--- On Wed, 1/27/10, Antoine Pitrou <solip...@pitrou.net> wrote: > From: Antoine Pitrou <solip...@pitrou.net> > Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time > To: python-dev@python.org > Date: Wednesday, January 27, 2010, 6:15 AM > Nick Coghlan <ncoghlan <at> > gmail.com> writes: > > > > The big practical concern is actually the memory cost > of adding yet > > another pointer (potentially 8 bytes of data) to every > list object, even > > empty ones. > > It needn't be, actually. Why? > You only need to store this other pointer when you have an > orphaned area in the > first place. But then you can store the pointer *in* the > orphaned area :-) > > So let's say for example: > - when ob_size >= 0, there are no orphans > - when ob_size < 0, the pointer to the orphaned area is > given by > (PyObject **) self->ob_items[-1] > > This makes a couple of micro-operations slightly slower > (you need to take > ob_size's absolute value to get the length (*)); and it > kills code which uses > Py_SIZE() on lists. But PyList_GET_SIZE() exists for a > reason. > > (*) or you could use ob_size's MSB, and then a simple > binary AND gets you the > length; or even ob_items's LSB, since the pointer will > always be aligned on more > than 1 byte > > (and, yes, this is a bit insane) >
A slightly more sane alternative would be to leave ob_size and ob_item alone with their current semantics, and then replace allocated with self->excess, where self->excess == excess_above * 256 + excess_below. Right now self->allocated is only used in a few places: list_resize listextend listsort list_init (but only in an assert) list_sizeof Those places would need to compute: allocated = self->ob_size + self->excess / 256 + self->excess % 256 The right hand side could be done with shifting/bitmasking inside a macro. Excess_left would obviously not be allowed to exceed 256; excess_right would max out at PY_SIZE_MAX / 256, instead of PY_SIZE_MAX / 8 + 6, which is probably the right thing anyway. Here is the current algorithm for allocating extra bytes on the right: new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6); On a 32-bit platform, excess_right would max out at 16M instead of the current 512M. Any method that wanted to find the originally allocated pointer would compute self->ob_item - (self->excess % 256) To the extent that most of the ugly details could be encapsulated in macros, I do not think this would complicate the list code itself greatly, but it would complicate debugging. Guido has expressed a strong desire to keep a hard pointer to the originally allocated chunk, especially with regard to future garbage collection changes. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com