--- On Wed, 1/27/10, Antoine Pitrou <solip...@pitrou.net> wrote:

> From: Antoine Pitrou <solip...@pitrou.net>
> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
> To: python-dev@python.org
> Date: Wednesday, January 27, 2010, 6:15 AM
> Nick Coghlan <ncoghlan <at>
> gmail.com> writes:
> > 
> > The big practical concern is actually the memory cost
> of adding yet
> > another pointer (potentially 8 bytes of data) to every
> list object, even
> > empty ones.
> 
> It needn't be, actually. Why?
> You only need to store this other pointer when you have an
> orphaned area in the
> first place. But then you can store the pointer *in* the
> orphaned area :-)
> 
> So let's say for example:
> - when ob_size >= 0, there are no orphans
> - when ob_size < 0, the pointer to the orphaned area is
> given by 
>   (PyObject **) self->ob_items[-1]
> 
> This makes a couple of micro-operations slightly slower
> (you need to take
> ob_size's absolute value to get the length (*)); and it
> kills code which uses
> Py_SIZE() on lists. But PyList_GET_SIZE() exists for a
> reason.
> 
> (*) or you could use ob_size's MSB, and then a simple
> binary AND gets you the
> length; or even ob_items's LSB, since the pointer will
> always be aligned on more
> than 1 byte
> 
> (and, yes, this is a bit insane)
> 

A slightly more sane alternative would be to leave ob_size and ob_item alone 
with their current semantics, and then replace allocated with self->excess, 
where

  self->excess == excess_above * 256 + excess_below.

Right now self->allocated is only used in a few places:
  
  list_resize
  listextend
  listsort
  list_init (but only in an assert)
  list_sizeof

Those places would need to compute:

  allocated = self->ob_size + self->excess / 256 + self->excess % 256

The right hand side could be done with shifting/bitmasking inside a macro.

Excess_left would obviously not be allowed to exceed 256; excess_right would 
max out at PY_SIZE_MAX / 256, instead of PY_SIZE_MAX / 8 + 6, which is probably 
the right thing anyway.

Here is the current algorithm for allocating extra bytes on the right:

new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);

On a 32-bit platform, excess_right would max out at 16M instead of the current 
512M.

Any method that wanted to find the originally allocated pointer would compute

  self->ob_item - (self->excess % 256)

To the extent that most of the ugly details could be encapsulated in macros, I 
do not think this would complicate the list code itself greatly, but it would 
complicate debugging.  Guido has expressed a strong desire to keep a hard 
pointer to the originally allocated chunk, especially with regard to future 
garbage collection changes.


_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to