Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

Raymond Hettinger Wed, 27 Jan 2010 15:51:04 -0800

On Jan 27, 2010, at 2:56 PM, Steve Howell wrote:

> 
> --- On Wed, 1/27/10, Raymond Hettinger <raymond.hettin...@gmail.com> wrote:
> 
>> From: Raymond Hettinger <raymond.hettin...@gmail.com>
>> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
>> To: "Martin v. Löwis" <mar...@v.loewis.de>
>> Cc: "Steve Howell" <showel...@yahoo.com>, python-dev@python.org
>> Date: Wednesday, January 27, 2010, 1:49 PM
>> 
>> On Jan 27, 2010, at 11:38 AM, Martin v. Löwis wrote:
>> 
>>>> Lists are indeed used *everywhere* and *vast*
>> quantities, which is
>>>> why optimizations on list operations are
>> worthwhile to pursue.
>>> 
>>> Unfortunately, the proposed change is a pessimisation,
>> not an
>>> optimisation. We probably shouldn't discuss it
>> further, at least not
>>> until a PEP gets written by its proponents.
>>> 
>> 
>> I concur (on both points).
>> 
>> AFAICT, the performance of the proposal:
>> 
>> * increases space requirements by a small fixed amount
>> 
>> * s.append() performance slightly degraded.
>> 
>> * the s.insert(0, value) case isn't helped -- still O(n)
>> 
>> * repeated s.pop(0) have amortized O(1) but either
>>    needs to waste space indefinitely (likely
>> not what
>>    the programmer intended by popping off
>> values)
>>    or needs to do occasional memmoves (which
>> isn't
>>    as good as a deque which never moves the
>> data).
>> 
>> * the resize performance doesn't work well with the
>>    memory allocator -- while a series of
>> appends can often
>>    resize in-place without a need to do an
>> O(n) memmove,
>>    but a series of pop(0) calls doesn't have
>> a resize in-place
>>    option.
>> 
>> What currently unknown is the effect on third-party
>> extensions
>> and debugging tools that access the structure directly.
>> Also, am not sure if this affects psyco or the other 
>> implementations such as Jython which may implement
>> lists in terms of existing Java containers.
>> 
>> ISTM, the *only* benefit is that an occasional s.pop(0)
>> will perform better than it does now but not as well
>> as a deque which never has to move data).
>> 
>> 
> 
> I agree with most of what's said above, with a few clarifications.
> 
> The speedup is not limited to pop(0); it also considerably speeds up code 
> like below:
> 
> n = 800000
> for i in range(n):
>       x = lst[:10]
>       del lst[:10]
>


For this code, the slicing notation is nicer than the equivalent pop 
one-at-a-time code for deques.

The underlying performance isn't a good though -- a deque would free block as 
they become available and would never call a memmove.

The use case is a bit weird.  If people needed something like this, we would 
already see it in reverse order:

   lst.reverse()   # then delete slices from the right

ISTM, that there aren't actually any use cases for left popping in the absence 
of right appending or left inserting; otherwise you could just retrieve the 
slices directly (without doing any deletes).   In cases where the list is 
growing on one end and shrinking on another, a deque wins because it frees 
memory readily and doesn't need memmoves.

> Understood that it is a contrived benchmark, and real code like that could be 
> replaced with a technique that Nones out the used up elements and advances a 
> start pointer.  

Right.


> 
> Since there is a lot of discussion about tradeoffs, I would remind folks that 
> asking somebody to use a deque instead of a list also forces tradeoffs; you 
> lose the comfort and familiarity of lists (not to be underestimated) as well 
> as some features (you can't spell d[:10] as d[:10] for example).


If it were actually needed, there is no reason deques couldn't support slicing 
notation.  But then, I've never had a request for it ever (and users typically 
haven't been shy about asking for what they want).

The familiarity argument doesn't hold much water for two reasons:

* deques seem to have a nearly zero learning curve.  There have been no posts 
or reports on people being challenged by the API.  The *only* issue that has 
ever arisen is that a fair number of people have heard of a queue but not heard 
of the term, deque.

* changing the list implementation makes it harder to decide which data 
structure to use.  Currently, it is simple -- if you need to append or pop on 
the left, use a deque; otherwise, use a list.  With the proposed change, the 
performance trade-offs are harding to understand (you can do pop(0) but not 
insert(0,v) unless you've popped first but not after a resize, a deque is 
better is most cases but it is hard to describe real use cases with list.pop(0) 
would be preferable).

* the current design encourages people to use the right data structure for a 
given need.  the proposed change makes the trades-off murky and implementation 
dependent.  you have to know a lot more in order to make the right choices.  
that's not good for usability.  we want tools that are easy to use 
correctly/well.

Raymond

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

Reply via email to