On Jan 27, 2010, at 2:56 PM, Steve Howell wrote:
>
> --- On Wed, 1/27/10, Raymond Hettinger <raymond.hettin...@gmail.com> wrote:
>
>> From: Raymond Hettinger <raymond.hettin...@gmail.com>
>> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
>> To: "Martin v. Löwis" <mar...@v.loewis.de>
>> Cc: "Steve Howell" <showel...@yahoo.com>, python-dev@python.org
>> Date: Wednesday, January 27, 2010, 1:49 PM
>>
>> On Jan 27, 2010, at 11:38 AM, Martin v. Löwis wrote:
>>
>>>> Lists are indeed used *everywhere* and *vast*
>> quantities, which is
>>>> why optimizations on list operations are
>> worthwhile to pursue.
>>>
>>> Unfortunately, the proposed change is a pessimisation,
>> not an
>>> optimisation. We probably shouldn't discuss it
>> further, at least not
>>> until a PEP gets written by its proponents.
>>>
>>
>> I concur (on both points).
>>
>> AFAICT, the performance of the proposal:
>>
>> * increases space requirements by a small fixed amount
>>
>> * s.append() performance slightly degraded.
>>
>> * the s.insert(0, value) case isn't helped -- still O(n)
>>
>> * repeated s.pop(0) have amortized O(1) but either
>> needs to waste space indefinitely (likely
>> not what
>> the programmer intended by popping off
>> values)
>> or needs to do occasional memmoves (which
>> isn't
>> as good as a deque which never moves the
>> data).
>>
>> * the resize performance doesn't work well with the
>> memory allocator -- while a series of
>> appends can often
>> resize in-place without a need to do an
>> O(n) memmove,
>> but a series of pop(0) calls doesn't have
>> a resize in-place
>> option.
>>
>> What currently unknown is the effect on third-party
>> extensions
>> and debugging tools that access the structure directly.
>> Also, am not sure if this affects psyco or the other
>> implementations such as Jython which may implement
>> lists in terms of existing Java containers.
>>
>> ISTM, the *only* benefit is that an occasional s.pop(0)
>> will perform better than it does now but not as well
>> as a deque which never has to move data).
>>
>>
>
> I agree with most of what's said above, with a few clarifications.
>
> The speedup is not limited to pop(0); it also considerably speeds up code
> like below:
>
> n = 800000
> for i in range(n):
> x = lst[:10]
> del lst[:10]
>
For this code, the slicing notation is nicer than the equivalent pop
one-at-a-time code for deques.
The underlying performance isn't a good though -- a deque would free block as
they become available and would never call a memmove.
The use case is a bit weird. If people needed something like this, we would
already see it in reverse order:
lst.reverse() # then delete slices from the right
ISTM, that there aren't actually any use cases for left popping in the absence
of right appending or left inserting; otherwise you could just retrieve the
slices directly (without doing any deletes). In cases where the list is
growing on one end and shrinking on another, a deque wins because it frees
memory readily and doesn't need memmoves.
> Understood that it is a contrived benchmark, and real code like that could be
> replaced with a technique that Nones out the used up elements and advances a
> start pointer.
Right.
>
> Since there is a lot of discussion about tradeoffs, I would remind folks that
> asking somebody to use a deque instead of a list also forces tradeoffs; you
> lose the comfort and familiarity of lists (not to be underestimated) as well
> as some features (you can't spell d[:10] as d[:10] for example).
If it were actually needed, there is no reason deques couldn't support slicing
notation. But then, I've never had a request for it ever (and users typically
haven't been shy about asking for what they want).
The familiarity argument doesn't hold much water for two reasons:
* deques seem to have a nearly zero learning curve. There have been no posts
or reports on people being challenged by the API. The *only* issue that has
ever arisen is that a fair number of people have heard of a queue but not heard
of the term, deque.
* changing the list implementation makes it harder to decide which data
structure to use. Currently, it is simple -- if you need to append or pop on
the left, use a deque; otherwise, use a list. With the proposed change, the
performance trade-offs are harding to understand (you can do pop(0) but not
insert(0,v) unless you've popped first but not after a resize, a deque is
better is most cases but it is hard to describe real use cases with list.pop(0)
would be preferable).
* the current design encourages people to use the right data structure for a
given need. the proposed change makes the trades-off murky and implementation
dependent. you have to know a lot more in order to make the right choices.
that's not good for usability. we want tools that are easy to use
correctly/well.
Raymond
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com