--- On Mon, 1/25/10, Benjamin Peterson <[email protected]> wrote:
> From: Benjamin Peterson <[email protected]>
> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
> To: "Steve Howell" <[email protected]>
> Cc: [email protected]
> Date: Monday, January 25, 2010, 3:15 PM
> 2010/1/25 Steve Howell <[email protected]>:
> >> From: Raymond Hettinger <[email protected]>
> >>
> >> On Jan 25, 2010, at 12:36 PM, Steve Howell wrote:
> >>
> >> >
> >> > Deque does not support all the operations
> that list
> >> does. It is also roughly twice as slow for
> accessing
> >> elements (I've measured it).
> >>
> >>
> >> ISTM that apps that want to insert or pop from the
> front of
> >> list are also apps that don't care about accessing
> arbitrary
> >> elements in the middle using the position index.
> When
> >> lists are growing or shrinking from the front, the
> meaning
> >> of the i-th element changes. So, it doesn't
> >> make sense for an application to track indices of
> objects in
> >> such a list.
> >>
> >> i = s.find('abc')
> >> s.pop(0)
> >> print s[i] # i no longer
> >> points at 'abc'
> >>
> >
> > I am not going to directly address your point, but I'd
> like to give a examples of code that uses pop(0) from the
> standard library.
> >
> > If you look at the code for
> multiprocessing/connection.py, you will see that
> PipeListener creates _handle_queue as an ordinary Python
> list, and in line 317 it uses pop(0) to pop the first handle
> off the top of the queue.
> >
> > Why does that code not use a deque? In hindsight, it
> probably should. But to make the change now, it's not a
> simple matter of fixing just PipeListener, because
> PipeListener passes off _handle_queue to Finalize, which
> also expects a list.
> >
> > In order to understand why Finalize expects a list,
> you need to look at how it uses args, and here is one
> example usage:
> >
> > res = self._callback(*self._args, **self._kwargs)
> >
> > Ok, so now you need to know what self._callback is
> doing, so now you have to trace through all callers of
> Finalize are passing in for their args.
> >
> > So what seems like a trivial matter--switching over a
> list to a deque--actually requires a lot of thinking.
> >
> > It turns out that the callback for PipeListener just
> iterates through the remaining handles and closes them. So
> a deque would not break that code.
> >
> > If you look at difflib.py, it also does pop(0) in a
> loop. Why doesn't it use a deque? Simplicity, maybe?
> >
> > codecs.py also deletes from the top of the list:
> >
> > del self.linebuffer[0]
>
> Yes, but in either of these cases is there an excellent
> performance
> improvement to be gained and is it worth the complexity of
> your
> optimization? I think not.
>
I am starting to think that the optimization would be drowned out by the cost
of processing each line, unless you had some combination of the following:
1) a pretty large list (plausible)
2) a very inexpensive operation that you were applying to each line (rare)
3) a really slow memmove implementation (extremely doubtful)
The complexity of the optimization does not phase me for some reason. If
ruthless simplicity were the only goal, then I'd also simplify/remove some of
this code:
/* Bypass realloc() when a previous overallocation is large enough
to accommodate the newsize. If the newsize falls lower than half
the allocated size, then proceed with the realloc() to shrink the list.
*/
if (allocated >= newsize && newsize >= (allocated >> 1)) {
assert(self->ob_item != NULL || newsize == 0);
Py_SIZE(self) = newsize;
return 0;
}
/* This over-allocates proportional to the list size, making room
* for additional growth. The over-allocation is mild, but is
* enough to give linear-time amortized behavior over a long
* sequence of appends() in the presence of a poorly-performing
* system realloc().
* The growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
*/
new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);
In my mind, though, the complexity within CPython does not have to leak up to
the Python level.
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com