New submission from Joshua Bronson jabron...@gmail.com:
From http://docs.python.org/library/heapq.html:
The latter two functions (nlargest and nsmallest) perform best for
smaller values of n. For larger values, it is more efficient to use
the sorted() function. Also, when n==1, it is more
Raymond Hettinger rhettin...@users.sourceforge.net added the comment:
FWIW, 2.7 and 3.1 already have automatic selection of sort()/min()/max()
alternatives. They use pure python to dispatch to the underlying C
functions:
Joshua Bronson jabron...@gmail.com added the comment:
Oh, that's great!
(I also noticed that the previously inutile line _heappushpop = heappushpop
is now doing something in the heapq.py you linked to, nice.)
It looks like the docs haven't been updated yet though. For instance,
Raymond Hettinger rhettin...@users.sourceforge.net added the comment:
I prefer the docs the way they are. They help the reader understand the
relationship between min, max, nsmallest, nlargest, and sorted.
I'm not sure where you got the n * 10 = len(iterable) switch-over
point. That is
Raymond Hettinger rhettin...@users.sourceforge.net added the comment:
One other thought: When memory is tight, the programmer needs to be
able to select the heap algorithm in favor of sorted() even for
relatively large values of n. I do not want an automatic switchover
point that takes away a
Joshua Bronson jabron...@gmail.com added the comment:
I prefer the docs the way they are. They help the reader understand
the relationship between min, max, nsmallest, nlargest, and sorted.
Except that it's no longer true that when n==1, it is more efficient to use
the
builtin min() and
Joshua Bronson jabron...@gmail.com added the comment:
One more thing:
I prefer the docs the way they are. They help the reader understand
the relationship between min, max, nsmallest, nlargest, and sorted.
The docs still use the unspecific language for smaller values of n and
for larger
Raymond Hettinger rhettin...@users.sourceforge.net added the comment:
Except that it's no longer true that when n==1, it is
more efficient to use the builtin min() and max() functions.
There's still the dispatch overhead.
If someone needs a n==1 case, they
*should* use min/max for both speed
Joshua Bronson jabron...@gmail.com added the comment:
That is in the pure python version of nsmallest() and that
code is not used (it is overriden by the C version).
So just because it isn't used by CPython it should remain in there even
though as you said yourself it's completely without
Raymond Hettinger rhettin...@users.sourceforge.net added the comment:
There is a basis for the pure python version to switch to bisect.
There is not a basis for having the final wrapped C function switch to
using sorted(). That is a programmer decision.
The pure python version is there to
10 matches
Mail list logo