[issue6614] heapq.nsmallest and nlargest should be smarter/more usable/more consistent

2009-07-31 Thread Joshua Bronson
New submission from Joshua Bronson jabron...@gmail.com: From http://docs.python.org/library/heapq.html: The latter two functions (nlargest and nsmallest) perform best for smaller values of n. For larger values, it is more efficient to use the sorted() function. Also, when n==1, it is more

[issue6614] heapq.nsmallest and nlargest should be smarter/more usable/more consistent

2009-07-31 Thread Raymond Hettinger
Raymond Hettinger rhettin...@users.sourceforge.net added the comment: FWIW, 2.7 and 3.1 already have automatic selection of sort()/min()/max() alternatives. They use pure python to dispatch to the underlying C functions:

[issue6614] heapq.nsmallest and nlargest should be smarter/more usable/more consistent

2009-07-31 Thread Joshua Bronson
Joshua Bronson jabron...@gmail.com added the comment: Oh, that's great! (I also noticed that the previously inutile line _heappushpop = heappushpop is now doing something in the heapq.py you linked to, nice.) It looks like the docs haven't been updated yet though. For instance,

[issue6614] heapq.nsmallest and nlargest should be smarter/more usable/more consistent

2009-07-31 Thread Raymond Hettinger
Raymond Hettinger rhettin...@users.sourceforge.net added the comment: I prefer the docs the way they are. They help the reader understand the relationship between min, max, nsmallest, nlargest, and sorted. I'm not sure where you got the n * 10 = len(iterable) switch-over point. That is

[issue6614] heapq.nsmallest and nlargest should be smarter/more usable/more consistent

2009-07-31 Thread Raymond Hettinger
Raymond Hettinger rhettin...@users.sourceforge.net added the comment: One other thought: When memory is tight, the programmer needs to be able to select the heap algorithm in favor of sorted() even for relatively large values of n. I do not want an automatic switchover point that takes away a

[issue6614] heapq.nsmallest and nlargest should be smarter/more usable/more consistent

2009-07-31 Thread Joshua Bronson
Joshua Bronson jabron...@gmail.com added the comment: I prefer the docs the way they are. They help the reader understand the relationship between min, max, nsmallest, nlargest, and sorted. Except that it's no longer true that when n==1, it is more efficient to use the builtin min() and

[issue6614] heapq.nsmallest and nlargest should be smarter/more usable/more consistent

2009-07-31 Thread Joshua Bronson
Joshua Bronson jabron...@gmail.com added the comment: One more thing: I prefer the docs the way they are. They help the reader understand the relationship between min, max, nsmallest, nlargest, and sorted. The docs still use the unspecific language for smaller values of n and for larger

[issue6614] heapq.nsmallest and nlargest should be smarter/more usable/more consistent

2009-07-31 Thread Raymond Hettinger
Raymond Hettinger rhettin...@users.sourceforge.net added the comment: Except that it's no longer true that when n==1, it is more efficient to use the builtin min() and max() functions. There's still the dispatch overhead. If someone needs a n==1 case, they *should* use min/max for both speed

[issue6614] heapq.nsmallest and nlargest should be smarter/more usable/more consistent

2009-07-31 Thread Joshua Bronson
Joshua Bronson jabron...@gmail.com added the comment: That is in the pure python version of nsmallest() and that code is not used (it is overriden by the C version). So just because it isn't used by CPython it should remain in there even though as you said yourself it's completely without

[issue6614] heapq.nsmallest and nlargest should be smarter/more usable/more consistent

2009-07-31 Thread Raymond Hettinger
Raymond Hettinger rhettin...@users.sourceforge.net added the comment: There is a basis for the pure python version to switch to bisect. There is not a basis for having the final wrapped C function switch to using sorted(). That is a programmer decision. The pure python version is there to