On 9/1/2010 9:08 PM, Dmitry Chichkov wrote:
Given: a large list (10,000,000) of floating point numbers;
Task: fastest python code that finds k (small, e.g. 10) smallest
items, preferably with item indexes;
Limitations: in python, using only standard libraries (numpy& scipy
is Ok);
I've tried several methods. With N = 10,000,000, K = 10 The fastest so
far (without item indexes) was pure python implementation
nsmallest_slott_bisect (using bisect/insert). And with indexes
nargsmallest_numpy_argmin (argmin() in the numpy array k times).
Anyone up to the challenge beating my code with some clever selection
algorithm?
Current Table:
1.66864395142 mins_heapq(items, n):
0.946580886841 nsmallest_slott_bisect(items, n):
1.38014793396 nargsmallest(items, n):
10.0732769966 sorted(items)[:n]:
3.17916202545 nargsmallest_numpy_argsort(items, n):
1.31794500351 nargsmallest_numpy_argmin(items, n):
2.37499308586 nargsmallest_numpy_array_argsort(items, n):
0.524670124054 nargsmallest_numpy_array_argmin(items, n):
0.0525538921356 numpy argmin(items): 1892997
0.364673852921 min(items): 10.0000026786
Your problem is underspecified;-).
Detailed timing comparisons are only valid for a particular Python
version running under a particular OS on particular hardware. So, to
actually run a contest, you would have to specify a version and OS and
offer to run entries on your machine, with as much else as possible
turned off, or else enlist a neutral judge to do so. And the timing
method should also be specified.
--
Terry Jan Reedy
--
http://mail.python.org/mailman/listinfo/python-list