On Mon, Feb 17, 2014 at 9:18 AM, Francesc Alted <franc...@continuum.io> wrote: > On 2/17/14, 1:08 AM, josef.p...@gmail.com wrote: >> On Sun, Feb 16, 2014 at 6:12 PM, Daπid <davidmen...@gmail.com> wrote: >>> On 16 February 2014 23:43, <josef.p...@gmail.com> wrote: >>>> What's the fastest argsort for a 1d array with around 28 Million >>>> elements, roughly uniformly distributed, random order? >>> >>> On numpy latest version: >>> >>> for kind in ['quicksort', 'mergesort', 'heapsort']: >>> print kind >>> %timeit np.sort(data, kind=kind) >>> %timeit np.argsort(data, kind=kind) >>> >>> >>> quicksort >>> 1 loops, best of 3: 3.55 s per loop >>> 1 loops, best of 3: 10.3 s per loop >>> mergesort >>> 1 loops, best of 3: 4.84 s per loop >>> 1 loops, best of 3: 9.49 s per loop >>> heapsort >>> 1 loops, best of 3: 12.1 s per loop >>> 1 loops, best of 3: 39.3 s per loop >>> >>> >>> It looks quicksort is quicker sorting, but mergesort is marginally faster >>> sorting args. The diference is slim, but upon repetition, it remains >>> significant. >>> >>> Why is that? Probably part of the reason is what Eelco said, and part is >>> that in sort comparison are done accessing the array elements directly, but >>> in argsort you have to index the array, introducing some overhead. >> Thanks, both. >> >> I also gain a second with mergesort. >> >> matlab would be nicer in my case, it returns both. >> I still need to use the argsort to index into the array to also get >> the sorted array. > > Many years ago I needed something similar, so I made some functions for > sorting and argsorting in one single shot. Maybe you want to reuse > them. Here it is an example of the C implementation: > > https://github.com/PyTables/PyTables/blob/develop/src/idx-opt.c#L619 > > and here the Cython wrapper for all of them: > > https://github.com/PyTables/PyTables/blob/develop/tables/indexesextension.pyx#L129
Thanks Francesc That would be very useful to have. However, I don't speak C, and it would require too much maintenance time if I were to include that in statsmodels. I'm trying to concentrate more on stats (and my backlog there), and leave other parts to developers that know those better. Josef > > Francesc > >> >> Josef >> >> >>> I seem unable to find the code for ndarray.sort, so I can't check. I have >>> tried to grep it tring all possible combinations of "def ndarray", >>> "self.sort", etc. Where is it? >>> >>> >>> /David. >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > -- > Francesc Alted > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion