On Fri, Jan 16, 2015 at 3:33 AM, Lars Buitinck <[email protected]> wrote:

> 2015-01-16 11:55 GMT+01:00  <[email protected]>:
> > Message: 2
> > Date: Thu, 15 Jan 2015 21:24:00 -0800
> > From: Jaime Fern?ndez del R?o <[email protected]>
> > Subject: [Numpy-discussion] Sorting refactor
> > To: Discussion of Numerical Python <[email protected]>
> > Message-ID:
> >         <
> capowhwkf6rnwcrgmcwsmq_lo3hshjgbvlsrn19z-mdpe25e...@mail.gmail.com>
> > Content-Type: text/plain; charset="utf-8"
> >
> > This changes will make it easier for me to add a Timsort generic type
> > function to numpy's arsenal of sorting routines. And I think they have
> > value by cleaning the source code on their own.
>
> Yes, they do. I've been looking at the sorting functions as well and
> I've found the following:
>
> * The code is generally hard to read because it prefers pointer over
> indices. I'm wondering if it would get slower using indices. The
> closer these algorithms are to the textbook, the easier to insert
> fancy optimizations.
>

They are harder to read, but so cute to look at! C code just wouldn't feel
the same without some magical pointer arithmetic thrown in here and there.
;-)


> * The heap sort exploits undefined behavior by using a pointer that
> points before the start of the array. However, rewriting it to always
> point within the array made it slower. I haven't tried rewriting it
> using indices.
>
> * Quicksort has a quadratic time worst case. I think it should be
> turned into an introsort [1] for O(n log n) worst case; we have the
> heapsort needed to do that.
>
> * Quicksort is robust to repeated elements, but doesn't exploit them.
> It can be made to run in linear time if the input array has only O(1)
> distinct elements [2]. This may come at the expense of some
> performance on arrays with no repeated elements.
>

Java famously changed its library implementation of quicksort to a dual
pivot one invented by Vladimir Yaroslavskiy[1], they claim that with
substantial performance gains. I tried to implement that for numpy [2], but
couldn't get it to work any faster than the current code.

* Using optimal sorting networks instead of insertion sort as the base
> case can speed up quicksort on float arrays by 5-10%, but only if NaNs
> are moved out of the way first so that comparisons become cheaper [3].
> This has consequences for the selection algorithms that I haven't
> fully worked out yet.
>

Even if we stick with selection sort, we should spin it off into an inline
smallsort function within the npy_sort library, and have quicksort and
mergesort call the same function, instead of each implementing their own.
It would make optimizations like the sorting networks easier to implement
for all sorts. We could even expose it outside npy_sort, as there are a few
places around the code base that have ad-hoc implementations of sorting.


> * Using Cilk Plus to parallelize merge sort can make it significantly
> faster than quicksort at the expense of only a few lines of code, but
> I haven't checked whether Cilk Plus plays nicely with multiprocessing
> and other parallelism options (remember the trouble with OpenMP-ified
> OpenBLAS?).
>
> This isn't really an answer to your questions, more like a brain dump
> from someone who's stared at the same code for a while and did some
> experiments. I'm not saying we should implement all of this, but keep
> in mind that there are some interesting options besides implementing
> timsort.
>

Timsort came up in a discussion several months ago, where I proposed adding
a mergesorted function (which I have mostly ready, by the way, [3]) to
speed-up some operations in arraysetops. I have serious doubts that it will
perform comparably to the other sorts unless comparisons are terribly
expensive, which they typically aren't in numpy, but it has been an
interesting learning exercise so far, and I don't mind taking it all the
way.

Most of my proposed original changes do not affect the core sorting
functionality, just the infrastructure around it. But if we agree that
sorting has potential for being an actively developed part of the code
base, then cleaning up its surroundings for clarity makes sense, so I'm
taking your brain dump as an aye for my proposal. ;-)

Jaime

[1] http://iaroslavski.narod.ru/quicksort/DualPivotQuicksort.pdf
[2]
https://github.com/jaimefrio/numpy/commit/a99dd77cda7c4c0f2df1fb17a59c20e19999cd86
[3]
https://github.com/jaimefrio/numpy/commit/2f53c99e7ec6d14fd77a29f9d3c1712d5b955079


>
> [1] https://en.wikipedia.org/wiki/Introsort
> [2] http://www.sorting-algorithms.com/static/QuicksortIsOptimal.pdf
> [3] https://github.com/larsmans/numpy/tree/sorting-nets
> _______________________________________________
> NumPy-Discussion mailing list
> [email protected]
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to