Re: [Numpy-discussion] Sorting refactor

Eelco Hoogendoorn Fri, 16 Jan 2015 04:16:30 -0800

I don't know if there is a general consensus or guideline on these matters,
but I am personally not entirely charmed by the use of behind-the-scenes
parallelism, unless explicitly requested.


Perhaps an algorithm can be made faster, but often these multicore
algorithms are also less efficient, and a less data-dependent way of
putting my cores to good use would have been preferable. Potentially, other
code could slow down due to cache trashing if too many parallel tasks run
in parallel. Id rather be in charge of such matters myself; but I imagine
adding a keyword arg for these matters would not be much trouble?

On Fri, Jan 16, 2015 at 12:43 PM, Julian Taylor <
jtaylor.deb...@googlemail.com> wrote:

> On 16.01.2015 12:33, Lars Buitinck wrote:
> > 2015-01-16 11:55 GMT+01:00  <numpy-discussion-requ...@scipy.org>:
> >> Message: 2
> >> Date: Thu, 15 Jan 2015 21:24:00 -0800
> >> From: Jaime Fern?ndez del R?o <jaime.f...@gmail.com>
> >> Subject: [Numpy-discussion] Sorting refactor
> >> To: Discussion of Numerical Python <numpy-discussion@scipy.org>
> >> Message-ID:
> >>         <
> capowhwkf6rnwcrgmcwsmq_lo3hshjgbvlsrn19z-mdpe25e...@mail.gmail.com>
> >> Content-Type: text/plain; charset="utf-8"
> >>
> >> This changes will make it easier for me to add a Timsort generic type
> >> function to numpy's arsenal of sorting routines. And I think they have
> >> value by cleaning the source code on their own.
> >
> > Yes, they do. I've been looking at the sorting functions as well and
> > I've found the following:
> >
> > * The code is generally hard to read because it prefers pointer over
> > indices. I'm wondering if it would get slower using indices. The
> > closer these algorithms are to the textbook, the easier to insert
> > fancy optimizations.
> >
> > * The heap sort exploits undefined behavior by using a pointer that
> > points before the start of the array. However, rewriting it to always
> > point within the array made it slower. I haven't tried rewriting it
> > using indices.
> >
> > * Quicksort has a quadratic time worst case. I think it should be
> > turned into an introsort [1] for O(n log n) worst case; we have the
> > heapsort needed to do that.
>
> This probably rarely happens in numeric data, and we do have guaranteed
> nlog runtime algorithms available.
> But it also is not costly to do, e.g. the selection code is a
> introselect instead of a normal quickselect.
> I'd say not high priority, but if someone wants to do it I don't see why
> not.
>
> >
> > * Quicksort is robust to repeated elements, but doesn't exploit them.
> > It can be made to run in linear time if the input array has only O(1)
> > distinct elements [2]. This may come at the expense of some
> > performance on arrays with no repeated elements.
> >
> > * Using optimal sorting networks instead of insertion sort as the base
> > case can speed up quicksort on float arrays by 5-10%, but only if NaNs
> > are moved out of the way first so that comparisons become cheaper [3].
> > This has consequences for the selection algorithms that I haven't
> > fully worked out yet.
>
> I was also thinking about this, an advantage of a sorting network is
> that it can be vectorized to be significantly faster than an insertion
> sort. Handling NaN's should also be directly possible.
> The issue is that its probably too much complicated code for only a very
> small gain.
>
> >
> > * Using Cilk Plus to parallelize merge sort can make it significantly
> > faster than quicksort at the expense of only a few lines of code, but
> > I haven't checked whether Cilk Plus plays nicely with multiprocessing
> > and other parallelism options (remember the trouble with OpenMP-ified
> > OpenBLAS?).
>
> you should also be able to do this with openmp tasks, though it will be
> a little less efficient as cilk+ has a better scheduler for this type of
> work.
> But I assume you will get the same trouble as openmp but that needs
> testing, also cilk+ in gcc is not really production ready yet, I got
> lots of crashes when I last tried it (it will be in 5.0 though).
>
>
> >
> > This isn't really an answer to your questions, more like a brain dump
> > from someone who's stared at the same code for a while and did some
> > experiments. I'm not saying we should implement all of this, but keep
> > in mind that there are some interesting options besides implementing
> > timsort.
> >
> > [1] https://en.wikipedia.org/wiki/Introsort
> > [2] http://www.sorting-algorithms.com/static/QuicksortIsOptimal.pdf
> > [3] https://github.com/larsmans/numpy/tree/sorting-nets
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Sorting refactor

Reply via email to