Re: Quadratic time to sort in a simple case?

Ivan Kazmenko Tue, 23 Apr 2013 14:10:19 -0700

On Monday, 22 April 2013 at 21:34:47 UTC, Andrei Alexandrescuwrote:

a) choose median of five

Sounds like it could decrease speed on average, and still perhapsfail on a simple case like [0, 1, 2, 3, ..., 999, 0, 0] orsomething? Didn't test it though.

b) if the array is large, sort a stride of it first (e.g. 1%)then choose the pivot as the median point of that stride

A bad case would be, e.g., the array containing almost sortedparts, and the stride getting taken from the part where lowest orhighest elements reside.

c) add introspection to the algorithm, i.e. if an attempt topartition hits the worst case or near worst case, just tryanother pivot instead of moving forward with the sorting stage.

Now, trying another pivot again and again could give the sameO(n^2) in a bad case.

One simple solution to this (which I forgot to mention) is
picking a pivot at random.

But it would again mean that the sort function depends on globalstate (RNG state). And if it doesn't (a local RNG is initializedsomehow each time), an adversary would just have to hardcode itonce to get the same O(n^2) worst case anytime he wants.


And on Tuesday, 23 April 2013 at 01:10:26 UTC, Xinok wrote:

I filed a bug report for this issue a year ago:
http://d.puremagic.com/issues/show_bug.cgi?id=7767
I've been meaning to fix this issue myself. Time allowing, I'lldo it soon.


What I wonder now, what would be the goal of such a fix?

One possible goal would be to have an O(n log n) worst case sort.And that would perhaps cost some speed and/or memory on average.Besides, such a sort function is already implemented (TimSort),so it's just a matter of setting the default then.

Another goal would be to have O(n^2) only for cases which don'thave a reasonable chance to occur in real world, but could stillbe constructed artificially by an adversary. And that couldperhaps be done by just changing the getPivot implementation.Other languages' experience may be useful here:


1. GNU C++ 4.x std::sort implementation is IntroSort [1].
2. Microsoft .NET used Quicksort up to version 4.0 [2].
   Now in .NET 4.5 they use Introsort too [3].
3. For primitive types, Java 6 used a tuned QuickSort [4].
   The current Java 7 uses Dual-Pivot QuickSort [5].
   Java uses TimSort for Objects [6].
4. Python uses TimSort since version 2.3 [7].

In any case, there are techniques to construct a bad case given aQuickSort implementation. One of them is described by M. DouglasMcIlroy in "A Killer Adversary for Quicksort" [8]. We runQuickSort on a would-be-array "a" controlling the "a[i] < a[j]"predicate, and, using only certain assumptions (not looking atthe code!), we build a killer-case array on the fly. Given somethought, the technique could perhaps be extended to Dual-PivotQuickSort too. As long as some flavor of QuickSort (and topN) isin Phobos, such a unittest would fit the sort implementationnicely, if only to show just how "bad" it can get.


Ivan Kazmenko.

-----

References:

[1]http://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-html-USERS-4.4/a01027.html#g152148508b4a39e15ffbfbc987ab653a[2]http://msdn.microsoft.com/en-us/library/6tf1f0bc%28v=vs.100%29.aspx[3]http://msdn.microsoft.com/en-us/library/6tf1f0bc%28v=vs.110%29.aspx[4]http://docs.oracle.com/javase/6/docs/api/java/util/Arrays.html#sort%28int[]%29[5]http://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html#sort%28int[]%29[6]http://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html#sort%28java.lang.Object[]%29

[7] http://en.wikipedia.org/wiki/Timsort
[8] http://www.cs.dartmouth.edu/~doug/mdmspe.pdf

Re: Quadratic time to sort in a simple case?

Reply via email to