Sorry :( . Please ignore previous post. Warming-up yield to better results in dual-pivot's favor.
On Sat, Sep 12, 2009 at 2:43 PM, Goktug Gokdogan <gokdo...@gmail.com> wrote: > My absolutely non-scientific preliminary tests indicate Arrays.sort > performs better with the same input (5000 items) in nearly all runs > (-client). > public static void main(String[] args) { > final int n = 5000; > int[] array = new int[n]; > int[] array2 = new int[n]; > Random random = new Random(); > for (int i = 0; i < n; i++) { > int r = random.nextInt(n); > array[i] = r; > array2[i] = r; > } > > long st1 = System.nanoTime(); > Arrays.sort(array2); > long st2 = System.nanoTime(); > DualPivotQuicksort.sort(array); > long end = System.nanoTime(); > System.out.println(st2 - st1); > System.out.println(end - st2); > } > > I tried changing which sort runs first, but the results did not change. > Over 100.000 items, dual-pivot consistently beats Arrays.sort. > > Well, this test is not a good reference though :) > > On Fri, Sep 11, 2009 at 5:27 PM, Vladimir Iaroslavski <iaroslav...@mail.ru > > wrote: > >> Hi, >> >> I've tried to use local variable int ak = a[k] in the loop >> and not found saving of time. There are 3 occurrences of a[k] >> in each loop, but only two can be changed because third >> comes after line: swap(a, k, great--); >> >> As summary: I'm changing only hardcoded constant by >> private static final variables. >> >> Thank you, >> Vladimir >> >> >> Ulf Zibis wrote: >> >>> Am 11.09.2009 15:32, Rémi Forax schrieb: >>> >>>> just my two cents : >>>> In the loop tagged "sorting" and "equals element", >>>> a[k] can be stored in a local variable to avoid to recompute it several >>>> time. >>>> >>> >>> I add 2 cents more: >>> Doesn't HotSpot see this, and optimize accordingly? >>> IMO: It's time that javac should do such optimization, as there are less >>> optimized VM's in the world. >>> >>> -Ulf >>> >>> >>> >>>> The algorithm use two constants: 37 and 13, >>>> I think they should be declared as private static final. >>>> >>>> Rémi >>>> >>>> >>>> Le 11/09/2009 12:35, Vladimir Yaroslavskiy a écrit : >>>> >>>>> Hello All, >>>>> >>>>> I'd like to share with you new Dual-Pivot Quicksort which is >>>>> faster than the known implementations (theoretically and >>>>> experimental). I'd like to propose to replace the JDK's >>>>> Quicksort implementation by new one. >>>>> >>>>> Description >>>>> ----------- >>>>> The classical Quicksort algorithm uses the following scheme: >>>>> >>>>> 1. Pick an element P, called a pivot, from the array. >>>>> 2. Reorder the array so that all elements, which are less than >>>>> the pivot, come before the pivot and all elements greater than >>>>> the pivot come after it (equal values can go either way). After >>>>> this partitioning, the pivot element is in its final position. >>>>> 3. Recursively sort the sub-array of lesser elements and the >>>>> sub-array of greater elements. >>>>> >>>>> The invariant of classical Quicksort is: >>>>> >>>>> [ <= p | >= p ] >>>>> >>>>> There are several modifications of the schema: >>>>> >>>>> [ < p | = p | > p ] or [ = p | < p | > p | = p ] >>>>> >>>>> But all of them use *one* pivot. >>>>> >>>>> The new Dual-Pivot Quicksort uses *two* pivots elements in this manner: >>>>> >>>>> 1. Pick an elements P1, P2, called pivots from the array. >>>>> 2. Assume that P1 <= P2, otherwise swap it. >>>>> 3. Reorder the array into three parts: those less than the smaller >>>>> pivot, those larger than the larger pivot, and in between are >>>>> those elements between (or equal to) the two pivots. >>>>> 4. Recursively sort the sub-arrays. >>>>> >>>>> The invariant of the Dual-Pivot Quicksort is: >>>>> >>>>> [ < P1 | P1 <= & <= P2 } > P2 ] >>>>> >>>>> The new Quicksort is faster than current implementation of Quicksort >>>>> in JDK (L. Bentley and M. Douglas McIlroy) and classical Quicksort. >>>>> >>>>> The full description of the Dual-Pivot Quicksort you can find >>>>> on my page: http://iaroslavski.narod.ru/quicksort >>>>> >>>>> Performance tests >>>>> ----------------- >>>>> Here is result of running on different types of input data: >>>>> >>>>> Client VM all 85% organ 0..1 >>>>> 0..4 >>>>> random ascend descend equal equal pipes random 010101 >>>>> random >>>>> Dual-Pivot: 16.83 5.31 5.47 0.35 0.68 10.59 1.06 1.02 >>>>> 2.18 >>>>> Bentley's: 19.77 9.08 10.13 0.63 1.12 13.22 1.63 1.08 >>>>> 2.49 >>>>> >>>>> Server VM all 85% organ 0..1 >>>>> 0..4 >>>>> random ascend descend equal equal pipes random 010101 >>>>> random >>>>> Dual-Pivot: 23.94 6.68 6.63 0.43 0.62 17.14 1.42 1.96 >>>>> 3.41 >>>>> Bentley's: 25.20 10.18 10.32 2.07 1.33 16.72 2.95 1.82 >>>>> 3.39 >>>>> >>>>> The a lot of other tests have been run under client and server mode. >>>>> The most interesting is BentleyBasher test framework. It runs battery >>>>> of tests for all cases: >>>>> >>>>> { 100, 1000, 10000, 1000000 } x >>>>> { sawtooth, rand, stagger, plateau, shuffle } x >>>>> { ident, reverse, reverse_front, reverse_back, sort, dither} >>>>> >>>>> where >>>>> >>>>> 100, ... , 1000000 - array length >>>>> >>>>> sawtooth: x[i] =i%m >>>>> rand: x[i] = rand() % m >>>>> stagger: x[i] = (i*m + i) % n >>>>> plateau: x[i] = min(i, m) >>>>> shuffle: x[i] = rand()%m? (j+=2): (k+=2) >>>>> >>>>> ident(x) - a copy of x >>>>> reverse(x, 0, n) - reversed copy >>>>> reverse_front(x, 0, n/2) - front half reversed >>>>> reverse_back(x, n/2, n) - back half reversed >>>>> sort(x) - an ordered copy >>>>> dither(x) - add i%5 to x[i] >>>>> >>>>> Here is the result of execution: >>>>> Server VM: >>>>> http://spreadsheets.google.com/pub?key=t_EAWUkQ4mD3BIbOv8Fa-AQ&output=html >>>>> Client VM: >>>>> http://spreadsheets.google.com/pub?key=tdiMo8xleTxd23nKUObcz0Q&single=true&gid=0&output=html >>>>> >>>>> Mathematical investigations >>>>> --------------------------- >>>>> It is proved that for the Dual-Pivot Quicksort the average number of >>>>> comparisons is 2*n*ln(n), the average number of swaps is 0.8*n*ln(n), >>>>> whereas classical Quicksort algorithm has 2*n*ln(n) and 1*n*ln(n) >>>>> respectively. Full mathematical proof see in attached proof.txt >>>>> and proof_add.txt files. Theoretical results are also confirmed >>>>> by experimental counting of the operations. >>>>> >>>>> Diff between current and new implementation of Quicksort >>>>> -------------------------------------------------------- >>>>> >>>>> Here is the link to the diff for java.util.Arrays class: >>>>> http://cr.openjdk.java.net/~alanb/6880672/webrev.00 >>>>> >>>>> If you like to look and play with new algorithm, >>>>> please, take attached class DualPivotQuicksort.java >>>>> >>>>> Feedback >>>>> -------- >>>>> >>>>> Also I'd like to share a feedback from Joshua Bloch and >>>>> Jon Bentley who spent a lot of time investigating this >>>>> algorithm, who gave me many advices and tips how to >>>>> make new Quicksort better. >>>>> >>>>> -------- Original Message -------- >>>>> Subject: Re: Integration of new Dual-Pivot Quicksort into JDK 7 >>>>> Date: Thu, 10 Sep 2009 07:20:11 -0700 >>>>> From: Joshua Bloch <j...@google.com> >>>>> >>>>> Jon also says that Vladimir should make every reasonable improvement to >>>>> the basic method before checking in the code. In his words, "It would >>>>> be >>>>> horrible to put the new code into the library, and then have someone >>>>> else come along and speed it up by another 20% by using standard >>>>> techniques." I believe it's not unlikely that this code may end up >>>>> getting ported to many languages and widely deployed in much the manner >>>>> of Bentley and McIlroy's fine sort (which is nearing 20 successful >>>>> years >>>>> in the field). Jon will help Vladimir do this. >>>>> >>>>> -------- Original Message -------- >>>>> Subject: Dual-Pivot Quicksort: Next Steps >>>>> Date: Wed, 09 Sep 2009 15:02:25 -0400 >>>>> From: Jon Bentley <jbent...@avaya.com> >>>>> >>>>> Vladimir, Josh, >>>>> I *finally* feel like I understand what is going on. Now that I >>>>> (think that) I see it, it seems straightforward and obvious. >>>>> Tony Hoare developed Quicksort in the early 1960s. I was very >>>>> proud to make minor contributions to a particularly clean (binary) >>>>> quicksort in the mid 1980s, to a relatively straightforward, industrial >>>>> strength Quicksort with McIlroy in the early 1990s, and then to >>>>> algorithms and data structures for strings with Sedgewick in the mid >>>>> 1990s. >>>>> I think that Vladimir's contributions to Quicksort go way beyond >>>>> anything that I've ever done, and rank up there with Hoare's original >>>>> design and Sedgewick's analysis. I feel so privileged to play a very, >>>>> very minor role in helping Vladimir with the most excellent work! >>>>> >>>>> ----------------------------------------------- >>>>> >>>>> Let me know, if you have any questions/comments. >>>>> >>>>> Thank you, >>>>> Vladimir >>>>> >>>> >