Vladimir, > If we consider case when the whole array is sorted, > we can omit setting sentinel and restoring original value:
Let me define my point more precisely. We can omit setting sentinel not only in a particular case when the whole array is sorted. During partitioning the whole array is split into small chunks. Elements within a chunks remains unsorted, but every element in a certain chunk is greater than every element in preceding (lefter) chunks and less than every element in following (righter) chunks. E.g. [...A...B...]C[...D...E...]F[...G...H...] Elements C and F are pivots. Elements D and E are greater that pivot C and hence greater than A and B, and less than pivot F and hence G and H. Therefore during sorting any non-leftmost chunk we can rely on preceding chunk. Did I miss something? Is this what you mean? Thank you, Dmytro Sheyko > Date: Wed, 5 May 2010 21:23:37 +0400 > From: [email protected] > Subject: Re: New portion of improvements for Dual-Pivot Quicksort > To: [email protected] > CC: [email protected] > > Hi Dmytro, > > Thank you very much for suggestions! I checked it and here > is my results: > > If we consider case when the whole array is sorted, > we can omit setting sentinel and restoring original value: > > // int before = a[left - 1]; a[left - 1] = Integer.MIN_VALUE; > > for (int j, i = left + 1; i <= right; i++) { > int ai = a[i]; > for (j = i - 1; /* j >= left && */ ai < a[j]; j--) { > a[j + 1] = a[j]; > } > a[j + 1] = ai; > } > // a[left - 1] = before; > > I checked this version and found that it works little bit faster > (only 0.4-0.3%) with client VM, but loses more than 2.5% under > server mode in compare with one-pivot version from JDK 6: > > client server > original: 60.75% 48.20% > no setting sentinel: 60.40% 50.88% > > If we add additional check which case is sorted, I think, we > don't win at all. > > And if pivot candidates are not put back to array, it shows > very poor results: > > client server > original: 60.75% 48.20% > no back candidates: 66.80% 53.95% > > I run one pivot version taken from JDK 6: I replace in Dual-Pivot > implementation with all optimizations dual-pivot partitioning by one-pivot: > but it remain still slower than DPQ. So, the main point of DPQ is not only > in optimizations, but in dividing array into 3 parts instead of two. > > Thank you, > Vladimir > > Dmytro Sheyko wrote: > > Hi Vladimir, > > > > The trick with sentinel is quite cute. Going farther, I think it is not > > always necessary to replace left outer element with the least possible value > > (and restore it later after sorting) because after partitioning every > > element in adjoining array part can play the role of sentinel. > > The only exception is when the user requested to sort array partially > > (not from the beginning). Thereby we should care about setting sentinel > > explicitly in this exceptional case only and only once before sorting whole > > array. > > > > Also it seems to me that it is not necessary to put pivot candidates (5 > > elements that are used to choose pivots) back to array > > because anyway they are to be sorted later and likely change their > > positions. > > > > I am also interesting in theoretical rationale why dual pivot quicksort > > is better than single pivot one. The last document that I have seen > > (Last updated: September 22, 2009) > > compared classic quicksort (where pivot is chosen arbitrarily) with > > "classic" dual pivot quicksort (where pivots are also chosen arbitrarily). > > As conclusion they both perform about 2*n*ln(n) key comparisons. However > > jdk had improved quicksort: median-of-three and pseudomedian-of-nine > > approaches were used. And median-of-three approach lead to 12/7*n*ln(n) > > key comparisons. On the other hand, dual pivot quicksort is also improved: > > pivots are chosen from 5 candidates, and hence it must perform less than > > 2*n*ln(n) key comparisons. > > > > Regards, > > Dmytro Sheyko > > > > > Date: Tue, 27 Apr 2010 01:50:08 +0400 > > > Subject: New portion of improvements for Dual-Pivot Quicksort > > > From: [email protected] > > > To: [email protected] > > > > > > Hello, everyone! > > > > > > I've investigated the implementation of the Dual-Pivot Quicksort > > > which is used for sorting primitives and here is my result: > > > > > > http://cr.openjdk.java.net/~alanb/6947216/webrev.00 > > > > > > New implementation of Dual-Pivot Quicksort is faster > > > than previous one of 12% for client VM and few percents for > > > server VM. Tests on Bentley's test suite (Win XP, JDK 7, > > > build 1.7.0-ea-b84, n = 1000000) show geometry mean 0.88 > > > for client VM and 0.98-1.00 for server VM. > > > > > > In compare with sorting from JDK 6 by Jon L. Bentley and > > > M. Douglas McIlroy's (with one pivot) the ratio is 0.61 / 0.50 > > > (client / server). > > > > > > See the execution time for sorting array of 2`000`000 int elements > > > 50 times, client / server VM, in milliseconds: > > > > > > random > > > new: 16723 18776 > > > jdk7: 17571 18975 > > > jdk6: 22241 26524 > > > > > > ascendant > > > new: 3541 4702 > > > jdk7: 4486 4669 > > > jdk6: 8667 7978 > > > > > > descendant > > > new: 3854 4907 > > > jdk7: 4942 5034 > > > jdk6: 8787 8243 > > > > > > equal > > > new: 234 281 > > > jdk7: 291 230 > > > jdk6: 602 1018 > > > > > > organ pipes > > > new: 7673 8613 > > > jdk7: 8167 8993 > > > jdk6: 11902 14044 > > > > > > stagger 1 > > > new: 7648 8591 > > > jdk7: 8161 8979 > > > jdk6: 11908 13810 > > > > > > stagger 2 > > > new: 8349 9299 > > > jdk7: 10968 11916 > > > jdk6: 12194 14734 > > > > > > stagger 4 > > > new: 8475 9622 > > > jdk7: 9221 9682 > > > jdk6: 10523 12006 > > > > > > stagger 8 > > > new: 9321 10689 > > > jdk7: 11125 12387 > > > jdk6: 13829 16214 > > > > > > period 1..2 > > > new: 758 751 > > > jdk7: 870 754 > > > jdk6: 1038 1227 > > > > > > period 1..4 > > > new: 1004 963 > > > jdk7: 1365 1209 > > > jdk6: 1511 1847 > > > > > > period 1..8 > > > new: 1588 1573 > > > jdk7: 1599 1790 > > > jdk6: 2602 3045 > > > > > > random 1..2 > > > new: 1327 1125 > > > jdk7: 1362 1496 > > > jdk6: 1531 2182 > > > > > > random 1..4 > > > new: 1830 2118 > > > jdk7: 1851 2236 > > > jdk6: 2292 3025 > > > > > > where stagger(m) is array like a[i] = i * (m + 1) % length. > > > > > > The summary of changes is: > > > > > > 1. For sorting small arrays is used insertion sort with sentinel > > > instead of traditional, which has the structure: > > > > > > for (int i = left + 1; i <= right; i++) { > > > for (j = i; j > left && a[j-1] > a[j]; j--) { > > > swap(a[i], a[j-1]); > > > } > > > } > > > > > > Note that range check j > left is performed on each iteration, > > > but really helps very rare. To avoid this expensive range check, > > > it was suggested to set minimum value (negative infinity) on the > > > first position. This type of suggestion is used in new version: > > > > > > if left bound > 0, we can put sentinel on a[left - 1], do insertion > > > sort without expensive check range, and then restore a[left - 1] > > > value. If left == 0, traditional insertion sort is used. Please, > > > look at the webrev for details. > > > > > > 2. In previous implementation 5 evenly spaced elements > > > > > > sixth = length / 6; > > > a[sixth], a[2 * sixth], a[3 * sixth], a[4 * sixth], a[5 * sixth] > > > > > > were used as candidates of pivots elements. This case is very > > > sensitive for period inputs, especially by 6. The new suggestion > > > is to take 5 center evenly spaced elements like this: > > > > > > int seventh = length / 7; > > > int midpoint = (left + right) >>> 1; > > > > > > a[midpoint - 2 * seventh], a[midpoint - seventh], a[midpoint], > > > a[midpoint + seventh], a[midpoint + 2 * seventh] > > > > > > and moreover, the seventh is calculated inexpensively: > > > seventh = (length >>> 3) + (length >>> 6) + 1; > > > > > > This schema works the same on random, ascendant, descendant, equal > > > inputs, but much better for period / stagger. > > > > > > 3. The whole structure > > > > > > <choose pivots> > > > > > > if (pivotsDiffer) { > > > <do partitioning for two pivots> > > > } > > > else { > > > <do partitioning for one pivot> > > > } > > > > > > <sort left and right parts> > > > > > > if (!pivotsDiffer) { > > > return; > > > } > > > <swap internal pivot values to ends> > > > > > > <sort center part> > > > > > > was modified to: > > > ---------------- > > > > > > <choose pivots> > > > > > > if (pivot1 < pivot2) { > > > <do partitioning for two pivots> > > > <swap pivots into their final positions> > > > <sort left and right parts> > > > <swap internal pivot values to ends> > > > <sort center part> > > > } > > > else { > > > <do partitioning for one pivot> > > > <sort left and right parts> > > > } > > > > > > 4. Partitioning for both cases have not been changed at all. > > > > > > 5. Minor javadoc and format changes. > > > > > > Please, review new implementation, > > > any comments / suggestions are welcome! > > > > > > Thank you, > > > Vladimir _________________________________________________________________ Hotmail: Powerful Free email with security by Microsoft. https://signup.live.com/signup.aspx?id=60969
