Vladimir Yaroslavskiy <vladimir.yaroslavs...@...> writes: > I'd like to share with you new Dual-Pivot Quicksort which is > faster than the known implementations (theoretically and > experimental). I'd like to propose to replace the JDK's > Quicksort implementation by new one.
This is a great idea; as a mathematician I immediately want to generalize it. Unfortunately, I lack the computing skill to test my generalization in practice. Suppose that instead of 2 pivots we have m (randomly chosen) pivots, where m=sqrt(n+1)-1. Step 1: Sort the m pivots, recursively. Step 2: Build a balanced binary tree from the pivots. Step 3: Sort the remaining n-m elements into (m+1) categories, using the BBT. Step 4: Sort each of the (m+1) categories, recursively. Let S(n) denote the number of steps needed for this algorithm. Calculating roughly: Step 1 requires S(m) steps. Step 2 requires m steps (it is easy since the pivots are already sorted). Step 3 requires (n-m)(log m) comparisons. Step 4 requires, on average, (m+1)S((n-m)/(m+1)) steps. m was chosen so that (n-m)/(m+1)=m, so S(n)=(m+2)S(m)+m+(n-m)(log m) <= (m+2) S(m)+ n log m. Crudely, we replace both m and m+2 by sqrt(n), to get as approximation: S(n) <= 0.5 n log n + n^0.5 [ 0.25 n^0.5 log n + n^0.25 [ 0.125 n^0.25 log n +...]]] = 0.5 n log n [1 + 0.5 + 0.25 + ...] = n log n. This appears to be better than 2 n log n. This argument can be made precise (the only really questionable issue is replacing m+2 by m) by first getting a crude but correct upper bound for 2S(m), which gets absorbed into the -mlogm term. Unfortunately, I'm not up to the detailed comparison vs. swap analysis as done by Mr. Yaroslavskiy, but it seems to me that, if nothing else, having an efficient algorithm to partition the non-pivot elements should provide a savings. Vadim Ponomarenko Dept. of Mathematics and Statistics San Diego State University
