Hi, I found the 2009 discussion and work surrounding dual-pivot quicksort fascinating. At that time, 3-pivot quicksort was described as slower; however, since then a new 3-pivot partition algorithm was published and if you want the fastest sorting time for primitives then the results are again very interesting.
For an array with 10 million elements number of swaps switching to insertion sort at 47 and then not counting swaps: ~74 million 3-pivot quicksort ~74 million dual-pivot quicksort Yet the 3-pivot is 10-15% faster - cache misses are fewer and so are comparisons. Unfortunately as we follow this path some of the original elegance of the algorithm is lost so I pasted the partition algorithm below for reference. If you need faster sorting of primitives then run the code: https://github.com/dmcmanam/quicksort/blob/master/src/Quicksort.java And just uncomment the Arrays.sort line to compare against the current dual-pivot you are using. --David // Pointers a and b initially point to the first element of the array while c // and d initially point to the last element of the array. int a = lo + 2; int b = lo + 2; int c = hi - 1; int d = hi - 1; while (b <= c) { while (A[b] < q && b <= c) { if (A[b] < p) { swap(A, a, b); a++; } b++; } while (A[c] > q && b <= c) { if (A[c] > r) { swap(A, c, d); d--; } c--; } if (b <= c) { if (A[b] > r) { if (A[c] < p) { swap(A, b, a); swap(A, a, c); a++; } else { swap(A, b, c); } swap(A, c, d); b++; c--; d--; } else { if (A[c] < p) { swap(A, b, a); swap(A, a, c); a++; } else { swap(A, b, c); } b++; c--; } } } // swap the pivots to their correct positions a--; b--; c++; d++; swap(A, lo + 1, a); swap(A, a, b); a--; swap(A, lo, a); swap(A, hi, d);