Hi Vladimir, One short comment, see below:
Le jeu. 15 nov. 2018 à 15:39, Vladimir Yaroslavskiy <vlv.spb...@mail.ru> a écrit : > Hello, Tagir! > > I compared Radix sort with Dual-Pivot Quicksort and see > that Radix sort is faster than DPQ (on 2M elements) on random data in > twice times, > but it works slower on repeated elements (in 3-7 times). For highly > structured arrays merging sort is invoked in both cases. Other > data types - almost the same. > Your test is not really fair: 2m ints ~ 8mb so it is larger than typical CPU L3 cache size. RadixSort is doing random accesses and is known to be faster when arrays fits in L2/L3 caches. If larger then using radix on sub parts+merge ! > Do we actually need to use RadixSort if it is not faster on all inputs? > I will do my own performance tests on int[] arrays with sizes in range [50 - 100000] ... Vladimir, I think Tageer proposed to tune its radixSort to obtain an hybrid / adaptive radix sort, as you did in your DPQS. I can help here. Cheers, Laurent What do you think? > > Regards, > Vladimir > > Среда, 14 ноября 2018, 19:17 +03:00 от Tagir Valeev <amae...@gmail.com>: > > Hello, Laurent, Vladimir! > > I created a pull request containing my RadixSort implementation: > https://github.com/bourgesl/nearly-optimal-mergesort-code/pull/1 > On my machine the results produced by Mergesorts.java are like this: > > Runs with individual timing (skips first 10 runs): > adjusted reps: 110 + inner loop: 1 > avg-ms=102.6177(+/- 7 %), algo=PeekSort+iscutoff=24+onlyIncRuns=false, > n=1000000 (55000110) (n=100, µ=102.6177, σ=7.4065714) > avg-ms=95.62607(+/- 4 %), > algo=TopDownMergesort+iscutoff=24+checkSorted=true, n=1000000 > (55000110) (n=100, µ=95.62607, σ=3.5302947) > avg-ms=118.73089(+/- 4 %), > algo=BottomUpMergesort+minRunLen=24+checkSorted=true, n=1000000 > (55000110) (n=100, µ=118.73089, σ=4.464908) > avg-ms=108.36175(+/- 4 %), algo=MarlinSort, n=1000000 (55000110) > (n=100, µ=108.36175, σ=4.5998554) > avg-ms=99.68292(+/- 4 %), algo=MarlinMergeSort, n=1000000 > (55000110) (n=100, µ=99.68292, σ=3.9944465) > avg-ms=75.43999(+/- 3 %), algo=DualPivotQuicksort2018, n=1000000 > (55000110) (n=100, µ=75.43999, σ=2.6127808) > avg-ms=83.80406(+/- 6 %), algo=DualPivotQuicksort2018Ext, n=1000000 > (55000110) (n=100, µ=83.80406, σ=4.734225) > avg-ms=18.886326(+/- 4 %), algo=RadixSort, n=1000000 (55000110) > (n=100, µ=18.886326, σ=0.75667334) > > As you can see RadixSort is much faster than anything. Well, probably > there are some input patterns where it may lose (though it's > performance is much more predictable and much less depend on input > data than in DPQS), but I strongly believe that concentrating efforts > on testing corner cases performance and integrating RadixSort into JDK > would yield much better performance than DPQS, at least for int[] > arrays. Also the implementation is much simpler. > > With best regards, > Tagir Valeev. > On Mon, Nov 12, 2018 at 5:09 PM Laurent Bourgès > <bourges.laur...@gmail.com> wrote: > > > > Hi, > > > > Do you know if someone has written a complete JMH benchmark suite > dedicated > > to Arrays.sort() ? > > with varying array size (trivial) but also testing lots of data > > distributions: (see Vladimir's tests) and possibly all variants (int, > long, > > double, Object[] ) > > > > It could be part of the standard OpenJDK JMH test suite... > > > > For now, I forked the nearly-optimal-mergesort repository on github: > > > https://github.com/bourgesl/nearly-optimal-mergesort-code/tree/master/results > > > > Cheers, > > Laurent > > > > Le sam. 10 nov. 2018 à 12:49, Laurent Bourgès <bourges.laur...@gmail.com> > a > > écrit : > > > > > Dear Vladimir & other Java sort experts, > > > > > > I made the port of the DPQS 2018.2 code last night to support a > secondary > > > array to be sorted and use preallocation (aux/run for merge sort): > > > > > > > https://github.com/bourgesl/nearly-optimal-mergesort-code/blob/master/src/wildinter/net/mergesort/DualPivotQuicksort2018Ext.java > > > > > > I succeded in using this variant in Marlin renderer (dev) and it is > > > promising. I figured out a rendering bug in 1 test wigh 65535 random > > > segments, certainly due to array swaps in mergesort (my side)... > > > > > > I will look again at my changes to check if I miss some permutation (a > // > > > b) or made any mistake on their indices... tricky. > > > > > > PS: Please share your updated webrev when possible to rebase my > changes. > > > > > > Regards, > > > Laurent > > > > > > Le ven. 9 nov. 2018 à 10:08, Laurent Bourgès < > bourges.laur...@gmail.com> > > > a écrit : > > > > > >> Hi Vladimir, > > >> > > >> Thank you for your attention, you are the Sort Master. > > >> > > >> > > >> Le ven. 9 nov. 2018 à 09:02, Vladimir Yaroslavskiy < > vlv.spb...@mail.ru> > > >> a écrit : > > >> > > >>> Hi Laurent, > > >>> > > >>> The new version is still under review, there were a lot of > suggestions > > >>> and ideas from Doug Lea. > > >>> I needed time to apply and check them. I'm going to send final > version > > >>> in few days. > > >>> > > >> > > >> Excellent news, so it will ship in jdk12... > > >> > > >>> > > >>> As to new method sort(a[], low, high, indices): do you mean this > method > > >>> in Arrays class? > > >>> Are you going to extend Arrays API? > > >>> > > >> > > >> I benchmarked my MergeSort and adding extra array implies many more > > >> moves: thresholds should be adjusted and my sort is sometimes better > as it > > >> makes half moves (a <-> buffer swaps), so this complementary sort > should > > >> have its own tuned implementation (tests & benchmarks). > > >> > > >> I coded my sort as such need did not match any Arrays.sort() methods. > > >> Having it publicly in jdk.base would be great for any other data > handling > > >> algorithm (science, AI ?) > > >> > > >> If you agree it is useful, I could try proposing an Arrays/Collection > API > > >> extension like : > > >> sort(<primitive or Value>[], low, high, int[]indices) > > >> > > >> > > >>> About new parameters workbase[]: method sort(...) in > DualPivotQuicksort > > >>> class (version is under review) > > >>> has extra parameter - parallel context (Sorter sorter) which has > > >>> required workbase[]. > > >>> If we run sorting sequentially, sorter parameter is null (no any > > >>> objects) and temporary buffer > > >>> will be created inside in tryMerge method if we go ahead with > merging. > > >>> > > >> > > >> I will have a look. For Marlin, parallel sort is not possible as > > >> rendering shapes can be parallelized in the application (geoserver map > > >> rendering). > > >> > > >> > > >>> I will send new sources in few days and add you in cc, so you can > look > > >>> at it > > >>> and find if new methods are acceptable for you. Then we can discuss > > >>> required changes (if any). > > >>> > > >> > > >> Perfect, I started adapting your code and will share soon the link to > my > > >> github repo. > > >> > > >> > > >>> Thank you, > > >>> Vladimir > > >>> > > >> > > >> Thank you very much for your first thoughts, > > >> Laurent > > >> > > >> > > >>> Пятница, 9 ноября 2018, 10:44 +03:00 от Laurent Bourgès < > > >>> bourges.laur...@gmail.com>: > > >>> > > >>> Hi, > > >>> > > >>> I am currently testing many sort algorithms to improve the Marlin > > >>> renderer (AA 2D shape rasterizer), I integrated since OpenJDK9 and > am still > > >>> improving for OpenJDK12 . > > >>> > > >>> I created my MergeSort (top down, check for sorted parts, array / > buffer > > >>> swap to minimize moves, isort on small sub arrays) that sort 2 int[]: > > >>> crossing + edge indices. > > >>> It is critical as edge crossings are sorted at every scanline, data > are > > >>> almost sorted/reversed, not really random. > > >>> > > >>> 3 questions: > > >>> - why is this patch dormant ? I checked in openjdk12 repo and your > > >>> changes were not integrated. > > >>> - I need a sort() method that works with 2 arrays: data + indices > > >>> (pointer like). Such sorted indices are useful to use for lookups > c[idx[k]] > > >>> or to perform other array permutations... > > >>> Would you agree adding such new sort(a[], low, high, indices) > > >>> - Marlin uses a ThreadLocal context to avoid any allocation. JDK > sort() > > >>> impl do not accept parameters to give any buffer[] and avoid > allocations. > > >>> Would you agree adding such optional parameters (like workbase[]) ? > > >>> > > >>> I will experiment adapting the DualPivotQuickSort in Marlin renderer > and > > >>> perform array sort race & rendering benchmarks. > > >>> > > >>> Cheers, > > >>> Laurent > > >>> > > >>> Le ven. 19 janv. 2018 à 14:38, Vladimir Yaroslavskiy < > vlv.spb...@mail.ru > > >>> <https://e.mail.ru/compose/?mailto=mailto%3avlv.spb...@mail.ru > <vlv.spb...@mail.ru>>> a > > >>> écrit : > > >>> > > >>> Hi team, > > >>> > > >>> In Sept 2009 Josh Bloch, Jon Bentley and I introduced new sorting > > >>> algorithm, Dual-Pivot Quicksort, for primitives in JDK 7 and later > > >>> I suggested several improvements of Dual-Pivot Quicksort, which > > >>> were integrated into JDK 8. > > >>> > > >>> Now I have more optimized and faster version of Dual-Pivot Quicksort. > > >>> I have been working on it for the last 5 years. Please, find the > > >>> summary of changes below and sources / diff at webrev [1]. > > >>> > > >>> All tests and benchmarking were run on the most recent build of JDK > 10, > > >>> jdk-10-ea+39. The new version shows the better performance on > different > > >>> inputs and guarantees n*log(n) on any data. > > >>> > > >>> The new implementation of Dual-Pivot Quicksort is 'all-in-one' > version: > > >>> it contains one code for both parallel and sequential sorting > algorithms. > > >>> > > >>> Suggested version is 10-20% faster on random data, 1.5-4 times faster > > >>> on nearly structured arrays, 1.5-2 times faster on period inputs. > > >>> > > >>> Parallel Dual-Pivot Quicksort is 1.5-3 times faster than current > > >>> algorithm based on merge sort. > > >>> > > >>> Benchmarking on the test suite, suggested by Jon Bentley, shows the > > >>> boost of performance in 1.4 times. This test suite contains several > > >>> types of inputs, such as random data, nearly structured arrays, data > > >>> with period and so on. Also there are several modifications of inputs > > >>> and parameters which are used in data building. We run sorting on all > > >>> combinations to compare two algorithms. > > >>> > > >>> Please let me know if you have any questions / comments. > > >>> > > >>> Summary of changes: > > >>> > > >>> DualPivotQuicksort class > > >>> ------------------------ > > >>> * Pivots are chosen with another step, the 1-st and 5-th candidates > > >>> are taken as pivots instead of 2-nd and 4-th. > > >>> * Splitting into parts is related to the golden ratio > > >>> * Pivot candidates are sorted by combination of 5-element > > >>> network sorting + insertion sort > > >>> * New backwards partitioning is simpler and more efficient > > >>> * Quicksort tuning parameters were updated > > >>> * Merging sort is invoked on each iteration from Quicksort > > >>> * Additional steps for float/double were fully rewritten > > >>> * Heap sort is invoked on the leftmost part > > >>> * Heap sort is used as a guard against quadratic time > > >>> * Parallel sorting is based on Dual-Pivot Quicksort > > >>> instead of merge sort > > >>> > > >>> SortingAlgorithms class > > >>> ----------------------- > > >>> * Merging sort and pair insertion sort were moved from > > >>> DualPivotQuicksort class > > >>> * Pair insertion sort was simplified and optimized > > >>> * New nano insertion sort was introduced for tiny arrays > > >>> * Merging sort was fully rewritten > > >>> * Optimized merging partitioning is used > > >>> * Merging parameters were updated > > >>> * Merging of runs was fully rewritten > > >>> * Fast version of heap sort was introduced > > >>> * Parallel merging sort was also provided > > >>> > > >>> Sorting / ParallelSorting classes > > >>> --------------------------------- > > >>> * New test cases were added > > >>> * Sources of these classes were unified > > >>> > > >>> Arrays class > > >>> ------------ > > >>> * Calls of Dual-Pivot Quicksort were updated > > >>> * Parallel sorting of primitives was switched to parallel > > >>> Dual-Pivot Quicksort > > >>> * Javadoc was modified > > >>> > > >>> ArraysParallelSortHelpers class > > >>> ------------------------------- > > >>> * Old implementation of parallel sorting for primitives was removed > > >>> * Javadoc was updated > > >>> > > >>> Thank you, > > >>> Vladimir > > >>> > > >>> -------------------------------------------------------------------- > > >>> [1] http://cr.openjdk.java.net/~alanb/DualPivotSortUpdate/webrev.01/ > > >>> -------------------------------------------------------------------- > > >>> > > >>> > > >>> > > >>> -- > > >>> Vladimir Yaroslavskiy > > >>> > > >> > > > > -- > Vladimir Yaroslavskiy >