Hi, Do you know if someone has written a complete JMH benchmark suite dedicated to Arrays.sort() ? with varying array size (trivial) but also testing lots of data distributions: (see Vladimir's tests) and possibly all variants (int, long, double, Object[] )
It could be part of the standard OpenJDK JMH test suite... For now, I forked the nearly-optimal-mergesort repository on github: https://github.com/bourgesl/nearly-optimal-mergesort-code/tree/master/results Cheers, Laurent Le sam. 10 nov. 2018 à 12:49, Laurent Bourgès <bourges.laur...@gmail.com> a écrit : > Dear Vladimir & other Java sort experts, > > I made the port of the DPQS 2018.2 code last night to support a secondary > array to be sorted and use preallocation (aux/run for merge sort): > > https://github.com/bourgesl/nearly-optimal-mergesort-code/blob/master/src/wildinter/net/mergesort/DualPivotQuicksort2018Ext.java > > I succeded in using this variant in Marlin renderer (dev) and it is > promising. I figured out a rendering bug in 1 test wigh 65535 random > segments, certainly due to array swaps in mergesort (my side)... > > I will look again at my changes to check if I miss some permutation (a // > b) or made any mistake on their indices... tricky. > > PS: Please share your updated webrev when possible to rebase my changes. > > Regards, > Laurent > > Le ven. 9 nov. 2018 à 10:08, Laurent Bourgès <bourges.laur...@gmail.com> > a écrit : > >> Hi Vladimir, >> >> Thank you for your attention, you are the Sort Master. >> >> >> Le ven. 9 nov. 2018 à 09:02, Vladimir Yaroslavskiy <vlv.spb...@mail.ru> >> a écrit : >> >>> Hi Laurent, >>> >>> The new version is still under review, there were a lot of suggestions >>> and ideas from Doug Lea. >>> I needed time to apply and check them. I'm going to send final version >>> in few days. >>> >> >> Excellent news, so it will ship in jdk12... >> >>> >>> As to new method sort(a[], low, high, indices): do you mean this method >>> in Arrays class? >>> Are you going to extend Arrays API? >>> >> >> I benchmarked my MergeSort and adding extra array implies many more >> moves: thresholds should be adjusted and my sort is sometimes better as it >> makes half moves (a <-> buffer swaps), so this complementary sort should >> have its own tuned implementation (tests & benchmarks). >> >> I coded my sort as such need did not match any Arrays.sort() methods. >> Having it publicly in jdk.base would be great for any other data handling >> algorithm (science, AI ?) >> >> If you agree it is useful, I could try proposing an Arrays/Collection API >> extension like : >> sort(<primitive or Value>[], low, high, int[]indices) >> >> >>> About new parameters workbase[]: method sort(...) in DualPivotQuicksort >>> class (version is under review) >>> has extra parameter - parallel context (Sorter sorter) which has >>> required workbase[]. >>> If we run sorting sequentially, sorter parameter is null (no any >>> objects) and temporary buffer >>> will be created inside in tryMerge method if we go ahead with merging. >>> >> >> I will have a look. For Marlin, parallel sort is not possible as >> rendering shapes can be parallelized in the application (geoserver map >> rendering). >> >> >>> I will send new sources in few days and add you in cc, so you can look >>> at it >>> and find if new methods are acceptable for you. Then we can discuss >>> required changes (if any). >>> >> >> Perfect, I started adapting your code and will share soon the link to my >> github repo. >> >> >>> Thank you, >>> Vladimir >>> >> >> Thank you very much for your first thoughts, >> Laurent >> >> >>> Пятница, 9 ноября 2018, 10:44 +03:00 от Laurent Bourgès < >>> bourges.laur...@gmail.com>: >>> >>> Hi, >>> >>> I am currently testing many sort algorithms to improve the Marlin >>> renderer (AA 2D shape rasterizer), I integrated since OpenJDK9 and am still >>> improving for OpenJDK12 . >>> >>> I created my MergeSort (top down, check for sorted parts, array / buffer >>> swap to minimize moves, isort on small sub arrays) that sort 2 int[]: >>> crossing + edge indices. >>> It is critical as edge crossings are sorted at every scanline, data are >>> almost sorted/reversed, not really random. >>> >>> 3 questions: >>> - why is this patch dormant ? I checked in openjdk12 repo and your >>> changes were not integrated. >>> - I need a sort() method that works with 2 arrays: data + indices >>> (pointer like). Such sorted indices are useful to use for lookups c[idx[k]] >>> or to perform other array permutations... >>> Would you agree adding such new sort(a[], low, high, indices) >>> - Marlin uses a ThreadLocal context to avoid any allocation. JDK sort() >>> impl do not accept parameters to give any buffer[] and avoid allocations. >>> Would you agree adding such optional parameters (like workbase[]) ? >>> >>> I will experiment adapting the DualPivotQuickSort in Marlin renderer and >>> perform array sort race & rendering benchmarks. >>> >>> Cheers, >>> Laurent >>> >>> Le ven. 19 janv. 2018 à 14:38, Vladimir Yaroslavskiy <vlv.spb...@mail.ru >>> <https://e.mail.ru/compose/?mailto=mailto%3avlv.spb...@mail.ru>> a >>> écrit : >>> >>> Hi team, >>> >>> In Sept 2009 Josh Bloch, Jon Bentley and I introduced new sorting >>> algorithm, Dual-Pivot Quicksort, for primitives in JDK 7 and later >>> I suggested several improvements of Dual-Pivot Quicksort, which >>> were integrated into JDK 8. >>> >>> Now I have more optimized and faster version of Dual-Pivot Quicksort. >>> I have been working on it for the last 5 years. Please, find the >>> summary of changes below and sources / diff at webrev [1]. >>> >>> All tests and benchmarking were run on the most recent build of JDK 10, >>> jdk-10-ea+39. The new version shows the better performance on different >>> inputs and guarantees n*log(n) on any data. >>> >>> The new implementation of Dual-Pivot Quicksort is 'all-in-one' version: >>> it contains one code for both parallel and sequential sorting algorithms. >>> >>> Suggested version is 10-20% faster on random data, 1.5-4 times faster >>> on nearly structured arrays, 1.5-2 times faster on period inputs. >>> >>> Parallel Dual-Pivot Quicksort is 1.5-3 times faster than current >>> algorithm based on merge sort. >>> >>> Benchmarking on the test suite, suggested by Jon Bentley, shows the >>> boost of performance in 1.4 times. This test suite contains several >>> types of inputs, such as random data, nearly structured arrays, data >>> with period and so on. Also there are several modifications of inputs >>> and parameters which are used in data building. We run sorting on all >>> combinations to compare two algorithms. >>> >>> Please let me know if you have any questions / comments. >>> >>> Summary of changes: >>> >>> DualPivotQuicksort class >>> ------------------------ >>> * Pivots are chosen with another step, the 1-st and 5-th candidates >>> are taken as pivots instead of 2-nd and 4-th. >>> * Splitting into parts is related to the golden ratio >>> * Pivot candidates are sorted by combination of 5-element >>> network sorting + insertion sort >>> * New backwards partitioning is simpler and more efficient >>> * Quicksort tuning parameters were updated >>> * Merging sort is invoked on each iteration from Quicksort >>> * Additional steps for float/double were fully rewritten >>> * Heap sort is invoked on the leftmost part >>> * Heap sort is used as a guard against quadratic time >>> * Parallel sorting is based on Dual-Pivot Quicksort >>> instead of merge sort >>> >>> SortingAlgorithms class >>> ----------------------- >>> * Merging sort and pair insertion sort were moved from >>> DualPivotQuicksort class >>> * Pair insertion sort was simplified and optimized >>> * New nano insertion sort was introduced for tiny arrays >>> * Merging sort was fully rewritten >>> * Optimized merging partitioning is used >>> * Merging parameters were updated >>> * Merging of runs was fully rewritten >>> * Fast version of heap sort was introduced >>> * Parallel merging sort was also provided >>> >>> Sorting / ParallelSorting classes >>> --------------------------------- >>> * New test cases were added >>> * Sources of these classes were unified >>> >>> Arrays class >>> ------------ >>> * Calls of Dual-Pivot Quicksort were updated >>> * Parallel sorting of primitives was switched to parallel >>> Dual-Pivot Quicksort >>> * Javadoc was modified >>> >>> ArraysParallelSortHelpers class >>> ------------------------------- >>> * Old implementation of parallel sorting for primitives was removed >>> * Javadoc was updated >>> >>> Thank you, >>> Vladimir >>> >>> -------------------------------------------------------------------- >>> [1] http://cr.openjdk.java.net/~alanb/DualPivotSortUpdate/webrev.01/ >>> -------------------------------------------------------------------- >>> >>> >>> >>> -- >>> Vladimir Yaroslavskiy >>> >>