Hi Vladimir,

One short comment, see below:


Le jeu. 15 nov. 2018 à 15:39, Vladimir Yaroslavskiy <vlv.spb...@mail.ru> a
écrit :

> Hello, Tagir!
>
> I compared Radix sort with Dual-Pivot Quicksort and see
> that Radix sort is faster than DPQ (on 2M elements) on random data in
> twice times,
> but it works slower on repeated elements (in 3-7 times). For highly
> structured arrays merging sort is invoked in both cases. Other
> data types - almost the same.
>

Your test is not really fair: 2m ints ~ 8mb so it is larger than typical
CPU L3 cache size.
RadixSort is doing random accesses and is known to be faster when arrays
fits in L2/L3 caches.
If larger then using radix on sub parts+merge !


> Do we actually need to use RadixSort if it is not faster on all inputs?
>

I will do my own performance tests on int[] arrays with sizes in range [50
- 100000] ...

Vladimir, I think Tageer proposed to tune its radixSort to obtain an hybrid
/ adaptive radix sort, as you did in your DPQS.
I can help here.

Cheers,
Laurent


What do you think?
>
> Regards,
> Vladimir
>
> Среда, 14 ноября 2018, 19:17 +03:00 от Tagir Valeev <amae...@gmail.com>:
>
> Hello, Laurent, Vladimir!
>
> I created a pull request containing my RadixSort implementation:
> https://github.com/bourgesl/nearly-optimal-mergesort-code/pull/1
> On my machine the results produced by Mergesorts.java are like this:
>
> Runs with individual timing (skips first 10 runs):
> adjusted reps: 110 + inner loop: 1
> avg-ms=102.6177(+/- 7 %), algo=PeekSort+iscutoff=24+onlyIncRuns=false,
> n=1000000 (55000110) (n=100, µ=102.6177, σ=7.4065714)
> avg-ms=95.62607(+/- 4 %),
> algo=TopDownMergesort+iscutoff=24+checkSorted=true, n=1000000
> (55000110) (n=100, µ=95.62607, σ=3.5302947)
> avg-ms=118.73089(+/- 4 %),
> algo=BottomUpMergesort+minRunLen=24+checkSorted=true, n=1000000
> (55000110) (n=100, µ=118.73089, σ=4.464908)
> avg-ms=108.36175(+/- 4 %), algo=MarlinSort, n=1000000 (55000110)
> (n=100, µ=108.36175, σ=4.5998554)
> avg-ms=99.68292(+/- 4 %), algo=MarlinMergeSort, n=1000000
> (55000110) (n=100, µ=99.68292, σ=3.9944465)
> avg-ms=75.43999(+/- 3 %), algo=DualPivotQuicksort2018, n=1000000
> (55000110) (n=100, µ=75.43999, σ=2.6127808)
> avg-ms=83.80406(+/- 6 %), algo=DualPivotQuicksort2018Ext, n=1000000
>  (55000110) (n=100, µ=83.80406, σ=4.734225)
> avg-ms=18.886326(+/- 4 %), algo=RadixSort, n=1000000 (55000110)
> (n=100, µ=18.886326, σ=0.75667334)
>
> As you can see RadixSort is much faster than anything. Well, probably
> there are some input patterns where it may lose (though it's
> performance is much more predictable and much less depend on input
> data than in DPQS), but I strongly believe that concentrating efforts
> on testing corner cases performance and integrating RadixSort into JDK
> would yield much better performance than DPQS, at least for int[]
> arrays. Also the implementation is much simpler.
>
> With best regards,
> Tagir Valeev.
> On Mon, Nov 12, 2018 at 5:09 PM Laurent Bourgès
> <bourges.laur...@gmail.com> wrote:
> >
> > Hi,
> >
> > Do you know if someone has written a complete JMH benchmark suite
> dedicated
> > to Arrays.sort() ?
> > with varying array size (trivial) but also testing lots of data
> > distributions: (see Vladimir's tests) and possibly all variants (int,
> long,
> > double, Object[] )
> >
> > It could be part of the standard OpenJDK JMH test suite...
> >
> > For now, I forked the nearly-optimal-mergesort repository on github:
> >
> https://github.com/bourgesl/nearly-optimal-mergesort-code/tree/master/results
> >
> > Cheers,
> > Laurent
> >
> > Le sam. 10 nov. 2018 à 12:49, Laurent Bourgès <bourges.laur...@gmail.com>
> a
> > écrit :
> >
> > > Dear Vladimir & other Java sort experts,
> > >
> > > I made the port of the DPQS 2018.2 code last night to support a
> secondary
> > > array to be sorted and use preallocation (aux/run for merge sort):
> > >
> > >
> https://github.com/bourgesl/nearly-optimal-mergesort-code/blob/master/src/wildinter/net/mergesort/DualPivotQuicksort2018Ext.java
> > >
> > > I succeded in using this variant in Marlin renderer (dev) and it is
> > > promising. I figured out a rendering bug in 1 test wigh 65535 random
> > > segments, certainly due to array swaps in mergesort (my side)...
> > >
> > > I will look again at my changes to check if I miss some permutation (a
> //
> > > b) or made any mistake on their indices... tricky.
> > >
> > > PS: Please share your updated webrev when possible to rebase my
> changes.
> > >
> > > Regards,
> > > Laurent
> > >
> > > Le ven. 9 nov. 2018 à 10:08, Laurent Bourgès <
> bourges.laur...@gmail.com>
> > > a écrit :
> > >
> > >> Hi Vladimir,
> > >>
> > >> Thank you for your attention, you are the Sort Master.
> > >>
> > >>
> > >> Le ven. 9 nov. 2018 à 09:02, Vladimir Yaroslavskiy <
> vlv.spb...@mail.ru>
> > >> a écrit :
> > >>
> > >>> Hi Laurent,
> > >>>
> > >>> The new version is still under review, there were a lot of
> suggestions
> > >>> and ideas from Doug Lea.
> > >>> I needed time to apply and check them. I'm going to send final
> version
> > >>> in few days.
> > >>>
> > >>
> > >> Excellent news, so it will ship in jdk12...
> > >>
> > >>>
> > >>> As to new method sort(a[], low, high, indices): do you mean this
> method
> > >>> in Arrays class?
> > >>> Are you going to extend Arrays API?
> > >>>
> > >>
> > >> I benchmarked my MergeSort and adding extra array implies many more
> > >> moves: thresholds should be adjusted and my sort is sometimes better
> as it
> > >> makes half moves (a <-> buffer swaps), so this complementary sort
> should
> > >> have its own tuned implementation (tests & benchmarks).
> > >>
> > >> I coded my sort as such need did not match any Arrays.sort() methods.
> > >> Having it publicly in jdk.base would be great for any other data
> handling
> > >> algorithm (science, AI ?)
> > >>
> > >> If you agree it is useful, I could try proposing an Arrays/Collection
> API
> > >> extension like :
> > >> sort(<primitive or Value>[], low, high, int[]indices)
> > >>
> > >>
> > >>> About new parameters workbase[]: method sort(...) in
> DualPivotQuicksort
> > >>> class (version is under review)
> > >>> has extra parameter - parallel context (Sorter sorter) which has
> > >>> required workbase[].
> > >>> If we run sorting sequentially, sorter parameter is null (no any
> > >>> objects) and temporary buffer
> > >>> will be created inside in tryMerge method if we go ahead with
> merging.
> > >>>
> > >>
> > >> I will have a look. For Marlin, parallel sort is not possible as
> > >> rendering shapes can be parallelized in the application (geoserver map
> > >> rendering).
> > >>
> > >>
> > >>> I will send new sources in few days and add you in cc, so you can
> look
> > >>> at it
> > >>> and find if new methods are acceptable for you. Then we can discuss
> > >>> required changes (if any).
> > >>>
> > >>
> > >> Perfect, I started adapting your code and will share soon the link to
> my
> > >> github repo.
> > >>
> > >>
> > >>> Thank you,
> > >>> Vladimir
> > >>>
> > >>
> > >> Thank you very much for your first thoughts,
> > >> Laurent
> > >>
> > >>
> > >>> Пятница, 9 ноября 2018, 10:44 +03:00 от Laurent Bourgès <
> > >>> bourges.laur...@gmail.com>:
> > >>>
> > >>> Hi,
> > >>>
> > >>> I am currently testing many sort algorithms to improve the Marlin
> > >>> renderer (AA 2D shape rasterizer), I integrated since OpenJDK9 and
> am still
> > >>> improving for OpenJDK12 .
> > >>>
> > >>> I created my MergeSort (top down, check for sorted parts, array /
> buffer
> > >>> swap to minimize moves, isort on small sub arrays) that sort 2 int[]:
> > >>> crossing + edge indices.
> > >>> It is critical as edge crossings are sorted at every scanline, data
> are
> > >>> almost sorted/reversed, not really random.
> > >>>
> > >>> 3 questions:
> > >>> - why is this patch dormant ? I checked in openjdk12 repo and your
> > >>> changes were not integrated.
> > >>> - I need a sort() method that works with 2 arrays: data + indices
> > >>> (pointer like). Such sorted indices are useful to use for lookups
> c[idx[k]]
> > >>> or to perform other array permutations...
> > >>> Would you agree adding such new sort(a[], low, high, indices)
> > >>> - Marlin uses a ThreadLocal context to avoid any allocation. JDK
> sort()
> > >>> impl do not accept parameters to give any buffer[] and avoid
> allocations.
> > >>> Would you agree adding such optional parameters (like workbase[]) ?
> > >>>
> > >>> I will experiment adapting the DualPivotQuickSort in Marlin renderer
> and
> > >>> perform array sort race & rendering benchmarks.
> > >>>
> > >>> Cheers,
> > >>> Laurent
> > >>>
> > >>> Le ven. 19 janv. 2018 à 14:38, Vladimir Yaroslavskiy <
> vlv.spb...@mail.ru
> > >>> <https://e.mail.ru/compose/?mailto=mailto%3avlv.spb...@mail.ru
> <vlv.spb...@mail.ru>>> a
> > >>> écrit :
> > >>>
> > >>> Hi team,
> > >>>
> > >>> In Sept 2009 Josh Bloch, Jon Bentley and I introduced new sorting
> > >>> algorithm, Dual-Pivot Quicksort, for primitives in JDK 7 and later
> > >>> I suggested several improvements of Dual-Pivot Quicksort, which
> > >>> were integrated into JDK 8.
> > >>>
> > >>> Now I have more optimized and faster version of Dual-Pivot Quicksort.
> > >>> I have been working on it for the last 5 years. Please, find the
> > >>> summary of changes below and sources / diff at webrev [1].
> > >>>
> > >>> All tests and benchmarking were run on the most recent build of JDK
> 10,
> > >>> jdk-10-ea+39. The new version shows the better performance on
> different
> > >>> inputs and guarantees n*log(n) on any data.
> > >>>
> > >>> The new implementation of Dual-Pivot Quicksort is 'all-in-one'
> version:
> > >>> it contains one code for both parallel and sequential sorting
> algorithms.
> > >>>
> > >>> Suggested version is 10-20% faster on random data, 1.5-4 times faster
> > >>> on nearly structured arrays, 1.5-2 times faster on period inputs.
> > >>>
> > >>> Parallel Dual-Pivot Quicksort is 1.5-3 times faster than current
> > >>> algorithm based on merge sort.
> > >>>
> > >>> Benchmarking on the test suite, suggested by Jon Bentley, shows the
> > >>> boost of performance in 1.4 times. This test suite contains several
> > >>> types of inputs, such as random data, nearly structured arrays, data
> > >>> with period and so on. Also there are several modifications of inputs
> > >>> and parameters which are used in data building. We run sorting on all
> > >>> combinations to compare two algorithms.
> > >>>
> > >>> Please let me know if you have any questions / comments.
> > >>>
> > >>> Summary of changes:
> > >>>
> > >>> DualPivotQuicksort class
> > >>> ------------------------
> > >>> * Pivots are chosen with another step, the 1-st and 5-th candidates
> > >>> are taken as pivots instead of 2-nd and 4-th.
> > >>> * Splitting into parts is related to the golden ratio
> > >>> * Pivot candidates are sorted by combination of 5-element
> > >>> network sorting + insertion sort
> > >>> * New backwards partitioning is simpler and more efficient
> > >>> * Quicksort tuning parameters were updated
> > >>> * Merging sort is invoked on each iteration from Quicksort
> > >>> * Additional steps for float/double were fully rewritten
> > >>> * Heap sort is invoked on the leftmost part
> > >>> * Heap sort is used as a guard against quadratic time
> > >>> * Parallel sorting is based on Dual-Pivot Quicksort
> > >>> instead of merge sort
> > >>>
> > >>> SortingAlgorithms class
> > >>> -----------------------
> > >>> * Merging sort and pair insertion sort were moved from
> > >>> DualPivotQuicksort class
> > >>> * Pair insertion sort was simplified and optimized
> > >>> * New nano insertion sort was introduced for tiny arrays
> > >>> * Merging sort was fully rewritten
> > >>> * Optimized merging partitioning is used
> > >>> * Merging parameters were updated
> > >>> * Merging of runs was fully rewritten
> > >>> * Fast version of heap sort was introduced
> > >>> * Parallel merging sort was also provided
> > >>>
> > >>> Sorting / ParallelSorting classes
> > >>> ---------------------------------
> > >>> * New test cases were added
> > >>> * Sources of these classes were unified
> > >>>
> > >>> Arrays class
> > >>> ------------
> > >>> * Calls of Dual-Pivot Quicksort were updated
> > >>> * Parallel sorting of primitives was switched to parallel
> > >>> Dual-Pivot Quicksort
> > >>> * Javadoc was modified
> > >>>
> > >>> ArraysParallelSortHelpers class
> > >>> -------------------------------
> > >>> * Old implementation of parallel sorting for primitives was removed
> > >>> * Javadoc was updated
> > >>>
> > >>> Thank you,
> > >>> Vladimir
> > >>>
> > >>> --------------------------------------------------------------------
> > >>> [1] http://cr.openjdk.java.net/~alanb/DualPivotSortUpdate/webrev.01/
> > >>> --------------------------------------------------------------------
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> Vladimir Yaroslavskiy
> > >>>
> > >>
>
>
>
> --
> Vladimir Yaroslavskiy
>

Reply via email to