Hi Vladimir, First Thank you for these impressive results: 15% to 70% overall gains is amazing and will make a big difference, Congratulations !
Could you publish the different test environment ? - HW: cpu (lscpu output) - OS version - JDK version + JVM settings Ideally you or I could write a shell script to collect setup and write a simple log file. PS: I suppose you tested DPQS only on intel cpus, could somebody test on ppc or arm cpus as well ? Cheers, Laurent Le jeu. 25 juil. 2019 à 16:04, Vladimir Yaroslavskiy <vlv.spb...@mail.ru> a écrit : > Hi all, > > With the help of Laurent Bourges I run benchmarking of new optimized > Dual-Pivot Quicksort > and summarized results. The tests were run for all types (int / long / ... > / double), for both types: > sequential and parallel sorting, for small, medium and large sizes. The > comparison was done > on several data types, such as: random, period, sorted, equal, stagger, > shuffle. > > Here is the summary of total gains. The gain is defined as (jdk_time - > dpqs_time) / jdk_time, > where jdk_time is sorting time by the current jdk14 and dpqs_time is > sorting time by new > optimized Dual-Pivot Quicksort. To get stable and reproducible results, > min time of many > invocations is taken. > > You can find all detailed results there > https://github.com/iaroslavski/sorting/tree/master/benchmarking/results > > Sources of benchmarking tests are there > https://github.com/iaroslavski/sorting/tree/master/benchmarking/sources > > <https://github.com/iaroslavski/sorting/tree/master/benchmarking/sources>You > can see not good performance for float / double types (parallel sorting). > This behavior can be explained by existing bug in jdk, when NaNs and -0.0s > are not processed properly. New sorting has total losses for small float / > double > arrays, few percents only. For all other cases new optimized sorting is > winner. > > > -------------------------------------------------------------------------------------------------------- > [int] > > sequential, Length = 150, Gain: 0.15 > sequential, Length = 30000, Gain: 0.28 > sequential, Length = 1000000, Gain: 0.31 > parallel 8, Length = 150, Gain: 0.14 > parallel 8, Length = 30000, Gain: 0.15 > parallel 8, Length = 1000000, Gain: 0.14 > > [long] > sequential, Length = 150, Gain: 0.12 > sequential, Length = 30000, Gain: 0.23 > sequential, Length = 1000000, Gain: 0.32 > parallel 8, Length = 150, Gain: 0.11 > parallel 8, Length = 30000, Gain: 0.16 > parallel 8, Length = 1000000, Gain: 0.24 > parallel 88 processors, Length = 1000000, Gain: 0.05 > > [byte] > sequential, Length = 150, Gain: 0.06 > sequential, Length = 30000, Gain: 0.36 > sequential, Length = 1000000, Gain: 0.37 > parallel 8, Length = 150, Gain: 0.13 > parallel 8, Length = 30000, Gain: 0.73 > parallel 8, Length = 1000000, Gain: 0.65 > > [char] > sequential, Length = 150, Gain: 0.10 > sequential, Length = 30000, Gain: 0.00 > sequential, Length = 1000000, Gain: 0.17 > parallel 8, Length = 150, Gain: 0.10 > parallel 8, Length = 30000, Gain: 0.33 > parallel 8, Length = 1000000, Gain: 0.62 > > [short] > sequential, Length = 150, Gain: 0.14 > sequential, Length = 30000, Gain: 0.10 > sequential, Length = 1000000, Gain: 0.18 > parallel 8, Length = 150, Gain: 0.13 > parallel 8, Length = 30000, Gain: 0.41 > parallel 8, Length = 1000000, Gain: 0.65 > > [float] > sequential, Length = 150, Gain: -0.02 > sequential, Length = 30000, Gain: 0.21 > sequential, Length = 1000000, Gain: 0.24 > parallel 8, Length = 150, Gain: -0.05 > parallel 8, Length = 30000, Gain: -0.04 > parallel 8, Length = 1000000, Gain: -0.12 > > [double] > sequential, Length = 150, Gain: -0.03 > sequential, Length = 30000, Gain: 0.19 > sequential, Length = 1000000, Gain: 0.23 > parallel 8, Length = 150, Gain: -0.05 > parallel 8, Length = 30000, Gain: 0.05 > parallel 8, Length = 1000000, Gain: 0.15 > > Thank you, > Vladimir >