[ 
https://issues.apache.org/jira/browse/RNG-72?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16780529#comment-16780529
 ] 

Alex D Herbert commented on RNG-72:
-----------------------------------

I have compiled a new listing using the output of {{ConstructionPerformance}} 
and {{GenerationPerformance}}.

For reference the construction benchmark creates 500 RNGs, using pre-built 
seeds where appropriate. Test the following construction methods:
 * Create using the native constructor with the *new* keyword
 * Create using a cached *{{Constructor<Object>}}* with *{{newInstance}}*
 * Create using a cached *{{Class<?>}}*, lookup the *{{Constructor<Object>}}* 
and use *{{newInstance}}*
 * Create using *{{RandomSource}}* with the *native* seed
 * Create using *{{RandomSource}}* with a *{{null}}* seed
 * Create using *{{RandomSource}}* with a *{{Long}}* seed
 * Create using *{{RandomSource}}* with a *truncated native* seed forcing self 
seeding
 * Create using *{{RandomSource}}* with a *{{byte[]}}* of the appropriate length

Generation time is shown for the given number of values. The time uses the 
native method of the generator (either {{nextInt}} or {{nextLong}}) so *is not 
comparable between generators of different types*. It is indicative of the 
speed it will take to create a generator and output values.

The table is constructed by adding the time for construction to the generation 
time. Only the total time is shown for the construction using *new*. All other 
times are relative to this. The table is sorted by speed of the *new* 
construction and grouped by the number of values generated.

 

*Note: This listing involves adding of times from two benchmarks run on the 
same computer, using the same JMH benchmark settings for average time in 
microseconds. An independent test using construction and generation together 
should be conducted to indicate this is a valid approach.*

 

 
||Values||Generation||RandomSource||new||newInstance||lookupNewInstance||createNativeSeed||createNullSeed||createLongSeed||createSelfSeed||createByteArray||
|0|0|SPLIT_MIX_64|4.55|1.89|5.63|8.62|13.48|9.11|0|12.11|
|0|0|KISS|10.66|1.37|2.82|4.07|67.38|10.79|4.53|6.33|
|0|0|WELL_512_A|12.68|1.36|2.59|4.14|64.37|14.00|4.88|10.26|
|0|0|WELL_1024_A|17.31|1.36|2.37|3.39|45.20|8.31|5.66|13.54|
|0|0|JDK|17.49|0.79|1.93|2.34|4.54|2.82|0|3.68|
|0|0|XOR_SHIFT_1024_S|19.81|1.28|2.10|2.60|75.25|12.26|3.55|7.24|
|0|0|MWC_256|234.62|0.97|1.28|1.18|4.50|2.09|2.37|6.70|
|0|0|WELL_19937_C|247.23|1.10|0.99|1.20|4.58|2.29|4.51|14.01|
|0|0|WELL_19937_A|267.32|0.91|1.17|1.11|4.89|2.13|4.65|12.97|
|0|0|WELL_44497_B|550.72|1.16|1.03|0.97|2.93|2.41|4.50|16.66|
|0|0|WELL_44497_A|594.5|0.98|0.89|1.06|3.06|1.87|5.32|12.29|
|0|0|ISAAC|1230.51|0.97|0.91|0.86|1.44|1.03|1.19|2.29|
|0|0|MT_64|3568.38|1.05|0.95|1.00|0.79|0.54|0.46|0.98|
|0|0|MT|5501.32|1.00|1.00|1.06|0.80|0.68|0.62|1.38|
|0|0|TWO_CMRES|66868.49|1.01|0.95|1.12|1.10|1.09|0|1.08|
|1|0.01|SPLIT_MIX_64|4.56|1.88|5.62|8.60|13.45|9.10|0|12.09|
|1|0.01|KISS|10.67|1.37|2.81|4.07|67.32|10.78|4.53|6.33|
|1|0.01|WELL_512_A|12.69|1.36|2.58|4.14|64.32|13.99|4.88|10.25|
|1|0.01|WELL_1024_A|17.32|1.36|2.37|3.39|45.17|8.31|5.66|13.54|
|1|0.02|JDK|17.51|0.79|1.93|2.34|4.53|2.82|0|3.68|
|1|0.01|XOR_SHIFT_1024_S|19.82|1.28|2.10|2.60|75.21|12.25|3.55|7.24|
|1|0.01|MWC_256|234.63|0.97|1.28|1.18|4.50|2.09|2.37|6.70|
|1|0.01|WELL_19937_C|247.24|1.10|0.99|1.20|4.58|2.29|4.51|14.01|
|1|0.01|WELL_19937_A|267.33|0.91|1.17|1.11|4.89|2.13|4.65|12.97|
|1|0.01|WELL_44497_B|550.73|1.16|1.03|0.97|2.93|2.41|4.50|16.66|
|1|0.01|WELL_44497_A|594.51|0.98|0.89|1.06|3.06|1.87|5.32|12.29|
|1|0.01|ISAAC|1230.52|0.97|0.91|0.86|1.44|1.03|1.19|2.29|
|1|0.01|MT_64|3568.39|1.05|0.95|1.00|0.79|0.54|0.46|0.98|
|1|0.01|MT|5501.33|1.00|1.00|1.06|0.80|0.68|0.62|1.38|
|1|0.01|TWO_CMRES|66868.5|1.01|0.95|1.12|1.10|1.09|0|1.08|
|100|0.44|SPLIT_MIX_64|4.99|1.81|5.22|7.95|12.38|8.40|0|11.13|
|100|0.61|KISS|11.27|1.35|2.72|3.91|63.79|10.26|4.34|6.04|
|100|0.79|WELL_512_A|13.47|1.34|2.49|3.96|60.65|13.24|4.65|9.72|
|100|0.86|WELL_1024_A|18.17|1.34|2.31|3.28|43.11|7.97|5.44|12.95|
|100|1.91|JDK|19.4|0.81|1.84|2.21|4.19|2.64|0|3.42|
|100|0.54|XOR_SHIFT_1024_S|20.35|1.27|2.07|2.56|73.28|11.96|3.48|7.08|
|100|0.45|MWC_256|235.07|0.97|1.28|1.18|4.49|2.09|2.37|6.69|
|100|1.06|WELL_19937_C|248.29|1.09|0.99|1.20|4.57|2.28|4.50|13.95|
|100|0.98|WELL_19937_A|268.3|0.91|1.17|1.11|4.87|2.12|4.64|12.93|
|100|1.09|WELL_44497_B|551.81|1.16|1.03|0.97|2.93|2.41|4.49|16.63|
|100|1.05|WELL_44497_A|595.55|0.98|0.89|1.06|3.06|1.86|5.31|12.27|
|100|0.62|ISAAC|1231.13|0.97|0.91|0.86|1.44|1.03|1.19|2.29|
|100|0.67|MT_64|3569.05|1.05|0.95|1.00|0.79|0.54|0.46|0.98|
|100|0.66|MT|5501.98|1.00|1.00|1.06|0.80|0.68|0.62|1.38|
|100|0.54|TWO_CMRES|66869.03|1.01|0.95|1.12|1.10|1.09|0|1.08|
|10000|42.74|SPLIT_MIX_64|47.29|1.09|1.45|1.73|2.20|1.78|0|2.07|
|10000|64.41|KISS|75.07|1.05|1.26|1.44|10.43|2.39|1.50|1.76|
|10000|74.64|WELL_512_A|87.32|1.05|1.23|1.46|10.20|2.89|1.56|2.34|
|10000|76.65|WELL_1024_A|93.96|1.07|1.25|1.44|9.14|2.35|1.86|3.31|
|10000|225.95|JDK|243.44|0.99|1.07|1.10|1.25|1.13|0|1.19|
|10000|55.29|XOR_SHIFT_1024_S|75.1|1.07|1.29|1.42|20.59|3.97|1.67|2.65|
|10000|43.09|MWC_256|277.71|0.98|1.24|1.16|3.96|1.92|2.16|5.82|
|10000|105.91|WELL_19937_C|353.14|1.07|0.99|1.14|3.51|1.90|3.46|10.11|
|10000|100.37|WELL_19937_A|367.69|0.94|1.12|1.08|3.83|1.82|3.66|9.70|
|10000|111.56|WELL_44497_B|662.28|1.14|1.02|0.97|2.60|2.18|3.91|14.02|
|10000|104.18|WELL_44497_A|698.68|0.98|0.91|1.05|2.76|1.74|4.68|10.61|
|10000|61.19|ISAAC|1291.7|0.97|0.92|0.86|1.42|1.03|1.18|2.23|
|10000|66.40|MT_64|3634.78|1.05|0.95|1.00|0.79|0.55|0.47|0.98|
|10000|64.42|MT|5565.74|1.00|1.00|1.06|0.80|0.68|0.62|1.38|
|10000|49.90|TWO_CMRES|66918.39|1.01|0.95|1.12|1.10|1.09|0|1.08|
|1000000|3984.56|SPLIT_MIX_64|3989.11|1.00|1.01|1.01|1.01|1.01|0|1.01|
|1000000|6032.31|KISS|6042.97|1.00|1.00|1.01|1.12|1.02|1.01|1.01|
|1000000|7356.91|WELL_512_A|7369.59|1.00|1.00|1.01|1.11|1.02|1.01|1.02|
|1000000|7444.91|WELL_1024_A|7462.22|1.00|1.00|1.01|1.10|1.02|1.01|1.03|
|1000000|19155.09|JDK|19172.58|1.00|1.00|1.00|1.00|1.00|0|1.00|
|1000000|4889.82|XOR_SHIFT_1024_S|4909.63|1.00|1.00|1.01|1.30|1.05|1.01|1.03|
|1000000|4133.11|MWC_256|4367.73|1.00|1.02|1.01|1.19|1.06|1.07|1.31|
|1000000|10572.00|WELL_19937_C|10819.23|1.00|1.00|1.00|1.08|1.03|1.08|1.30|
|1000000|10005.49|WELL_19937_A|10272.81|1.00|1.00|1.00|1.10|1.03|1.10|1.31|
|1000000|12121.27|WELL_44497_B|12671.99|1.01|1.00|1.00|1.08|1.06|1.15|1.68|
|1000000|11065.87|WELL_44497_A|11660.37|1.00|0.99|1.00|1.11|1.04|1.22|1.58|
|1000000|5976.62|ISAAC|7207.13|0.99|0.99|0.98|1.08|1.00|1.03|1.22|
|1000000|6190.41|MT_64|9758.79|1.02|0.98|1.00|0.92|0.83|0.80|0.99|
|1000000|6635.87|MT|12137.19|1.00|1.00|1.03|0.91|0.86|0.83|1.17|
|1000000|4580.57|TWO_CMRES|71449.06|1.01|0.96|1.11|1.10|1.08|0|1.07|

The effect of construction and generation can be more easily observed using a 
table of the relative contribution of the generation time to the total time. 
Here the total time is made using construction with the *new* keyword or via 
the default *{{RandomSource.create}}* method with a *native* seed or *null* 
seed. Note: Times for the *create* method are relative to *new* (as above) but 
the generating % is computed using the total time.
||Values||Generation||RandomSource||new||Generating 
(%)||createNativeSeed||Generating (%)||createNullSeed||Generating (%)||
|0|0|SPLIT_MIX_64|4.55|0.00|8.62|0.00|13.48|0.00|
|0|0|KISS|10.66|0.00|4.07|0.00|67.38|0.00|
|0|0|WELL_512_A|12.68|0.00|4.14|0.00|64.37|0.00|
|0|0|WELL_1024_A|17.31|0.00|3.39|0.00|45.2|0.00|
|0|0|JDK|17.49|0.00|2.34|0.00|4.54|0.00|
|0|0|XOR_SHIFT_1024_S|19.81|0.00|2.6|0.00|75.25|0.00|
|0|0|MWC_256|234.62|0.00|1.18|0.00|4.5|0.00|
|0|0|WELL_19937_C|247.23|0.00|1.2|0.00|4.58|0.00|
|0|0|WELL_19937_A|267.32|0.00|1.11|0.00|4.89|0.00|
|0|0|WELL_44497_B|550.72|0.00|0.97|0.00|2.93|0.00|
|0|0|WELL_44497_A|594.5|0.00|1.06|0.00|3.06|0.00|
|0|0|ISAAC|1230.51|0.00|0.86|0.00|1.44|0.00|
|0|0|MT_64|3568.38|0.00|1|0.00|0.79|0.00|
|0|0|MT|5501.32|0.00|1.06|0.00|0.8|0.00|
|0|0|TWO_CMRES|66868.49|0.00|1.12|0.00|1.1|0.00|
|1|0.01|SPLIT_MIX_64|4.56|0.22|8.6|0.03|13.45|0.02|
|1|0.01|KISS|10.67|0.09|4.07|0.02|67.32|0.00|
|1|0.01|WELL_512_A|12.69|0.08|4.14|0.02|64.32|0.00|
|1|0.01|WELL_1024_A|17.32|0.06|3.39|0.02|45.17|0.00|
|1|0.02|JDK|17.51|0.11|2.34|0.05|4.53|0.03|
|1|0.01|XOR_SHIFT_1024_S|19.82|0.05|2.6|0.02|75.21|0.00|
|1|0.01|MWC_256|234.63|0.00|1.18|0.00|4.5|0.00|
|1|0.01|WELL_19937_C|247.24|0.00|1.2|0.00|4.58|0.00|
|1|0.01|WELL_19937_A|267.33|0.00|1.11|0.00|4.89|0.00|
|1|0.01|WELL_44497_B|550.73|0.00|0.97|0.00|2.93|0.00|
|1|0.01|WELL_44497_A|594.51|0.00|1.06|0.00|3.06|0.00|
|1|0.01|ISAAC|1230.52|0.00|0.86|0.00|1.44|0.00|
|1|0.01|MT_64|3568.39|0.00|1|0.00|0.79|0.00|
|1|0.01|MT|5501.33|0.00|1.06|0.00|0.8|0.00|
|1|0.01|TWO_CMRES|66868.5|0.00|1.12|0.00|1.1|0.00|
|100|0.44|SPLIT_MIX_64|4.99|8.82|7.95|1.11|12.38|0.71|
|100|0.61|KISS|11.27|5.41|3.91|1.38|63.79|0.08|
|100|0.79|WELL_512_A|13.47|5.86|3.96|1.48|60.65|0.10|
|100|0.86|WELL_1024_A|18.17|4.73|3.28|1.44|43.11|0.11|
|100|1.91|JDK|19.4|9.85|2.21|4.45|4.19|2.35|
|100|0.54|XOR_SHIFT_1024_S|20.35|2.65|2.56|1.04|73.28|0.04|
|100|0.45|MWC_256|235.07|0.19|1.18|0.16|4.49|0.04|
|100|1.06|WELL_19937_C|248.29|0.43|1.2|0.36|4.57|0.09|
|100|0.98|WELL_19937_A|268.3|0.37|1.11|0.33|4.87|0.08|
|100|1.09|WELL_44497_B|551.81|0.20|0.97|0.20|2.93|0.07|
|100|1.05|WELL_44497_A|595.55|0.18|1.06|0.17|3.06|0.06|
|100|0.62|ISAAC|1231.13|0.05|0.86|0.06|1.44|0.03|
|100|0.67|MT_64|3569.05|0.02|1|0.02|0.79|0.02|
|100|0.66|MT|5501.98|0.01|1.06|0.01|0.8|0.01|
|100|0.54|TWO_CMRES|66869.03|0.00|1.12|0.00|1.1|0.00|
|10000|42.74|SPLIT_MIX_64|47.29|90.38|1.73|52.24|2.2|41.08|
|10000|64.41|KISS|75.07|85.80|1.44|59.58|10.43|8.23|
|10000|74.64|WELL_512_A|87.32|85.48|1.46|58.55|10.2|8.38|
|10000|76.65|WELL_1024_A|93.96|81.58|1.44|56.65|9.14|8.93|
|10000|225.95|JDK|243.44|92.82|1.1|84.38|1.25|74.25|
|10000|55.29|XOR_SHIFT_1024_S|75.1|73.62|1.42|51.85|20.59|3.58|
|10000|43.09|MWC_256|277.71|15.52|1.16|13.38|3.96|3.92|
|10000|105.91|WELL_19937_C|353.14|29.99|1.14|26.31|3.51|8.54|
|10000|100.37|WELL_19937_A|367.69|27.30|1.08|25.28|3.83|7.13|
|10000|111.56|WELL_44497_B|662.28|16.84|0.97|17.37|2.6|6.48|
|10000|104.18|WELL_44497_A|698.68|14.91|1.05|14.20|2.76|5.40|
|10000|61.19|ISAAC|1291.7|4.74|0.86|5.51|1.42|3.34|
|10000|66.4|MT_64|3634.78|1.83|1|1.83|0.79|2.31|
|10000|64.42|MT|5565.74|1.16|1.06|1.09|0.8|1.45|
|10000|49.9|TWO_CMRES|66918.39|0.07|1.12|0.07|1.1|0.07|
|1000000|3984.56|SPLIT_MIX_64|3989.11|99.89|1.01|98.90|1.01|98.90|
|1000000|6032.31|KISS|6042.97|99.82|1.01|98.84|1.12|89.13|
|1000000|7356.91|WELL_512_A|7369.59|99.83|1.01|98.84|1.11|89.94|
|1000000|7444.91|WELL_1024_A|7462.22|99.77|1.01|98.78|1.1|90.70|
|1000000|19155.09|JDK|19172.58|99.91|1|99.91|1|99.91|
|1000000|4889.82|XOR_SHIFT_1024_S|4909.63|99.60|1.01|98.61|1.3|76.61|
|1000000|4133.11|MWC_256|4367.73|94.63|1.01|93.69|1.19|79.52|
|1000000|10572|WELL_19937_C|10819.23|97.71|1|97.71|1.08|90.48|
|1000000|10005.49|WELL_19937_A|10272.81|97.40|1|97.40|1.1|88.54|
|1000000|12121.27|WELL_44497_B|12671.99|95.65|1|95.65|1.08|88.57|
|1000000|11065.87|WELL_44497_A|11660.37|94.90|1|94.90|1.11|85.50|
|1000000|5976.62|ISAAC|7207.13|82.93|0.98|84.62|1.08|76.78|
|1000000|6190.41|MT_64|9758.79|63.43|1|63.43|0.92|68.95|
|1000000|6635.87|MT|12137.19|54.67|1.03|53.08|0.91|60.08|
|1000000|4580.57|TWO_CMRES|71449.06|6.41|1.11|5.78|1.1|5.83|

 

Creating 0, 1, or 100 values.

The construction time is the dominant component of the total time. In this use 
case it will be valid to optimise construction speed.

The construction without a seed highlights the slow speed of the seed 
generation by the factory method. To generate 100 values using a SplitMix64 
without a seed takes over 99% of the time constructing the generator.

 

Creating 10000 values.

For the fast to construct generators the generation time is the major component 
of the run-time. All of these generators have either a primitive seed or a seed 
array with a size less than 32. The difference of using the 
{{RandomSource.create}} is noticeable and optimisation of this will be 
beneficial.

Compared to the above generating 10000 values using a SplitMix64 without a seed 
takes 59% of the time constructing the generator, with a seed takes 48% of the 
time but a call to *new* would take 10% of the time constructing the generator. 
Using reflection to create the object is slow.

The slow to construct generators (MT, MT_64, TWO_CMRES) have internal seeding 
routines that dominate total time.

 

Creating 1000000 values.

Only the slow to construct generators (MT, MT_64, TWO_CMRES) with internal 
seeding routines part of the algorithm are still dominated by construction cost.

 

Improvements ideas:
* Generators with an array seed construct a new array (zero filled) and call 
the method to fill the array using the input seed. In the case where the input 
seed is the correct size then seed filling can be skipped and the seed copied 
using a call to {{Arrays.copyOf}} which should be faster as it is JVM intrinsic 
and avoids zero fill.
* The MWC_256 is interesting as it has a seed size of 257. However the 
construction creates a temp array which is filled and then later copied to the 
state which has a size of 256. If the input seed is the correct size then this 
allocation for seed filling is redundant. This provides an optimisation.



> Create a RandomSource.create benchmark
> --------------------------------------
>
>                 Key: RNG-72
>                 URL: https://issues.apache.org/jira/browse/RNG-72
>             Project: Commons RNG
>          Issue Type: New Feature
>          Components: simple
>    Affects Versions: 1.3
>            Reporter: Alex D Herbert
>            Assignee: Alex D Herbert
>            Priority: Minor
>              Labels: performance-benchmark
>             Fix For: 1.3
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> The recommended method to construct a {{UniformRandomProvider}} is to use, 
> e.g.:
> {code:java}
> import org.apache.commons.rng.UniformRandomProvider;
> import org.apache.commons.rng.simple.RandomSource;
> UniformRandomProvider rng = RandomSource.create(RandomSource.MWC_256);
> {code}
> The factory method knows the type of seed required for the constructor and 
> generates one as appropriate.
> This factory method could be made more efficient, in particular:
>  * Reducing synchronisation around the single source of random seed data
>  * Adding knowledge of the required seed size for arrays
>  * Changing internal data structures, e.g. {{Map<Class<?>, 
> SeedConverter<?,?>>}} can be changed to {{Map<SeedType, SeedConverter<?,?>>}} 
> using an {{EnumMap}} if a new enum {{SeedType}} was created for all the 
> supported seeds (currently 4 types).
>  * Add a new interface to replace {{SeedConverter<?,?>.convert()}} with a 
> {{.convert(int outputArraySize)}} method to allow conversions to generate 
> appropriately sized arrays. The parameter can be ignored for non-array 
> conversions but could optimise array conversions.
> This ticket is to add a JMH benchmark to compare the speed of construction of 
> all the providers using:
>  * Their native constructor
>  * {{RandomSource}} using the native seed of the correct size (calls a 
> constructor using reflection)
>  * {{RandomSource}} using a non native seed (requires seed conversion)
>  * {{RandomSource}} using no seed (requires seed generation)
> The report will be posted here. It could be added to the user guide for 
> reference.
> This work is motivated by the new {{XorShiRo}} generators in version 1.3 that 
> have a native array seed size of 2, 4, or 8. The current {{RandomSource}} 
> create method will generate a fixed seed of length 128 for seeding.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to