[
https://issues.apache.org/jira/browse/RNG-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16691772#comment-16691772
]
Alex D Herbert commented on RNG-57:
-----------------------------------
I have the results of DieHarder and BigCrush for the {{LongProvider}} RNGs.
Parsing DieHarder is easy. Just count FAILED except for the diehard_sums test.
Parsing BigCrush I just count the number of failed tests in the summary at the
end.
I have found discrepancies in the [user
guide|https://commons.apache.org/proper/commons-rng/userguide/rng.html] table
for the current results. Here is what I find:
||RNG identifier||Dieharder||TestU01 (BigCrush)||
|JDK|14 16 15|78 75 76|
|MT|0 0 0|2 2 2|
|WELL_512_A|0 0 0|6 6 6|
|WELL_1024_A|0 0 0|4 4 5|
|WELL_19937_A|0 0 0|2 3 4|
|WELL_19937_C|0 0 0|2 2 2|
|WELL_44497_A|0 0 0|3 3 2|
|WELL_44497_B|0 0 0|2 2 3|
|ISAAC|0 0 0|1 0 0|
|MT_64|0 0 0|2 2 3|
|SPLIT_MIX_64|0 0 0|0 0 0|
|XOR_SHIFT_1024_S|0 0 0|0 0 0|
|TWO_CMRES|0 1 0|0 0 0|
|MWC_256|0 0 0|0 0 0|
|KISS|0 0 0|0 1 0|
* WELL_1024_A is wrong for BigCrush (the user guide states 4,6,5)
* WELL_1024_A and WELL_44497B are wrong for Dieharder as the diehard_sums test
failure is not ignored
Here are the results from the modified {{LongProvider}} RNGs:
||RNG identifier||Dieharder||new||TestU01 (BigCrush)||new||
|MT_64|0 0 0|0 0 0|2 2 3|3 4 3|
|SPLIT_MIX_64|0 0 0|0 0 0|0 0 0|0 1 0|
|XOR_SHIFT_1024_S|0 0 0|0 0 0|0 0 0|0 0 0|
|TWO_CMRES|0 0 0|1 1 1|0 0 0|0 0 1|
So the change has made the RNGs slightly worse with the exception of
XOR_SHIFT_1024_S, which still has no failures.
Here's a breakdown of the failed tests:
||RNG identifier||Test||Old||New||
|MT_64|BigCrush LinearComp, r = 0|3|3|
|MT_64|BigCrush LinearComp, r = 29|3|3|
|MT_64|BigCrush HammingIndep, L=300, r=0|1|0|
|MT_64|BigCrush CollisionOver, t = 14|0|1|
|MT_64|BigCrush SampleProd, t = 16|0|1|
|MT_64|BigCrush RandomWalk1 H (L=10000, r=15)|0|1|
|MT_64|BigCrush RandomWalk1 J (L=1000, r=0)|0|1|
|SPLIT_MIX_64|BigCrush ClosePairs mNP2, t = 9|0|1|
|TWO_CMRES|diehard_dna|0|3|
|TWO_CMRES|BigCrush RandomWalk1 C (L=50, r=25)|0|1|
* MT_64 is still bad at the LinearComp test. This tests complexity of each
successive bit from the bit sequence. It also fails the CollisionOver test
which tests the uniformity of overlapping tuples from the sequence. So the
MT_64 sequence is not random when compared to parts of itself. It also fails
the SampleProd test that uses the entire bit sequence. Perhaps this RNG should
be avoided. Note however that the new failures are just outside the allowed
p-values. The LinearComp test fails with p < 1e-15 compared to the allowed
0.001.
* SPLIT_MIX_64 fails a test using the entire bit sequence. The failure is
p=0.9996 which is just outside p=0.9990. This is marginal fail.
* TWO_CMRES is now bad at the diehard_dna test. This is marked as suspect in
the version I installed (3.31.1) and perhaps these should be ignored.
Failures on the RandomWalk tests (MT_64, TWO_CMRES) of BigCrush are
interesting. According to the user guide (p.120)
[http://simul.iro.umontreal.ca/testu01/guideshorttestu01.pdf]
the walk of _{{L}}_ steps is performed using _{{s}}_ bits from each random
{{int}} after dropping _{{r}}_ bits. So the {{RandomWalk r=0}} test is actually
testing the new implementation of {{nextBoolean}} which obtains booleans using
all the bits from a random sample. The results show a few failures in this test
with the new implementation. The test is more rigorous than the random walk
test in the core module as it tests the number of sign changes during the walk
from zero and the time spent on each side. A failure of this test shows that
the cached {{nextBoolean}} is not robust.
Note that the failures are marginal:
{noformat}
79 RandomWalk1 H (L=10000, r=15) 6.9e-4
76 RandomWalk1 J (L=1000, r=0) 4.1e-4
75 RandomWalk1 C (L=50, r=25) 7.5e-4
{noformat}
Mean run times (in minutes):
Old:
||RNG identifier||Dieharder||TestU01 (BigCrush)||
|JDK|268.68 +/- 24.12|1918.44 +/- 10.08|
|MT|189.15 +/- 3.26|1837.33 +/- 20.28|
|WELL_512_A|195.51 +/- 6.70|1827.63 +/- 5.55|
|WELL_1024_A|101.55 +/- 4.76|1864.89 +/- 24.80|
|WELL_19937_A|108.35 +/- 6.14|1914.86 +/- 14.91|
|WELL_19937_C|122.08 +/- 7.79|1922.84 +/- 8.42|
|WELL_44497_A|111.41 +/- 3.44|1923.36 +/- 10.69|
|WELL_44497_B|112.22 +/- 2.57|1926.86 +/- 33.81|
|ISAAC|105.32 +/- 0.85|1837.63 +/- 10.47|
|MT_64|101.64 +/- 2.94|1819.06 +/- 19.88|
|SPLIT_MIX_64|96.12 +/- 4.80|480.92 +/- 19.77|
|XOR_SHIFT_1024_S|99.05 +/- 3.46|508.50 +/- 5.06|
|TWO_CMRES|105.38 +/- 4.22|502.01 +/- 9.68|
|MWC_256|105.99 +/- 13.19|490.15 +/- 4.02|
|KISS|75.80 +/- 3.61|487.72 +/- 7.65|
New:
||RNG identifier||Dieharder||TestU01 (BigCrush)||
|MT_64|102.30 +/- 2.27|1355.75 +/- 10.52|
|SPLIT_MIX_64|101.80 +/- 1.56|1336.16 +/- 12.60|
|XOR_SHIFT_1024_S|107.09 +/- 5.47|1344.89 +/- 12.96|
|TWO_CMRES|108.57 +/- 6.24|1341.70 +/- 13.50|
The BigCrush algorithm was run on an old 12 core workstation using
hyperthreading with all 12 tests at the same time. However each test actually
maxes out 2 CPUs (one process to generate random numbers and one child process
to do the stats analysis). The machine was at max capacity and are in between
the old run times for the BigCrush test. There is a jump in run times for the
original data for the BigCrush results. The OS and HOST is the same in the
results file so I am not sure what happened to speed up the test 3-fold for
some of the RNGs.
The Dieharder test was run on a 16-core hyper threading machine with 8 tests
concurrently. The times are comparable to the original data.
Summary:
* The changes to use cached values slightly increase the number of failures
* The MT_64 RNG is poor and gets worse. Perhaps avoid using this.
* Failures for a Dieharder tests marked as suspect occur.
* Examination of the failures show some occur in linear walk tests. This is
effectively testing the new cached boolean implementation.
* Reading the definitions of the failed tests indicate that some tests apply
to part of the bit sequence of the {{int}}. This effectively tests the new
cached {{nextBoolean}} function and the tests should be repeated for the
{{IntProviders}}
I will post again when the {{IntProviders}} are done. I'll run them all on the
machine that takes about 1300 minutes per test. There are 3 * 11 providers with
a 12-core machine. This may take around 60 hours.
> CachedUniformRandomProvider for nextBoolean() and nextInt()
> -----------------------------------------------------------
>
> Key: RNG-57
> URL: https://issues.apache.org/jira/browse/RNG-57
> Project: Commons RNG
> Issue Type: Improvement
> Components: core
> Reporter: Alex D Herbert
> Priority: Minor
> Labels: performance
> Fix For: 1.2
>
>
> Implement a wrapper around a {{UniformRandomProvider}} that can cache the
> underlying source of random bytes for use in the methods {{nextBoolean()}}
> and {{nextInt()}} (in the case of {{LongProvider}}). E.g.
> {code:java}
> LongProvider provider = RandomSource.create(RandomSource.SPLIT_MIX_64);
> CachedLongProvider rng = new CachedLongProvider(provider);
> // Uses cached nextLong() 64 times
> rng.nextBoolean();
> // Uses cached nextLong() twice
> rng.nextInt();
> IntProvider provider = RandomSource.create(RandomSource.KISS);
> CachedIntProvider rng2 = new CachedIntProvider(provider);
> // Uses cached nextInt() 32 times
> rng2.nextBoolean();
> // This could be wrapped by a factory method:
> UniformRandomProvider rng = CachedUniformRandomProviderFactory.wrap(
> // Any supported source: IntProvider or LongProvider
> RandomSource.create(RandomSource...));
> {code}
> The implementation should be speed tested to determine the benefit for
> {{nextBoolean()}} and if {{nextInt()}} can be improved for {{LongProviders}}.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)