[jira] [Commented] (MATH-1314) RNG: Warn users about "seeding"

Gilles (JIRA) Sun, 10 Jan 2016 15:47:09 -0800

    [ 
https://issues.apache.org/jira/browse/MATH-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15091292#comment-15091292
 ]


Gilles commented on MATH-1314:
------------------------------

bq. \[...\] we cannot do anything more than trusting these experts

Quoting ISAAC's designer (from the web page in the Javadoc):
"I provided no official seeding routine because I didn't feel competent to give 
one."

By this issue, I meant to indicate clearly that it may not be sufficient to 
call the default constructor just because the seed is supposedly random 
(current time and memory location); it could be that, out of bad luck, the 
default seed is not good.

In fact, it seems that there are various strategies for preparing an initial 
state that has more chances of being "good".  Examples are indeed provided in 
the "AbstractWell" and "ISAACRandom".
We could factor those out of the generators and include them in a 
"SeedingUtils" class.
>From the C implementations I've browsed through, it seems that the bit 
>generation is not related to the seed generation: it's up to the user to 
>decide what to put in the data that make up the "state" of the RNG instance.  
>In particular, I have the impression that the standard "setSeed(long)" is only 
>so because of the simplicity of the state of the LCGs.
It would be more correct to consider that setting the seed/state is 
algorithm-dependent.  The obvious example being the *non-Java-standard* 
"setSeed(int[])" introduced for the sake of taking advantage of the full state 
of the WELL generators.
Some authors/experts suggest to initialize the state of one RNG by the output 
of another RNG. And this leads again to my proposal to overhaul the 
"RandomGenerator" interface and remove the "setSeed" methods from it.  For 
legacy users, we could try and figure out a compatibility layer.

bq. I don't think we can provide any tools.

By setting fixed seeds for the unit tests, we effectively use a primitive tool: 
manual random choice until the test passes.  So statistical tests used in 
{{RandomGeneratorAbstractTest}} are indeed able to detect some bad seeds.
Anyway, I think that we could mention the "TestU01" software, if only to point 
out that some good seeds are determined experimentally.
The comparative testing presented on [that page|http://xorshift.di.unimi.it/] 
show that some RNGs fail more often than others.

bq. Concerning the performances, \[...\]

Platform:
Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
openjdk version "1.8.0_72-internal"
OpenJDK Runtime Environment (build 1.8.0_72-internal-b05)
OpenJDK 64-Bit Server VM (build 25.72-b05, mixed mode)

Here is the benchmark output:
{noformat}
nextInt() (calls per timed block: 2000000, timed blocks: 100, time unit: ms)
              name      time/call      std error total time      ratio      
difference
JDKRandomGenerator 1.34103050e-05 2.45357474e-06 2.6821e+03 1.0000e+00  
0.00000000e+00
   MersenneTwister 1.14748419e-05 9.90008146e-07 2.2950e+03 8.5567e-01 
-3.87092621e+02
          Well512a 1.34941515e-05 1.69129854e-06 2.6988e+03 1.0063e+00  
1.67693020e+01
         Well1024a 1.50723034e-05 8.03930622e-07 3.0145e+03 1.1239e+00  
3.32399667e+02
        Well19937a 1.38263157e-05 5.69889705e-06 2.7653e+03 1.0310e+00  
8.32021280e+01
        Well19937c 1.59393823e-05 2.35562440e-06 3.1879e+03 1.1886e+00  
5.05815457e+02
        Well44497a 1.93589129e-05 2.64356495e-06 3.8718e+03 1.4436e+00  
1.18972157e+03
        Well44497b 2.03862615e-05 2.14390340e-06 4.0773e+03 1.5202e+00  
1.39519129e+03
             ISAAC 1.32823857e-05 2.02074578e-06 2.6565e+03 9.9046e-01 
-2.55838710e+01

nextDouble() (calls per timed block: 2000000, timed blocks: 100, time unit: ms)
              name      time/call      std error total time      ratio      
difference
JDKRandomGenerator 2.32352997e-05 1.76107593e-06 4.6471e+03 1.0000e+00  
0.00000000e+00
   MersenneTwister 1.95680107e-05 5.15377258e-07 3.9136e+03 8.4217e-01 
-7.33457809e+02
          Well512a 2.40500380e-05 2.50013851e-06 4.8100e+03 1.0351e+00  
1.62947648e+02
         Well1024a 2.48173367e-05 2.93429135e-06 4.9635e+03 1.0681e+00  
3.16407385e+02
        Well19937a 2.75334969e-05 2.45923272e-06 5.5067e+03 1.1850e+00  
8.59639442e+02
        Well19937c 2.84004673e-05 1.95285979e-06 5.6801e+03 1.2223e+00  
1.03303351e+03
        Well44497a 3.55292049e-05 2.88860968e-06 7.1058e+03 1.5291e+00  
2.45878103e+03
        Well44497b 3.69400758e-05 1.55845565e-06 7.3880e+03 1.5898e+00  
2.74095521e+03
             ISAAC 2.21769477e-05 2.20944591e-06 4.4354e+03 9.5445e-01 
-2.11670411e+02

nextLong() (calls per timed block: 2000000, timed blocks: 100, time unit: ms)
              name      time/call      std error total time      ratio      
difference
JDKRandomGenerator 2.24688238e-05 1.33151962e-06 4.4938e+03 1.0000e+00  
0.00000000e+00
   MersenneTwister 1.81175642e-05 1.63071216e-06 3.6235e+03 8.0634e-01 
-8.70251910e+02
          Well512a 2.27232582e-05 1.06753850e-06 4.5447e+03 1.0113e+00  
5.08868800e+01
         Well1024a 2.27862427e-05 1.59353328e-06 4.5572e+03 1.0141e+00  
6.34837960e+01
        Well19937a 2.50605559e-05 1.00839808e-06 5.0121e+03 1.1153e+00  
5.18346431e+02
        Well19937c 2.68680821e-05 1.11341253e-06 5.3736e+03 1.1958e+00  
8.79851660e+02
        Well44497a 3.29918582e-05 7.18500728e-07 6.5984e+03 1.4683e+00  
2.10460688e+03
        Well44497b 3.47592845e-05 7.83038103e-07 6.9519e+03 1.5470e+00  
2.45809215e+03
             ISAAC 1.99627637e-05 1.40470250e-06 3.9926e+03 8.8847e-01 
-5.01212018e+02
{noformat}

It would be interesting to compare various platforms (HW and JVM).

> RNG: Warn users about "seeding"
> -------------------------------
>
>                 Key: MATH-1314
>                 URL: https://issues.apache.org/jira/browse/MATH-1314
>             Project: Commons Math
>          Issue Type: Wish
>            Reporter: Gilles
>              Labels: doc
>             Fix For: 4.0
>
>
> The "package-info.java" file of {{o.a.c.m.random}} does not mention the 
> problem of seeding.
> Many users of CM could not be aware that it is not sufficient to "randomly" 
> choose a seed in order to ensure a random sequence.
> I think that this is what is illustrated by random failures of some unit 
> tests (when the seed is "randomly" selected).
> Do the intricate initialization procedures provided in some implementations 
> (WELL family and ISAAC) ensure that all seeds are good enough?
> Should we provide some tool to test a seed?
> By the way, the WELL performances listed on [this 
> table|http://commons.apache.org/proper/commons-math/javadocs/api-3.6/org/apache/commons/math3/random/package-summary.html]
>  do not correspond to the results obtained on my machine with our 
> {{PerfTestUtils}} benchmark: the {{MersenneTwister}} is invariably faster 
> than all WELL implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MATH-1314) RNG: Warn users about "seeding"

Reply via email to