On 18/03/2019 14:12, Gilles Sadowski wrote:
Hi.

[...]

One actual issue is that we are testing long providers using the long to create 
2 int values. Should we test using a series of the upper 32 bits and then a 
series of the lower 32 bits?
Is that useful since the test now sees the integers as they are produced (i.e. 2
values per long)?

It is not relevant if you are concerned about int quality. But if you are 
concerned about long quality then it is relevant. The long quality is important 
for the quality of nextDouble(). Although in that case only the upper 53 bits 
of the long. This means that the quality of a long from an int provider is also 
not covered by the benchmark as that would require testing alternating ints 
twice using the series: 1, 3, 5…, 2n+1 and 2, 4, 6, … 2n.
I don't follow: I'd think that if the full sequence passes the test,
then "decimated"
sequences will too.

My position was that if a series of int values is random, that does not mean a subset of the int values is random due to bias in the subset sample.

However I acknowledge that:

- the test suites may have this covered already

- if it really is random then any subset will also be random, even if it is a systematic subset such as alternating values

Given that half of the int values were previously discarded from the BigCrush 
analysis, the current results on the user guide page actually represent 
BigCrush running on the upper 32-bits of the long, byte reversed due to the 
big/little endian interpretation of the bytes in Java and linux.

So maybe the an update to the RandomStressTester to support analysis for int or 
long quality is needed.
I'm not convinced.

I'm not totally convinced either. It is a lot more work to test upper and lower bits separately.

It may be that a producer of long values has better randomness in the upper bits. Or put another way has less than 64-bits of randomness.

The question is whether running the test suite on all the bits (as we currently do) or targetting just the upper or lower 32-bits is useful. E.g. would a RNG that fails a few tests using all the bits pass with just the upper 32-bits and fail more with just the lower 32-bits, or would the fails be the same?

Note: The current results for long providers do not test the lower 32-bits at all, and currently test alternating values from any int providers. So they will have to be rerun anyway.

Previously I looked at systematic failures in the test suite (where the same test always fails). IIRC the MersenneTwister has some systematic failures. Since we are not doing systematic failure analysis for the user guide, and we are not developing the algorithms, then I agree that a more detailed analysis of the failures and their origins is beyond the scope of the quality section.

So leave the testing to just ints and document on the user guide that is what we are testing.

For now the quality section on the website should just state that the quality 
is for the ‘nextInt()’ method of the RNG.

I have the results of BigCrush using the new bridge c program:

XorShiftSerialComposite : 40, 39, 39 : 608.2 +/- 3.9
Makes sense now. :-}

So it fails.

The XorShiftXorComposite crashed after 2 hours about 1/4 of the results file 
complete. I am running again so I can monitor it for memory usage. Something in 
the BigCrush suite just cannot handle this generator output.
Strange...

Yep. I restarted it and it crashed after 3 hours again! Monitoring every minute found no obvious memory issues. The BigCrush process never exceeded 2.7% of memory and Java never exceeded 0.1%.

The footer is written by the Java program so this indicates that the TestU01 bridge is stopping then the Java process writes the footer, wraps everything up and stops.

Weirdly my process to follow the output also stopped which is unexpected. I am investigating if my system has some strange walltime limits I do not know about. Since the other composite generators work using the same code I am thinking it may be a bug in TestU01 when the generator is bad (which DieHarder thinks it definitely is). But I am prepared to be wrong on that and also to never find out.

I've changed RandomStressTester to redirect the stderr to stdout (in case that contains any info) and added a line to get the exit code from the Java Process that is running BigCrush. Maybe that will be non-zero. So I re-run and wait 3+ hours again...

Alex



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to