Hello Luc.
On Tue, 26 Jan 2016 09:47:20 +0100, Luc Maisonobe wrote:
Le 26/01/2016 01:13, Gilles a écrit :
Hi.
Hi Gilles,
In CM, a base class "BitsStreamGenerator" contains methods for
generating various primitive types from a bit source (to be
implemented by subclasses by overriding the "next(int bits)"
method).
For example, to get a random "long" value, the following code
is called:
---CUT---
public long nextLong() {
final long high = ((long) next(32)) << 32;
final long low = (next(32)) & 0xffffffffL;
return high | low;
}
---CUT---
If the bit source were provided by the JDK's "Random" class,
is it assumed that the "long" value generated by the above
code will always be equal to the "long" value generated by
the call to JDK's "Random" class?
No. There is nothing said anywhere in the javadoc that would imply
one implementation of nextLong is the same as another implementation.
The only thing that user can exapect is that when they choose one
implementation, it is consistent, i.e. it is "random enough" and
"not biased". Two different implementations can fulfill these
requirements without producing the same sequence.
I note that your statement contradicts Phil's[1]:
"[..] you [...] changed the internal implementation of all
of the Bitstream generators."
"[must] verify that the same sequences are generated by the
3_X code and the refactored code [...]"
Minor point: I think that there is a possible ambiguity with
"JDKRandomGenerator" when the doc indicates
"[an] adapter that delegates the random number generation to
the standard [Random] class".
In fact, there are two possible cases:
(a) delegating all methods to "Random", or
(b) only use the source of randomness ("next(int bits)") from
"Random" and create the "long" with the algo provided in CM.
CM implicitly assumes case (a).
A user could infer it from the HTML doc of "JDKRandomGenerator",
as it does not inherit from "BitsStreamGenerator".
But in "RandomGeneratorFactory", it's more confusing.
[By the way, those two are duplicate implementations of the same
functionality.]
In the new "rng" package, I'll actually provide case (b) which
is more consistent with how the other "sources of randomness"
are used by "BitsStreamGenerator" to provide all the "nextXxx"
methods.
In other words, how "standard" are the default implementations
of the methods defined in "BitsStreamGenerator"?
I don't think there are any standard to abide by.
OK, but how do we detect a suboptimal/buggy algorithm?
And what are the compatibility requirements with respect to
reproducibility of the sequences (between releases of the
library)?
It's a grey zone, IMHO. It is nice to preserve reproducibility
as changing it changes the reference test cases with fixed seeds.
AFAIK, having one seed pass the tests does not imply that the RNG
is good.
On the other hand, too much compatibility really is a pain, we
feel it constantly, so I would say that as soon as we need to
change the sequence, we can just do it, and check all the tests
we know of (i.e. those of our own library) still pass and we
go forward.
Are those tests with sample size of 1000 strong enough to detect
that an implementation does not uphold the properties of a RNG?
I mean, isn't it actually enough that
1. a CM unit test verifies that N numbers (i.e. the "int" values
returned by "next(32)") are the same as a set of "reference"
values,
2. a CM unit test verifies that the conversion from "int" to each of
the other types is correct (hence the need to have a reference
for those routines as well)
?
If tests (1) and (2) pass, then the CM the uniformity tests seem to
try and show that the reference algorithm itself is wrong. Which
it'll have a hard time achieving with a small sample. Even more so
with a single, fixed, seed.
In the absence of (2), the uniformity tests are actually testing
the conversion routines, with each of the different sources of
randomness. Testing with a single source should be enough since if
there is a bug in the conversion routines, the same tests will fail
for any source of randomness.
Best regards,
Gilles
[1] http://markmail.org/message/ewat3qekz7ec5q6y
best regards,
Luc
Regards,
Gilles
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org