Alex Herbert created RNG-175:
--------------------------------
Summary: RandomSource.MSWS createByteArraySeed will infinite loop
with faulty UniformRandomProvider
Key: RNG-175
URL: https://issues.apache.org/jira/browse/RNG-175
Project: Commons RNG
Issue Type: Bug
Components: simple
Affects Versions: 1.4
Reporter: Alex Herbert
The seeding routine for the Middle Square Weyl Sequence (MSWS) generator is a
custom routine that generates a hex permutation for part of the seed. This
requires a source of randomness.
The following public API method provides an input source of randomness to
generate a byte[] seed:
{code:java}
public byte[] createSeed(UniformRandomProvider rng);
{code}
The method used by the MSWS routine will directly use the input
UniformRandomProvider. If this is non functional then the routine will infinite
loop. This test fails:
{code:java}
@Test
void testMSWSCreateSeed() {
final UniformRandomProvider broken = new LongProvider() {
@Override
public long next() {
return 0;
}
};
Assertions.assertTimeoutPreemptively(Duration.ofMillis(100), () -> {
RandomSource.MSWS.createSeed(broken);
});
}
{code}
This occurs as the seed generation uses an unbiased random permutation that
uses the nextInt() method from the UniformRandomProvider to implement a custom
shuffle routine. If the values from nextInt() generate rejections then another
value is obtained. This will infinite loop if the values are always rejected.
h2. Note
This issue was discovered when implementing RNG-174 which improves support for
non-zero seeds. All RandomSource entries should allow construction of seeds
with a source RNG that will may generate an all zero seed. To force generation
of an all zero seed for testing is most easily done using a broken input RNG as
above to the seeding routine.
h2. Possible solutions
# Fix the routine to be robust to a broken input RNG
# Mark as won't fix. The use of a broken input RNG is an edge case. Any broken
input RNG is incorrect usage of the API. If you do not provide a good source of
randomness to create a seed then the seed routine will fail to create a
suitable seed.
# Fix the infinite loop by raising an exception inside the seed generation
routine after a set number of int values have been used.
I prefer option 1 as it satisfies the functionality to generate a seed, no
matter how bad the source of randomness. Option 2 has the potential to infinite
loop which should be avoided. Option3 will tell a user they have a bad RNG, but
it will raise errors which may be unexpected.
The bug can be fixed by switching the source of randomness when the input
source has proven to be invalid.
A small test provides a count of the mean, min, and max number of calls to
nextInt(). This is 10 repeats of 2^20 calls to generate the hex permutation.
{noformat}
4.217280387878418 4 6
4.217123985290527 4 6
4.216763496398926 4 6
4.217494010925293 4 6
4.217161178588867 4 6
4.21730899810791 4 6
4.217828750610352 4 6
4.216683387756348 4 6
4.21721076965332 4 6
4.216983795166016 4 6
{noformat}
A suitable generator should have at least 32*4 bits of state and ideally be
equidistributed over 6 ints. There are suitable generators in the library to
provide this functionality. However in the correct usage the input generator
will not be broken and ideally should be used directly. A solution would be to
wrap the input generator and switch to a default implementation when the
provided generator proves to be faulty:
{code:java}
@Override
protected byte[] createByteArraySeed(UniformRandomProvider source) {
// The seed requires approximately 4-6 calls to nextInt().
// Wrap the input and switch to a default if the input is faulty.
final UniformRandomProvider wrapped = new IntProvider() {
private int calls;
private UniformRandomProvider backup;
@Override
public int next() {
if (calls++ < 100) {
return source.nextInt();
}
// The input source is broken.
if (backup == null) {
backup = new SplitMix64(source.nextLong());
}
return backup.nextInt();
}
@Override
public long nextLong() {
// No specific requirements so always use the source
return source.nextLong();
}
};
return NativeSeedType.convertSeedToBytes(createMswsSeed(wrapped));
}
{code}
This solution should provide backward compatibility if the input RNG is
functional. The routine will create the same seed bytes as previous versions
of the library with a good RNG. Exceeding 100 calls to nextInt(int) is
extremely unlikely if the output is a good random source. Switching to a
default generator seeded from the input ensures the method will produce the
same seed with repeat calls of the same generator (initialised at the same
state).
Here a SplitMix64 is used. This is robust to any input seed. It is only 2
equidistributed for int output. Use of a 4-equidistributed generator (in long
output) such as XoShiRo256PlusPlus is possible (this is 8-equidistributed for
int output). However this is not robust to any seed as it cannot accept all
zeros. If the input source RNG is broken then the seeding of the back-up RNG
should be robust and the 4-equidistributed requirement has already been broken
by the faulty input RNG.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)