Alex Herbert created RNG-175:
--------------------------------

             Summary: RandomSource.MSWS createByteArraySeed will infinite loop 
with faulty UniformRandomProvider
                 Key: RNG-175
                 URL: https://issues.apache.org/jira/browse/RNG-175
             Project: Commons RNG
          Issue Type: Bug
          Components: simple
    Affects Versions: 1.4
            Reporter: Alex Herbert


The seeding routine for the Middle Square Weyl Sequence (MSWS) generator is a 
custom routine that generates a hex permutation for part of the seed. This 
requires a source of randomness.

The following public API method provides an input source of randomness to 
generate a byte[] seed:
{code:java}
public byte[] createSeed(UniformRandomProvider rng);
{code}
The method used by the MSWS routine will directly use the input 
UniformRandomProvider. If this is non functional then the routine will infinite 
loop. This test fails:
{code:java}
@Test
void testMSWSCreateSeed() {
    final UniformRandomProvider broken = new LongProvider() {
        @Override
        public long next() {
            return 0;
        }
    };
    Assertions.assertTimeoutPreemptively(Duration.ofMillis(100), () -> {
        RandomSource.MSWS.createSeed(broken);
    });
}
{code}
This occurs as the seed generation uses an unbiased random permutation that 
uses the nextInt() method from the UniformRandomProvider to implement a custom 
shuffle routine. If the values from nextInt() generate rejections then another 
value is obtained. This will infinite loop if the values are always rejected.
h2. Note

This issue was discovered when implementing RNG-174 which improves support for 
non-zero seeds. All RandomSource entries should allow construction of seeds 
with a source RNG that will may generate an all zero seed. To force generation 
of an all zero seed for testing is most easily done using a broken input RNG as 
above to the seeding routine.
h2. Possible solutions
 # Fix the routine to be robust to a broken input RNG
 # Mark as won't fix. The use of a broken input RNG is an edge case. Any broken 
input RNG is incorrect usage of the API. If you do not provide a good source of 
randomness to create a seed then the seed routine will fail to create a 
suitable seed.
 # Fix the infinite loop by raising an exception inside the seed generation 
routine after a set number of int values have been used.

I prefer option 1 as it satisfies the functionality to generate a seed, no 
matter how bad the source of randomness. Option 2 has the potential to infinite 
loop which should be avoided. Option3 will tell a user they have a bad RNG, but 
it will raise errors which may be unexpected.

The bug can be fixed by switching the source of randomness when the input 
source has proven to be invalid.

A small test provides a count of the mean, min, and max number of calls to 
nextInt(). This is 10 repeats of 2^20 calls to generate the hex permutation.
{noformat}
4.217280387878418  4  6
4.217123985290527  4  6
4.216763496398926  4  6
4.217494010925293  4  6
4.217161178588867  4  6
4.21730899810791  4  6
4.217828750610352  4  6
4.216683387756348  4  6
4.21721076965332  4  6
4.216983795166016  4  6
{noformat}
A suitable generator should have at least 32*4 bits of state and ideally be 
equidistributed over 6 ints. There are suitable generators in the library to 
provide this functionality. However in the correct usage the input generator 
will not be broken and ideally should be used directly. A solution would be to 
wrap the input generator and switch to a default implementation when the 
provided generator proves to be faulty:
{code:java}
@Override
protected byte[] createByteArraySeed(UniformRandomProvider source) {
    // The seed requires approximately 4-6 calls to nextInt().
    // Wrap the input and switch to a default if the input is faulty.
    final UniformRandomProvider wrapped = new IntProvider() {
        private int calls;
        private UniformRandomProvider backup;
        @Override
        public int next() {
            if (calls++ < 100) {
                return source.nextInt();
            }
            // The input source is broken.
            if (backup == null) {
                backup = new SplitMix64(source.nextLong());
            }
            return backup.nextInt();
        }
        @Override
        public long nextLong() {
            // No specific requirements so always use the source
            return source.nextLong();
        }
    };
    return NativeSeedType.convertSeedToBytes(createMswsSeed(wrapped));
}
{code}
This solution should provide backward compatibility if the input RNG is 
functional.  The routine will create the same seed bytes as previous versions 
of the library with a good RNG. Exceeding 100 calls to nextInt(int) is 
extremely unlikely if the output is a good random source. Switching to a 
default generator seeded from the input ensures the method will produce the 
same seed with repeat calls of the same generator (initialised at the same 
state).

Here a SplitMix64 is used. This is robust to any input seed. It is only 2 
equidistributed for int output. Use of a 4-equidistributed generator (in long 
output) such as XoShiRo256PlusPlus is possible (this is 8-equidistributed for 
int output). However this is not robust to any seed as it cannot accept all 
zeros. If the input source RNG is broken then the seeding of the back-up RNG 
should be robust and the 4-equidistributed requirement has already been broken 
by the faulty input RNG.

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to