[
https://issues.apache.org/jira/browse/RNG-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517402#comment-17517402
]
Alex Herbert commented on RNG-175:
----------------------------------
I think option 1 is similar to the functionality of all other providers which
use an array seed and change all zero output to have some non-zero bits. In
effect any RandomSource that has additional seeding requirements that are not
met by the input source of randomness can override the input source to ensure
the seed is good.
> RandomSource.MSWS createByteArraySeed will infinite loop with faulty
> UniformRandomProvider
> ------------------------------------------------------------------------------------------
>
> Key: RNG-175
> URL: https://issues.apache.org/jira/browse/RNG-175
> Project: Commons RNG
> Issue Type: Bug
> Components: simple
> Affects Versions: 1.4
> Reporter: Alex Herbert
> Priority: Trivial
>
> The seeding routine for the Middle Square Weyl Sequence (MSWS) generator is a
> custom routine that generates a hex permutation for part of the seed. This
> requires a source of randomness.
> The following public API method provides an input source of randomness to
> generate a byte[] seed:
> {code:java}
> public byte[] createSeed(UniformRandomProvider rng);
> {code}
> The method used by the MSWS routine will directly use the input
> UniformRandomProvider. If this is non functional then the routine will
> infinite loop. This test fails:
> {code:java}
> @Test
> void testMSWSCreateSeed() {
> final UniformRandomProvider broken = new LongProvider() {
> @Override
> public long next() {
> return 0;
> }
> };
> Assertions.assertTimeoutPreemptively(Duration.ofMillis(100), () -> {
> RandomSource.MSWS.createSeed(broken);
> });
> }
> {code}
> This occurs as the seed generation uses an unbiased random permutation that
> uses the nextInt() method from the UniformRandomProvider to implement a
> custom shuffle routine. If the values from nextInt() generate rejections then
> another value is obtained. This will infinite loop if the values are always
> rejected.
> h2. Note
> This issue was discovered when implementing RNG-174 which improves support
> for non-zero seeds. All RandomSource entries should allow construction of
> seeds with a source RNG that will may generate an all zero seed. To force
> generation of an all zero seed for testing is most easily done using a broken
> input RNG as above to the seeding routine.
> h2. Possible solutions
> # Fix the routine to be robust to a broken input RNG
> # Mark as won't fix. The use of a broken input RNG is an edge case. Any
> broken input RNG is incorrect usage of the API. If you do not provide a good
> source of randomness to create a seed then the seed routine will fail to
> create a suitable seed.
> # Fix the infinite loop by raising an exception inside the seed generation
> routine after a set number of int values have been used.
> I prefer option 1 as it satisfies the functionality to generate a seed, no
> matter how bad the source of randomness. Option 2 has the potential to
> infinite loop which should be avoided. Option3 will tell a user they have a
> bad RNG, but it will raise errors which may be unexpected.
> The bug can be fixed by switching the source of randomness when the input
> source has proven to be invalid.
> A small test provides a count of the mean, min, and max number of calls to
> nextInt(). This is 10 repeats of 2^20 calls to generate the hex permutation.
> {noformat}
> 4.217280387878418 4 6
> 4.217123985290527 4 6
> 4.216763496398926 4 6
> 4.217494010925293 4 6
> 4.217161178588867 4 6
> 4.21730899810791 4 6
> 4.217828750610352 4 6
> 4.216683387756348 4 6
> 4.21721076965332 4 6
> 4.216983795166016 4 6
> {noformat}
> A suitable generator should have at least 32*4 bits of state and ideally be
> equidistributed over 6 ints. There are suitable generators in the library to
> provide this functionality. However in the correct usage the input generator
> will not be broken and ideally should be used directly. A solution would be
> to wrap the input generator and switch to a default implementation when the
> provided generator proves to be faulty:
> {code:java}
> @Override
> protected byte[] createByteArraySeed(UniformRandomProvider source) {
> // The seed requires approximately 4-6 calls to nextInt().
> // Wrap the input and switch to a default if the input is faulty.
> final UniformRandomProvider wrapped = new IntProvider() {
> private int calls;
> private UniformRandomProvider backup;
> @Override
> public int next() {
> if (calls++ < 100) {
> return source.nextInt();
> }
> // The input source is broken.
> if (backup == null) {
> backup = new SplitMix64(source.nextLong());
> }
> return backup.nextInt();
> }
> @Override
> public long nextLong() {
> // No specific requirements so always use the source
> return source.nextLong();
> }
> };
> return NativeSeedType.convertSeedToBytes(createMswsSeed(wrapped));
> }
> {code}
> This solution should provide backward compatibility if the input RNG is
> functional. The routine will create the same seed bytes as previous versions
> of the library with a good RNG. Exceeding 100 calls to nextInt(int) is
> extremely unlikely if the output is a good random source. Switching to a
> default generator seeded from the input ensures the method will produce the
> same seed with repeat calls of the same generator (initialised at the same
> state).
> Here a SplitMix64 is used. This is robust to any input seed. It is only 2
> equidistributed for int output. Use of a 4-equidistributed generator (in long
> output) such as XoShiRo256PlusPlus is possible (this is 8-equidistributed for
> int output). However this is not robust to any seed as it cannot accept all
> zeros. If the input source RNG is broken then the seeding of the back-up RNG
> should be robust and the 4-equidistributed requirement has already been
> broken by the faulty input RNG.
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)