[
https://issues.apache.org/jira/browse/RNG-175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alex Herbert resolved RNG-175.
------------------------------
Fix Version/s: 1.5
Resolution: Fixed
Fixed to fall back to a default generator after 100 int values fail to create a
seed. This will occur when 100 consecutive int values are zero. (Note: A fixed
value with non-zero bits will create a seed.) Probability is 1 in 2^3200, or
approximately 5e-964:
{noformat}
jshell> new BigDecimal(2).pow(-3200, MathContext.DECIMAL32)
$77 ==> 5.058408E-964
{noformat}
Added in commit:
d01b5051799cd60f7b675e4ee1680bc35079b436
> RandomSource.MSWS createByteArraySeed will infinite loop with faulty
> UniformRandomProvider
> ------------------------------------------------------------------------------------------
>
> Key: RNG-175
> URL: https://issues.apache.org/jira/browse/RNG-175
> Project: Commons RNG
> Issue Type: Bug
> Components: simple
> Affects Versions: 1.4
> Reporter: Alex Herbert
> Priority: Trivial
> Fix For: 1.5
>
>
> The seeding routine for the Middle Square Weyl Sequence (MSWS) generator is a
> custom routine that generates a hex permutation for part of the seed. This
> requires a source of randomness.
> The following public API method provides an input source of randomness to
> generate a byte[] seed:
> {code:java}
> public byte[] createSeed(UniformRandomProvider rng);
> {code}
> The method used by the MSWS routine will directly use the input
> UniformRandomProvider. If this is non functional then the routine will
> infinite loop. This test fails:
> {code:java}
> @Test
> void testMSWSCreateSeed() {
> final UniformRandomProvider broken = new LongProvider() {
> @Override
> public long next() {
> return 0;
> }
> };
> Assertions.assertTimeoutPreemptively(Duration.ofMillis(100), () -> {
> RandomSource.MSWS.createSeed(broken);
> });
> }
> {code}
> This occurs as the seed generation uses an unbiased random permutation that
> uses the nextInt() method from the UniformRandomProvider to implement a
> custom shuffle routine. If the values from nextInt() generate rejections then
> another value is obtained. This will infinite loop if the values are always
> rejected.
> h2. Note
> This issue was discovered when implementing RNG-174 which improves support
> for non-zero seeds. All RandomSource entries should allow construction of
> seeds with a source RNG that will may generate an all zero seed. To force
> generation of an all zero seed for testing is most easily done using a broken
> input RNG as above to the seeding routine.
> h2. Possible solutions
> # Fix the routine to be robust to a broken input RNG
> # Mark as won't fix. The use of a broken input RNG is an edge case. Any
> broken input RNG is incorrect usage of the API. If you do not provide a good
> source of randomness to create a seed then the seed routine will fail to
> create a suitable seed.
> # Fix the infinite loop by raising an exception inside the seed generation
> routine after a set number of int values have been used.
> I prefer option 1 as it satisfies the functionality to generate a seed, no
> matter how bad the source of randomness. Option 2 has the potential to
> infinite loop which should be avoided. Option3 will tell a user they have a
> bad RNG, but it will raise errors which may be unexpected.
> The bug can be fixed by switching the source of randomness when the input
> source has proven to be invalid.
> A small test provides a count of the mean, min, and max number of calls to
> nextInt(). This is 10 repeats of 2^20 calls to generate the hex permutation.
> {noformat}
> 4.217280387878418 4 6
> 4.217123985290527 4 6
> 4.216763496398926 4 6
> 4.217494010925293 4 6
> 4.217161178588867 4 6
> 4.21730899810791 4 6
> 4.217828750610352 4 6
> 4.216683387756348 4 6
> 4.21721076965332 4 6
> 4.216983795166016 4 6
> {noformat}
> A suitable generator should have at least 32*4 bits of state and ideally be
> equidistributed over 6 ints. There are suitable generators in the library to
> provide this functionality. However in the correct usage the input generator
> will not be broken and ideally should be used directly. A solution would be
> to wrap the input generator and switch to a default implementation when the
> provided generator proves to be faulty:
> {code:java}
> @Override
> protected byte[] createByteArraySeed(UniformRandomProvider source) {
> // The seed requires approximately 4-6 calls to nextInt().
> // Wrap the input and switch to a default if the input is faulty.
> final UniformRandomProvider wrapped = new IntProvider() {
> private int calls;
> private UniformRandomProvider backup;
> @Override
> public int next() {
> if (calls++ < 100) {
> return source.nextInt();
> }
> // The input source is broken.
> if (backup == null) {
> backup = new SplitMix64(source.nextLong());
> }
> return backup.nextInt();
> }
> @Override
> public long nextLong() {
> // No specific requirements so always use the source
> return source.nextLong();
> }
> };
> return NativeSeedType.convertSeedToBytes(createMswsSeed(wrapped));
> }
> {code}
> This solution should provide backward compatibility if the input RNG is
> functional. The routine will create the same seed bytes as previous versions
> of the library with a good RNG. Exceeding 100 calls to nextInt(int) is
> extremely unlikely if the output is a good random source. Switching to a
> default generator seeded from the input ensures the method will produce the
> same seed with repeat calls of the same generator (initialised at the same
> state).
> Here a SplitMix64 is used. This is robust to any input seed. It is only 2
> equidistributed for int output. Use of a 4-equidistributed generator (in long
> output) such as XoShiRo256PlusPlus is possible (this is 8-equidistributed for
> int output). However this is not robust to any seed as it cannot accept all
> zeros. If the input source RNG is broken then the seeding of the back-up RNG
> should be robust and the 4-equidistributed requirement has already been
> broken by the faulty input RNG.
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)