[
https://issues.apache.org/jira/browse/RNG-174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517467#comment-17517467
]
Alex Herbert commented on RNG-174:
----------------------------------
I have updated the o.a.c.rng.simple.internal package to support checking a
range of the seed is not all zero. This updates RandomSourceInternal and
SeedFactory.
Only one new method is public and one change has occurred to an existing public
method to change from abstract to an implementation (it just calls the new
method). Here is the JApiCmp report:
{noformat}
Comparing source compatibility of commons-rng-simple-1.5-SNAPSHOT.jar against
commons-rng-simple-1.4.jar
**** MODIFIED ENUM: PUBLIC ABSTRACT
org.apache.commons.rng.simple.internal.NativeSeedType (compatible)
=== CLASS FILE FORMAT VERSION: 52.0 <- 52.0
*** MODIFIED METHOD: PUBLIC NON_ABSTRACT (<- ABSTRACT)
java.lang.Object createSeed(int)
+++* NEW METHOD: PUBLIC(+) ABSTRACT(+) java.lang.Object createSeed(int,
int, int)
{noformat}
The existing method createSeed in NativeSeedType is public so adding this new
method as public is consistent. It could be changed to package-private. This
class is only used internally and could be entirely package-private. That may
not have been the case when it was created for version 1.3 but I cannot
remember and have not checked the commit history.
The new internal seeding routines holds a sub-range of the seed that cannot be
all zero. Work was previously done to avoid creating all zero seeds as some
generators are sensitive to them. This used the simple approach to ensure the
first position in array seeds is non zero. I have used the previous tests to
identify the generators that require non-zero seeds. See
* o.a.c.rng.simple.ProvidersCommonParametricTest.testZeroIntArraySeed
* o.a.c.rng.core.RandomAssert.assertNextIntZeroOutput
* o.a.c.rng.core.RandomAssert.assertNextLongZeroOutput
*
o.a.c.rng.core.RandomAssert.assertIntArrayConstructorWithSingleBitInPoolIsFunctional
Any generator identified from these tests requires a non-zero seed. In most
cases this was set as the full seed length, or one less for generators that do
not use all the bits of the seed array (WELL_19937_x,
WELL_44497_x).
Notable exceptions:
The KISS generator is reduced to a simple LCG when positions [0, 3) are all
zero. I added a test to demonstrate this. With a zero seed the KISS LCG passes
testZeroIntArraySeed. However output will be a 32-bit LCG. To avoid a poor
generator the seed will be checked to be non-zero in the range [0, 3). This
prevents the KISS generator reducing to a LCG. It is consistent with checking
range [0, 1) in previous versions of the library.
The MSWS generator is sensitive to the initial state. I added a test to show
that a zero seed creates zero output. Updating RandomAssert to add an
assertLongArrayConstructorWithSingleBitInPoolIsFunctional test shows the MSWS
fails with single bit seeds. This generator is the most sensitive in the
library to poor seeding. It has a seed length of 3. The final position must be
a good increment for a Weyl sequence. It should definitely not be zero. The
second position is the initial state of the Weyl sequence. This could be zero.
The first position is generator state. If not very random, and the Weyl
increment is poor, then this state can take a long time to attain randomness
for the output. The behaviour from v1.3 would be to set the first position as
non-zero. However randomness can best be achieved through a good Weyl
increment. It makes more sense to ensure position 3 is non-zero.
However it is still possible to create a generator that will output zeros for a
large number of cycles. So despite the native seed type being a long[] of
length 3, it would be recommended to create this generator with a single long
value and have RandomSource.MSWS create an appropriately seeded generator. An
alternative is to provide a source of randomness to create a byte[] seed. This
can use more entropy than the 64-bits of a long to create more possible seeds.
The method was fixed in RNG-175 to be robust to bad sources of randomness.
h2. Changes
This change is summarised below for all sources that ensured a seed was
non-zero in position 0 in their native array seed.
||RandomSource||Type||Length||From (inclusive)||To (exclusive)||Notes||
|WELL_512_A|int[]|16|0|16| |
|WELL_1024_A|int[]|32|0|32| |
|WELL_19937_A|int[]|624|0|623|Does not use all bits from the final seed
position|
|WELL_19937_C|int[]|624|0|623|Does not use all bits from the final seed
position|
|WELL_44497_A|int[]|1391|0|1390|Does not use all bits from the final seed
position|
|WELL_44497_B|int[]|1391|0|1390|Does not use all bits from the final seed
position|
|MT|int[]|624|0|0|Not sensitive to all-zero seeds|
|ISAAC|int[]|256|0|0|Not sensitive to all-zero seeds|
|XOR_SHIFT_1024_S|long[]|16|0|16| |
|MT_64|long[]|312|0|0|Not sensitive to all-zero seeds|
|MWC_256|int[]|257|0|257| |
|KISS|int[]|4|0|3|Last position is a LCG state and can be zero.|
|XOR_SHIFT_1024_S_PHI|long[]|16|0|16| |
|XO_RO_SHI_RO_64_S|int[]|2|0|2| |
|XO_RO_SHI_RO_64_SS|int[]|2|0|2| |
|XO_SHI_RO_128_PLUS|int[]|4|0|4| |
|XO_SHI_RO_128_SS|int[]|4|0|4| |
|XO_RO_SHI_RO_128_PLUS|long[]|2|0|2| |
|XO_RO_SHI_RO_128_SS|long[]|2|0|2| |
|XO_SHI_RO_256_PLUS|long[]|4|0|4| |
|XO_SHI_RO_256_SS|long[]|4|0|4| |
|XO_SHI_RO_512_PLUS|long[]|8|0|8| |
|XO_SHI_RO_512_SS|long[]|8|0|8| |
|PCG_XSH_RR_32|long[]|2|0|0|Not sensitive to all-zero seeds|
|PCG_XSH_RS_32|long[]|2|0|0|Not sensitive to all-zero seeds|
|PCG_RXS_M_XS_64|long[]|2|0|0|Not sensitive to all-zero seeds|
|MSWS|long[]|3|2|3|Changed to target the Weyl increment as non-zero|
|SFC_32|int[]|3|0|0|Not sensitive to all-zero seeds|
|SFC_64|long[]|3|0|0|Not sensitive to all-zero seeds|
|XO_SHI_RO_128_PP|int[]|4|0|4| |
|XO_RO_SHI_RO_128_PP|long[]|2|0|2| |
|XO_SHI_RO_256_PP|long[]|4|0|4| |
|XO_SHI_RO_512_PP|long[]|8|0|8| |
|XO_RO_SHI_RO_1024_PP|long[]|16|0|16| |
|XO_RO_SHI_RO_1024_S|long[]|16|0|16| |
|XO_RO_SHI_RO_1024_SS|long[]|16|0|16| |
h2. Functional Changes
In most use cases the change will have no functional incompatibility. Default
seeding uses an internal source of randomness. The change modifies how this
internal source was applied to create a generator. The generator created by
RandomSource should have an initial random state and produce quality output
(i.e. not all zeros).
However this change introduces functionally breaking changes to the method in
RandomSource that accepts an input source of randomness:
{code:java}
byte[] createSeed(UniformRandomProvider);{code}
Previously the method would generate the native array seed, ensure it was
non-zero in position 0, and convert the seed to bytes. If the seed was zero in
position 0 then the _input provider was ignored_ and a value was generated from
the _default source of randomness_ in the SeedFactory. This made the method
non-reproducible.
The method has been updated to create the native seed as before, then check the
sub-range in the table above is non-zero. If all zero in the sub-range then the
sub-range is filled using a robust RNG seeded from the provided input
UniformRandomProvider. The fill will ensure not all bits are zero in the
sub-range. The default source of randomness is not used. The method is now
reproducible. The same UniformRandomProvider will create the same seed, even if
the initial seed has a sub-range that is all zero.
A test has been added to show that the seed created from a source of randomness
that outputs all zeros with create a functional generator for all RandomSource
value; and that the seed created is reproducible.
Functional changes summary:
* In the common use case for the method, the source of randomness to the
method will be random. The output will be functionally identical, a new random
seed is produced.
* In the uncommon use case for the method, the source of randomness is fixed
and happens to avoid a native seed with a zero in the first position. This
behaviour is unchanged except for the MSWS where the seed may be different if
it had a zero in position [2] of the long[] seed.
* In the very uncommon use case for the method, the source of randomness is
fixed and happens to create a native seed with a zero in the first position.
This occurs with a frequency of 1 in 2^32 or 1 in 2^64. These cases will have a
functionally breaking change in the byte[] seed created by the method. Previous
behaviour would generate a different random seed each call. New behaviour will
generate a (possibly different) random seed; the seed will be identical for
each call.
Given that previous behaviour for edge cases of zero seeds would have generated
random seeds, this change should not effect users.
Seeding for the MSWS has been improved to be more robust. Users generating
fixed seeds for this generator should check the seed is suitable, and
regenerate it with the new routines if required.
> Improve support for non-zero seeds
> ----------------------------------
>
> Key: RNG-174
> URL: https://issues.apache.org/jira/browse/RNG-174
> Project: Commons RNG
> Issue Type: Improvement
> Components: simple
> Affects Versions: 1.4
> Reporter: Alex Herbert
> Assignee: Alex Herbert
> Priority: Minor
> Fix For: 1.5
>
>
> The default seed arrays created by RandomSource are ensured to be non-zero in
> the first position. This is to support xor-based generators which are
> non-functional when seeded with all zeros.
> All xor-based generators in the library fill their state from position 0 in
> the input seed array. So this has worked for all current implementations.
> The new LXM family of generators have a composite seed of the state of a
> linear congruential generator (LCG) and the state of a xor-based generator
> (XBG). Ideal seeding for these generators places the LCG state first. This is
> due to the behaviour of the LXM family where the seeding of the LCG can
> create independent streams of RNG output, specifically when using a different
> LCG add parameter (which must be odd). Thus seeding with values 1, 3, 5, 7,
> which are then expanded into a full array, will create non-overlapping RNG
> sequences.
> The requirement to place the LCG state first in the seed shifts the seed for
> the XBG state. It is possible that a generated seed would be all zero in the
> XBG state. The current seed generator is 16-equidistributed and can thus
> output consecutive zeros. The RandomSource seeding behaviour should be
> updated with the option to create a seed which is non-zero in a specified
> range of the seed array.
> The public API in RandomSource is:
> {code:java}
> byte[] createSeed();
> byte[] createSeed(UniformRandomProvider rng);
> static int[] createIntArray(int n);
> static long[] createLongArray(int n);{code}
> The createSeed methods are specific to each RandomSource instance. This is
> delegated to an internal package which creates a native seed of the correct
> length and converts it to bytes. No changes to the public API should be
> required to support non-zero seeds in a range.
> Note that the seed generation method is also used by:
> {code:java}
> RestorableUniformRandomProvider create(); {code}
> So any LXM generator created by the RandomSource enum with no explicit seed
> will also obtain this functionality.
> For the array generation methods, these have no documentation on the non-zero
> behaviour. Either these methods can be left alone, or updated to add a range:
> {code:java}
> int[] RandomSource.createIntArray(int n, int from, int to);
> long[] RandomSource.createLongArray(int n, int from, int to);
> {code}
> In the interest of simplicity, and given that createSeed() is the preferred
> method for a known RandomSource, additional overloads of these methods can be
> omitted.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)