[
https://issues.apache.org/jira/browse/RNG-173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alex Herbert updated RNG-173:
-----------------------------
Component/s: core
> BaseProvider state filling procedure can be improved
> ----------------------------------------------------
>
> Key: RNG-173
> URL: https://issues.apache.org/jira/browse/RNG-173
> Project: Commons RNG
> Issue Type: Improvement
> Components: core
> Reporter: Alex Herbert
> Priority: Trivial
> Fix For: 1.5
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> The BaseProvider has a method to fill in remaining state if the input seed is
> too short. The fill uses existing seed values to fill the remaining.
> The next state is created using:
> {code:java}
> long n = state[i - seed.length];
> state[i] = 1812433253L * (n ^ (n >> 30)) + i{code}
> If the existing state is zero then the new state is i. When the input seed
> has no length then the filled state is a natural sequence.
> Here is a state of 10 filled from empty seeds of length 0 to 5:
> {noformat}
> 0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
> 1: [0, 1, 1812433255, 3284914298392595265, 6102061520201954364,
> -3308799481182342998, -3869692221293809580, -7101959917617921332,
> 7986832403292652032, 8936067391732911773]
> 2: [0, 0, 2, 3, 3624866510, 5437299764, 6569828598597623783,
> -8592001180344199076, 1136775338421644002, 8717367692712810396]
> 3: [0, 0, 0, 3, 4, 5, 5437299765, 7249733019, 9062166273,
> -8592001182156632327]
> 4: [0, 0, 0, 0, 4, 5, 6, 7, 7249733020, 9062166274]
> 5: [0, 0, 0, 0, 0, 5, 6, 7, 8, 9]
> {noformat}
> When the seed is zero length or close to half the length of the desired state
> and all zeros then the output state has a low number of non-zero bits.
> Note:
> This has little impact when using the Commons RNG simple module to create a
> generator. The seed is produced to the correct length using a high quality
> random source.
> A second issue is that the method to fill the state is an instance method.
> Since it uses no state it could be a static method. I would suggest a method
> to convert a seed to the correct length:
> {code:java}
> protected static long[] ensureSeedLength(long[] seed, int length); {code}
> This would allow classes that implement the following pattern:
> {code:java}
> MyRNG(long[] seed) {
> if (seed.length < SEED_SIZE) {
> final long[] state = new long[SEED_SIZE];
> fillState(state, seed);
> setState(state);
> } else {
> setState(seed);
> }
> } {code}
> To simplify to:
> {code:java}
> MyRNG(long[] seed) {
> setState(ensureSeedLength(seed, SEED_SIZE));
> }{code}
> h2. Compatibility
> The user guide states:
> {noformat}
> upon initialization, the underlying generation algorithm
> - may not use all the information contents of the seed,
> - may use a procedure (using the given seed as input) for further filling its
> internal state (in order to avoid a too uniform initial state).
> In both cases, the behavior is not standard but should not change between
> releases of the library (bugs notwithstanding).{noformat}
> Since behaviour *should not change* it would rule out changes for existing
> classes. New classes could use the new static version to fill state.
> I would suggest providing a new method to ensure the input seed is a minimum
> length. If the method seeds a SplitMix64 style generator with the first value
> of the input seed (or zero if the seed length is zero) then the filled state
> will be high quality. This type of generator only outputs zero once during
> the period and so any seed length can be ensured to be non zero when it has
> been expanded. An input seed of entirely zero values would be passed through
> unchanged. This is the default *user beware* behaviour for full length zero
> seeds.
> A 32-bit variant can be created using a similar hashing function that outputs
> only a single 0 in the period, for example MurmurHash3's 32-bit finaliser
> function.
> An example implementation for long values is:
> {code:java}
> private static final long GOLDEN_RATIO = 0x9e3779b97f4a7c15L
> protected static long[] ensureSeedLength(long[] seed, int length) {
> if (seed.length < length) {
> final long[] s = Arrays.copyOf(seed, length);
> // Fill the rest as if using a SplitMix64 RNG
> long x = s[0];
> for (int i = seed.length; i < length; i++) {
> s[i] = stafford13(x += GOLDEN_RATIO);
> }
> return s;
> }
> return seed;
> }
> private static long stafford13(long x) {
> x = (x ^ (x >>> 30)) * 0xbf58476d1ce4e5b9L;
> x = (x ^ (x >>> 27)) * 0x94d049bb133111ebL;
> return x ^ (x >>> 31);
> }
> {code}
> A 32-bit mix function for Murmur32 is:
> {code:java}
> private static int murmur32(int x) {
> x = (x ^ (x >>> 16)) * 0x85ebca6b;
> x = (x ^ (x >>> 13)) * 0xc2b2ae35;
> return x ^ (x >>> 16);
> }{code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)