[
https://issues.apache.org/jira/browse/RNG-123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alex Herbert reassigned RNG-123:
--------------------------------
Fix Version/s: 1.4
Assignee: Alex Herbert
Issue Type: Improvement (was: Bug)
> PCG generators may exhibit massive stream correlation
> -----------------------------------------------------
>
> Key: RNG-123
> URL: https://issues.apache.org/jira/browse/RNG-123
> Project: Commons RNG
> Issue Type: Improvement
> Components: core
> Affects Versions: 1.3
> Reporter: Sebastiano Vigna
> Assignee: Alex Herbert
> Priority: Major
> Fix For: 1.4
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> [This is based on an issue I posted on the Rust development mailing list.]
> The documentation of the PCG generators does not state explicit that
> different seeds generate independent sequences, but the existence of a
> 128-bit seed implies somehow the idea that the whole seed is meaningful.
> The user should be made aware that the second parameter (the constant of the
> underlying LCG) is almost useless from a mathematical and statistical
> viewpoint.
> Changing constant to an LCG with power-of-two modulus and a constant like the
> one used in PCG simply adds a constant to the sequence (more precisely, there
> are two equivalence classes of constants, and in each equivalence class the
> sequences are identical, except for an additive constant).
> The minimal scrambling done by the generator usually cannot cancel this fact,
> and as a result changing the constant is equivalent to changing the initial
> state (modulo an additive constant). You can try to run this program:
>
> {noformat}
> import org.apache.commons.rng.core.source32.PcgXshRr32;
> import com.google.common.primitives.Ints;
> public class TestPCG {
> public static void main(final String[] args) {
> final long state = Long.parseLong(args[0]);
> final long c = Long.parseLong(args[1]);
> final long d = Long.parseLong(args[2]);
> if (c % 2 != d % 2) throw new IllegalArgumentException();
> final long C = c << 1 | 1;
> final long D = d << 1 | 1;
> final long r = 1314878037273365987L * ((d - c) >>> 1);
> final PcgXshRr32 rng0 = new PcgXshRr32(new long[] { state, c });
> final PcgXshRr32 rng1 = new PcgXshRr32(new long[] {
> 0xc097ef87329e28a5L *(6364136223846793005L * (state + C) + C - r
> - D) - D, d });
> for(;;) {
> final int a = rng0.nextInt();
> System.out.write(Ints.toByteArray(a), 0, 4);
> final int b = rng1.nextInt();
> System.out.write(Ints.toByteArray(b), 0, 4);
> }
> }
> }{noformat}
> You can pass any state as first argument, and any two constants as the
> following two arguments, as long as they are either both even or both odd .
> The program will set up a second initial state so that the sequences
> generated by the PRNGs using the two constants as seed are based on almost
> identical underlying LCG sequences, in spite of having arbitrary, different
> constants and different initial states. The two streams should be
> independent, but if you pipe the output in PractRand you'll get immediately
> {noformat}
> rng=RNG_stdin32, seed=unknown
> length= 4 megabytes (2^22 bytes), time= 2.1 seconds
> Test Name Raw Processed Evaluation
> BCFN(0+0,13-5,T) R=+263.2 p = 3.4e-103 FAIL !!!!!
> BCFN(0+1,13-5,T) R=+128.6 p = 1.5e-50 FAIL !!!!
> BCFN(0+2,13-6,T) R= +65.2 p = 9.2e-23 FAIL !!
> BCFN(0+3,13-6,T) R= +15.4 p = 1.0e-5 mildly suspicious
> DC6-9x1Bytes-1 R= +59.1 p = 4.2e-33 FAIL !!!
> DC6-6x2Bytes-1 R= +34.1 p = 9.0e-19 FAIL !
> DC6-5x4Bytes-1 R= +15.2 p = 7.7e-8 very suspicious
> [Low4/16]BCFN(0+1,13-6,T) R= +12.0 p = 1.5e-4 unusual
> [Low4/16]FPF-14+6/64:(4,14-8) R= +9.2 p = 1.1e-6 unusual
> [...]
> [Low8/32]FPF-14+6/4:(9,14-9) R= +27.4 p = 1.6e-17 FAIL
> [Low8/32]FPF-14+6/4:(10,14-10) R= +16.4 p = 2.5e-9 suspicious
> [Low8/32]FPF-14+6/4:all R=+283.4 p = 8.4e-255 FAIL !!!!!!
> [Low8/32]Gap-16:A R=+414.8 p = 2.4e-336 FAIL !!!!!!!
> [Low8/32]Gap-16:B R= +1736 p = 5e-1320 FAIL
> !!!!!!!!{noformat}
> You can also offset one of generator by hundred of iterations, but the
> sequences are so correlated that the result won't change. If you peek at the
> state of the two generators you'll see that their difference is constant.
> I think the reader should be made aware of the danger. If you start several
> generators of this kind the state is too small to guarantee that there will
> be no overlap. Once you get overlap, since there are in practice just two
> sequences, you will get a lot of unwanted correlation.
> There's a reason why nobody in the last decades ever considered creating
> "streams" using the constant part of an LCG, and it's that people realized
> very early that it doesn't work (see, e.g.,
> [https://ieeexplore.ieee.org/document/718715]).
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)