[ 
https://issues.apache.org/jira/browse/RNG-123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976484#comment-16976484
 ] 

Gilles Sadowski commented on RNG-123:
-------------------------------------

bq. personally, I would provide only the state bits as a seed and fix the 
constant

I'd tend to agree but the implementations aim to stay close to the reference.
I've no idea how to reconcile it with safe usage...


> PCG generators may exhibit massive stream correlation
> -----------------------------------------------------
>
>                 Key: RNG-123
>                 URL: https://issues.apache.org/jira/browse/RNG-123
>             Project: Commons RNG
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.3
>            Reporter: Sebastiano Vigna
>            Priority: Major
>
> [This is based on an issue I posted on the Rust development mailing list.]
> The documentation of the PCG generators does not state explicit that 
> different seeds generate independent sequences, but the existence of a 
> 128-bit seed implies somehow the idea that the whole seed is meaningful.
> The user should be made aware that the second parameter (the constant of the 
> underlying LCG) is almost useless from a mathematical and statistical 
> viewpoint.
> Changing constant to an LCG with power-of-two modulus and a constant like the 
> one used in PCG simply adds a constant to the sequence (more precisely, there 
> are two equivalence classes of constants, and in each equivalence class the 
> sequences are identical, except for an additive constant).
> The minimal scrambling done by the generator usually cannot cancel this fact, 
> and as a result changing the constant is equivalent to changing the initial 
> state (modulo an additive constant). You can try to run this program:
>  
> {noformat}
> import org.apache.commons.rng.core.source32.PcgXshRr32;
> import com.google.common.primitives.Ints;
> public class TestPCG {
>     public static void main(final String[] args) {
>         final long state = Long.parseLong(args[0]);
>         final long c = Long.parseLong(args[1]);
>         final long d = Long.parseLong(args[2]);
>         if (c % 2 != d % 2) throw new IllegalArgumentException();
>         final long C = c << 1 | 1;
>         final long D = d << 1 | 1;
>         final long r = 1314878037273365987L * ((d - c) >>> 1);
>         final PcgXshRr32 rng0 = new PcgXshRr32(new long[] { state, c });
>         final PcgXshRr32 rng1 = new PcgXshRr32(new long[] {
>             0xc097ef87329e28a5L  *(6364136223846793005L * (state + C) + C - r 
> - D) - D, d });
>         for(;;) {
>             final int a = rng0.nextInt();
>             System.out.write(Ints.toByteArray(a), 0, 4);
>             final int b = rng1.nextInt();
>             System.out.write(Ints.toByteArray(b), 0, 4);
>         }
>     }
> }{noformat}
> You can pass any state as first argument, and any two constants as the 
> following two arguments, as long as they are either both even or both odd . 
> The program will set up a second initial state so that the sequences 
> generated by the PRNGs using the two constants as seed are based on almost 
> identical underlying LCG sequences, in spite of having arbitrary, different 
> constants and different initial states. The two streams should be 
> independent, but if you pipe the output in PractRand you'll get immediately
> {noformat}
> rng=RNG_stdin32, seed=unknown
> length= 4 megabytes (2^22 bytes), time= 2.1 seconds
>   Test Name                         Raw       Processed     Evaluation
>   BCFN(0+0,13-5,T)                  R=+263.2  p =  3.4e-103   FAIL !!!!!
>   BCFN(0+1,13-5,T)                  R=+128.6  p =  1.5e-50    FAIL !!!!
>   BCFN(0+2,13-6,T)                  R= +65.2  p =  9.2e-23    FAIL !!
>   BCFN(0+3,13-6,T)                  R= +15.4  p =  1.0e-5   mildly suspicious
>   DC6-9x1Bytes-1                    R= +59.1  p =  4.2e-33    FAIL !!!
>   DC6-6x2Bytes-1                    R= +34.1  p =  9.0e-19    FAIL !
>   DC6-5x4Bytes-1                    R= +15.2  p =  7.7e-8   very suspicious
>   [Low4/16]BCFN(0+1,13-6,T)         R= +12.0  p =  1.5e-4   unusual
>   [Low4/16]FPF-14+6/64:(4,14-8)     R=  +9.2  p =  1.1e-6   unusual
>   [...]
>   [Low8/32]FPF-14+6/4:(9,14-9)      R= +27.4  p =  1.6e-17    FAIL
>   [Low8/32]FPF-14+6/4:(10,14-10)    R= +16.4  p =  2.5e-9   suspicious
>   [Low8/32]FPF-14+6/4:all           R=+283.4  p =  8.4e-255   FAIL !!!!!!
>   [Low8/32]Gap-16:A                 R=+414.8  p =  2.4e-336   FAIL !!!!!!!
>   [Low8/32]Gap-16:B                 R= +1736  p =  5e-1320    FAIL 
> !!!!!!!!{noformat}
> You can also offset one of generator by hundred of iterations, but the 
> sequences are so correlated that the result won't change. If you peek at the 
> state of the two generators you'll see that their difference is constant.
> I think the reader should be made aware of the danger. If you start several 
> generators of this kind the state is too small to guarantee that there will 
> be no overlap. Once you get overlap, since there are in practice just two 
> sequences, you will get a lot of unwanted correlation.
> There's a reason why nobody in the last decades ever considered creating 
> "streams" using the constant part of an LCG, and it's that people realized 
> very early that it doesn't work (see, e.g., 
> [https://ieeexplore.ieee.org/document/718715]).
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to