[jira] [Commented] (RNG-123) PCG generators may exhibit massive stream correlation
[ https://issues.apache.org/jira/browse/RNG-123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147896#comment-17147896 ] Alex Herbert commented on RNG-123: -- I have added comments to all the relevant PCG generators in the Javadoc to indicate the issue with using the 128-bit seed variant. There is now an alternative version using only the 64-bit seed and this is the recommended version. > PCG generators may exhibit massive stream correlation > - > > Key: RNG-123 > URL: https://issues.apache.org/jira/browse/RNG-123 > Project: Commons RNG > Issue Type: Improvement > Components: core >Affects Versions: 1.3 >Reporter: Sebastiano Vigna >Assignee: Alex Herbert >Priority: Major > Fix For: 1.4 > > Time Spent: 0.5h > Remaining Estimate: 0h > > [This is based on an issue I posted on the Rust development mailing list.] > The documentation of the PCG generators does not state explicit that > different seeds generate independent sequences, but the existence of a > 128-bit seed implies somehow the idea that the whole seed is meaningful. > The user should be made aware that the second parameter (the constant of the > underlying LCG) is almost useless from a mathematical and statistical > viewpoint. > Changing constant to an LCG with power-of-two modulus and a constant like the > one used in PCG simply adds a constant to the sequence (more precisely, there > are two equivalence classes of constants, and in each equivalence class the > sequences are identical, except for an additive constant). > The minimal scrambling done by the generator usually cannot cancel this fact, > and as a result changing the constant is equivalent to changing the initial > state (modulo an additive constant). You can try to run this program: > > {noformat} > import org.apache.commons.rng.core.source32.PcgXshRr32; > import com.google.common.primitives.Ints; > public class TestPCG { > public static void main(final String[] args) { > final long state = Long.parseLong(args[0]); > final long c = Long.parseLong(args[1]); > final long d = Long.parseLong(args[2]); > if (c % 2 != d % 2) throw new IllegalArgumentException(); > final long C = c << 1 | 1; > final long D = d << 1 | 1; > final long r = 1314878037273365987L * ((d - c) >>> 1); > final PcgXshRr32 rng0 = new PcgXshRr32(new long[] { state, c }); > final PcgXshRr32 rng1 = new PcgXshRr32(new long[] { > 0xc097ef87329e28a5L *(6364136223846793005L * (state + C) + C - r > - D) - D, d }); > for(;;) { > final int a = rng0.nextInt(); > System.out.write(Ints.toByteArray(a), 0, 4); > final int b = rng1.nextInt(); > System.out.write(Ints.toByteArray(b), 0, 4); > } > } > }{noformat} > You can pass any state as first argument, and any two constants as the > following two arguments, as long as they are either both even or both odd . > The program will set up a second initial state so that the sequences > generated by the PRNGs using the two constants as seed are based on almost > identical underlying LCG sequences, in spite of having arbitrary, different > constants and different initial states. The two streams should be > independent, but if you pipe the output in PractRand you'll get immediately > {noformat} > rng=RNG_stdin32, seed=unknown > length= 4 megabytes (2^22 bytes), time= 2.1 seconds > Test Name Raw Processed Evaluation > BCFN(0+0,13-5,T) R=+263.2 p = 3.4e-103 FAIL ! > BCFN(0+1,13-5,T) R=+128.6 p = 1.5e-50FAIL > BCFN(0+2,13-6,T) R= +65.2 p = 9.2e-23FAIL !! > BCFN(0+3,13-6,T) R= +15.4 p = 1.0e-5 mildly suspicious > DC6-9x1Bytes-1R= +59.1 p = 4.2e-33FAIL !!! > DC6-6x2Bytes-1R= +34.1 p = 9.0e-19FAIL ! > DC6-5x4Bytes-1R= +15.2 p = 7.7e-8 very suspicious > [Low4/16]BCFN(0+1,13-6,T) R= +12.0 p = 1.5e-4 unusual > [Low4/16]FPF-14+6/64:(4,14-8) R= +9.2 p = 1.1e-6 unusual > [...] > [Low8/32]FPF-14+6/4:(9,14-9) R= +27.4 p = 1.6e-17FAIL > [Low8/32]FPF-14+6/4:(10,14-10)R= +16.4 p = 2.5e-9 suspicious > [Low8/32]FPF-14+6/4:all R=+283.4 p = 8.4e-255 FAIL !! > [Low8/32]Gap-16:A R=+414.8 p = 2.4e-336 FAIL !!! > [Low8/32]Gap-16:B R= +1736 p = 5e-1320FAIL > {noformat} > You can also offset one of generator by hundred of iterations, but the > sequences are so correlated that the result won't change. If you peek at the > state of the two generators you'll see
[jira] [Commented] (RNG-123) PCG generators may exhibit massive stream correlation
[ https://issues.apache.org/jira/browse/RNG-123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976724#comment-16976724 ] Alex Herbert commented on RNG-123: -- Thank you for the link and comprehensive explanation. > PCG generators may exhibit massive stream correlation > - > > Key: RNG-123 > URL: https://issues.apache.org/jira/browse/RNG-123 > Project: Commons RNG > Issue Type: Bug > Components: core >Affects Versions: 1.3 >Reporter: Sebastiano Vigna >Priority: Major > > [This is based on an issue I posted on the Rust development mailing list.] > The documentation of the PCG generators does not state explicit that > different seeds generate independent sequences, but the existence of a > 128-bit seed implies somehow the idea that the whole seed is meaningful. > The user should be made aware that the second parameter (the constant of the > underlying LCG) is almost useless from a mathematical and statistical > viewpoint. > Changing constant to an LCG with power-of-two modulus and a constant like the > one used in PCG simply adds a constant to the sequence (more precisely, there > are two equivalence classes of constants, and in each equivalence class the > sequences are identical, except for an additive constant). > The minimal scrambling done by the generator usually cannot cancel this fact, > and as a result changing the constant is equivalent to changing the initial > state (modulo an additive constant). You can try to run this program: > > {noformat} > import org.apache.commons.rng.core.source32.PcgXshRr32; > import com.google.common.primitives.Ints; > public class TestPCG { > public static void main(final String[] args) { > final long state = Long.parseLong(args[0]); > final long c = Long.parseLong(args[1]); > final long d = Long.parseLong(args[2]); > if (c % 2 != d % 2) throw new IllegalArgumentException(); > final long C = c << 1 | 1; > final long D = d << 1 | 1; > final long r = 1314878037273365987L * ((d - c) >>> 1); > final PcgXshRr32 rng0 = new PcgXshRr32(new long[] { state, c }); > final PcgXshRr32 rng1 = new PcgXshRr32(new long[] { > 0xc097ef87329e28a5L *(6364136223846793005L * (state + C) + C - r > - D) - D, d }); > for(;;) { > final int a = rng0.nextInt(); > System.out.write(Ints.toByteArray(a), 0, 4); > final int b = rng1.nextInt(); > System.out.write(Ints.toByteArray(b), 0, 4); > } > } > }{noformat} > You can pass any state as first argument, and any two constants as the > following two arguments, as long as they are either both even or both odd . > The program will set up a second initial state so that the sequences > generated by the PRNGs using the two constants as seed are based on almost > identical underlying LCG sequences, in spite of having arbitrary, different > constants and different initial states. The two streams should be > independent, but if you pipe the output in PractRand you'll get immediately > {noformat} > rng=RNG_stdin32, seed=unknown > length= 4 megabytes (2^22 bytes), time= 2.1 seconds > Test Name Raw Processed Evaluation > BCFN(0+0,13-5,T) R=+263.2 p = 3.4e-103 FAIL ! > BCFN(0+1,13-5,T) R=+128.6 p = 1.5e-50FAIL > BCFN(0+2,13-6,T) R= +65.2 p = 9.2e-23FAIL !! > BCFN(0+3,13-6,T) R= +15.4 p = 1.0e-5 mildly suspicious > DC6-9x1Bytes-1R= +59.1 p = 4.2e-33FAIL !!! > DC6-6x2Bytes-1R= +34.1 p = 9.0e-19FAIL ! > DC6-5x4Bytes-1R= +15.2 p = 7.7e-8 very suspicious > [Low4/16]BCFN(0+1,13-6,T) R= +12.0 p = 1.5e-4 unusual > [Low4/16]FPF-14+6/64:(4,14-8) R= +9.2 p = 1.1e-6 unusual > [...] > [Low8/32]FPF-14+6/4:(9,14-9) R= +27.4 p = 1.6e-17FAIL > [Low8/32]FPF-14+6/4:(10,14-10)R= +16.4 p = 2.5e-9 suspicious > [Low8/32]FPF-14+6/4:all R=+283.4 p = 8.4e-255 FAIL !! > [Low8/32]Gap-16:A R=+414.8 p = 2.4e-336 FAIL !!! > [Low8/32]Gap-16:B R= +1736 p = 5e-1320FAIL > {noformat} > You can also offset one of generator by hundred of iterations, but the > sequences are so correlated that the result won't change. If you peek at the > state of the two generators you'll see that their difference is constant. > I think the reader should be made aware of the danger. If you start several > generators of this kind the state is too small to guarantee that there will > be no overlap. Once you get overlap, since there are in practice just two > sequences, you will get a lot of
[jira] [Commented] (RNG-123) PCG generators may exhibit massive stream correlation
[ https://issues.apache.org/jira/browse/RNG-123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976638#comment-16976638 ] Sebastiano Vigna commented on RNG-123: -- Since you're striving for precision, I'd say "the second 64 bits, with the exception of the most significant bit, which is discarded, choose...". There's a fairly long discussion here: [https://www.sciencedirect.com/science/article/pii/S0378475416300829] I don't think using the same generator on different data has any guarantee of independence though. One thing you can actually prove is that in some sense different streams generated from different starting states of a linear generator are "pairwise empirically independent" if they do not overlap. If you consider any states X and Y, and M is the transition matrix, after n steps you get M^nX and M^nY. But by linearity M^nX ⨁ M^nY = M^n(X ⨁ Y), so the xor between the streams is another empirically random stream. If the xor between two streams looks random, we might agree that the streams are independent. "Pairwise" here is crucial because as you increase the number of streams you get more and more dependent streams. But pairwise is a good start. The argument does not work always with LCGs because you get that if d is the difference between X and Y, after n step is a^nd mod m. If, for example, the modulus m is a power of 2 and 2^c | d for a large c, this will be true forever. This is the case I was highlighting above—if you change just the upper bits of state you get strongly correlated streams. In other words: linear generators have pairwise "independent" substreams—you cannot, even adversarially, find bad pairs. Power-of-two LCGs do not. LCGs with prime moduli are a completely different thing (see the multiple-stream facility of MRG32k3a). > PCG generators may exhibit massive stream correlation > - > > Key: RNG-123 > URL: https://issues.apache.org/jira/browse/RNG-123 > Project: Commons RNG > Issue Type: Bug > Components: core >Affects Versions: 1.3 >Reporter: Sebastiano Vigna >Priority: Major > > [This is based on an issue I posted on the Rust development mailing list.] > The documentation of the PCG generators does not state explicit that > different seeds generate independent sequences, but the existence of a > 128-bit seed implies somehow the idea that the whole seed is meaningful. > The user should be made aware that the second parameter (the constant of the > underlying LCG) is almost useless from a mathematical and statistical > viewpoint. > Changing constant to an LCG with power-of-two modulus and a constant like the > one used in PCG simply adds a constant to the sequence (more precisely, there > are two equivalence classes of constants, and in each equivalence class the > sequences are identical, except for an additive constant). > The minimal scrambling done by the generator usually cannot cancel this fact, > and as a result changing the constant is equivalent to changing the initial > state (modulo an additive constant). You can try to run this program: > > {noformat} > import org.apache.commons.rng.core.source32.PcgXshRr32; > import com.google.common.primitives.Ints; > public class TestPCG { > public static void main(final String[] args) { > final long state = Long.parseLong(args[0]); > final long c = Long.parseLong(args[1]); > final long d = Long.parseLong(args[2]); > if (c % 2 != d % 2) throw new IllegalArgumentException(); > final long C = c << 1 | 1; > final long D = d << 1 | 1; > final long r = 1314878037273365987L * ((d - c) >>> 1); > final PcgXshRr32 rng0 = new PcgXshRr32(new long[] { state, c }); > final PcgXshRr32 rng1 = new PcgXshRr32(new long[] { > 0xc097ef87329e28a5L *(6364136223846793005L * (state + C) + C - r > - D) - D, d }); > for(;;) { > final int a = rng0.nextInt(); > System.out.write(Ints.toByteArray(a), 0, 4); > final int b = rng1.nextInt(); > System.out.write(Ints.toByteArray(b), 0, 4); > } > } > }{noformat} > You can pass any state as first argument, and any two constants as the > following two arguments, as long as they are either both even or both odd . > The program will set up a second initial state so that the sequences > generated by the PRNGs using the two constants as seed are based on almost > identical underlying LCG sequences, in spite of having arbitrary, different > constants and different initial states. The two streams should be > independent, but if you pipe the output in PractRand you'll get immediately > {noformat} > rng=RNG_stdin32, seed=unknown > length= 4 megabytes (2^22 bytes), time= 2.1 seconds > Test Name
[jira] [Commented] (RNG-123) PCG generators may exhibit massive stream correlation
[ https://issues.apache.org/jira/browse/RNG-123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976619#comment-16976619 ] Alex Herbert commented on RNG-123: -- I did note about the two constants for the two different generators having to have the same oddness for this correlation to be observed. Thus we have this to add to the javadoc: {noformat} Due to the use of an underlying linear congruential generator (LCG) alterations to the 128 bit seed have the following effect: the first 64-bits alter the generator state; the second 64 bits choose between one of two alternative LCGs where the output of the chosen LCG is the same sequence except for an additive constant determined by the seed bits. The result is that seeds that differ only in the last 64-bits will have a 50% chance of producing highly correlated output sequences. Consider using the fixed increment variant where the 64-bit seed sets the generator state. @see https://ieeexplore.ieee.org/document/718715 Section 3.1: Different additive constants in a maximum potency congruential generator [Reference as provided]{noformat} We can then add a simpler variant with a fixed increment to the library. As for a section in the user guide for working with parallel computations based on jumps, random seedings and reparameterization (since the library has generators that that can use each of these strategies), is there a reference where these strategies are discussed? I see it as a major unknown for any of these strategies as to whether there will be correlations between the parallel output. My current strategy is to ensure that the parallel work is done on different data, thus even if the same generator was used the results are credible; or to use a different generator class for each set of work (this does not scale well). > PCG generators may exhibit massive stream correlation > - > > Key: RNG-123 > URL: https://issues.apache.org/jira/browse/RNG-123 > Project: Commons RNG > Issue Type: Bug > Components: core >Affects Versions: 1.3 >Reporter: Sebastiano Vigna >Priority: Major > > [This is based on an issue I posted on the Rust development mailing list.] > The documentation of the PCG generators does not state explicit that > different seeds generate independent sequences, but the existence of a > 128-bit seed implies somehow the idea that the whole seed is meaningful. > The user should be made aware that the second parameter (the constant of the > underlying LCG) is almost useless from a mathematical and statistical > viewpoint. > Changing constant to an LCG with power-of-two modulus and a constant like the > one used in PCG simply adds a constant to the sequence (more precisely, there > are two equivalence classes of constants, and in each equivalence class the > sequences are identical, except for an additive constant). > The minimal scrambling done by the generator usually cannot cancel this fact, > and as a result changing the constant is equivalent to changing the initial > state (modulo an additive constant). You can try to run this program: > > {noformat} > import org.apache.commons.rng.core.source32.PcgXshRr32; > import com.google.common.primitives.Ints; > public class TestPCG { > public static void main(final String[] args) { > final long state = Long.parseLong(args[0]); > final long c = Long.parseLong(args[1]); > final long d = Long.parseLong(args[2]); > if (c % 2 != d % 2) throw new IllegalArgumentException(); > final long C = c << 1 | 1; > final long D = d << 1 | 1; > final long r = 1314878037273365987L * ((d - c) >>> 1); > final PcgXshRr32 rng0 = new PcgXshRr32(new long[] { state, c }); > final PcgXshRr32 rng1 = new PcgXshRr32(new long[] { > 0xc097ef87329e28a5L *(6364136223846793005L * (state + C) + C - r > - D) - D, d }); > for(;;) { > final int a = rng0.nextInt(); > System.out.write(Ints.toByteArray(a), 0, 4); > final int b = rng1.nextInt(); > System.out.write(Ints.toByteArray(b), 0, 4); > } > } > }{noformat} > You can pass any state as first argument, and any two constants as the > following two arguments, as long as they are either both even or both odd . > The program will set up a second initial state so that the sequences > generated by the PRNGs using the two constants as seed are based on almost > identical underlying LCG sequences, in spite of having arbitrary, different > constants and different initial states. The two streams should be > independent, but if you pipe the output in PractRand you'll get immediately > {noformat} > rng=RNG_stdin32, seed=unknown > length= 4 megabytes (2^22 bytes), time= 2.1 seconds > Test Name
[jira] [Commented] (RNG-123) PCG generators may exhibit massive stream correlation
[ https://issues.apache.org/jira/browse/RNG-123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976513#comment-16976513 ] Sebastiano Vigna commented on RNG-123: -- Yes, almost. The constant simply chooses between the sequence x ← ax + 1 and the sequence x ← ax - 1 (both modulo m), up to an additive constant. So it's more like being equivalent to one of two available generators. There is an additional problem: LCGs with power-of-two moduli never propagate state changes to the right. If you take my example and XOR the upper 16 bits, say, of both initial states with two arbitrary constants, you'll see still a lot of correlation. So the "actual seed", so to speak, if you do not want to get correlation, is more like the lower 40-48 bits of the state part. That's very little, in particular for parallel applications. Jumps, random seedings and reparameterization are heavily debated as methods to get different streams. One has to be very careful though, as correlation can pop up in unexpected ways. > PCG generators may exhibit massive stream correlation > - > > Key: RNG-123 > URL: https://issues.apache.org/jira/browse/RNG-123 > Project: Commons RNG > Issue Type: Bug > Components: core >Affects Versions: 1.3 >Reporter: Sebastiano Vigna >Priority: Major > > [This is based on an issue I posted on the Rust development mailing list.] > The documentation of the PCG generators does not state explicit that > different seeds generate independent sequences, but the existence of a > 128-bit seed implies somehow the idea that the whole seed is meaningful. > The user should be made aware that the second parameter (the constant of the > underlying LCG) is almost useless from a mathematical and statistical > viewpoint. > Changing constant to an LCG with power-of-two modulus and a constant like the > one used in PCG simply adds a constant to the sequence (more precisely, there > are two equivalence classes of constants, and in each equivalence class the > sequences are identical, except for an additive constant). > The minimal scrambling done by the generator usually cannot cancel this fact, > and as a result changing the constant is equivalent to changing the initial > state (modulo an additive constant). You can try to run this program: > > {noformat} > import org.apache.commons.rng.core.source32.PcgXshRr32; > import com.google.common.primitives.Ints; > public class TestPCG { > public static void main(final String[] args) { > final long state = Long.parseLong(args[0]); > final long c = Long.parseLong(args[1]); > final long d = Long.parseLong(args[2]); > if (c % 2 != d % 2) throw new IllegalArgumentException(); > final long C = c << 1 | 1; > final long D = d << 1 | 1; > final long r = 1314878037273365987L * ((d - c) >>> 1); > final PcgXshRr32 rng0 = new PcgXshRr32(new long[] { state, c }); > final PcgXshRr32 rng1 = new PcgXshRr32(new long[] { > 0xc097ef87329e28a5L *(6364136223846793005L * (state + C) + C - r > - D) - D, d }); > for(;;) { > final int a = rng0.nextInt(); > System.out.write(Ints.toByteArray(a), 0, 4); > final int b = rng1.nextInt(); > System.out.write(Ints.toByteArray(b), 0, 4); > } > } > }{noformat} > You can pass any state as first argument, and any two constants as the > following two arguments, as long as they are either both even or both odd . > The program will set up a second initial state so that the sequences > generated by the PRNGs using the two constants as seed are based on almost > identical underlying LCG sequences, in spite of having arbitrary, different > constants and different initial states. The two streams should be > independent, but if you pipe the output in PractRand you'll get immediately > {noformat} > rng=RNG_stdin32, seed=unknown > length= 4 megabytes (2^22 bytes), time= 2.1 seconds > Test Name Raw Processed Evaluation > BCFN(0+0,13-5,T) R=+263.2 p = 3.4e-103 FAIL ! > BCFN(0+1,13-5,T) R=+128.6 p = 1.5e-50FAIL > BCFN(0+2,13-6,T) R= +65.2 p = 9.2e-23FAIL !! > BCFN(0+3,13-6,T) R= +15.4 p = 1.0e-5 mildly suspicious > DC6-9x1Bytes-1R= +59.1 p = 4.2e-33FAIL !!! > DC6-6x2Bytes-1R= +34.1 p = 9.0e-19FAIL ! > DC6-5x4Bytes-1R= +15.2 p = 7.7e-8 very suspicious > [Low4/16]BCFN(0+1,13-6,T) R= +12.0 p = 1.5e-4 unusual > [Low4/16]FPF-14+6/64:(4,14-8) R= +9.2 p = 1.1e-6 unusual > [...] > [Low8/32]FPF-14+6/4:(9,14-9) R= +27.4 p = 1.6e-17FAIL >
[jira] [Commented] (RNG-123) PCG generators may exhibit massive stream correlation
[ https://issues.apache.org/jira/browse/RNG-123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976497#comment-16976497 ] Alex Herbert commented on RNG-123: -- Hi Sebastiano, thank you for the report and demonstration code. I agree that this is something we should add to the documentation. >From the paper this seems to be the crux: {noformat} X_(n+1) = a X_n + c "the two additive constants c and c + (a - 1) r produce the same sequence (except for a shift modulo 2^e) for any r" e is the power of 2 modulus for the LCG.{noformat} Thus despite the 128-bit state the likelihood of overlapping correlated output from any two PCG instances is equivalent to using a generator with a period of 2^64 from different starting points, i.e. the second seed argument serves to alter the output but does not create uncorrelated output. I am wondering if we should revise the user guide description of how to create generators for parallel computation. The section on the JumpableUniformRandomProvider is hidden in the examples. This issue with PCG indicates that a sub-section titled 'Parallel Computations' would direct users to an appropriate usage rather than creating multiple randomly seeded instances of their favourite generator. The user guide could state that the later method is subject to a likelihood of overlap that is related to the period of the generator and its state properties with a note to check the Javadoc for the generator for details. As for the removal of the second seed argument there is a Pcg32 variant that has a fixed increment (using the {{_oneseq}} nomeclature in the C++ source code). We could add this to the enumeration choices for the RandomSource. This would be in keeping with the library mission of providing Java implementations of reference generators. In effect this states that since the second seed argument is not adding substantial value you can create a generator without one. > PCG generators may exhibit massive stream correlation > - > > Key: RNG-123 > URL: https://issues.apache.org/jira/browse/RNG-123 > Project: Commons RNG > Issue Type: Bug > Components: core >Affects Versions: 1.3 >Reporter: Sebastiano Vigna >Priority: Major > > [This is based on an issue I posted on the Rust development mailing list.] > The documentation of the PCG generators does not state explicit that > different seeds generate independent sequences, but the existence of a > 128-bit seed implies somehow the idea that the whole seed is meaningful. > The user should be made aware that the second parameter (the constant of the > underlying LCG) is almost useless from a mathematical and statistical > viewpoint. > Changing constant to an LCG with power-of-two modulus and a constant like the > one used in PCG simply adds a constant to the sequence (more precisely, there > are two equivalence classes of constants, and in each equivalence class the > sequences are identical, except for an additive constant). > The minimal scrambling done by the generator usually cannot cancel this fact, > and as a result changing the constant is equivalent to changing the initial > state (modulo an additive constant). You can try to run this program: > > {noformat} > import org.apache.commons.rng.core.source32.PcgXshRr32; > import com.google.common.primitives.Ints; > public class TestPCG { > public static void main(final String[] args) { > final long state = Long.parseLong(args[0]); > final long c = Long.parseLong(args[1]); > final long d = Long.parseLong(args[2]); > if (c % 2 != d % 2) throw new IllegalArgumentException(); > final long C = c << 1 | 1; > final long D = d << 1 | 1; > final long r = 1314878037273365987L * ((d - c) >>> 1); > final PcgXshRr32 rng0 = new PcgXshRr32(new long[] { state, c }); > final PcgXshRr32 rng1 = new PcgXshRr32(new long[] { > 0xc097ef87329e28a5L *(6364136223846793005L * (state + C) + C - r > - D) - D, d }); > for(;;) { > final int a = rng0.nextInt(); > System.out.write(Ints.toByteArray(a), 0, 4); > final int b = rng1.nextInt(); > System.out.write(Ints.toByteArray(b), 0, 4); > } > } > }{noformat} > You can pass any state as first argument, and any two constants as the > following two arguments, as long as they are either both even or both odd . > The program will set up a second initial state so that the sequences > generated by the PRNGs using the two constants as seed are based on almost > identical underlying LCG sequences, in spite of having arbitrary, different > constants and different initial states. The two streams should be > independent, but if you pipe the output in PractRand you'll get immediately >
[jira] [Commented] (RNG-123) PCG generators may exhibit massive stream correlation
[ https://issues.apache.org/jira/browse/RNG-123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976484#comment-16976484 ] Gilles Sadowski commented on RNG-123: - bq. personally, I would provide only the state bits as a seed and fix the constant I'd tend to agree but the implementations aim to stay close to the reference. I've no idea how to reconcile it with safe usage... > PCG generators may exhibit massive stream correlation > - > > Key: RNG-123 > URL: https://issues.apache.org/jira/browse/RNG-123 > Project: Commons RNG > Issue Type: Bug > Components: core >Affects Versions: 1.3 >Reporter: Sebastiano Vigna >Priority: Major > > [This is based on an issue I posted on the Rust development mailing list.] > The documentation of the PCG generators does not state explicit that > different seeds generate independent sequences, but the existence of a > 128-bit seed implies somehow the idea that the whole seed is meaningful. > The user should be made aware that the second parameter (the constant of the > underlying LCG) is almost useless from a mathematical and statistical > viewpoint. > Changing constant to an LCG with power-of-two modulus and a constant like the > one used in PCG simply adds a constant to the sequence (more precisely, there > are two equivalence classes of constants, and in each equivalence class the > sequences are identical, except for an additive constant). > The minimal scrambling done by the generator usually cannot cancel this fact, > and as a result changing the constant is equivalent to changing the initial > state (modulo an additive constant). You can try to run this program: > > {noformat} > import org.apache.commons.rng.core.source32.PcgXshRr32; > import com.google.common.primitives.Ints; > public class TestPCG { > public static void main(final String[] args) { > final long state = Long.parseLong(args[0]); > final long c = Long.parseLong(args[1]); > final long d = Long.parseLong(args[2]); > if (c % 2 != d % 2) throw new IllegalArgumentException(); > final long C = c << 1 | 1; > final long D = d << 1 | 1; > final long r = 1314878037273365987L * ((d - c) >>> 1); > final PcgXshRr32 rng0 = new PcgXshRr32(new long[] { state, c }); > final PcgXshRr32 rng1 = new PcgXshRr32(new long[] { > 0xc097ef87329e28a5L *(6364136223846793005L * (state + C) + C - r > - D) - D, d }); > for(;;) { > final int a = rng0.nextInt(); > System.out.write(Ints.toByteArray(a), 0, 4); > final int b = rng1.nextInt(); > System.out.write(Ints.toByteArray(b), 0, 4); > } > } > }{noformat} > You can pass any state as first argument, and any two constants as the > following two arguments, as long as they are either both even or both odd . > The program will set up a second initial state so that the sequences > generated by the PRNGs using the two constants as seed are based on almost > identical underlying LCG sequences, in spite of having arbitrary, different > constants and different initial states. The two streams should be > independent, but if you pipe the output in PractRand you'll get immediately > {noformat} > rng=RNG_stdin32, seed=unknown > length= 4 megabytes (2^22 bytes), time= 2.1 seconds > Test Name Raw Processed Evaluation > BCFN(0+0,13-5,T) R=+263.2 p = 3.4e-103 FAIL ! > BCFN(0+1,13-5,T) R=+128.6 p = 1.5e-50FAIL > BCFN(0+2,13-6,T) R= +65.2 p = 9.2e-23FAIL !! > BCFN(0+3,13-6,T) R= +15.4 p = 1.0e-5 mildly suspicious > DC6-9x1Bytes-1R= +59.1 p = 4.2e-33FAIL !!! > DC6-6x2Bytes-1R= +34.1 p = 9.0e-19FAIL ! > DC6-5x4Bytes-1R= +15.2 p = 7.7e-8 very suspicious > [Low4/16]BCFN(0+1,13-6,T) R= +12.0 p = 1.5e-4 unusual > [Low4/16]FPF-14+6/64:(4,14-8) R= +9.2 p = 1.1e-6 unusual > [...] > [Low8/32]FPF-14+6/4:(9,14-9) R= +27.4 p = 1.6e-17FAIL > [Low8/32]FPF-14+6/4:(10,14-10)R= +16.4 p = 2.5e-9 suspicious > [Low8/32]FPF-14+6/4:all R=+283.4 p = 8.4e-255 FAIL !! > [Low8/32]Gap-16:A R=+414.8 p = 2.4e-336 FAIL !!! > [Low8/32]Gap-16:B R= +1736 p = 5e-1320FAIL > {noformat} > You can also offset one of generator by hundred of iterations, but the > sequences are so correlated that the result won't change. If you peek at the > state of the two generators you'll see that their difference is constant. > I think the reader should be made aware of the danger. If you start several > generators of this kind
[jira] [Commented] (RNG-123) PCG generators may exhibit massive stream correlation
[ https://issues.apache.org/jira/browse/RNG-123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976477#comment-16976477 ] Sebastiano Vigna commented on RNG-123: -- Seems OK (personally, I would provide only the state bits as a seed and fix the constant). Note that another undocumented danger is that the most significant bit of the second seed is not used _at all_ (it is discarded when the seed is stored in the increment field). This is just a minor issue tho. > PCG generators may exhibit massive stream correlation > - > > Key: RNG-123 > URL: https://issues.apache.org/jira/browse/RNG-123 > Project: Commons RNG > Issue Type: Bug > Components: core >Affects Versions: 1.3 >Reporter: Sebastiano Vigna >Priority: Major > > [This is based on an issue I posted on the Rust development mailing list.] > The documentation of the PCG generators does not state explicit that > different seeds generate independent sequences, but the existence of a > 128-bit seed implies somehow the idea that the whole seed is meaningful. > The user should be made aware that the second parameter (the constant of the > underlying LCG) is almost useless from a mathematical and statistical > viewpoint. > Changing constant to an LCG with power-of-two modulus and a constant like the > one used in PCG simply adds a constant to the sequence (more precisely, there > are two equivalence classes of constants, and in each equivalence class the > sequences are identical, except for an additive constant). > The minimal scrambling done by the generator usually cannot cancel this fact, > and as a result changing the constant is equivalent to changing the initial > state (modulo an additive constant). You can try to run this program: > > {noformat} > import org.apache.commons.rng.core.source32.PcgXshRr32; > import com.google.common.primitives.Ints; > public class TestPCG { > public static void main(final String[] args) { > final long state = Long.parseLong(args[0]); > final long c = Long.parseLong(args[1]); > final long d = Long.parseLong(args[2]); > if (c % 2 != d % 2) throw new IllegalArgumentException(); > final long C = c << 1 | 1; > final long D = d << 1 | 1; > final long r = 1314878037273365987L * ((d - c) >>> 1); > final PcgXshRr32 rng0 = new PcgXshRr32(new long[] { state, c }); > final PcgXshRr32 rng1 = new PcgXshRr32(new long[] { > 0xc097ef87329e28a5L *(6364136223846793005L * (state + C) + C - r > - D) - D, d }); > for(;;) { > final int a = rng0.nextInt(); > System.out.write(Ints.toByteArray(a), 0, 4); > final int b = rng1.nextInt(); > System.out.write(Ints.toByteArray(b), 0, 4); > } > } > }{noformat} > You can pass any state as first argument, and any two constants as the > following two arguments, as long as they are either both even or both odd . > The program will set up a second initial state so that the sequences > generated by the PRNGs using the two constants as seed are based on almost > identical underlying LCG sequences, in spite of having arbitrary, different > constants and different initial states. The two streams should be > independent, but if you pipe the output in PractRand you'll get immediately > {noformat} > rng=RNG_stdin32, seed=unknown > length= 4 megabytes (2^22 bytes), time= 2.1 seconds > Test Name Raw Processed Evaluation > BCFN(0+0,13-5,T) R=+263.2 p = 3.4e-103 FAIL ! > BCFN(0+1,13-5,T) R=+128.6 p = 1.5e-50FAIL > BCFN(0+2,13-6,T) R= +65.2 p = 9.2e-23FAIL !! > BCFN(0+3,13-6,T) R= +15.4 p = 1.0e-5 mildly suspicious > DC6-9x1Bytes-1R= +59.1 p = 4.2e-33FAIL !!! > DC6-6x2Bytes-1R= +34.1 p = 9.0e-19FAIL ! > DC6-5x4Bytes-1R= +15.2 p = 7.7e-8 very suspicious > [Low4/16]BCFN(0+1,13-6,T) R= +12.0 p = 1.5e-4 unusual > [Low4/16]FPF-14+6/64:(4,14-8) R= +9.2 p = 1.1e-6 unusual > [...] > [Low8/32]FPF-14+6/4:(9,14-9) R= +27.4 p = 1.6e-17FAIL > [Low8/32]FPF-14+6/4:(10,14-10)R= +16.4 p = 2.5e-9 suspicious > [Low8/32]FPF-14+6/4:all R=+283.4 p = 8.4e-255 FAIL !! > [Low8/32]Gap-16:A R=+414.8 p = 2.4e-336 FAIL !!! > [Low8/32]Gap-16:B R= +1736 p = 5e-1320FAIL > {noformat} > You can also offset one of generator by hundred of iterations, but the > sequences are so correlated that the result won't change. If you peek at the > state of the two generators you'll see that their difference is constant. > I think the
[jira] [Commented] (RNG-123) PCG generators may exhibit massive stream correlation
[ https://issues.apache.org/jira/browse/RNG-123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976462#comment-16976462 ] Gilles Sadowski commented on RNG-123: - Do you think that inserting {noformat} Although the seed size is 128 bits, only the first 64 are effective: in effect, two seeds that only differ by the last 64 bits will produce highly correlated sequences. {noformat} would appropriately convey the issue? > PCG generators may exhibit massive stream correlation > - > > Key: RNG-123 > URL: https://issues.apache.org/jira/browse/RNG-123 > Project: Commons RNG > Issue Type: Bug > Components: core >Affects Versions: 1.3 >Reporter: Sebastiano Vigna >Priority: Major > > [This is based on an issue I posted on the Rust development mailing list.] > The documentation of the PCG generators does not state explicit that > different seeds generate independent sequences, but the existence of a > 128-bit seed implies somehow the idea that the whole seed is meaningful. > The user should be made aware that the second parameter (the constant of the > underlying LCG) is almost useless from a mathematical and statistical > viewpoint. > Changing constant to an LCG with power-of-two modulus and a constant like the > one used in PCG simply adds a constant to the sequence (more precisely, there > are two equivalence classes of constants, and in each equivalence class the > sequences are identical, except for an additive constant). > The minimal scrambling done by the generator usually cannot cancel this fact, > and as a result changing the constant is equivalent to changing the initial > state (modulo an additive constant). You can try to run this program: > > {noformat} > import org.apache.commons.rng.core.source32.PcgXshRr32; > import com.google.common.primitives.Ints; > public class TestPCG { > public static void main(final String[] args) { > final long state = Long.parseLong(args[0]); > final long c = Long.parseLong(args[1]); > final long d = Long.parseLong(args[2]); > if (c % 2 != d % 2) throw new IllegalArgumentException(); > final long C = c << 1 | 1; > final long D = d << 1 | 1; > final long r = 1314878037273365987L * ((d - c) >>> 1); > final PcgXshRr32 rng0 = new PcgXshRr32(new long[] { state, c }); > final PcgXshRr32 rng1 = new PcgXshRr32(new long[] { > 0xc097ef87329e28a5L *(6364136223846793005L * (state + C) + C - r > - D) - D, d }); > for(;;) { > final int a = rng0.nextInt(); > System.out.write(Ints.toByteArray(a), 0, 4); > final int b = rng1.nextInt(); > System.out.write(Ints.toByteArray(b), 0, 4); > } > } > }{noformat} > You can pass any state as first argument, and any two constants as the > following two arguments, as long as they are either both even or both odd . > The program will set up a second initial state so that the sequences > generated by the PRNGs using the two constants as seed are based on almost > identical underlying LCG sequences, in spite of having arbitrary, different > constants and different initial states. The two streams should be > independent, but if you pipe the output in PractRand you'll get immediately > {noformat} > rng=RNG_stdin32, seed=unknown > length= 4 megabytes (2^22 bytes), time= 2.1 seconds > Test Name Raw Processed Evaluation > BCFN(0+0,13-5,T) R=+263.2 p = 3.4e-103 FAIL ! > BCFN(0+1,13-5,T) R=+128.6 p = 1.5e-50FAIL > BCFN(0+2,13-6,T) R= +65.2 p = 9.2e-23FAIL !! > BCFN(0+3,13-6,T) R= +15.4 p = 1.0e-5 mildly suspicious > DC6-9x1Bytes-1R= +59.1 p = 4.2e-33FAIL !!! > DC6-6x2Bytes-1R= +34.1 p = 9.0e-19FAIL ! > DC6-5x4Bytes-1R= +15.2 p = 7.7e-8 very suspicious > [Low4/16]BCFN(0+1,13-6,T) R= +12.0 p = 1.5e-4 unusual > [Low4/16]FPF-14+6/64:(4,14-8) R= +9.2 p = 1.1e-6 unusual > [...] > [Low8/32]FPF-14+6/4:(9,14-9) R= +27.4 p = 1.6e-17FAIL > [Low8/32]FPF-14+6/4:(10,14-10)R= +16.4 p = 2.5e-9 suspicious > [Low8/32]FPF-14+6/4:all R=+283.4 p = 8.4e-255 FAIL !! > [Low8/32]Gap-16:A R=+414.8 p = 2.4e-336 FAIL !!! > [Low8/32]Gap-16:B R= +1736 p = 5e-1320FAIL > {noformat} > You can also offset one of generator by hundred of iterations, but the > sequences are so correlated that the result won't change. If you peek at the > state of the two generators you'll see that their difference is constant. > I think the reader should be made aware of the danger. If