Sebastiano Vigna created RNG-123:
------------------------------------
Summary: PCG generators may exhibit massive stream correlation
Key: RNG-123
URL: https://issues.apache.org/jira/browse/RNG-123
Project: Commons RNG
Issue Type: Bug
Components: core
Affects Versions: 1.3
Reporter: Sebastiano Vigna
[This is based on an issue I posted on the Rust development mailing list.]
The documentation of the PCG generators does not state explicit that different
seeds generate independent sequences, but the existence of a 128-bit seed
implies somehow the idea that the whole seed is meaningful.
The user should be made aware that the second parameter (the constant of the
underlying LCG) is almost useless from a mathematical and statistical viewpoint.
Changing constant to an LCG with power-of-two modulus and a constant like the
one used in PCG simply adds a constant to the sequence (more precisely, there
are two equivalence classes of constants, and in each equivalence class the
sequences are identical, except for an additive constant).
The minimal scrambling done by the generator usually cannot cancel this fact,
and as a result changing the constant is equivalent to changing the initial
state (modulo an additive constant). You can try to run this program:
{noformat}
import org.apache.commons.rng.core.source32.PcgXshRr32;
import com.google.common.primitives.Ints;
public class TestPCG {
public static void main(final String[] args) {
final long state = Long.parseLong(args[0]);
final long c = Long.parseLong(args[1]);
final long d = Long.parseLong(args[2]);
if (c % 2 != d % 2) throw new IllegalArgumentException();
final long C = c << 1 | 1;
final long D = d << 1 | 1;
final long r = 1314878037273365987L * ((d - c) >>> 1);
final PcgXshRr32 rng0 = new PcgXshRr32(new long[] { state, c });
final PcgXshRr32 rng1 = new PcgXshRr32(new long[] {
0xc097ef87329e28a5L *(6364136223846793005L * (state + C) + C - r -
D) - D, d });
for(;;) {
final int a = rng0.nextInt();
System.out.write(Ints.toByteArray(a), 0, 4);
final int b = rng1.nextInt();
System.out.write(Ints.toByteArray(b), 0, 4);
}
}
}{noformat}
You can pass any state as first argument, and any two constants as the
following two arguments, as long as they are either both even or both odd . The
program will set up a second initial state so that the sequences generated by
the PRNGs using the two constants as seed are based on almost identical
underlying LCG sequences, in spite of having arbitrary, different constants and
different initial states. The two streams should be independent, but if you
pipe the output in PractRand you'll get immediately
{noformat}
rng=RNG_stdin32, seed=unknown
length= 4 megabytes (2^22 bytes), time= 2.1 seconds
Test Name Raw Processed Evaluation
BCFN(0+0,13-5,T) R=+263.2 p = 3.4e-103 FAIL !!!!!
BCFN(0+1,13-5,T) R=+128.6 p = 1.5e-50 FAIL !!!!
BCFN(0+2,13-6,T) R= +65.2 p = 9.2e-23 FAIL !!
BCFN(0+3,13-6,T) R= +15.4 p = 1.0e-5 mildly suspicious
DC6-9x1Bytes-1 R= +59.1 p = 4.2e-33 FAIL !!!
DC6-6x2Bytes-1 R= +34.1 p = 9.0e-19 FAIL !
DC6-5x4Bytes-1 R= +15.2 p = 7.7e-8 very suspicious
[Low4/16]BCFN(0+1,13-6,T) R= +12.0 p = 1.5e-4 unusual
[Low4/16]FPF-14+6/64:(4,14-8) R= +9.2 p = 1.1e-6 unusual
[...]
[Low8/32]FPF-14+6/4:(9,14-9) R= +27.4 p = 1.6e-17 FAIL
[Low8/32]FPF-14+6/4:(10,14-10) R= +16.4 p = 2.5e-9 suspicious
[Low8/32]FPF-14+6/4:all R=+283.4 p = 8.4e-255 FAIL !!!!!!
[Low8/32]Gap-16:A R=+414.8 p = 2.4e-336 FAIL !!!!!!!
[Low8/32]Gap-16:B R= +1736 p = 5e-1320 FAIL
!!!!!!!!{noformat}
You can also offset one of generator by hundred of iterations, but the
sequences are so correlated that the result won't change. If you peek at the
state of the two generators you'll see that their difference is constant.
I think the reader should be made aware of the danger. If you start several
generators of this kind the state is too small to guarantee that there will be
no overlap. Once you get overlap, since there are in practice just two
sequences, you will get a lot of unwanted correlation.
There's a reason why nobody in the last decades ever considered creating
"streams" using the constant part of an LCG, and it's that people realized very
early that it doesn't work (see, e.g.,
[https://ieeexplore.ieee.org/document/718715]).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)