On 07/16/2013 06:01 PM, Guy Steele wrote:
On Jul 16, 2013, at 3:07 AM, Martin Buchholz <marti...@google.com> wrote:
Read Appleby and Stafford ...
Hmmm.... mix32 has almost the same job as mix64 - have all the bits in the seed
affect all the bits in the output, so the obvious implementation is
mix32() { return (int) mix64(); }
or more conservatively
mix32() { m = mix64(); return (int)m ^ (int)(m >>> 32); }
although it would be slightly distressing to have nextInt be more expensive
than nextLong.
Yes---at first I used exactly
mix32() { return (int) mix64(); }
but was worried about the time it took, which is why I searched for a cheaper
mixing function.
There might be some clever way of doing murmurhash-style mixing when the input
is 64-bit and the output is 32-bit that can do better, but I don't think the
current mix32 is it.
Again, I was astonished that so simply a mixing function seemed to do the job
as far as Dieharder is concerned. I agree that it does appear to perform
poorly for sparse gamma values. So now the question is: do such sparse gamma
values arise in practice?
As Doug has pointed out, we carefully made the two-argument constructor private
so that the user cannot specify gamma values. The gamma values are derived
only from the gamma-seed 0, using the gamma value GAMMA_GAMMA, and always
running the seed through mix64.
I wrote a little test (code and output below) to generate the first 16 billion
(well, 2^34+1) gamma values actually generated and gather certain statistics:
(1) the number of 1-bits in the gamma value (output of Long.bitCount)
(2) the "width" of the gamma value (64 - Long.numberOfLeadingBits(gamma) -
Long.numberOfTrailingBits(gamma))
Then it histograms the bit counts, and for each possible bit count tracks the
minimum width for gammas having that same bit count.
Whenever the number of of gammas examined is one more than a power of 2, it
prints a report consisting of those two tables and the min over the minwidth
table.
Here's a brief summary:
For the first 65 gamma values, no gamma value has fewer than 21 1-bits or more
than 41 1-bits. The minimum width is 57. So I predict there will be no weird
mixing problems at all when using SplittableRandom to do a balanced split of a
stream of generated randoms.
For the first 2^16+1 = 65537 gamma values, the minimum bit count is 15, the
maximum bit count is 50 and the minimum width is 45.
For the first 2^24+1 gamma values, the minimum bit count is 12, the maximum bit
count is 52 and the minimum width is 35.
For the first 2^32+1 gamma values, the minimum bit count is 9, the maximum bit
count is 55 and the minimum width is 28.
For the first 2^34+1 gamma values, the minimum bit count is 8, the maximum bit
count is 56 and the minimum width is 28.
So I think we will not hit any really bad gamma values in practice.
--Guy
Hi,
Would it be possible to skip gammas with less than 12 or more than 52 bits?
I did some playing with dieharder a week ago, testing SR.nextInt()
(mix32) and indeed for sparse gamma values, all dieharder tests show
failure. In particular test id=1 (diehard_operm5) is the least
forgiving, requiring minimum gamma bit count being about 11 and maximum
about 54 in order for such "random" gamma to produce nextInt() bit
streams that pass the diehard_operm5 test. Here's a sample of the first
3 dieharder tests (0,1 and 2) for differently sparsed "random" gammas:
https://raw.github.com/plevart/lambda-hacks/SplittableRandom/SplittableRandom/nextInt_results_0to2.txt
The alternative:
public int nextIntAlt1() {
return (int) mix64(nextSeed());
}
... does not exhibit this weakness. It seems to be satisfied with gammas
having as few as 1 or as many as 63 bits set to pass dieharder tests:
https://raw.github.com/plevart/lambda-hacks/SplittableRandom/SplittableRandom/nextIntAlt1_results_0to2.txt
But such nextInt is slower, approximately as fast as
ThreadLocalRandom.nextInt() by my measurements.
Random streams created by SplittableRandom.nextLong() (mix64) also don't
exhibit weaknesses with sparse gammas, so I thought why not using
mix64() to produce 64 bits of randomness that is then returned by two
consecutive calls to nextInt, like that:
private int nextInt;
private boolean nextIntZero;
public int nextIntAlt2() {
if (nextInt != 0) {
int result = nextInt;
nextInt = 0;
return result;
} else if (nextIntZero) {
nextIntZero = false;
return 0;
} else {
long nextLong = mix64(nextSeed());
nextInt = (int)(nextLong >>> 32);
if (nextInt == 0) nextIntZero = true;
return (int)(nextLong & 0xFFFFFFFFL);
}
}
The (little-Indian) bitstream produced by nextIntAlt2() is exactly the
same as that produced by nextLong() and 1st 3 dieharder tests pass with
sparse gammas too:
https://raw.github.com/plevart/lambda-hacks/SplittableRandom/SplittableRandom/nextIntAlt2_results_0to2.txt
the measurements produced by micro benchmark show performance
somewhere between nextInt() and nextIntAlt1():
Benchmark Thr Cnt Sec Mean Mean
error Var Units
s.g.t.JmhTest.SR_nextInt 1 8 5 477779.670 1151.048
865504.410 ops/msec
s.g.t.JmhTest.SR_nextIntAlt1 1 8 5 385630.764 513.779
172439.034 ops/msec
s.g.t.JmhTest.SR_nextIntAlt2 1 8 5 432071.940 572.221
213899.640 ops/msec
s.g.t.JmhTest.TLR_nextInt 1 8 5 382334.416
28.907 545.871 ops/msec
Here's the JMH micro benchmark I used:
https://github.com/plevart/lambda-hacks/blob/SplittableRandom/SplittableRandom/src/sr/JmhTest.java
Here's the code I used to produce dieharder reports:
https://github.com/plevart/lambda-hacks/blob/SplittableRandom/SplittableRandom/src/sr/Test.java
which uses the following utility class to interface dieharder tool:
https://github.com/plevart/lambda-hacks/blob/SplittableRandom/SplittableRandom/src/si/pele/dieharder/DieharderTest.java
Regards, Peter