On 07/30/2013 04:25 PM, Paul Sandoz wrote:
Hi Peter,

Given that the user cannot specify gamma values and the for the first  2^16+1 
gamma values the minimum bit count is 15, the maximum bit count is 50 and the 
minimum width is 45, is that not sufficient? i.e. is it likely there will be 
more than 65537 splits?

In the streams API one would be hard pressed to reach this limit, even for 
splitting an iterator. If i calculated things correctly with an arithmetic 
progression step size of 1 the limits of the max collection size will be 
reached (currently we use a step size of 1024) and one is likely to hit other 
problems way before due to issues with reduction on right-heavy trees.

Hi Paul,

In streams API it is unlikely, but SplittableRandom is a public API and might be used in other unforeseen scenarios. It's not hard to imagine submitting way more than 2^16 tasks to an Executor, giving each it's own SplittableRandom instance generated from a chain of splits...

If it is very rare to encounter a sparse gamma then it is also very cheap to skip it, right? It only takes Long.bitCount(gamma) at each split()...

Regards, Peter

Paul.
On Jul 29, 2013, at 7:29 PM, Peter Levart <peter.lev...@gmail.com> wrote:

On 07/16/2013 06:01 PM, Guy Steele wrote:
On Jul 16, 2013, at 3:07 AM, Martin Buchholz <marti...@google.com> wrote:
Read Appleby and Stafford ...

Hmmm.... mix32 has almost the same job as mix64 - have all the bits in the seed 
affect all the bits in the output, so the obvious implementation is

mix32() { return (int) mix64(); }
or more conservatively
mix32() { m = mix64(); return (int)m ^ (int)(m >>> 32); }
although it would be slightly distressing to have nextInt be more expensive 
than nextLong.
Yes---at first I used exactly
    mix32() { return (int) mix64(); }
but was worried about the time it took, which is why I searched for a cheaper 
mixing function.

There might be some clever way of doing murmurhash-style mixing when the input 
is 64-bit and the output is 32-bit that can do better, but I don't think the 
current mix32 is it.
Again, I was astonished that so simply a mixing function seemed to do the job 
as far as Dieharder is concerned.  I agree that it does appear to perform 
poorly for sparse gamma values.  So now the question is: do such sparse gamma 
values arise in practice?

As Doug has pointed out, we carefully made the two-argument constructor private 
so that the user cannot specify gamma values.  The gamma values are derived 
only from the gamma-seed 0, using the gamma value GAMMA_GAMMA, and always 
running the seed through mix64.

I wrote a little test (code and output below) to generate the first 16 billion 
(well, 2^34+1) gamma values actually generated and gather certain statistics:
(1) the number of 1-bits in the gamma value (output of Long.bitCount)
(2) the "width" of the gamma value (64 - Long.numberOfLeadingBits(gamma) - 
Long.numberOfTrailingBits(gamma))
Then it histograms the bit counts, and for each possible bit count tracks the 
minimum width for gammas having that same bit count.
Whenever the number of of gammas examined is one more than a power of 2, it 
prints a report consisting of those two tables and the min over the minwidth 
table.

Here's a brief summary:

For the first 65 gamma values, no gamma value has fewer than 21 1-bits or more 
than 41 1-bits.  The minimum width is 57.  So I predict there will be no weird 
mixing problems at all when using SplittableRandom to do a balanced split of a 
stream of generated randoms.

For the first 2^16+1 = 65537 gamma values, the minimum bit count is 15, the 
maximum bit count is 50 and the minimum width is 45.

For the first 2^24+1 gamma values, the minimum bit count is 12, the maximum bit 
count is 52 and the minimum width is 35.

For the first 2^32+1 gamma values, the minimum bit count is 9, the maximum bit 
count is 55 and the minimum width is 28.

For the first 2^34+1 gamma values, the minimum bit count is 8, the maximum bit 
count is 56 and the minimum width is 28.

So I think we will not hit any really bad gamma values in practice.

--Guy

Hi,

Would it be possible to skip gammas with less than 12 or more than 52 bits?

I did some playing with dieharder a week ago, testing SR.nextInt() (mix32) and indeed for sparse 
gamma values, all dieharder tests show failure. In particular test id=1 (diehard_operm5) is the 
least forgiving, requiring minimum gamma bit count being about 11 and maximum about 54 in order for 
such "random" gamma to produce nextInt() bit streams that pass the diehard_operm5 test. 
Here's a sample of the first 3 dieharder tests (0,1 and 2) for differently sparsed 
"random" gammas:

https://raw.github.com/plevart/lambda-hacks/SplittableRandom/SplittableRandom/nextInt_results_0to2.txt

The alternative:

    public int nextIntAlt1() {
        return (int) mix64(nextSeed());
    }

... does not exhibit this weakness. It seems to be satisfied with gammas having 
as few as 1 or as many as 63 bits set to pass dieharder tests:

https://raw.github.com/plevart/lambda-hacks/SplittableRandom/SplittableRandom/nextIntAlt1_results_0to2.txt

But such nextInt is slower, approximately as fast as 
ThreadLocalRandom.nextInt() by my measurements.

Random streams created by SplittableRandom.nextLong() (mix64) also don't 
exhibit weaknesses with sparse gammas, so I thought why not using mix64() to 
produce 64 bits of randomness that is then returned by two consecutive calls to 
nextInt, like that:

    private int nextInt;
    private boolean nextIntZero;

    public int nextIntAlt2() {
        if (nextInt != 0) {
            int result = nextInt;
            nextInt = 0;
            return result;
        } else if (nextIntZero) {
            nextIntZero = false;
            return 0;
        } else {
            long nextLong = mix64(nextSeed());
            nextInt = (int)(nextLong >>> 32);
            if (nextInt == 0) nextIntZero = true;
            return (int)(nextLong & 0xFFFFFFFFL);
        }
    }

The (little-Indian) bitstream produced by nextIntAlt2() is exactly the same as 
that produced by nextLong() and 1st 3 dieharder tests pass with sparse gammas 
too:

https://raw.github.com/plevart/lambda-hacks/SplittableRandom/SplittableRandom/nextIntAlt2_results_0to2.txt

the measurements produced by micro benchmark show performance somewhere between 
nextInt() and nextIntAlt1():

Benchmark                      Thr    Cnt  Sec         Mean Mean error          
Var    Units
s.g.t.JmhTest.SR_nextInt         1      8    5 477779.670     1151.048   
865504.410 ops/msec
s.g.t.JmhTest.SR_nextIntAlt1     1      8    5 385630.764      513.779   
172439.034 ops/msec
s.g.t.JmhTest.SR_nextIntAlt2     1      8    5 432071.940      572.221   
213899.640 ops/msec
s.g.t.JmhTest.TLR_nextInt        1      8    5 382334.416       28.907      
545.871 ops/msec


Here's the JMH micro benchmark I used:

https://github.com/plevart/lambda-hacks/blob/SplittableRandom/SplittableRandom/src/sr/JmhTest.java


Here's the code I used to produce dieharder reports:

https://github.com/plevart/lambda-hacks/blob/SplittableRandom/SplittableRandom/src/sr/Test.java

which uses the following utility class to interface dieharder tool:

https://github.com/plevart/lambda-hacks/blob/SplittableRandom/SplittableRandom/src/si/pele/dieharder/DieharderTest.java


Regards, Peter


Reply via email to