Re: [computer-go] 10k UCT bots

Gunnar Farnebäck Mon, 19 May 2008 12:48:09 -0700

I wrote:
> Hideki Kato wrote:
>> Gunnar Farnebäck: <[EMAIL PROTECTED]>:
>>> Hideki Kato wrote:
>>>> I didn't against you, Álvaro, rather I just made a caution for
>>>> programmers who will use your pseudo code as is. :)
>>>>
>>>> First, I prefer SFMT (SIMD-oriented Fast Mersenne Twister) rather
>>>> than integer pseudo random number generators in practice where the
>>>> quality of play-out is important.  Modern processors execute floating
>>>> operations as fast as interger ones and
>>>>     picked = mt_rand() * (double) num_candidates;
>>>> is the simplest and safe.
>>> Please note that for uniformity purists this method has exactly the
>>> same problem as good_quality_int_rand() % num_candidates.
>>
>> Mt_rand() has very good uniform distributions in [0..1)
>> while
>> good_quality_int_rand() % num_candidates
>> doesn't disribute uniformly when num_candidates is not a power of 2,
>> assuming good_quality_int_rand() ranges [0..2^32 or so) due to modulo
>> operations.  They are not the same, aren't they?
>
> Well, there's nothing magic about floating point numbers. Even a very
> good uniform distribution in some interval is implemented by
> distributing N discrete values over the interval as uniformly as
> possible. When those N values by some mapping procedure are transformed
> into a smaller range M, some of those will get at least one more hit
> than some others, unless M divides N. It doesn't matter whether the
> mapping procedure is an integer modulo operation or a floating point
> multiplication + rounding.


It seems like this short explanation was too abstract. Let's try with
a more concrete one.

The function dsfmt_genrand_close1_open2 in the file dSFMT.h from
dSFMT-src-1.3, downloadable from
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/index.html,
generates uniformly distributed double precision floating point
numbers in the interval 1 <= x < 2.  This is done by taking uniformly
distributed 52 bit integers and placing those as the least significant
bits of a 64 bit IEEE 754 double precision floating point number with
the bit pattern

0 01111111111 y,

where y denotes the 52 bits uniformly distributed integer. This is
interpreted as the floating point number with value

x = 1 + y / 2^52

The function dsfmt_genrand_close_open generates uniformly distributed
double precision floating point numbers in the interval 0 <= x < 1
simply by subtracting one from the numbers above. These are still
exactly representable by IEEE754 double precision floating point
numbers although their bit representations become somewhat more
involved due precisely to the point floating around.

Thus we now have the uniform integer distribution 0, ..., 2^52-1
mapped onto the floating point numbers 0, 1*2^(-52), 2*2^(-52), ...,
(2^52-1)*2^(-52).

Next we multiply this by num_candidates and round down to the nearest
integer. Let's say that num_candidates is 7. Then the 52 bit integers
are mapped onto the interval 0, 1, 2, 3, 4, 5, 6 so that

               0 -  643371375338642 are mapped to 0
 643371375338643 - 1286742750677284 are mapped to 1
1286742750677285 - 1930114126015926 are mapped to 2
1930114126015927 - 2573485501354569 are mapped to 3
2573485501354570 - 3216856876693211 are mapped to 4
3216856876693212 - 3860228252031853 are mapped to 5
3860228252031854 - 4503599627370495 are mapped to 6

This means that moves 0, 3 have a relative chance of 643371375338643
to be chosen whereas moves 1, 2, 4, 5, 6 have a relative chance of
only 643371375338642. Of course this is for almost any purpose
negligible but it is exactly the same bias you get by taking a
uniformly distributed 52-bit integer modulo 7, except that then it's
the moves 0 and 1 that are favored instead of 0 and 3.

Well, that's what naive theoretical computations give at least, but
it's usually dangerous to trust intuition too much when dealing with
floating point numbers. Actually trying this out with the attached
program gives the following result on my computer,

0 0.000000 0
643371375338642 0.142857 0
643371375338643 0.142857 1
1286742750677284 0.285714 1
1286742750677285 0.285714 2
1930114126015926 0.428571 2
1930114126015927 0.428571 3
2573485501354568 0.571429 3
2573485501354569 0.571429 4
2573485501354570 0.571429 4
3216856876693211 0.714286 4
3216856876693212 0.714286 5
3860228252031853 0.857143 5
3860228252031854 0.857143 6
4503599627370495 1.000000 6

showing that 2573485501354569 maps to 4 rather than 3, meaning that
it's moves 0 and 4 which are favored.

I could go on but I think this should be enough to demonstrate my
point.

/Gunnar

#include <stdio.h>

int main(int argc, char **argv)
{
  unsigned long u;
  double x;
  int k;

  unsigned long v[] = {
    0ul,
    643371375338642ul,
    643371375338643ul,
    1286742750677284ul,
    1286742750677285ul,
    1930114126015926ul,
    1930114126015927ul,
    2573485501354568ul,
    2573485501354569ul,
    2573485501354570ul,
    3216856876693211ul,
    3216856876693212ul,
    3860228252031853ul,
    3860228252031854ul,
    4503599627370495ul};

  for (k = 0; k < (int) (sizeof(v) / sizeof(v[0])); k++) {
    u = v[k];
    u |= 0x3ff0000000000000ul;
    x = *(double *) (&u);
    x -= 1.0;
    printf("%lu %lf %d\n", v[k], x, (int) (x * 7));
  }
  return 0;
}

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] 10k UCT bots

Reply via email to