Re: [computer-go] RefBot (thought-) experiments

Weston Markham Mon, 15 Dec 2008 14:35:16 -0800

Hi.  This is a continuation of a month-old conversation about the
possibility that the quality of AMAF Monte Carlo can degrade, as the
number of simulations increases:

Me:  "running 10k playouts can be significantly worse than running 5k playouts."

On Tue, Nov 18, 2008 at 2:27 PM, Don Dailey <drdai...@cox.net> wrote:
> On Tue, 2008-11-18 at 14:17 -0500, Weston Markham wrote:
>> On Tue, Nov 18, 2008 at 12:02 PM, Michael Williams
>> <michaelwilliam...@gmail.com> wrote:
>> > It doesn't make any sense to me from a theoretical perspective.  Do you 
>> > have
>> > empirical evidence?
>>
>> I used to have data on this, from a program that I think was very
>> nearly identical to Don's reference spec.  When I get a chance, I'll
>> try to reproduce it.
>
> Unless the difference is large, you will have to run thousands of games
> to back this up.
>
> - Don

I am comparing the behavior of the AMAF reference bot with 5000
playouts against the behavior with 100000 playouts, and I am only
considering the first ten moves (five from each player) of the (9x9)
games.  I downloaded a copy of Don's reference bot, as well as a copy
of Mogo, which is used as an opponent for each of the two settings.
gnugo version 3.7.11 is also used, in order to judge which side won
(jrefgo or mogo) after each individual match.  gnugo was used because
it is simple to set it up for this sort of thing via command-line
options, and it seems plausible that it should give a somewhat
realistic assessment of the situation.

jrefgo always plays black, and Mogo plays white.  Komi is set to 0.5,
so that jrefgo has a reasonable number of winning lines available to
it, although the general superiority of Mogo means that egregiously
bad individual moves will be punished.

In the games played, Mogo would occasionally crash.  (This was run
under Windows Vista; perhaps there is some incompatibility of the
binary I downloaded)  I have discarded these games (about 1 out of 50,
I think) from the statistics gathered.  As far as I know, there would
be no reason to think that this would skew the comparison between 5k
playouts and 100k playouts.  Other than occasional crashes, the
behavior of Mogo seemed reasonable in other games that I observed.  I
have no reason to think that it was not playing at a relatively high
level in the retained results.

Out of 3637 matches using 5k playouts, jrefgo won (i.e., was ahead
after 10 moves, as estimated by gnugo) 1688 of them.  (46.4%)
Out of 2949 matches using 100k playouts, jrefgo won 785.  (26.6%)

It appears clear to me that increasing the number of playouts from 5k
to 100k certainly degrades the performance of jrefgo.  Below, I am
including the commands that I used to run the tests and tally the
results.

Weston

$ cat scratch5k.sh

../gogui-1.1.3/bin/gogui-twogtp -auto -black "\"C:\\\\Program Files\\\\Java\\\\j
dk1.6.0_06\\\\bin\\\\java.exe\" -jar jrefgo.jar 5000" -games 10000 -komi 0.5 -ma
xmoves 10 -referee "gnugo --mode gtp --score aftermath --chinese-rules --positio
nal-superko" -sgffile games/jr5k-v-mogo -size 9 -white C:\\\\cygwin\\\\home\\\\E
xperience\\\\projects\\\\go\\\\MoGo_release3\\\\mogo.exe

$ grep B+ games/jr5k-v-mogo.dat | grep -v unexp | wc -l
1688

$ grep W+ games/jr5k-v-mogo.dat | grep -v unexp | wc -l
1949

$ grep B+ games/jr100k-v-mogo.dat | grep -v unexp | wc -l
785

$ grep W+ games/jr100k-v-mogo.dat | grep -v unexp | wc -l
2164
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RefBot (thought-) experiments

Reply via email to