Re: [computer-go] More UCT / Monte-Carlo questions (Effect of rave)

Sylvain Gelly Wed, 06 Feb 2008 07:41:42 -0800

Hi Erik,

> Thanks for your reply! How do you like your new job? Do you miss CompGo? ;-)
I like it very much thanks. Do you want to come? ;). I miss computer
go and all of you, but I read the list (not all) time to time, so I
have some remembers :). I just can't do too many things at the same
time.


> Well, since you say the improvement is marginal on 9x9 then I think we
> are actually in agreement. I also get an improvement, but it's just
> not that much. When I wrote 'spectacular' I meant the reported jump
> from 25% to over 55% winrate against gnugo.

I am sorry, I was unclear. When I said marginal in 9x9, I was talking
of the differences between the current MoGo (well, at least when I
left it 6 months ago :)) and the algorithm described in the ICML
paper. The change is only on how to balance the two values you get
(UCT and Rave).
Rave in 9x9, as described in the paper, gives a big jump in
performance, and the numbers reported in the paper are accurate: those
are computed with many thousands games against gnugo and I carefully
did the experiments multiple times.
In addition, Hideki for example report the same order of improvements.
I have to point out that it is really easy to make a mistake in the
updates making Rave much less interesting. I am definitely not saying
that you or anyone else made a mistake, but it can just happen,
sometimes :).

> >  That could be an explanation, but there are two points:
> >  - the prior you put on top of Rave often avoid to first sample 1-1,
> >  and even when you do, you very often loose just 1 playout because of
> >  the UCT value you get right away.
>
> Yes, using more prior knowledge will probably reduce the problem.

You don't even need "more", already save atari, atari, hane and cut
gives you enough in many cases to avoid 1-1. Of course more good
knowledge is better :). My point is just that you don't need so much
to avoid to try first 1-1. And even without prior knowledge that is
not terrible. If you see at the numbers in the paper, adding this very
simple prior knowledge does not improve so much.


> >  - I never observed a big discrepancy between the number of Rave
> >  samples for each move.
>
> I guess this is because your playout policy is more uniform than mine?
> The problem tends to disappear with uniform random playouts.
> My program has some hard-reject patterns to discard moves that are
> strictly inferior to adjacent alternatives, so in some situations I
> can easily get a large difference between the number of Rave samples
> for each move.

It is definitely a sensible hypothesis.

Best,
Sylvain
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] More UCT / Monte-Carlo questions (Effect of rave)

Reply via email to