Peter Drake wrote:
I've decided to bite the bullet and implement UCT in Orego, since (a)
everyone else appears to be using it and (b) not being a statistician,
improving on this is probably not where I'm likely to make a
contribution.
Here's my understanding of UCT. It is an algorithm for choosing the
best move from a given parent node. Let
p = # of runs through parent node
r = # of runs through move in question
w = # of wins through move in question
UCT proposes that we choose the node with the highest sum
average + uncertainty
where
average = w/r
and
uncertainty = sqrt(2 * logn(p) / r)
Is this correct? Are there any things to watch out for?
You can try multiplying uncertainty by a well-chosen constant value.
This way, you can tune how selective your search is. I found that using
a constant < 1 improves the search on 9x9 for Crazy Stone (I use
1/sqrt(10), if I recall correctly). I wonder what is the experience of
other UCT programmers, by the way.
Also, are people pruning any move for which (average + uncertainty) is
less than the (average - uncertainty) of some other move?
I am not sure what you mean. In UCT, you only search the move that has
the highest average + uncertainty, and you don't search any other at all.
-- Rémi
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/