I've decided to bite the bullet and implement UCT in Orego, since (a)
everyone else appears to be using it and (b) not being a
statistician, improving on this is probably not where I'm likely to
make a contribution.
Here's my understanding of UCT. It is an algorithm for choosing the
best move from a given parent node. Let
p = # of runs through parent node
r = # of runs through move in question
w = # of wins through move in question
UCT proposes that we choose the node with the highest sum
average + uncertainty
where
average = w/r
and
uncertainty = sqrt(2 * logn(p) / r)
Is this correct? Are there any things to watch out for?
Also, are people pruning any move for which (average + uncertainty)
is less than the (average - uncertainty) of some other move?
Peter Drake
Assistant Professor of Computer Science
Lewis & Clark College
http://www.lclark.edu/~drake/
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/