I've decided to bite the bullet and implement UCT in Orego, since (a) everyone else appears to be using it and (b) not being a statistician, improving on this is probably not where I'm likely to make a contribution.

Here's my understanding of UCT. It is an algorithm for choosing the best move from a given parent node. Let

p = # of runs through parent node
r = # of runs through move in question
w = # of wins through move in question

UCT proposes that we choose the node with the highest sum

average + uncertainty

where

average = w/r

and

uncertainty = sqrt(2 * logn(p) / r)

Is this correct? Are there any things to watch out for?

Also, are people pruning any move for which (average + uncertainty) is less than the (average - uncertainty) of some other move?

Peter Drake
Assistant Professor of Computer Science
Lewis & Clark College
http://www.lclark.edu/~drake/



_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to