[computer-go] UCT-MC process

compgo123 Sat, 11 Apr 2009 09:53:16 -0700

Following is a description of the nature and process of the UCT-MC method.


For a given Go board position P, let Np denote the total number of all possble 
end of game positions. Let Ei denote each of the end of game position 
(i=1,...,Np).

Let Mip denote all possible move sequences that starts from the position P and 
ends at position Ei.

Assume the most simple MC simulation is used and there is no prunning at all 
except the detection of the end of the game. Then the average of N number of MC 
simulation gives the following ratio.

F=(sum of Mip where black wins)/(sum of all Mip)

Now the question is for F > 0.5 does it mean that P is a wnning position for 
black? The anwser is not necessarily. Statistically for more than 50% of the 
cases it's true. This is the reason why the MC evaluation works. It's also the 
reason why MC alone cannot be used to evaluate the game. The reason above 
happens is that there exist narrow paths in the game space. The searching part 
of the UCT-MC method is actually trying to identify these narrow paths.

With the so called heavy playout the situationis a little dfferent. Here the 
playout itself gets involved in the identifcation of the narrow paths.

In most of the cases the rules used in the heavy playout are not perfect. This 
results in the incorrect prunning even in small number of the cases. Generally 
the searching part of the UCT-MC method can compensate for this error. However, 
one has to be carefull here, because it's not guaranted. For example,?it can 
hapen if?the playout and the searching part uses the same prunning rules. Could 
it be true that for more powerful computers one should use lighter playout?

DL

_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] UCT-MC process

Reply via email to