Hi,
While recently implementing UCT, I've come across two cases where one may want
to increase the number of visits to a node by something greater than 1.
A) Leaf nodes - Artifically set the number of visits to some very high number.
The rationale for this is to accelerate propagation of the leaf condition "up
the tree" so as not to waste future time descending on parts of the tree which
have a known outcome. Does that make sense?
B) In the section "UCT with Prior Knowledge" within the paper "Combining Online
and Offline Knowledge in UCT' Gelly/Silver mention seeding the visits with the
"equivalent experience" contained in the prior value function, eg.
n(s,a) <- n-prior(s,a)
Therefoe, I assume that in Gelly/Silver's "updateValue(node,value), the line:
node[i].nb = node[i].nb + 1;
is replaced with something like:
node[i].nb = node[i].nb + k;
where k >= 1
This raises two questions:
1) What is an effective way of dealing with leaf nodes? Is what I describe
above in A) typical?
2) Is it really OK to increase the number of visits and propagate them "up the
tree" this way? Does it break any of the statistical assumptions inherent in
the algorithm. I suppose it must be "OK" in that it is mentioned in the paper.
Sorry if these questions seem basic, but I think it's important to get the
details right.
Thanks, I appreciate any further insights.
-- Greg
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/