Re: [computer-go] Transpositions in Monte-carlo tree search

Jonas Kahn Tue, 31 Mar 2009 13:17:21 -0700

On Tue, 31 Mar 2009, Matthew Woodcraft wrote:

Jonas Kahn wrote:

You might be interested by this article, for a very complete and tested
answer. Plus the idea of grouping, but a good part of the effect seems
to me to be giving a heuristic pre-value to moves, which might be done more
efficiently otherwise:


eprints.pascal-network.org/archive/00004571/01/8057.pdf


Thank you (and to the others who replied).

The idea of backing a simulation's results up to all parents ('UCT3' in
that paper) seems very dangerous to me! It's a shame they didn't have
any Go results to show for that one.


No there is no danger. That's the whole point of weighting with N_{s,a}.

N_{s,a} = number of times the node s has been visited, starting with parent a.

You can writeValue of a node a = (\sum_{s \in sons} N_{s,a} V_s) / (\sum N_{s,a})


where V_s is ideally the «true» value of node s.
In UCT2, they use V_s = Q_{s,a} the win average of simulations going
through a, and then through s.
In UCT3, they use V_s = Q_s the win average of all simulations through
s.

Assuming Markovianity (1), Q_s is a random variable with same mean as Q_{s,a}, but lower variance.That's all.


Jonas
(1) This might be broken if you give a heuristic value to your move in
the tree based on how near it is to previous moves, but that's not
really important.
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Transpositions in Monte-carlo tree search

Reply via email to