In reading Sylvain Gelly's thesis, it seemed that incorporating a prior
estimate of winning percentage is
very important to the practical strength of Mogo.
E.g., with 10000 trials, Mogo achieved 2110 rating on CGOS, whereas my
program attempts to
reproduce existing research and is (maybe) 1900 rating with 20000 to 30000
trials. The use of a
prior is an important difference, so I want to understand it more deeply.
Some questions:
1) When you create a node, do you initialize
number of simulations = C
number of wins = C * PriorEstimate()
where C is a constant > 0? In Sylvain's thesis, the optimal C = 50,
suggesting that
incorporating a prior estimate was the equivalent of 50 UCT-RAVE trials.
2) Two variations were suggested. In one variation, the prior was
incorporated into the UCT
statistics of the node. In the other, the prior was incorporated into the
RAVE statistics. Charts
in the thesis do not confirm which was actually being measured. In some
cases it appears to
be the UCT version, but elsewhere it seems to be the RAVE version. Does
anyone know
what was really done?
3) Elsewhere I have seen information suggesting that Mogo initializes RAVE
statistics to
implement progressive widening. Does that conflict with the use of a prior
for RAVE initialization,
or is it in addition to the use of a prior for RAVE initialization?
4) When creating a node, do you estimate the prior for that node , or for
that node's children?
Thanks in advance,
Brian
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/