Remi Munos wrote:
I have updated the BAST paper, providing additional comparison with UCT, as suggested by one person in the list. See: https://hal.inria.fr/inria-00150207

after reading the paper, I have two questions:

How did you deal with unexplored nodes in Flat UCB and BAST in your
experiment? If you assign infinity to their bounds that means that the
maximum is also infinity and all of the 2^D leaf nodes in the tree will
be explored at least once. This doesn't matter with Flat UCB, because
the regret is O(2^D) anyway, but it should matter with BAST, right?

Have you considered a different definition of your smoothness
assumption? I think the bound on the difference between a sub-optimal
leaf reward and the optimal reward is not a good model for Go trees. It
might be better to model smoothness by assuming a high probability that
two leaf rewards in a branch are close to each other, but there might
still be a few outliers (losing moves in a won position).

- Markus


_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to