Why are m and n different? Isn't every playout used both to update the UCT win rate and the RAVE values for the same nodes? Won't the number of UCT simulations and the number of RAVE simulations be the same?
Each playout is used both to update the UCT win rate and the RAVE values for the same node;
but for one simulation, you get _one_ result for ""UCT"" win rate, and _many_ RAVE values. (lower quality values but much larger number of results) _______________________________________________ computer-go mailing list [email protected] http://www.computer-go.org/mailman/listinfo/computer-go/
