>I'm wondering whether the formula to determine the balance between RAVE and >UCT, >beta = sqrt(c / 3 * parentVisits + c), >has any mathematical background - or is it just a best guess for something >that starts at 1 and is 1/2 after a certain number of visits?
I guess it is simply a kind of parameter tuning. At least the constant number 3 is meaningless in the formula - we can use the following formula with c2 = c/3. beta = sqrt(c2 / (parentVisits + c2)) >Another question is about the prior integration. Apparently the prior, RAVE >and UCT values are three different estimators for the winning probability. So >why not use the above formula for prior vs. RAVE balancing, too, instead of >initializing RAVE with it? Because the prior values do not change during simulations like RAVE and UCT values. Of course there might be a more effective integration method, however we need very long time to find it. -- Yamato _______________________________________________ computer-go mailing list [email protected] http://www.computer-go.org/mailman/listinfo/computer-go/
