I use the first formula, with K equal 500. > -----Original Message----- > From: computer-go-boun...@computer-go.org [mailto:computer-go- > boun...@computer-go.org] On Behalf Of Peter Drake > Sent: Monday, September 14, 2009 10:59 AM > To: Computer Go > Subject: [computer-go] Conflicting RAVE formulae > > Gelly and Silver ("Combining Online and Offline Knowledge in UCT", > section 6) give this formula for the weight given to RAVE values (as > opposed to the direct MC values): > > sqrt(k / (3*n(s) + k)) > > Here, k is a constant and n(s) is the number of playouts through state > s. Clearly, as the number of playouts increases, this approaches zero. > > Hembold and Parker-Wood ("All-Moves-As-First Heuristics in Monte-Carlo > Go") site the Gelly and Silver paper, but give a different formula! > Adjusting for notation, they use: > > (k - n(s)) / k, or 0 if this expression is negative > > This also converges toward (and then sticks at) zero, but it it not > the same formula. > > Why are they different? Does it matter? Is there an explanation > anywhere for Gelly and Silver's more elaborate formula? Is there > anything wrong with k / (n(s) + k)? > > On a related note, in a message on this list, David Silver gives a > newer formula: > > http://computer-go.org/pipermail/computer-go/2009-May/018251.html > > Was this ever published? (Orego is using this newer formula, and it > appears to work well.) > > Peter Drake > http://www.lclark.edu/~drake/ > > > > _______________________________________________ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/
_______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/