On Tue, Jun 30, 2009 at 12:47 AM, Peter Drake wrote:
> A while back, Sylvain Gelly posted this code:
>
> ChooseMove(node, board) {
> bias = 0.015 // I put a random number here, to be tuned
> b = bias * bias / 0.25
> best_value = -1
> best_move = PASSMOVE
> for (move in board.allmoves) {
> c = node.child(move).counts
> w = node.child(move).wins
> rc = node.rave_counts[move]
> rw = node.rave_wins[move]
> coefficient = 1 - rc / (rc + c + rc * c * b)
> value = w / c * coef + rw / rc * (1 - coef) // please here take care of
> the c==0 and rc == 0 cases
> if (value > best_value) {
> best_value = value
> best_move = move
> }
> }
> return best_move
> }
>
> From this, it appears that each node knows about its own counts and wins, as
> well as the rave counts and wins of its children.
>
> I understand (correct me if I'm wrong!) that the "value" line is a weighted
> sum of the win rate among actual moves and the win rate among RAVE moves.
>
> My question is: what is the meaning of this line?
>
> coefficient = 1 - rc / (rc + c + rc * c * b)
>
> Why this formula?
You can look at a thread on this list
http://computer-go.org/pipermail/computer-go/2008-February/014095.html
and better the attachment explaining the formula.
http://computer-go.org/pipermail/computer-go/attachments/20080208/6519e9c5/rave.pdf
Hoping it helps,
Sylvain
>
> Thanks for any help you can offer,
>
> Peter Drake
> http://www.lclark.edu/~drake/
>
>
>
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/