Re: [computer-go] Rave coefficient

2009-06-30 Thread Sylvain Gelly
On Tue, Jun 30, 2009 at 12:47 AM, Peter Drake wrote:
> A while back, Sylvain Gelly posted this code:
>
> ChooseMove(node, board) {
>  bias = 0.015  // I put a random number here, to be tuned
>  b = bias * bias / 0.25
>  best_value = -1
>  best_move = PASSMOVE
>  for (move in board.allmoves) {
>    c = node.child(move).counts
>    w = node.child(move).wins
>    rc = node.rave_counts[move]
>    rw = node.rave_wins[move]
>    coefficient = 1 - rc / (rc + c + rc * c * b)
>    value = w / c * coef + rw / rc * (1 - coef)  // please here take care of
> the c==0 and rc == 0 cases
>    if (value > best_value) {
>      best_value = value
>      best_move = move
>    }
>  }
>  return best_move
> }
>
> From this, it appears that each node knows about its own counts and wins, as
> well as the rave counts and wins of its children.
>
> I understand (correct me if I'm wrong!) that the "value" line is a weighted
> sum of the win rate among actual moves and the win rate among RAVE moves.
>
> My question is: what is the meaning of this line?
>
>    coefficient = 1 - rc / (rc + c + rc * c * b)
>
> Why this formula?

You can look at a thread on this list
http://computer-go.org/pipermail/computer-go/2008-February/014095.html
and better the attachment explaining the formula.
http://computer-go.org/pipermail/computer-go/attachments/20080208/6519e9c5/rave.pdf

Hoping it helps,
Sylvain
>
> Thanks for any help you can offer,
>
> Peter Drake
> http://www.lclark.edu/~drake/
>
>
>
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] Rave coefficient

2009-06-30 Thread Peter Drake

A while back, Sylvain Gelly posted this code:

ChooseMove(node, board) {
  bias = 0.015  // I put a random number here, to be tuned
  b = bias * bias / 0.25
  best_value = -1
  best_move = PASSMOVE
  for (move in board.allmoves) {
c = node.child(move).counts
w = node.child(move).wins
rc = node.rave_counts[move]
rw = node.rave_wins[move]
coefficient = 1 - rc / (rc + c + rc * c * b)
value = w / c * coef + rw / rc * (1 - coef)  // please here take  
care of

the c==0 and rc == 0 cases
if (value > best_value) {
  best_value = value
  best_move = move
}
  }
  return best_move
}

From this, it appears that each node knows about its own counts and  
wins, as well as the rave counts and wins of its children.


I understand (correct me if I'm wrong!) that the "value" line is a  
weighted sum of the win rate among actual moves and the win rate among  
RAVE moves.


My question is: what is the meaning of this line?

coefficient = 1 - rc / (rc + c + rc * c * b)

Why this formula?

Thanks for any help you can offer,

Peter Drake
http://www.lclark.edu/~drake/



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/