On Sat, Jan 17, 2009 at 08:29:32PM +0100, Sylvain Gelly wrote:
> ChooseMove(node, board) {
>   bias = 0.015  // I put a random number here, to be tuned
>   b = bias * bias / 0.25
>   best_value = -1
>   best_move = PASSMOVE
>   for (move in board.allmoves) {
>     c = node.child(move).counts
>     w = node.child(move).wins
>     rc = node.rave_counts[move]
>     rw = node.rave_wins[move]
>     coefficient = 1 - rc / (rc + c + rc * c * b)
>     value = w / c * coef + rw / rc * (1 - coef)  // please here take care of
> the c==0 and rc == 0 cases
>     if (value > best_value) {
>       best_value = value
>       best_move = move
>     }
>   }
>   return best_move
> }

Hi,

it seems to me that, when you select play in the tree, you don't have an
exploration component. You use just a weighted average of score and RAVE
score.
So, if :
  - the best play is a good only if played immediatly and very bad if
    played later in the game :
  - the first playout for this play resulted in a lost.
score and RAVE score will be very low and this play will never be
considered again until a very long time.

Is it simplified code and in reality you replace w/c and rw/rc by scores
with exploration component or did you realy use it as is ?

Tom

-- 
Thomas Lavergne                    "Entia non sunt multiplicanda praeter
                                     necessitatem." (Guillaume d'Ockham)
thomas.laver...@reveurs.org                            http://oniros.org
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to