Hi again ;)

I found some time to actually implement this stuff. And, this has raised
some small questions. In this part of the code:

  for (j = index; j < moves_played_in_tree.size(); j += 2) {
    //stuff....
  }
  for (j = 0; j < moves_played_out_tree.size(); ++j) {
    //more stuff

    // If it is either not the right color or the intersection is
    // already played we ignore that move for that node
    if (move < 0 || already_played.AlreadyPlayed(move)) continue

    already_played.Play(move)
    //stuff
  }

1. Shouldn't the first loop start at j=index+1? Starting at j=index would
mean that the RAVE value of the node is updated with the move of the node
itself, wouldn't it? It makes more sense to me to actually start at the
first child of the node that is being back-upped. Correct me if I'm wrong.
2. Shouldn't the order in the second loop be:
-if (already played): continue;
-update already played;
-if (wrong color): continue;
Otherwise, moves that are the wrong color don't get counted as already
played (because they never get updated). I'm not sure if it makes a
difference in this case because you check in the playouts, too, but maybe
it does.

And a final question: You calculate the (beta) coefficient as
c = rc / (rc+c+rc*c*BIAS);
which looks similar to the formula proposed by David Silver (If I recall
his name correctly). However, in his formula, the last term looks like
rc*c*BIAS/(q_ur*(1-q_ur))
Is it correct that we could get q_ur from the current UCT-RAVE mean value,
and that it is used like that?

Regards,
Isaac Deutsch
-- 
Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: 
http://www.gmx.net/de/go/multimessenger
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to