Hi Isaac, 2009/1/9 Isaac Deutsch <i...@gmx.ch>
> Hi Sylvain, > > Thanks for your quick answer. > > > > in a nutshell RAVE is basically AMAF adapted for Monte Carlo Tree Search. > > The original paper describing it is > > http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf and a > > paper for "broader audience" can be found here: > > http://www.lri.fr/~gelly/paper/MoGoNectar.pdf (the picture you posted > > comes > > from that paper). > > Yes, I took a screenshot. Another paper I looked at was > http://www.lri.fr/~teytaud/eg.pdf > > > > There are two important parts in the algorithms: the backup and the use > of > > the RAVE value. The second is the hardest to tune and to make it right. > > The > > proposed way of using the values in the original paper is not optimal > > (while > > already very useful). A much better way (especially in 19x19) has been > > described on that list by David Silver. > > Do you mean the calculation of the factor beta that the RAVE value is > multiplied with? > > > > For the backup (as it is your original question), for each node traversed > > by > > the simulation, back up the values exactly as it would be done in AMAF > > *if* > > the playout began at that node. Note that I call playout the whole > > simulation starting from the root and going to the end of the game. > > I see what you mean with the playout going from the root to the end of the > game. > How do you mean "back up the values ... if the playout began at that node"? > Since every playout starts > at the root (in my program, the root is the (previous) move of the opponent > player), wouldn't that mean > only updating the RAVE statistics for the root? I'm sorry if this question > sounds stupid. Sorry to be unclear. I wish we have a white board where we could discuss and that would sorted out in a few minutes :). In the quote of my sentence you did not put the "as" of the "as if the playout began..." (the "as" and the "if" where separated by a part of the sentence, which did not make things clearer, sorry...) What I tried to mean is that when you do the backup for a given node, you look at the part of the playout that happen after that node (including that node), and you do a normal AMAF backup for that part of the playout. Does it make sense? > > - Count only moves that happen after the node. > > How do you measure if a move is "after" another move? The amount of moves > taken from the root (i.e. the depth of the node in the tree/the playout)? Or > do you mean that the node is effectively a (grand-grand-...) father of the > move, so the playout has visited that node? By "after" I mean after in the sequence. If the playout is E5, A7, C4, D8, by "after" I mean that C4 is after E5, but not after D8. I hope we make progress and I am not making things more confusing :). I should write a pseudo code I guess, but for today I am too lazy :). Sylvain > I hope it will help you write a correct implementation. Don't hesitate to > > ask for precisions. > > I really appreciate your help. > > -ibd > -- > Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: > http://www.gmx.net/de/go/multimessenger > _______________________________________________ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ >
_______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/