Re: [computer-go] How to "properly" implement RAVE?

Sylvain Gelly Fri, 09 Jan 2009 14:55:26 -0800

Hi Isaac,

2009/1/9 Isaac Deutsch <i...@gmx.ch>


> Hi Sylvain,
>
> Thanks for your quick answer.
>
>
> > in a nutshell RAVE is basically AMAF adapted for Monte Carlo Tree Search.
> > The original paper describing it is
> > http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf and a
> > paper for "broader audience" can be found here:
> > http://www.lri.fr/~gelly/paper/MoGoNectar.pdf (the picture you posted
> > comes
> > from that paper).
>
> Yes, I took a screenshot. Another paper I looked at was
> http://www.lri.fr/~teytaud/eg.pdf
>
>
> > There are two important parts in the algorithms: the backup and the use
> of
> > the RAVE value. The second is the hardest to tune and to make it right.
> > The
> > proposed way of using the values in the original paper is not optimal
> > (while
> > already very useful). A much better way (especially in 19x19) has been
> > described on that list by David Silver.
>
> Do you mean the calculation of the factor beta that the RAVE value is
> multiplied with?
>
>
> > For the backup (as it is your original question), for each node traversed
> > by
> > the simulation, back up the values exactly as it would be done in AMAF
> > *if*
> > the playout began at that node. Note that I call playout the whole
> > simulation starting from the root and going to the end of the game.
>
> I see what you mean with the playout going from the root to the end of the
> game.
> How do you mean "back up the values ... if the playout began at that node"?
> Since every playout starts
> at the root (in my program, the root is the (previous) move of the opponent
> player), wouldn't that mean
> only updating the RAVE statistics for the root? I'm sorry if this question
> sounds stupid.


Sorry to be unclear. I wish we have a white board where we could discuss and
that would sorted out in a few minutes :).
In the quote of my sentence you did not put the "as" of the "as if the
playout began..." (the "as" and the "if" where separated by a part of the
sentence, which did not make things clearer, sorry...)
What I tried to mean is that when you do the backup for a given node, you
look at the part of the playout that happen after that node (including that
node), and you do a normal AMAF backup for that part of the playout.
Does it make sense?


> >   - Count only moves that happen after the node.
>
> How do you measure if a move is "after" another move? The amount of moves
> taken from the root (i.e. the depth of the node in the tree/the playout)? Or
> do you mean that the node is effectively a (grand-grand-...) father of the
> move, so the playout has visited that node?

By "after" I mean after in the sequence.
If the playout is E5, A7, C4, D8, by "after" I mean that C4 is after E5, but
not after D8.

I hope we make progress and I am not making things more confusing :).
I should write a pseudo code I guess, but for today I am too lazy :).

Sylvain

> I hope it will help you write a correct implementation. Don't hesitate to
> > ask for precisions.
>
> I really appreciate your help.
>
> -ibd
> --
> Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen:
> http://www.gmx.net/de/go/multimessenger
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] How to "properly" implement RAVE?

Reply via email to