Re: [computer-go] How to "properly" implement RAVE?

Daniel Waeber Wed, 14 Jan 2009 16:55:30 -0800

Ok, I still have same questions about the refbot code.

On 10:29 Wed 14 Jan     , Mark Boon wrote:
> 
> On Jan 14, 2009, at 9:36 AM, Daniel Waeber wrote:
> 
> > I have a question about the the part were the stats are updated.
> > (l.15-25). having an array of amaf-values in every node seems very  
> > memory
> > intensive and I don't get how you would access these values.
> 
> You are right, it is memory intensive. I believe this is one of the  
> reasons that most implementations wait a certain number of playouts  
> before creating the next level of nodes.


> > form http://pastie.org/pastes/357231 :
> >     node[visitedNode[i]].AMAFSum[visitedPos[j]]+=result; 
> >     node[visitedNode[i]].AMAFPlayed[visitedPos[j]]+=1;

I had some problems with these lines. looks to me like the nodes have an
array of boardsize*boardsize inside *all* nodes, increasing the memory
of the tree by a factor >10.
But, correct me if I'm wrong, the refbot code just adds two simple
values to the node for rave.

> Accessing the AMAF values depends on your implementation. The  
> following is a code-snippet from my MCTS reference implementation that  
> updates the AMAF values in the tree:
> 
> if (_useAMAF)
> {
>       IntStack playoutMoves = _searchAdministration.getMoveStack();
>       byte color = _monteCarloAdministration.getColorToMove();
>       int start = _monteCarloAdministration.getMoveStack().getSize();
>       int end = playoutMoves.getSize();
>       double weight = 1.0;
>       double weightDelta = 1.0 / (end - start + 1); // Michael Williams' idea 
> to use decreasing weights
>       GoArray.clear(_weightMap);
>       GoArray.clear(_colorMap);
>       for (int i=start; i<end; i++)
>       {
>               int moveXY = playoutMoves.get(i);
>               if (_colorMap[moveXY]==0)
>               {
>                       _colorMap[moveXY] = color;
>                       _weightMap[moveXY] = weight;
>               }
>               weight -= weightDelta;
>               color = opposite(color);
>       }

until here it's clear to me.

>       while (playoutNode!=null)
>       {
>               color = opposite(playoutNode.getContent().getMove().getColor());
>               boolean playerWins = (blackWins && color==BLACK) || (!blackWins 
> && color==WHITE);
>               double score = playerWins ? 
> MonteCarloTreeSearchResult.MAX_SCORE : MonteCarloTreeSearchResult.MIN_SCORE;
>               for (int i=0; i<playoutNode.getChildCount(); i++)
>               {
>                       TreeNode<MonteCarloTreeSearchResult<MoveType>> nextNode 
> = playoutNode.getChildAt(i);
>                       MonteCarloTreeSearchResult<MoveType> result = 
> nextNode.getContent();
>                       GoMove move = (GoMove) result.getMove();
>                       int xy = move.getXY();
>                       if (_colorMap[xy]==color)
>                                
> result.increaseVirtualPlayouts(_weightMap[xy]*score,_weightMap[xy]);

if i understand this code correctly, it updates the amaf vaules of all
direct children of the playoutNode according to the weight/color maps.

And that update is done for all nodes on the selected path.

>               }
>               playoutNode = playoutNode.getParent();

First of all, I miss an weight/colorMap update for xy here. Souldn't the
move of the current playoutNode be considered as an amaf move for all
the nodes below this one?

>       }
> }

But, most of all, I miss that the code only updates the amaf values of
all direct children, and not of all nodes n below the playoutNode, where
there is no play at n.move on the path between n and the playoutNode.

Finding all these nodes n would be a costy thing to do, but wouldn't
that be the "right" thing to do? Implementing a realistic subset of RAVE
is another story, but first of all I want to understand the pure concept
of RAVE.

> playoutNode is the move-node from which the playout was done. The amaf  
> values are stored in its children by the increaseVirtualPlayout()  
> method. Note that it goes up the tree by assigning the parent to  
> playoutNode until it gets to the root.
> 
> For more context it would be better to lookup the whole source at 
> http://plug-and-go.dev.java.net
> If you think some more comments in the code could clarify things  
> better I'm open to suggestions.

Thanks for the code, didn't know amaf already is inside the mcts refbot.

> 
> Good luck.
> 
>      Mark

Regards,
   wabu


_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] How to "properly" implement RAVE?

Reply via email to