[Computer-go] RAVE Implementation Clarification

Kenny Lam Fri, 21 Jan 2011 16:51:37 -0800

Hi,
I'm currently looking to write a bot using UCT/RAVE but there's a bit of
confusion on my end as to correct implementation of RAVE.  In particular,
I'm unsure of how to calculate the variance term for the nodes in the tree
search.  From reading Gelly and Silver's original paper on RAVE, I believe
that the UCB variance term for a node x is calculated as:


sqrt(log(N) / n)
Where n is the number of times that x was chosen/updated.  N is the
summation of n's for x and all siblings of x which is also equal to the n of
x_parent.

According to the paper (or atleast as I understand it), the AMAF variance
term for RAVE is calculated as:

sqrt(log(M) / m)
Where m is the number of times that x was given an AMAF "virtual" update and
M is the sum of m's for x and all of its siblings.

However, I've also downloaded the source for the TesujiRef engine from the
plug-and-go's svn repository and it seems RAVE is implemented differently
here.  In particular, I've noticed that the AMAF variance term is calculated
as:

sqrt(log(N) / m)
Where N is the number of real updated for x_parent and m is the number of
virtual updates for x (in consistence with the terms used throughout this
email).

Tesuji's AMAF variance term seems to only go up when x isn't chosen for a
real update as opposed to a virtual update.  Which implementation is best?
Am I simply misunderstanding one of the above?
I've also noticed that the beta value used for RAVE is calculated
differently.  The beta in Gelly and Silver's paper seems to decrease at an
inverse sqrt rate where as the beta in TesujiRefBot decreases
logarithmically.  Is the implementation in Tesuji more "modern" than the
once described in Gelly and Silver's paper?  Thanks a lot for your help.

_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

[Computer-go] RAVE Implementation Clarification

Reply via email to