No, because some strong programs use winrate + rave + prior bias.  The rave
term can provide enough exploration to avoid the need to the UCT term.

David

> -----Original Message-----
> From: [email protected] [mailto:computer-go-
> [email protected]] On Behalf Of ???? ??????
> Sent: Tuesday, October 26, 2010 10:14 PM
> To: computer-go
> Subject: Re: [Computer-go] Monte Carlo (upper confidence bounds applied to
> trees)
> 
> > With uniformly distributed playouts, it would be something around
> > c=0.2 in sqrt(c*ln(N)/M), with much more sophisticated heuristics and
> > good prior biasing of the node values and then RAVE, c will approach 0
> > as the need for UCB-driven exploration will decrease.
> 
> 
> Thank you.
> 
> But exploration coefficient C can't be equal to 0 ? Because if it's equal,
> then we return to the situation which first post of this thread described
> (we use only WinRate).
> 
> 
> Another question: what to do when the game is over in the Tree Policy, not
> in the Default Policy? Do we have to make the program not to select this
> node any more (not to call procedure PlaySimulation for this node)?
> _______________________________________________
> Computer-go mailing list
> [email protected]
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Reply via email to