Re: [Computer-go] Exploration formulas for UCT

2011-01-02 Thread Petr Baudis
  Hi!

On Sun, Jan 02, 2011 at 03:53:32PM +0800, Aja wrote:
 I guess it should be not * 3000 but / 3000.
 
 Zen also uses this type of formula, but the constant value is rather
 small. I use 400 for the latest version of Zen.
 
   If you are right, then it makes sense. For /3000, bias is around 0.009.
   I use 600 for Erica, similar to Zen.

  Yes, I am sorry, it's /3000.

Petr Pasky Baudis
___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


Re: [Computer-go] Exploration formulas for UCT

2011-01-01 Thread Aja

  Hi petr,


 We use the Silver formula:

rave_visits / (rave_visits + real_visits + rave_visits * real_visits * 
3000)


The figure of 3000 is surprisingly resilient. Even with radically
different heuristics and playouts, it stays the empirical optimum.


  Interesting. According to Sylvain's original post here, that means you 
set bias to sqrt(3000/4)=27.386... But is not bias should be in the range 
[0,1]?


 Aja


___
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go