Re: [computer-go] Optimal explore rates for plain UCT

Jonas Kahn Tue, 11 Mar 2008 02:22:02 -0700

On Tue, Mar 11, 2008 at 09:05:01AM +0100, Magnus Persson wrote:
> Quoting Don Dailey <[EMAIL PROTECTED]>:
>>
>> When the child nodes are allocated, they are done all at once with
>> this code - where cc is the number of fully legal child nodes:
>
> In valkyria3 I have "supernodes" that contains an array of "moveinfo" for 
> all possible moves. In the moveinfo I also store win/visits and end 
> position ownership statistics so my data structures are memory intensive. 
> As a consequence I expand each move individually, and my threshold seems to 
> be best at 7-10 visits in test against Gnugo. 40 visits could be possible 
> but at 100 there is a major loss in playing strength.
>
> Valkyria3 is also superselective using my implementation of mixing AMAF 
> with UCT as the mogo team recently described. The UCT constant is 0.01 
> (outside of the square root).
>
> When it comes to parameters please remember that they may not have 
> independent effects on the playing strength. If one parameter is changed a 
> lot then the best value for other parameters may also change. And what 
> makes things worse is probably that best parameters change as a function of 
> the playouts. I believe that ideally the better the MC-eval is the more 
> selective one can expand the tree for example.


Typically, how many parameters do you have to tune ? Real or two-level ?

If you consider a yet reasonable subset of parameters, an efficient way
to estimate them is  to use fractional factorial design for the linear
part, and central composite design for quadratic part (once you know you
are already in the right area). You are much more precise than with
change-one-at-a-time strategies if there is no interaction between
parameters, and you can detect interactions.

Since anyhow computers are used, it might be possible to choose
sequentially automatically new values of the parameter that optimize
your efficiency. That's a very interesting problem, with much work on it
in the statistical community, but I do not know that very well (neither
the former designs, but those are easy).

Alternatively, especially with a very high number of real parameters,
derivatives of MC techniques can be efficient and easy to implement:
particle filtering or swarm optimization in particular.

Jonas
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Optimal explore rates for plain UCT

Reply via email to