Re: [computer-go] Optimal explore rates for plain UCT

Magnus Persson Tue, 11 Mar 2008 02:48:27 -0700

Quoting Jonas Kahn <[EMAIL PROTECTED]>:

On Tue, Mar 11, 2008 at 09:05:01AM +0100, Magnus Persson wrote:

Quoting Don Dailey <[EMAIL PROTECTED]>:


When the child nodes are allocated, they are done all at once with
this code - where cc is the number of fully legal child nodes:


In valkyria3 I have "supernodes" that contains an array of "moveinfo" for
all possible moves. In the moveinfo I also store win/visits and end
position ownership statistics so my data structures are memory intensive.
As a consequence I expand each move individually, and my threshold seems to
be best at 7-10 visits in test against Gnugo. 40 visits could be possible
but at 100 there is a major loss in playing strength.

Valkyria3 is also superselective using my implementation of mixing AMAF
with UCT as the mogo team recently described. The UCT constant is 0.01
(outside of the square root).

When it comes to parameters please remember that they may not have
independent effects on the playing strength. If one parameter is changed a
lot then the best value for other parameters may also change. And what
makes things worse is probably that best parameters change as a function of
the playouts. I believe that ideally the better the MC-eval is the more
selective one can expand the tree for example.


Typically, how many parameters do you have to tune ? Real or two-level ?

I guess I have 10 real valued and 10 binary ones. There are probably alot of stuff that are ahrd coded and could be parameterized.

Here I am also completely ignoring playouts that have hundreds ofhandtuned parameters.

If you consider a yet reasonable subset of parameters, an efficient way
to estimate them is  to use fractional factorial design for the linear
part, and central composite design for quadratic part (once you know you
are already in the right area). You are much more precise than with
change-one-at-a-time strategies if there is no interaction between
parameters, and you can detect interactions.


I once met this guy:

http://meche.mit.edu/people/faculty/index.html?id=27

his research is a mix of testing formal methods and how well they workin practice and also studying how engineers (who often do not usethese methods) actually do in practice.

He seemed to argue that doing parameter optimization intuitively byhand is not as bad as one might think compared to fractional factorialdesign. So I use that as an excuse for just doing it as I always did.For me it is important to keep a careful record of what I do and plotthe result with confidence intervals to avoid tricking myself.

Alternatively, especially with a very high number of real parameters,
derivatives of MC techniques can be efficient and easy to implement:
particle filtering or swarm optimization in particular.

That would be tempting (I once implemented a fitting method inspiredby simulating annealing and it was very efficient) but it wouldrequire a completely different test setup than the one I use right now.

It also a matter of time and patience. I want new results every day.If I would test all parameters at once using formal methods I wouldstill have to wait for weeks.


-Magnus
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Optimal explore rates for plain UCT

Reply via email to