On Jan 23, 2008 7:39 PM, Jason House <[EMAIL PROTECTED]> wrote: > On Wed, 2008-01-23 at 18:57 -0500, Eric Boesch wrote: > > I am curious if any of those of you who have heavy-playout programs > > would find a benefit from the following modification: > > > > > exp_param = sqrt(0.2); // sqrt(2) times the original parameter value. > > > uct = exp_param * sqrt( log(sum of all children playout) > > > * (child-win-rate-2) / > > > (number of child playout) ); > > > uct_value = (child winning rate) + uct; > > > > where child-win-rate-2 is defined as > > > > (#wins + 1) / (#wins + #losses + 2) > > I'm surprised to see that this works as listed, because the math looks > all wrong to me...
Argh. I have to retract the claim that this helps. I didn't optimize the libego parameters correctly before I tested it. Sorry about that -- I thought I did. There's a lot more I could add, but I thought I'd get that out there before anyone wasted (probably) any more time on my error. By the way, does anybody know of any nifty tools or heuristics for efficient probabilistic multi-parameter optimization? In other words, like multi-dimensional optimization, except instead of your function returning a deterministic value, it returns the result of a Bernoulli trial, and the heuristic uses those trial results to converge as rapidly as possible to parameter values that roughly maximize the success probability. The obvious approach is to cycle through all dimensions in sequence, treating it as a one-dimensional optimization problem -- though the best way to optimize in one dimension isn't obvious to me either -- but just as with the deterministic version of optimization, I assume it's possible to do better than that. It might be fun problem to play with, but if good tools already exist then it wouldn't be very productive for me to waste time reinventing the broken wheel. _______________________________________________ computer-go mailing list [email protected] http://www.computer-go.org/mailman/listinfo/computer-go/
