Re: [computer-go] 19x19 MC improvement
Eric Boesch wrote: By the way, does anybody know of any nifty tools or heuristics for efficient probabilistic multi-parameter optimization? In other words, like multi-dimensional optimization, except instead of your function returning a deterministic value, it returns the result of a Bernoulli trial, and the heuristic uses those trial results to converge as rapidly as possible to parameter values that roughly maximize the success probability. The obvious approach is to cycle through all dimensions in sequence, treating it as a one-dimensional optimization problem -- though the best way to optimize in one dimension isn't obvious to me either -- but just as with the deterministic version of optimization, I assume it's possible to do better than that. It might be fun problem to play with, but if good tools already exist then it wouldn't be very productive for me to waste time reinventing the broken wheel. RSPSA: http://www.sztaki.hu/~szcsaba/papers/rspsa_acg.pdf Levente Kocsis, Csaba Szepesvari, Mark H.M. Winands Abstract. Most game programs have a large number of parameters that are crucial for their performance. While tuning these parameters by hand is rather di±cult, successful applications of automatic optimisation al- gorithms in game programs are known only for parameters that belong to certain components (e.g. evaluation-function parameters). The SPSA (Simultaneous Perturbation Stochastic Approximation) algorithm is an attractive choice for optimising any kind of parameters of a game pro- gram, both for its generality and its simplicity. It's disadvantage is that it can be very slow. In this article we propose several methods to speed up SPSA, in particular, the combination with RPROP, using common random numbers, antithetic variables and averaging. We test the result- ing algorithm for tuning various types of parameters in two domains, poker and LOA. From the experimental study, we conclude that using SPSA is a viable approach for optimisation in game programs, especially if no good alternative exists for the types of parameters considered. Rémi ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] 19x19 MC improvement
On Jan 23, 2008 7:39 PM, Jason House [EMAIL PROTECTED] wrote: On Wed, 2008-01-23 at 18:57 -0500, Eric Boesch wrote: I am curious if any of those of you who have heavy-playout programs would find a benefit from the following modification: exp_param = sqrt(0.2); // sqrt(2) times the original parameter value. uct = exp_param * sqrt( log(sum of all children playout) * (child-win-rate-2) / (number of child playout) ); uct_value = (child winning rate) + uct; where child-win-rate-2 is defined as (#wins + 1) / (#wins + #losses + 2) I'm surprised to see that this works as listed, because the math looks all wrong to me... Argh. I have to retract the claim that this helps. I didn't optimize the libego parameters correctly before I tested it. Sorry about that -- I thought I did. There's a lot more I could add, but I thought I'd get that out there before anyone wasted (probably) any more time on my error. By the way, does anybody know of any nifty tools or heuristics for efficient probabilistic multi-parameter optimization? In other words, like multi-dimensional optimization, except instead of your function returning a deterministic value, it returns the result of a Bernoulli trial, and the heuristic uses those trial results to converge as rapidly as possible to parameter values that roughly maximize the success probability. The obvious approach is to cycle through all dimensions in sequence, treating it as a one-dimensional optimization problem -- though the best way to optimize in one dimension isn't obvious to me either -- but just as with the deterministic version of optimization, I assume it's possible to do better than that. It might be fun problem to play with, but if good tools already exist then it wouldn't be very productive for me to waste time reinventing the broken wheel. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] 19x19 MC improvement
Eric Boesch This is probably massive overkill, but one of the most successful techniques for multi-parameter optimization is Taguchi methods. http://en.wikipedia.org/wiki/Taguchi_methods However, in my experience, starting with the decisions that make the biggest difference, and then adding in the next-biggest decision, and so on, usually gets close enough to optimal. In my cgbg performance tests, most reasonable decisions affect total program performance by a factor of 2 at most, which translates to 1 rank at most. Likewise, I doubt that tweaking of existing MC parameters will bring more than 1 rank. Further improvement will require new ideas. Michael Wing By the way, does anybody know of any nifty tools or heuristics for efficient probabilistic multi-parameter optimization? In other words, like multi-dimensional optimization, except instead of your function returning a deterministic value, it returns the result of a Bernoulli trial, and the heuristic uses those trial results to converge as rapidly as possible to parameter values that roughly maximize the success probability. The obvious approach is to cycle through all dimensions in sequence, treating it as a one-dimensional optimization problem -- though the best way to optimize in one dimension isn't obvious to me either -- but just as with the deterministic version of optimization, I assume it's possible to do better than that. It might be fun problem to play with, but if good tools already exist then it wouldn't be very productive for me to waste time reinventing the broken wheel. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] 19x19 MC improvement
i recommend: http://www.research.att.com/~njas/gosset/index.html s. - Original Message From: [EMAIL PROTECTED] [EMAIL PROTECTED] To: computer-go computer-go@computer-go.org Sent: Saturday, January 26, 2008 2:35:01 PM Subject: Re: [computer-go] 19x19 MC improvement Eric Boesch This is probably massive overkill, but one of the most successful techniques for multi-parameter optimization is Taguchi methods. http://en.wikipedia.org/wiki/Taguchi_methods However, in my experience, starting with the decisions that make the biggest difference, and then adding in the next-biggest decision, and so on, usually gets close enough to optimal. In my cgbg performance tests, most reasonable decisions affect total program performance by a factor of 2 at most, which translates to 1 rank at most. Likewise, I doubt that tweaking of existing MC parameters will bring more than 1 rank. Further improvement will require new ideas. Michael Wing By the way, does anybody know of any nifty tools or heuristics for efficient probabilistic multi-parameter optimization? In other words, like multi-dimensional optimization, except instead of your function returning a deterministic value, it returns the result of a Bernoulli trial, and the heuristic uses those trial results to converge as rapidly as possible to parameter values that roughly maximize the success probability. The obvious approach is to cycle through all dimensions in sequence, treating it as a one-dimensional optimization problem -- though the best way to optimize in one dimension isn't obvious to me either -- but just as with the deterministic version of optimization, I assume it's possible to do better than that. It might be fun problem to play with, but if good tools already exist then it wouldn't be very productive for me to waste time reinventing the broken wheel. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ Looking for last minute shopping deals? Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] 19x19 MC improvement
By the way, does anybody know of any nifty tools or heuristics for efficient probabilistic multi-parameter optimization? In other words, like multi-dimensional optimization, except instead of your function returning a deterministic value, it returns the result of a Bernoulli trial, and the heuristic uses those trial results to converge as rapidly as possible to parameter values that roughly maximize the success probability. I recommend evolutionary algorithms because they are robust on noise and don't require a quadratic or linear model for the function they optimize. I would go as simple as a ES(1+1) algorithm (a glorified name for a simple hill climber that probes randomly for its next step). I would also use restarts: run it once until no more improvement is apparent, then run it again and again (restarts) a few times (5-10) and take the overall optimum found. You'd be surprised how far you can get with this method! Adrian ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] 19x19 MC improvement
My program Aya637_4CPU is top on 19x19 CGOS now. It is temporary, but I'm so happy! Please don't run full power MoGo, Crazy Stone and Leela :-) My main improvement was 1. Do not move dead stones. Before doing playout, Aya does string capture search up to 7 plies(ladders are extended). And set the probability of dead stones's liberty to 1/1. This changed winning rate 0.17 to 0.33 (both are 200 games, 4po against GnuGo 3.7.10) But in 9x9, this method changed nothing. There is no ladder search in playout. 2. Change UCT exploration parameter exp_param = sqrt(2.0); uct = exp_param * sqrt( log(sum of all children playout) / (number of child playout) ); uct_value = (child winning rate) + uct; I changed exp_param sqrt(2.0) to sqrt(0.1). This changed winning rate 0.10(660 games) to 0.22(140 games). (1po, against GnuGo 3.7.10) I was surprised such a small parameter change made big difference. AyaMC's sytle is similar to Crazy Stone. - progressive windening. - pattern is only 3x3 from 1 professional player's game. Uct code is like this, http://www.yss-aya.com/uct_aya_e.cpp In 1po of the root initial position, 11 nodes are created, and 18 candidates are searched Regards, Hiroshi Yamashita ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
RE: [computer-go] 19x19 MC improvement
Congratulations! I'm really impressed with how fast the AyaMC improved. Regards, David -Original Message- From: [EMAIL PROTECTED] [mailto:computer-go- [EMAIL PROTECTED] On Behalf Of Hiroshi Yamashita Sent: Wednesday, January 23, 2008 12:12 AM To: computer-go Subject: [computer-go] 19x19 MC improvement My program Aya637_4CPU is top on 19x19 CGOS now. It is temporary, but I'm so happy! Please don't run full power MoGo, Crazy Stone and Leela :-) My main improvement was 1. Do not move dead stones. Before doing playout, Aya does string capture search up to 7 plies(ladders are extended). And set the probability of dead stones's liberty to 1/1. This changed winning rate 0.17 to 0.33 (both are 200 games, 4po against GnuGo 3.7.10) But in 9x9, this method changed nothing. There is no ladder search in playout. 2. Change UCT exploration parameter exp_param = sqrt(2.0); uct = exp_param * sqrt( log(sum of all children playout) / (number of child playout) ); uct_value = (child winning rate) + uct; I changed exp_param sqrt(2.0) to sqrt(0.1). This changed winning rate 0.10(660 games) to 0.22(140 games). (1po, against GnuGo 3.7.10) I was surprised such a small parameter change made big difference. AyaMC's sytle is similar to Crazy Stone. - progressive windening. - pattern is only 3x3 from 1 professional player's game. Uct code is like this, http://www.yss-aya.com/uct_aya_e.cpp In 1po of the root initial position, 11 nodes are created, and 18 candidates are searched Regards, Hiroshi Yamashita ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] 19x19 MC improvement
On Jan 23, 2008 3:11 AM, Hiroshi Yamashita [EMAIL PROTECTED] wrote: 2. Change UCT exploration parameter exp_param = sqrt(2.0); uct = exp_param * sqrt( log(sum of all children playout) / (number of child playout) ); uct_value = (child winning rate) + uct; I changed exp_param sqrt(2.0) to sqrt(0.1). 2.0 to 0.1? That's a pretty big step. I did notice that libego worked better with lower constants than recommended by the formula -- in libego, the original UCB1 formula corresponds to an explore_rate coefficient of 8, but the default explore_rate coefficient of 1 seemed to be about the best. I am curious if any of those of you who have heavy-playout programs would find a benefit from the following modification: exp_param = sqrt(0.2); // sqrt(2) times the original parameter value. uct = exp_param * sqrt( log(sum of all children playout) * (child-win-rate-2) / (number of child playout) ); uct_value = (child winning rate) + uct; where child-win-rate-2 is defined as (#wins + 1) / (#wins + #losses + 2) If it happens that you do the equivalent of initializing #wins to 1 and #losses to 1 (in libego, setting the initial bias to 2), then you can just use your original (child winning rate) value. (W+1)/(W+L+2) is the mean of a beta(W+1,L+1) random variable, which is what you get when you start with a uniform(0,1) distribution and condition it by the observation of a W-L record. The explore parameter inside the square root is doubled so that when you have an average child-win-rate-2 value of 0.5, the formula returns the same value as before. (Using an initial bias of 2.0 seems like a good thing anyhow.) Adding this extra term seemed to help a bit (57% +/- 4.5% win rate over unmodified program) when the basic playouts were uniform. This change tends to make the formula stick with proven winners more than before, so you might need to increase the explore rate by a little more than sqrt(2). Your mileage may vary -- I'd also like to know if you try this change and find it unhelpful. The justification is that the standard deviation of a beta(1,21) is lower than the standard deviation of a beta(11,11) variable. The error term of a beta(21,1) is lower too, but applying the (L+1)/(W+L+2) term to the UCB1 formula can yield nonsensical results where when two moves have the same bias, the one with the worse win-loss record is assigned the higher UCB1 value. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] 19x19 MC improvement
On Wed, 2008-01-23 at 18:57 -0500, Eric Boesch wrote: I am curious if any of those of you who have heavy-playout programs would find a benefit from the following modification: exp_param = sqrt(0.2); // sqrt(2) times the original parameter value. uct = exp_param * sqrt( log(sum of all children playout) * (child-win-rate-2) / (number of child playout) ); uct_value = (child winning rate) + uct; where child-win-rate-2 is defined as (#wins + 1) / (#wins + #losses + 2) I'm surprised to see that this works as listed, because the math looks all wrong to me... I usually think of UCT as being based on the sample variance. It looks like you're using: sqrt(child_win_rate_2/number_child_playouts) Standard bernouli trials yield: sqrt(child_win_rate*(1-child_win_rate)/number_child_playouts) Beta distribution yields: sqrt(win_rate_2*(1-win_rate_2)/(number_child_playouts+3)) When using the full beta distribution theory, uct_value would be: child_win_rate_2 + uct Is it possible you have a typo in your uct calculation? If not, you're really favoring high win rates over low win rates. Or maybe with inversions of wins and losses between ply, you're favoring exploration of the less probable moves? I'd really be interested in hearing more about what you did! PS: I'm glad to see someone else using the beta distribution theory. I posted it to the mailing list long ago, but didn't think anyone found it very interesting/useful. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/