> I have been told that bots that are based on MC play better when they only > record the result of each roll out (W or L) > rather than the margin of victory. > > To me this is counter-intuitive. > > Does anyone have an intelligible reason why it should be so?
The search then optimizes for the probability of winning, rather than optimizing for the largest margin of victory. Imagine two stock traders. Their goal is to beat the market over the next 12 months, and they both choose 10 companies from the FTSE100. Trader A randomly chooses 10 companies with dividends that are paying over the average for the FTSE100. Trader B chooses the 10 companies with the highest dividends. Intuitively trader B should have earned more by the end of the year, but there is a good chance that at least one company will go bankrupt, and another will cut its dividend. Maybe the other 8 choices will do well enough to keep him ahead overall, but chances are that trade A will come out ahead. Games of go tend to be dominated by life and death battles. There may be a way for black to kill white's group, and win big, but it is awfully complicated and we don't have time for exhaustive search. If we can still win by letting white's group live small, that is a much safer path. There is also a pragmatic reason: it is just one bit of information to pass up the tree, so very easy to make a single number for chance of win. With margin of victory you end up with the problem of how to pass a probability distribution up the tree, and then what to do with it at the top. (The presence of the life/death battles means the distribution tends to have multiple peaks, not be nice and gaussian.) Darren -- Darren Cook, Software Researcher/Developer My New Book: Practical Machine Learning with H2O: http://shop.oreilly.com/product/0636920053170.do _______________________________________________ Computer-go mailing list [email protected] http://computer-go.org/mailman/listinfo/computer-go
