> I have been told that bots that are based on MC play better when they only 
> record the result of each roll out (W or L)
> rather than the margin of victory.
> 
> To me this is counter-intuitive.
> 
> Does anyone have an intelligible reason why it should be so?

The search then optimizes for the probability of winning, rather than
optimizing for the largest margin of victory.

Imagine two stock traders. Their goal is to beat the market over the
next 12 months, and they both choose 10 companies from the FTSE100.
Trader A randomly chooses 10 companies with dividends that are paying
over the average for the FTSE100. Trader B chooses the 10 companies with
the highest dividends. Intuitively trader B should have earned more by
the end of the year, but there is a good chance that at least one
company will go bankrupt, and another will cut its dividend. Maybe the
other 8 choices will do well enough to keep him ahead overall, but
chances are that trade A will come out ahead.

Games of go tend to be dominated by life and death battles. There may be
a way for black to kill white's group, and win big, but it is awfully
complicated and we don't have time for exhaustive search. If we can
still win by letting white's group live small, that is a much safer path.

There is also a pragmatic reason: it is just one bit of information to
pass up the tree, so very easy to make a single number for chance of
win. With margin of victory you end up with the problem of how to pass a
probability distribution up the tree, and then what to do with it at the
top.
(The presence of the life/death battles means the distribution tends to
have multiple peaks, not be nice and gaussian.)

Darren



-- 
Darren Cook, Software Researcher/Developer
My New Book: Practical Machine Learning with H2O:
  http://shop.oreilly.com/product/0636920053170.do
_______________________________________________
Computer-go mailing list
[email protected]
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to