So what percent seems to work best in Pebbles? On Mon, Jul 4, 2011 at 11:01 PM, Brian Sheppard <[email protected]> wrote:
> Related to the "perfect endgame" thread, but different... > > Fuego claims that adding a few percent of point differential to the result > of a trial results in a stronger player. Pachi later confirmed that result, > and I have confirmed it in Pebbles as well. > > The standard explanation for this is that the small bias (just 2% in Fuego, > 4% in Pachi) help to avoid losses by endgame blunders. Well, that might be, > but I see something more fundamental. > > When you score a game in a Win/Loss dimension, there is only one player who > can make an error: the side that is winning. For the loser, all moves are > losing. So a playout stumbles to the right conclusion if it contains an > even > number of errors, and if it contains an odd number of errors then it > reaches > the wrong conclusion. If you take a probability P of making an error and > model the probability of making an even nubmer of errors then you will find > out that this is a daunting model. You might doubt that MCTS could ever > work. > > But MCTS works quite well, and I think that it is because of point > differential. > > In a point differential model, *both* players can make errors. So the point > differential takes a random walk from the leaf of the tree to the terminal > position. The trial reaches the right conclusion if the random walk crosses > zero an even number of times. > > In a random walk, the probability of crossing zero depends on how far from > zero you start from. So if one tree node is better (that is, higher point > differential) than another, it is more likely that a simulation trial will > result in a win. > > To get back to Fuego's finding: why does adding in some point differential > help? Because the larger the point differential of the terminal position, > the higher (on average) was the point differential of the leaf node. > > The random walk that takes a leaf node to a terminal node is invertible, so > the same probability distribution relates the leaf and terminal positions. > Accordingly, we can use the terminal point differential to compute a > probability distribution of the leaf node, and that distribution implies a > probability that the leaf node is winning. > > So, without doubting the standard theory about how point differential could > reduce yose errors, I see point differential as a factor in opening and > middle play, too. > > Brian > > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go >
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
