So what percent seems to work best in Pebbles?

On Mon, Jul 4, 2011 at 11:01 PM, Brian Sheppard <[email protected]> wrote:

> Related to the "perfect endgame" thread, but different...
>
> Fuego claims that adding a few percent of point differential to the result
> of a trial results in a stronger player. Pachi later confirmed that result,
> and I have confirmed it in Pebbles as well.
>
> The standard explanation for this is that the small bias (just 2% in Fuego,
> 4% in Pachi) help to avoid losses by endgame blunders. Well, that might be,
> but I see something more fundamental.
>
> When you score a game in a Win/Loss dimension, there is only one player who
> can make an error: the side that is winning. For the loser, all moves are
> losing. So a playout stumbles to the right conclusion if it contains an
> even
> number of errors, and if it contains an odd number of errors then it
> reaches
> the wrong conclusion. If you take a probability P of making an error and
> model the probability of making an even nubmer of errors then you will find
> out that this is a daunting model. You might doubt that MCTS could ever
> work.
>
> But MCTS works quite well, and I think that it is because of point
> differential.
>
> In a point differential model, *both* players can make errors. So the point
> differential takes a random walk from the leaf of the tree to the terminal
> position. The trial reaches the right conclusion if the random walk crosses
> zero an even number of times.
>
> In a random walk, the probability of crossing zero depends on how far from
> zero you start from. So if one tree node is better (that is, higher point
> differential) than another, it is more likely that a simulation trial will
> result in a win.
>
> To get back to Fuego's finding: why does adding in some point differential
> help? Because the larger the point differential of the terminal position,
> the higher (on average) was the point differential of the leaf node.
>
> The random walk that takes a leaf node to a terminal node is invertible, so
> the same probability distribution relates the leaf and terminal positions.
> Accordingly, we can use the terminal point differential to compute a
> probability distribution of the leaf node, and that distribution implies a
> probability that the leaf node is winning.
>
> So, without doubting the standard theory about how point differential could
> reduce yose errors, I see point differential as a factor in opening and
> middle play, too.
>
> Brian
>
> _______________________________________________
> Computer-go mailing list
> [email protected]
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Reply via email to