The complex formula at the end is for a lower confidence bound of a
Bernoulli distribution with independent trials (AKA biased coin flip) and
no prior knowledge. At a leaf of your search tree, that is the most correct
distribution. Higher up in a search tree, I'm not so sure that's the
correct distribution. For a sufficiently high number of samples, most
averaging processes converge to a Normal distribution (due to central limit
theorem). For a Bernoulli distribution with a mean near 50% the required
number of samples is ridiculously low.

I believe a lower confidence bound is probably best for final move
selection, but UCT uses an upper confidence bound for tree exploration. I
recommend reading the paper, but it uses a gradually increasing confidence
interval which was shown to be an optimal solution for the muli-armed
bandit problem. I don't think that's the best model for computer go, but
the success of the method cannot be denied.

The strongest programs have good "prior knowledge" to initialize wins and
losses. My understanding is that they use average win rate directly
(incorrect solution #2) instead of any kind of confidence bound.

TL;DR: Use UCT until your program natures
On Mar 30, 2015 8:06 AM, "folkert" <folk...@vanheusden.com> wrote:

> Hi,
>
> When performing a montecarlo search, we end up with a number of wins
> and number of looses for a position on the board.
>
> What is now the proven methology for comparing these values?
>
> I tried the method described here:
>         http://www.evanmiller.org/how-not-to-sort-by-average-rating.html
> ...which not only looks at how much more wins than looses there were
> for a position but also the total number for that position. So that a
> difference x with 10 playouts may be evaluated lower than a difference
> y with 11 playouts, even if x > y.
> This does not seem to give good results altough other factors may be in
> play here (my program is in it's infant stage).
>
>
> Folkert van Heusden
>
> --
> Finally want to win in the MegaMillions lottery? www.smartwinning.info
> ----------------------------------------------------------------------
> Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
> _______________________________________________
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to