BTW: have you tried other distributional difference metrics, or does K-L
have properties that you like?

Thanks,

steve
On Sep 5, 2015 1:35 AM, "Hideki Kato" <[email protected]> wrote:

> djhbrown .: <
> capsify9fub60pd3lzdyhdpupffgyenv4t+m47okwphzrb4q...@mail.gmail.com>:
> >thank you for sharing the paper.
> >
> >"the Maximum Frequency method is based on the
> >maximization of the difference between the expected reward of
> >the optimal move and that of others"
> >
> >intuitively it feels that biasing random search towards the optimal route
> >would yield reduced failure rates, yet it does seem to depend on knowing
> >what the optimal route is beforehand.
>
> UCT is never a random search but deterministic.
>
> Maxmizing KL-divergence just speed-up the convergence of the interative
> algorithm.
>
> Hideki
>
> >if i knew the optimal route to get from A to B, i wouldn't bother doing a
> >random search, but just follow it.
> >
> >"This property [“bias in suboptimal moves”] means that the impact of
> >missing the optimal move is much greater for one player than it is for the
> >opponent."
> >
> >i find this conclusion puzzling because Go is a zero-sum game, so what is
> >good for one side is equally bad for the other, not variably so.  I have
> >not checked the statistical inference calculations to see whether there is
> >an error in them.
> >---- inline file
> >_______________________________________________
>
> >Computer-go mailing list
>
> >[email protected]
>
> >http://computer-go.org/mailman/listinfo/computer-go
> --
> Hideki Kato <mailto:[email protected]>
> _______________________________________________
> Computer-go mailing list
> [email protected]
> http://computer-go.org/mailman/listinfo/computer-go
_______________________________________________
Computer-go mailing list
[email protected]
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to