BTW: have you tried other distributional difference metrics, or does K-L have properties that you like?
Thanks, steve On Sep 5, 2015 1:35 AM, "Hideki Kato" <[email protected]> wrote: > djhbrown .: < > capsify9fub60pd3lzdyhdpupffgyenv4t+m47okwphzrb4q...@mail.gmail.com>: > >thank you for sharing the paper. > > > >"the Maximum Frequency method is based on the > >maximization of the difference between the expected reward of > >the optimal move and that of others" > > > >intuitively it feels that biasing random search towards the optimal route > >would yield reduced failure rates, yet it does seem to depend on knowing > >what the optimal route is beforehand. > > UCT is never a random search but deterministic. > > Maxmizing KL-divergence just speed-up the convergence of the interative > algorithm. > > Hideki > > >if i knew the optimal route to get from A to B, i wouldn't bother doing a > >random search, but just follow it. > > > >"This property [“bias in suboptimal moves”] means that the impact of > >missing the optimal move is much greater for one player than it is for the > >opponent." > > > >i find this conclusion puzzling because Go is a zero-sum game, so what is > >good for one side is equally bad for the other, not variably so. I have > >not checked the statistical inference calculations to see whether there is > >an error in them. > >---- inline file > >_______________________________________________ > > >Computer-go mailing list > > >[email protected] > > >http://computer-go.org/mailman/listinfo/computer-go > -- > Hideki Kato <mailto:[email protected]> > _______________________________________________ > Computer-go mailing list > [email protected] > http://computer-go.org/mailman/listinfo/computer-go
_______________________________________________ Computer-go mailing list [email protected] http://computer-go.org/mailman/listinfo/computer-go
