djhbrown .: <capsify9fub60pd3lzdyhdpupffgyenv4t+m47okwphzrb4q...@mail.gmail.com>: >thank you for sharing the paper. > >"the Maximum Frequency method is based on the >maximization of the difference between the expected reward of >the optimal move and that of others" > >intuitively it feels that biasing random search towards the optimal route >would yield reduced failure rates, yet it does seem to depend on knowing >what the optimal route is beforehand.
UCT is never a random search but deterministic. Maxmizing KL-divergence just speed-up the convergence of the interative algorithm. Hideki >if i knew the optimal route to get from A to B, i wouldn't bother doing a >random search, but just follow it. > >"This property [bias in suboptimal moves] means that the impact of >missing the optimal move is much greater for one player than it is for the >opponent." > >i find this conclusion puzzling because Go is a zero-sum game, so what is >good for one side is equally bad for the other, not variably so. I have >not checked the statistical inference calculations to see whether there is >an error in them. >---- inline file >_______________________________________________ >Computer-go mailing list >Computer-go@computer-go.org >http://computer-go.org/mailman/listinfo/computer-go -- Hideki Kato <mailto:hideki_ka...@ybb.ne.jp> _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go