djhbrown .: 
<capsify9fub60pd3lzdyhdpupffgyenv4t+m47okwphzrb4q...@mail.gmail.com>:
>thank you for sharing the paper.
>
>"the Maximum Frequency method is based on the
>maximization of the difference between the expected reward of
>the optimal move and that of others"
>
>intuitively it feels that biasing random search towards the optimal route
>would yield reduced failure rates, yet it does seem to depend on knowing
>what the optimal route is beforehand.

UCT is never a random search but deterministic.

Maxmizing KL-divergence just speed-up the convergence of the interative 
algorithm.

Hideki

>if i knew the optimal route to get from A to B, i wouldn't bother doing a
>random search, but just follow it.
>
>"This property [“bias in suboptimal moves”] means that the impact of
>missing the optimal move is much greater for one player than it is for the
>opponent."
>
>i find this conclusion puzzling because Go is a zero-sum game, so what is
>good for one side is equally bad for the other, not variably so.  I have
>not checked the statistical inference calculations to see whether there is
>an error in them.
>---- inline file
>_______________________________________________

>Computer-go mailing list

>Computer-go@computer-go.org

>http://computer-go.org/mailman/listinfo/computer-go
-- 
Hideki Kato <mailto:hideki_ka...@ybb.ne.jp>
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to