It seems that the UCB1-Tuned algorithm uses variance from a normal distribution, however we believe it would be more optimal to use variance from a beta distribution. Has any work been done in this area? Are people still using UCB1-Tuned to guide their explorations of moves?

Thanks,
John Stogin
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to