On 7/11/2017 19:07, Imran Hendley wrote: > Am I understanding this correctly?
Yes. It's possible they had in-betweens or experimented with variations at some point, then settled on the simplest case. You can vary the randomness if you define it as a softmax with varying temperature, that's harder if you only define the policy as select best or select proportionally. -- GCP _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go