> After the program plays all simulations, which move should it choose?
>
> (Wins/Visits) + SQRT(ln(...))
>
> or
>
> (Wins+Draw/2)/Visits + SQRT(ln(...))
>
>
None of these two formula :-)
These formulas is for choosing moves to be simulated. For turn-based games,
when al simulations are finished, we should choose

move = argmax_m number_of_simulations(m)

or something like that (you can introduce a bias built from the success
rate...).

But this is not related to draws :-)

For the formula to be used for simulations, I think:
(Wins/Visits) + SQRT(ln(...))     should lead to good results asymptotically
if the game is a win when playing optimally,
(Wins+Draw/2)/Visits + SQRT(ln(...))    should lead to good results
asymptotically if the game is a draw when playing optimally.

Non-asymptotically, life is too complicated :-)

Best regards,
Olivier
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Reply via email to