> Pachi's?), you can gain without risk by choosing the move with the most > wins rather than the most visits. Anyhow, if you play more simulations, > UCB ensures that it will be more visited than the currently most visited > at some point in time (of course others could pass it by then, and the > "first" move might bet back ahead later, but your current knowledge is > says that the "most wins" move is better than the most visited). >
I agree that the number of wins is ok - but it restricts your implementation to games with wins and losses (I don't know the game considered by Petr). For Go it excludes only minor cases (jigo, go with loops...). The biggest number of simulations is ok I think for all full information zero-sum turn-based games (with simultaneous actions it's not a good idea anymore to use the most simulated action, nor the move with most wins). Olivier ,
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
