On Sun, 31 Oct 2010, Olivier Teytaud wrote:
> move = argmax_m number_of_simulations(m)
Olivier, I'm still confused. How should look like the formula for
the final decision? This is very important, becuase if we don't
choose a right move to play, all previous work is useless.
Basically I would say that you just take the most simulated move. As
others said, you can have some improvement
by taking into account other elements, but there's little to win and much
to loose.
Well, as mentionned some time ago by Pebbles' author (or was it
Pachi's?), you can gain without risk by choosing the move with the most
wins rather than the most visits. Anyhow, if you play more simulations,
UCB ensures that it will be more visited than the currently most visited
at some point in time (of course others could pass it by then, and the
"first" move might bet back ahead later, but your current knowledge is
says that the "most wins" move is better than the most visited).
Jonas
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go