I agree that the number of wins is ok - but it restricts your
implementation to games with wins and losses (I don't know
the game considered by Petr). For Go it excludes only minor cases (jigo,
go with loops...).
Well, the same logic applies for using the first term in the formula for
choosing which nodes to be simulated. If we write it as maximising
value_of_node + uncertainty_of_node
we select the node with highest value in the end.
For UCB1:
value_of_nodes = number of wins
uncertainty_of_node = 2 \sqrt{n}
With draws, this was suggested:
value_of_nodes = number of wins + (number of draws)/2
Of course, it is necessary for the idea to work that value_of_nodes
cannot decrease when adding simulations.
The biggest number of simulations is ok I think for all full information
zero-sum turn-based games (with simultaneous
actions it's not a good idea anymore to use the most simulated action,
nor the move with most wins).
Do you have an example of such games, and how to use MCTS for it? I have
trouble picturing it.
Cheers
Jonas
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go