Re: [computer-go] Transpositions in Monte-carlo tree search
Erik van der Werf wrote: Jonas Kahn wrote: No there is no danger. That's the whole point of weighting with N_{s,a}. N_{s,a} = number of times the node s has been visited, starting with parent a. You can write Value of a node a = (\sum_{s \in sons} N_{s,a} V_s) / (\sum N_{s,a}) where V_s is ideally the «true» value of node s. In UCT2, they use V_s = Q_{s,a} the win average of simulations going through a, and then through s. In UCT3, they use V_s = Q_s the win average of all simulations through s. There is a danger. The problem is that the selection policy also implements the soft-max like behavior that ensures convergence to the minimax result. If the you backup to all possible parents, including those for which the child would have been an inferior choice, you may get into trouble. That's what I was worried about. But I think it's ok the way Jonas describes above: you don't add anything to the false-parent node's simulation count, and you don't change the weight of the false-child in its value; you just change the evaluation of the false-child. (This means that the effect of backing up to alternate parents will be smaller than the effect of backing up to the 'true' parent, which is presumably part of the reason why this variant is less attractive.) -M- ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Transpositions in Monte-carlo tree search
On Wed, Apr 1, 2009 at 9:03 PM, Matthew Woodcraft matt...@woodcraft.me.uk wrote: Erik van der Werf wrote: Jonas Kahn wrote: No there is no danger. That's the whole point of weighting with N_{s,a}. N_{s,a} = number of times the node s has been visited, starting with parent a. You can write Value of a node a = (\sum_{s \in sons} N_{s,a} V_s) / (\sum N_{s,a}) where V_s is ideally the «true» value of node s. In UCT2, they use V_s = Q_{s,a} the win average of simulations going through a, and then through s. In UCT3, they use V_s = Q_s the win average of all simulations through s. There is a danger. The problem is that the selection policy also implements the soft-max like behavior that ensures convergence to the minimax result. If the you backup to all possible parents, including those for which the child would have been an inferior choice, you may get into trouble. That's what I was worried about. But I think it's ok the way Jonas describes above: you don't add anything to the false-parent node's simulation count, and you don't change the weight of the false-child in its value; you just change the evaluation of the false-child. (This means that the effect of backing up to alternate parents will be smaller than the effect of backing up to the 'true' parent, which is presumably part of the reason why this variant is less attractive.) Ok, but I would not call that a back up; nothing goes up to the alternative parents. Unless I missed something, with this you only make adjustments to the statistics representing transposed occurrences of the same position. I don't see that this is how we should interpret UCT3. Erik ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Transpositions in Monte-carlo tree search
Jonas Kahn wrote: You might be interested by this article, for a very complete and tested answer. Plus the idea of grouping, but a good part of the effect seems to me to be giving a heuristic pre-value to moves, which might be done more efficiently otherwise: eprints.pascal-network.org/archive/4571/01/8057.pdf Thank you (and to the others who replied). The idea of backing a simulation's results up to all parents ('UCT3' in that paper) seems very dangerous to me! It's a shame they didn't have any Go results to show for that one. -M- ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Transpositions in Monte-carlo tree search
On Tue, 31 Mar 2009, Matthew Woodcraft wrote: Jonas Kahn wrote: You might be interested by this article, for a very complete and tested answer. Plus the idea of grouping, but a good part of the effect seems to me to be giving a heuristic pre-value to moves, which might be done more efficiently otherwise: eprints.pascal-network.org/archive/4571/01/8057.pdf Thank you (and to the others who replied). The idea of backing a simulation's results up to all parents ('UCT3' in that paper) seems very dangerous to me! It's a shame they didn't have any Go results to show for that one. No there is no danger. That's the whole point of weighting with N_{s,a}. N_{s,a} = number of times the node s has been visited, starting with parent a. You can write Value of a node a = (\sum_{s \in sons} N_{s,a} V_s) / (\sum N_{s,a}) where V_s is ideally the «true» value of node s. In UCT2, they use V_s = Q_{s,a} the win average of simulations going through a, and then through s. In UCT3, they use V_s = Q_s the win average of all simulations through s. Assuming Markovianity (1), Q_s is a random variable with same mean as Q_{s,a}, but lower variance. That's all. Jonas (1) This might be broken if you give a heuristic value to your move in the tree based on how near it is to previous moves, but that's not really important. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Transpositions in Monte-carlo tree search
How are transpositions normally handled in monte-carlo tree search? I have been assuming that the natural thing would be to have a single shared node for each board position, so that simulations which reach the same position will use the same set of statistics (but when backing up the result, to only update the nodes for the simulation actually played). But I see in some of the Mogo papers that some of the contributions to the heuristic value of a node depend on the position of the previous move. So do MCTS programs not recognise transpositions at all? Or are the heuristics from the time when the node was first created allowed to stand, no matter what the simulation route is next time? -M- ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Transpositions in Monte-carlo tree search
Which Mogo paper(s)? Not all Mogo papers contain ideaa used by Mogo in public releases / competitions. Some things are just research. I remember hearing that the grandfather heuristic isn't good. Sent from my iPhone On Mar 30, 2009, at 5:36 PM, Matthew Woodcraft matt...@woodcraft.me.uk wrote: How are transpositions normally handled in monte-carlo tree search? I have been assuming that the natural thing would be to have a single shared node for each board position, so that simulations which reach the same position will use the same set of statistics (but when backing up the result, to only update the nodes for the simulation actually played). But I see in some of the Mogo papers that some of the contributions to the heuristic value of a node depend on the position of the previous move. So do MCTS programs not recognise transpositions at all? Or are the heuristics from the time when the node was first created allowed to stand, no matter what the simulation route is next time? -M- ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Transpositions in Monte-carlo tree search
Hi Mattew, I cannot answer for the current version of Mogo but I can for the one 1.5 years ago. Maybe it still holds. We had a transposition table as it was designed like that from the beginning. However the prior value of the node, was initialized when the node was created and indeed was depending of the previous move played in the current sequence. In practice, I think it does not change much. Sylvain On Mon, Mar 30, 2009 at 11:36 PM, Matthew Woodcraft matt...@woodcraft.me.uk wrote: How are transpositions normally handled in monte-carlo tree search? I have been assuming that the natural thing would be to have a single shared node for each board position, so that simulations which reach the same position will use the same set of statistics (but when backing up the result, to only update the nodes for the simulation actually played). But I see in some of the Mogo papers that some of the contributions to the heuristic value of a node depend on the position of the previous move. So do MCTS programs not recognise transpositions at all? Or are the heuristics from the time when the node was first created allowed to stand, no matter what the simulation route is next time? -M- ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Transpositions in Monte-carlo tree search
You might be interested by this article, for a very complete and tested answer. Plus the idea of grouping, but a good part of the effect seems to me to be giving a heuristic pre-value to moves, which might be done more efficiently otherwise: eprints.pascal-network.org/archive/4571/01/8057.pdf Jonas On Mon, 30 Mar 2009, Matthew Woodcraft wrote: How are transpositions normally handled in monte-carlo tree search? I have been assuming that the natural thing would be to have a single shared node for each board position, so that simulations which reach the same position will use the same set of statistics (but when backing up the result, to only update the nodes for the simulation actually played). But I see in some of the Mogo papers that some of the contributions to the heuristic value of a node depend on the position of the previous move. So do MCTS programs not recognise transpositions at all? Or are the heuristics from the time when the node was first created allowed to stand, no matter what the simulation route is next time? -M- ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/