Dear all, the original MCTS paper is by Rémi Coulom (to the best of my knowledge at least...). It's clear for us that we did not invent MCTS and always referenced Remi's paper.
*1) For UCB-like formula:* - On the theoretical side, the consistency proof of MCTS without the UCT-like exploration also comes from mogo-people (Berthier et al); instead of saying that mogo's contribution is the introduction of UCT (which is good for other games but not for Go), I would have said that mogo's contribution is the analysis of MCTS without UCT (even if MCTS without UCT existed before mogo). - I'd like to point out that UCB-formulas are, I think, good for games with random part (e.g. with random transition, or with hidden information which leads to randomized strategies). But not for Go :-) *2) For works on the Monte-Carlo part:* For MoGo's contributions on the Monte-Carlo part, in particular with Yizao's patterns (with other people as well). There was a significant difference with previous attempts of designing good Monte-Carlo parts in the sense that a good Monte-Carlo part is not a MC-part which plays well as a standalone player, but a MC-part which plays well with a MCTS on top of it; this is unfortunately not very convenient as a criterion for designing a MC part... I think the main idea was the idea of balancing - the situation should not be better for one of the two players after one move by each player. This was claimed already in Sylvain's thesis and (I think) earlier than that. *3) Other mogo's contributions (I might forget many things...)* are around the fillboard option (which has a great impact for us on MoGo in 19x19), the nakade (maybe there were other simultaneous published methods for that), and the RAVE part (Brugmann, Gelly, Silver - Aja said that RAVE was invented by David and I have no idea on that, but I'm sure Sylvain contributed a lot on this and Brugmann did something ), the parallelization (Tristan Cazenave and others have published similar ideas; the Bourki et al paper has shown clear limitations in terms of scalability and counter-examples), the automatic building of patterns by direct policy search (J.-B. Hoock's papers) which can be used far from Go, the simultaneous use of patterns designed by supervised learning, patterns designed by policy search, rave values, expert knowledge with a dirty complicated formula :-) MoGo was also, I think, the first use of never-ending learning for designing automatically an opening book by MCTS (this was moderately good at first because we did not want to use expert knowledge at all, whereas human expertise was really necessary for guiding the search...), after months of work on a grid this provides very good results for 9x9 Go. The parallelization is only efficient for moderate time settings - otherwise we have the scalability plateau. For games with very expensive transitions it might be different... Best regards, Olivier
_______________________________________________ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go