Hi Oliver,

Now I know Remi is the first to utilize MCTS. Guess I need to read papers
more carefully. I do have a question though. I thought UCT is the foundation
of the current strong programs, I know that a RAVE term is added to the
original UCB term, i.e. sqrt(t_total/t_i), but the UCB term is still there
right? Could you eleborate a bit on why do you say "UCT is not good for Go"?
This is quite contradictory to a lot of material on the internet regarding
the lastest bread of go programs.

Regards,
Fuming

On Fri, Dec 31, 2010 at 5:51 PM, Olivier Teytaud <[email protected]> wrote:

>
>
> Dear all,
>   the original MCTS paper is by Rémi Coulom (to the best of my knowledge at
> least...). It's clear for us that we did not invent MCTS
> and always referenced Remi's paper.
>
> *1) For UCB-like formula:*
> - On the theoretical side, the consistency proof of MCTS without the
> UCT-like exploration also comes from mogo-people (Berthier et al); instead
> of saying that mogo's contribution is the introduction of UCT (which is good
> for other games but not for Go), I would
> have said that mogo's contribution is the analysis of MCTS without UCT
> (even if MCTS without UCT existed before mogo).
> - I'd like to point out that UCB-formulas are, I think, good for games with
> random part (e.g. with random transition, or with hidden information which
> leads to randomized strategies). But not for Go :-)
>
> *2) For works on the Monte-Carlo part:*
> For MoGo's contributions on the Monte-Carlo part, in particular with
> Yizao's patterns (with other people as well). There
> was a significant difference with previous attempts of designing good
> Monte-Carlo parts in the sense that a good Monte-Carlo part
> is not a MC-part which plays well as a standalone player, but a MC-part
> which plays well with a MCTS on top of it; this is unfortunately not very
> convenient as a criterion for designing a MC part... I think the main idea
> was the idea of balancing - the situation should not be better for one of
> the two players after one move by each player. This was claimed already in
> Sylvain's thesis and (I think) earlier than that.
>
> *3) Other mogo's contributions (I might forget many things...)*
> are around the fillboard option (which has a great impact for us on MoGo in
> 19x19), the nakade (maybe there were other simultaneous published methods
> for that),
> and the RAVE part (Brugmann, Gelly, Silver - Aja said that RAVE was
> invented by David and I have no idea on that, but I'm sure Sylvain
> contributed a lot on this and Brugmann did something ),
> the parallelization (Tristan Cazenave and others have published similar
> ideas; the Bourki et al paper has shown clear limitations in terms of
> scalability and counter-examples), the automatic building of
> patterns by direct policy search (J.-B. Hoock's papers) which can be used
> far from Go,
>  the simultaneous use of
> patterns designed by supervised learning, patterns designed by policy
> search,
> rave values, expert knowledge with a dirty complicated formula :-)
>
> MoGo was also, I think, the first use of never-ending learning for
> designing automatically an opening book by MCTS (this was
> moderately good at first because we did not want to use expert knowledge at
> all, whereas human expertise was really necessary
> for guiding the search...), after months of work on a grid this provides
> very good results for 9x9 Go.
>
> The parallelization is only efficient for moderate time settings -
> otherwise we have the scalability plateau. For games with very expensive
> transitions it might be different...
>
> Best regards,
> Olivier
>
> _______________________________________________
> Computer-go mailing list
> [email protected]
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Reply via email to