Re: [Computer-go] Fwd: News on Tromp-Cook ?
On Mon, Jan 03, 2011 at 08:24:43AM +0800, Aja wrote: Zen uses sequence-like AND probabilistic simulation. Basically it plays around the previous move randomly like MoGo, and these moves are biased by gamma values like Crazy Stone. I am also trying to use probabilistic simulation on the whole board, but it does not yet succeed. The main problem is how to combine the semeai, seki and useless-move detection with it. Very surprised that Zen is using partly sequence-like simulation like Mogo. Erica uses probabilistic simulation completely. I am not sure if my implementation is good enough to be called succeed , but I am sure that you will succeed. Hmm, I thought based on your paper that Erica uses a sort of a hybrid - probabilistic simulation completely, but with all but 3x3 feature filled in only around the last move anyway. So, one could say, Zen-like, but with 3x3-based tenuki sampling or whatever you want to call it. :-) Has this changed recently? Petr Pasky Baudis ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Hi Petr, Hmm, I thought based on your paper that Erica uses a sort of a hybrid - probabilistic simulation completely, but with all but 3x3 feature filled in only around the last move anyway. So, one could say, Zen-like, but with 3x3-based tenuki sampling or whatever you want to call it. :-) Has this changed recently? No, at the time of our simulation balancing paper, Erica was using probabilistic simulation completely, without seki knowledge. That also surprised Remi and me much, that balanced simulation is so powerful: achieved 80% against Fuego 0.4 without dealing with seki completely. I don't understand what do you mean by 3x3-based tenuki sampling. Right now Erica is using a difference approach. I have already described to Yamato in private. Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Hi; C*RAVE+(1-C)*UCT This has been my understanding. However, I am surprized to find out that people have been setting C close to one, according to Petr and Oliver's postings, which is essentially AMAF. MF apparently is doing something different. As explained by Aja, I did not mean that; I meant that we use UCT with a coefficient 0 in front of the UCB term +C_{uct} sqrt ( log( ...) / ...). Best regards, Olivier ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Hi Olivier, Thanks for all your corrections and detailed replies. I agree with them. If you think it's OK, I would like to describe Mogo's contributions in my thesis according to what you enumerated and explained. Considering and responding to the previous move in simulation, especially seeing the 3x3 patterns around (default policy of Mogo) is indeed a big contribution (so called Mogo's magic formula). These days, programs of Crazy Stone thread, like Zen, Erica, Aya...etc, do stochastic simulation, different with Mogo-type, fixed-sequence simulation. But the most important features, like save by capturing, save by extension are indeed inspired from Mogo. If I realize correctly, it was invented by Yizao, who is also a KGS 2d Go player. It's REALLY a pity that Sylvain and Yizao left Mogo team so soon, after Computer Olympiad 2007. I have only one more question. You mentioned this, -- the idea of balancing - the situation should not be better for one of the two players after one move by each player. This was claimed already in Sylvain's thesis and (I think) earlier than that. -- Can you point out where did Sylvain claimed the idea of balancing of simulation in his thesis (http://www.lri.fr/~gelly/paper/SylvainGellyThesis.pdf)? I check the thesis, Sylvain talked about balancing of exploration and exploitation, but not simulation (maybe he was not using such wording balancing?). David also didn't cite his thesis in his simulation balanacing paper. If Sylvain really proposed such concept, I would like to cite it in my thesis. Thanks, Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
These days, programs of Crazy Stone thread, like Zen, Erica, Aya...etc, do stochastic simulation, different with Mogo-type, fixed-sequence simulation. I think CrazyStone's update formula allows more flexibility than the pattern-based approach of MoGo. So it has better potential; I have no idea of whether one can really benefit from this flexibility, but there is a nice flexibility, without big computational cost, and this representation (by marginal probabilities) is a good algorithmic choice in CrazyStone and related programs. On the other hand, the probability involved in the Monte-Carlo part, as originally defined in CrazyStone, were not very efficient, in spite of the fact that they play Go better (still the apparent contradiction that a Monte-Carlo which plays well is not necessarily a good part of a MCTS). The modified versions were, I think, influenced by MoGo's patterns as defined by Yizao (I think other persons helped a lot for this, for the experimental setup and the organization of the code - I consider Sylvain as important as Yizao for this even if Sylvain did not design the patterns - I only consider myself of very moderate influence :-) ). We could win a lot on the Monte-Carlo part with fillboard and nakade and a few handcrafted patterns. No idea of whether better formula could be designed using the more flexible formalism of CrazyStone-like formula. Can you point out where did Sylvain claimed the idea of balancing of simulation in his thesis ( http://www.lri.fr/~gelly/paper/SylvainGellyThesis.pdfhttp://www.lri.fr/%7Egelly/paper/SylvainGellyThesis.pdf)? I check the thesis, Sylvain talked about balancing of exploration and exploitation, but not simulation (maybe he was not using such wording balancing?). David also didn't cite his thesis in his simulation balanacing paper. If Sylvain really proposed such concept, I would like to cite it in my thesis. Sylvain posted in Computer-Go an explanation of that a long time ago - but without the word balancing, probably. See page 154. Best regards, Olivier ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Hi Olivier, I forgot that you also contributed to Mogo's pattern design, besides Sylvain and Yizao. Sorry for neglecting your contribution. :) Actually I make fillboard and nakade as features in Erica and they work fine. But indeed, efficiency is a problem in our probabilistic simulation. Thanks for reminding of Sylvain's claim. Page 154 is right the page. Aja___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Sorry for neglecting your contribution. :) You can neglect mine, no problem with that, it's a reality :-) Actually I make fillboard and nakade as features in Erica and they work fine. But indeed, efficiency is a problem in our probabilistic simulation. For us fillboard = around 80% with reasonnable time settings (increasing with comp. time); I'm afraid the improvement is smaller in other implementations (and I don't know why). Olivier ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Hi Aja, On Sun, Jan 2, 2011 at 1:29 AM, Aja ajahu...@gmail.com wrote: Hi Fuming, C*RAVE+(1-C)*UCT C is computed dynamically in search, but not set to a fixed value. Maybe you mean UCT_C, UCT=UCT_mean+UCT_C*exploration_term What Petr and Olivier do, I think, is set UCT_C to 0, to disable the exploration_term, not the weight of RAVE. Without the exploring term, the UCT is just mean win rate, so there's no point in calling it UCT or UCB. Basically, what people have been saying is that currently the tree search is based on combination of sequence dependent rate (average win rate) and sequence independent/almost independent (rave rate) instead of combination of exploitation (win rate) and exploration (UCB term). Is this understanding close? Fuming ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
80% is a big improvement. Indeed, I don't get that much in Erica. Aja - Original Message - From: Olivier Teytaud To: Aja ; computer-go Sent: Sunday, January 02, 2011 9:04 PM Subject: Re: [Computer-go] Fwd: News on Tromp-Cook ? Sorry for neglecting your contribution. :) You can neglect mine, no problem with that, it's a reality :-) Actually I make fillboard and nakade as features in Erica and they work fine. But indeed, efficiency is a problem in our probabilistic simulation. For us fillboard = around 80% with reasonnable time settings (increasing with comp. time); I'm afraid the improvement is smaller in other implementations (and I don't know why). Olivier ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Hi Fuming, I agree with your understanding. As for the naming of UCT, I think it's fine, because we can just view such formula as assigned with zero coefficient of the exploration term. Aja - Original Message - From: Fuming Wang To: Aja Cc: computer-go@dvandva.org Sent: Sunday, January 02, 2011 10:35 PM Subject: Re: [Computer-go] Fwd: News on Tromp-Cook ? Hi Aja, On Sun, Jan 2, 2011 at 1:29 AM, Aja ajahu...@gmail.com wrote: Hi Fuming,  C*RAVE+(1-C)*UCT  C is computed dynamically in search, but not set to a fixed value. Maybe you mean UCT_C,  UCT=UCT_mean+UCT_C*exploration_term  What Petr and Olivier do, I think, is set UCT_C to 0, to disable the exploration_term, not the weight of RAVE. Without the exploring term, the UCT is just mean win rate, so there's no point in calling it UCT or UCB. Basically, what people have been saying is that currently the tree search is based on combination of sequence dependent rate (average win rate) and sequence independent/almost independent (rave rate) instead of combination of exploitation (win rate) and exploration (UCB term). Is this understanding close? Fuming ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Yamato and I have independently started development of Zen and FudoGo (resp) with sharing the same idea*, combining MoGo's sequence-like simulation policy and CrazyStone's larger patterns, though later Zen uses more complicated policy. *This is one of the reasons I have started the joint project DeepZen with Yamato. If Zen is not using probabilisic simulation (even with a lot of rules to make it less-random, like I do in Erica), I will be very surprised with its performance in few simulations. But anyway, I feel sorry to say anything about Zen that I am not really sure. Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
On Sun, Jan 2, 2011 at 10:31 AM, Olivier Teytaud olivier.teyt...@lri.fr wrote: mogo's MC was implemented in many strong programs, and influenced the MC of all other strong programs. I think there's no exception to this. I was personally of little influence on that, but I was here for clearly seeing that this is a big contribution. The playout policy I used in the 2007 version of Steenvreter was developed independently of the Mogo policy. However, some time after the olympiad I also implemented Mogo's version for comparison. IIRC the policy with Yizao's patterns was of comparable quality at equal numbers of simulations but a bit faster so therefore over-all better. I think Yizao's policy is the primary reason why Mogo got strong so fast in 2006. Erik ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Aja wrote: Yamato and I have independently started development of Zen and FudoGo (resp) with sharing the same idea*, combining MoGo's sequence-like simulation policy and CrazyStone's larger patterns, though later Zen uses more complicated policy. *This is one of the reasons I have started the joint project DeepZen with Yamato. If Zen is not using probabilisic simulation (even with a lot of rules to make it less-random, like I do in Erica), I will be very surprised with its performance in few simulations. But anyway, I feel sorry to say anything about Zen that I am not really sure. Zen uses sequence-like AND probabilistic simulation. Basically it plays around the previous move randomly like MoGo, and these moves are biased by gamma values like Crazy Stone. I am also trying to use probabilistic simulation on the whole board, but it does not yet succeed. The main problem is how to combine the semeai, seki and useless-move detection with it. -- Yamato ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
On Sun, Jan 2, 2011 at 4:41 PM, Olivier Teytaud olivier.teyt...@lri.fr wrote: The playout policy I used in the 2007 version of Steenvreter was developed independently of the Mogo policy. However, some time after the olympiad I also implemented Mogo's version for comparison. IIRC the policy with Yizao's patterns was of comparable quality at equal numbers of simulations but a bit faster so therefore over-all better. Thanks for this interesting comment. Is your policy described somewhere, and was this comparison in 9x9 or 13x13 or 19x19 ? No, I never published anything on it. The comparison was only for 9x9. Erik ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
On 01/02/2011 10:17 AM, Erik van der Werf wrote: The playout policy I used in the 2007 version of Steenvreter was developed independently of the Mogo policy. Did this policy include the idea of sequences (playing near the last move), and if so, was that developed independently? Memories are notoriously faulty. Priority has to be given to published papers and established results. It is not enough to say some months after MoGo established itself and published the ideas behind the implementation that you would have gotten there on your own with Steenvreter. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
On Sun, Jan 2, 2011 at 8:52 PM, Jeff Nowakowski j...@dilacero.org wrote: On 01/02/2011 10:17 AM, Erik van der Werf wrote: The playout policy I used in the 2007 version of Steenvreter was developed independently of the Mogo policy. Did this policy include the idea of sequences (playing near the last move), and if so, was that developed independently? Yes, the probabilities for playing near the last move were higher. This was btw a feature I already had in my move predictor back in 2001 (which was published at CG 2002). Memories are notoriously faulty. Memory is not an issue here. I can simply check my code at that time because I have it archived and under version control (and this is exactly what I did because I was also wondering when exactly I had done the comparison with Mogo's policy). Priority has to be given to published papers and established results. Agreed, as long as you don't deny things that were out there long before someone wrote 'the' paper. I have seen a rather natural progression from work of Brugman, Kaminski, Bouzy, Helmstetter, Hamlen, etc. to where we are today. Pinpointing the invention of MCTS in time or to a single person is simply not that easy. It is not enough to say some months after MoGo established itself and published the ideas behind the implementation that you would have gotten there on your own with Steenvreter. My reason for posting was not to brag. It was my own choice not to publish, so I accept it when laymen forget about Steenvreter. However, I think a guy like Jan Willemson, who's now almost entirely written out of the history of MCTS, really deserved some credit. Erik ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Hi Erik, My reason for posting was not to brag. It was my own choice not to publish, so I accept it when laymen forget about Steenvreter. However, I think a guy like Jan Willemson, who's now almost entirely written out of the history of MCTS, really deserved some credit. Many people will be very interested if you publish something about Steenvreter. Beating Mogo without learning Mogo's paper at that time? That's incredible. :) Aja ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Hi Aja, On Mon, Jan 3, 2011 at 1:28 AM, Aja ajahu...@gmail.com wrote: Many people will be very interested if you publish something about Steenvreter. Beating Mogo without learning Mogo's paper at that time? That's incredible. :) Well, that's not exactly true either. I did of course read some of their reports before the 2007 olympiad and some of it even directly contributed. One example is UCB-tuned. Another is how they parallellized the search. Also indirectly, discussions with various people and ideas that came by on this list must have influenced me somehow. The Mogo team definitely did some great work. The playouts just happened to be something for which I already had my own code. Perhaps it was more important that I had a bit different philosophy on what the playouts were supposed to do than most other people at that time. I wanted my playouts to evaluate positions well (and not necessarily play strong moves). So that's what I optimized them for, focusing especially on errors in human final positions, which was easy because I already had a large collection of scored 9x9 games from my time in Maastricht. As a consequence Steenvreter got most of the nakade and seki issues right long before others figured those things out... Best, Erik ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
On 01/02/2011 05:24 PM, Erik van der Werf wrote: Agreed, as long as you don't deny things that were out there long before someone wrote 'the' paper. I have seen a rather natural progression from work of Brugman, Kaminski, Bouzy, Helmstetter, Hamlen, etc. to where we are today. Pinpointing the invention of MCTS in time or to a single person is simply not that easy. I agree completely, and stated as such in my first message that touched off this long thread (and repeated many times since). I never credited MoGo with inventing MCTS or UCT. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Aja wrote: Zen uses sequence-like AND probabilistic simulation. Basically it plays around the previous move randomly like MoGo, and these moves are biased by gamma values like Crazy Stone. I am also trying to use probabilistic simulation on the whole board, but it does not yet succeed. The main problem is how to combine the semeai, seki and useless-move detection with it. Very surprised that Zen is using partly sequence-like simulation like Mogo. Erica uses probabilistic simulation completely. I am not sure if my implementation is good enough to be called succeed , but I am sure that you will succeed. I suppose that probabilistic simulation on the whole board is not suitable for some sort of heuristics, like seki, long semeai or ladder-safe. Does Erica handle those in the playouts correctly? -- Yamato ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Hi Erik, From your words, I think the main reason that Steenvreter defeated Mogo and Crazy Stone is that Steenvreter handled life and death, seki and nakade much better than them. Still, many people will be interested if you publish something about I had a bit different philosophy on what the playouts were supposed to do than most other people at that time. :) Aja - Original Message - From: Erik van der Werf erikvanderw...@gmail.com To: Aja ajahu...@gmail.com; computer-go@dvandva.org Sent: Monday, January 03, 2011 9:16 AM Subject: Re: [Computer-go] Fwd: News on Tromp-Cook ? Hi Aja, On Mon, Jan 3, 2011 at 1:28 AM, Aja ajahu...@gmail.com wrote: Many people will be very interested if you publish something about Steenvreter. Beating Mogo without learning Mogo's paper at that time? That's incredible. :) Well, that's not exactly true either. I did of course read some of their reports before the 2007 olympiad and some of it even directly contributed. One example is UCB-tuned. Another is how they parallellized the search. Also indirectly, discussions with various people and ideas that came by on this list must have influenced me somehow. The Mogo team definitely did some great work. The playouts just happened to be something for which I already had my own code. Perhaps it was more important that I had a bit different philosophy on what the playouts were supposed to do than most other people at that time. I wanted my playouts to evaluate positions well (and not necessarily play strong moves). So that's what I optimized them for, focusing especially on errors in human final positions, which was easy because I already had a large collection of scored 9x9 games from my time in Maastricht. As a consequence Steenvreter got most of the nakade and seki issues right long before others figured those things out... Best, Erik ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Aja wrote: I suppose that probabilistic simulation on the whole board is not suitable for some sort of heuristics, like seki, long semeai or ladder-safe. Does Erica handle those in the playouts correctly? It's not true. Erica handle seki, semeai and ladder in the playouts combined with probabilistic simulation. Everything Mogo-type, fixed-sequence simulation can do, should also be able to be done in probabilistic simulation. Probabilistic simulation has more flexibility, as Olivier said. That means, you track seki or semeai status incrementally? Or, do you use a resampling algorithm that avoids to play seki-dame or something useless? -- Yamato ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Hi Fuming, Most of the current strong programs are using UCT combined with RAVE (a kind of AMAF). The formula is like this (there are many variants), C*RAVE+(1-C)*UCT C is the weight of RAVE. As far as I know, there are at least two useful formula to compute C: 1. The first formula was proposed in the famous paper Combining Online and Offline Knowledge in UCT of Sylvain G. and David S (please refer to Section 6).(http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf). 2. The second (newer) was posted here in the past. Formal reference will be David Silver's PhD thesis. This new formula, according to my testing, is 70 elo stronger than the first one. RAVE is really a big invention. It's a big contribution of Mogo. We must thank Sylvain and David for bringing such powerful method to us. :) Aja - Original Message - From: Fuming Wang To: computer-go@dvandva.org Sent: Saturday, January 01, 2011 10:16 PM Subject: Re: [Computer-go] Fwd: News on Tromp-Cook ? So, the current strong programs are more like AMAF instead of UCT, right? Fuming On Sat, Jan 1, 2011 at 11:32 AM, David Fotland fotl...@smart-games.com wrote: I still have a UCB term, but it's probably because I depend more on Many Face's move generator. I have a rave term, but it's contribution is small. It seems that if the RAVE term is large, then Rave creates enough exploration by itself. David -Original Message- From: computer-go-boun...@dvandva.org [mailto:computer-go- boun...@dvandva.org] On Behalf Of Petr Baudis Sent: Friday, December 31, 2010 6:27 PM To: computer-go@dvandva.org Subject: Re: [Computer-go] Fwd: News on Tromp-Cook ? Hi! On Fri, Dec 31, 2010 at 07:02:35PM +0800, Fuming Wang wrote: Now I know Remi is the first to utilize MCTS. Guess I need to read papers more carefully. I do have a question though. I thought UCT is the foundation of the current strong programs, I know that a RAVE term is added to the original UCB term, i.e. sqrt(t_total/t_i), but the UCB term is still there right? Could you eleborate a bit on why do you say UCT is not good for Go? This is quite contradictory to a lot of material on the internet regarding the lastest bread of go programs. Most likely not all (e.g. it seems not ManyFaces?), but at least many programs use exploration coefficients that are either zero or negligibly small. In Pachi, I'm using 0 as the exploration coefficient in the end, it seems to work the best. But this probably also depends on the fact that I have slight forceful randomization of playouts. 0.02 can work well on 9x9 too, but it also depends on the priors, etc. Overally, it is a question of the overall tuning of the program. But right now, reasonably strong play with only RAVE and no UCB1 is certainly possible. -- Petr Pasky Baudis Computer science education cannot make an expert programmer any more than studying brushes and pigment can make an expert painter. --esr ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go -- ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Dear all, the original MCTS paper is by Rémi Coulom (to the best of my knowledge at least...). It's clear for us that we did not invent MCTS and always referenced Remi's paper. *1) For UCB-like formula:* - On the theoretical side, the consistency proof of MCTS without the UCT-like exploration also comes from mogo-people (Berthier et al); instead of saying that mogo's contribution is the introduction of UCT (which is good for other games but not for Go), I would have said that mogo's contribution is the analysis of MCTS without UCT (even if MCTS without UCT existed before mogo). - I'd like to point out that UCB-formulas are, I think, good for games with random part (e.g. with random transition, or with hidden information which leads to randomized strategies). But not for Go :-) *2) For works on the Monte-Carlo part:* For MoGo's contributions on the Monte-Carlo part, in particular with Yizao's patterns (with other people as well). There was a significant difference with previous attempts of designing good Monte-Carlo parts in the sense that a good Monte-Carlo part is not a MC-part which plays well as a standalone player, but a MC-part which plays well with a MCTS on top of it; this is unfortunately not very convenient as a criterion for designing a MC part... I think the main idea was the idea of balancing - the situation should not be better for one of the two players after one move by each player. This was claimed already in Sylvain's thesis and (I think) earlier than that. *3) Other mogo's contributions (I might forget many things...)* are around the fillboard option (which has a great impact for us on MoGo in 19x19), the nakade (maybe there were other simultaneous published methods for that), and the RAVE part (Brugmann, Gelly, Silver - Aja said that RAVE was invented by David and I have no idea on that, but I'm sure Sylvain contributed a lot on this and Brugmann did something ), the parallelization (Tristan Cazenave and others have published similar ideas; the Bourki et al paper has shown clear limitations in terms of scalability and counter-examples), the automatic building of patterns by direct policy search (J.-B. Hoock's papers) which can be used far from Go, the simultaneous use of patterns designed by supervised learning, patterns designed by policy search, rave values, expert knowledge with a dirty complicated formula :-) MoGo was also, I think, the first use of never-ending learning for designing automatically an opening book by MCTS (this was moderately good at first because we did not want to use expert knowledge at all, whereas human expertise was really necessary for guiding the search...), after months of work on a grid this provides very good results for 9x9 Go. The parallelization is only efficient for moderate time settings - otherwise we have the scalability plateau. For games with very expensive transitions it might be different... Best regards, Olivier ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Hi, I'd like to advise against using the exact algorithm I described in my 2006 paper. I compared it to UCT at that time, and UCT performed better. I am sorry I don't have a reference to my data any more. I posted the results to the mailing list. It used to be archived at that link: http://computer-go.org/pipermail/computer-go/2006-July/005737.html I hope somebody kept an archive an can forward it to the list. The title of that message was Experiments with UCT. It is too bad that the archive of that time is gone. Is it available anywhere? The google archive starts later. The biggest difference between the general MCTS idea I proposed and the UCT-like algorithms everybody is using today is that the backup operator was not the mean of playouts. I tried to consider selectivity and backup independently. I still believe it is a powerful idea. My feeling is that a good backup operator should be able to make a good decision independently of selectivity. That is to say, even if all the moves are explored uniformly, the backup operator should converge to min/max. Backing up the mean forces current programs to be extremely selective so that they rapidly converges to min/max. The only part of the tree where it is not important to converge rapidly to min/max is the root. I am aware of some experiments that indicate that being less selective at the root may improve strength. The efficiency of root parallelization may also be an indication that current algorithms are too selective at the root. So my vague intuition is that it may be worth trying to find a good backup operator that works with less selective search. Anyway, tweaking the MCTS formula you are using will never make your program very strong. Clever playouts are immensely more important, especially for 19x19. Rémi On 31 déc. 2010, at 07:02, Fuming Wang wrote: -- Forwarded message -- From: Fuming Wang fuming...@gmail.com Date: Fri, Dec 31, 2010 at 1:50 PM Subject: Re: [Computer-go] News on Tromp-Cook ? To: Aja ajahu...@gmail.com Hi Aja, Remi and S. Gelly's paper both come out in 2006,and I just checked that they did not reference each other. I just read Remi's paper again, and realized that CrazyStone's tree search approach is actually different from the popular UCT method. Similar to you, I haven't been able to get good results from the popular UCT method, so I might try CrazyStone's method for a change. In Remi's paper, CrazyStone is only having around 30% winning rate against Gnu Go 3.6, and now Erica is winning world competitions,this actually proves that high quality MC simulation (realized for the first time in MoGo) is more important than tree search algorithms. Best regards, Fuming ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
On 31.12.2010 13:31, Rémi Coulom wrote: being less selective at the root may improve strength So what about P; P ca. 3 ply, at each of these plies, a) firstly filter obviously bad moves, b) secondly consider each still available move's children by MC / UCT? Or, more generally, dynamically increase P subject to hardware power. -- robert jasiek ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Hi Olivier, Thanks for your clarification. Apologize for my wrong description about the invention of RAVE. I show the mail next again, Sorry, I might be wrong at RAVE. Maybe it should be: Sylvain proprosed the idea of RAVE and David Silver proposed a new formula for RAVE. Aja - Original Message - From: Olivier Teytaud To: Aja ; computer-go Sent: Friday, December 31, 2010 5:51 PM Subject: Re: [Computer-go] Fwd: News on Tromp-Cook ? Dear all, the original MCTS paper is by Rémi Coulom (to the best of my knowledge at least...). It's clear for us that we did not invent MCTS and always referenced Remi's paper. 1) For UCB-like formula: - On the theoretical side, the consistency proof of MCTS without the UCT-like exploration also comes from mogo-people (Berthier et al); instead of saying that mogo's contribution is the introduction of UCT (which is good for other games but not for Go), I would have said that mogo's contribution is the analysis of MCTS without UCT (even if MCTS without UCT existed before mogo). - I'd like to point out that UCB-formulas are, I think, good for games with random part (e.g. with random transition, or with hidden information which leads to randomized strategies). But not for Go :-) 2) For works on the Monte-Carlo part: For MoGo's contributions on the Monte-Carlo part, in particular with Yizao's patterns (with other people as well). There was a significant difference with previous attempts of designing good Monte-Carlo parts in the sense that a good Monte-Carlo part is not a MC-part which plays well as a standalone player, but a MC-part which plays well with a MCTS on top of it; this is unfortunately not very convenient as a criterion for designing a MC part... I think the main idea was the idea of balancing - the situation should not be better for one of the two players after one move by each player. This was claimed already in Sylvain's thesis and (I think) earlier than that. 3) Other mogo's contributions (I might forget many things...) are around the fillboard option (which has a great impact for us on MoGo in 19x19), the nakade (maybe there were other simultaneous published methods for that), and the RAVE part (Brugmann, Gelly, Silver - Aja said that RAVE was invented by David and I have no idea on that, but I'm sure Sylvain contributed a lot on this and Brugmann did something ), the parallelization (Tristan Cazenave and others have published similar ideas; the Bourki et al paper has shown clear limitations in terms of scalability and counter-examples), the automatic building of patterns by direct policy search (J.-B. Hoock's papers) which can be used far from Go, the simultaneous use of patterns designed by supervised learning, patterns designed by policy search, rave values, expert knowledge with a dirty complicated formula :-) MoGo was also, I think, the first use of never-ending learning for designing automatically an opening book by MCTS (this was moderately good at first because we did not want to use expert knowledge at all, whereas human expertise was really necessary for guiding the search...), after months of work on a grid this provides very good results for 9x9 Go. The parallelization is only efficient for moderate time settings - otherwise we have the scalability plateau. For games with very expensive transitions it might be different... Best regards, Olivier ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
On 12/31/2010 07:31 AM, Rémi Coulom wrote: I'd like to advise against using the exact algorithm I described in my 2006 paper. I compared it to UCT at that time, and UCT performed better. I am sorry I don't have a reference to my data any more. I posted the results to the mailing list. It used to be archived at that link: http://computer-go.org/pipermail/computer-go/2006-July/005737.html I hope somebody kept an archive an can forward it to the list. The title of that message was Experiments with UCT. It is too bad that the archive of that time is gone. Is it available anywhere? The google archive starts later. Here: http://thread.gmane.org/gmane.games.devel.go/6366 Archives and current messages for this list can be found at: http://blog.gmane.org/gmane.games.devel.go I don't like the web interface, however. It's very hard to browse the threads for a particular month without a lot of clicking. The search is OK, though. What I do is use the traditional NTTP news interface to gmane. Thunderbird (my newsreader) has an option to download all the messages, which is nice for browsing messages, both old and new. ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Hi Remi, Thanks for the suggestions. Sorry for inaccuracies in my previous statements. Now I have read your paper more carefully, I find in the appendex many discussions related to improvements on playout move selections. On another note, I find that formula you gave on variance calculation very interesting. I am a little confused on the meaning of symble sigma in the formula, is it the number of wins of S trials? If that's the case, then the meaning of sigma_sup(2) can not be easily understood. Best regards, Fuming On Fri, Dec 31, 2010 at 8:31 PM, Rémi Coulom remi.cou...@free.fr wrote: Hi, I'd like to advise against using the exact algorithm I described in my 2006 paper. I compared it to UCT at that time, and UCT performed better. I am sorry I don't have a reference to my data any more. I posted the results to the mailing list. It used to be archived at that link: http://computer-go.org/pipermail/computer-go/2006-July/005737.html I hope somebody kept an archive an can forward it to the list. The title of that message was Experiments with UCT. It is too bad that the archive of that time is gone. Is it available anywhere? The google archive starts later. The biggest difference between the general MCTS idea I proposed and the UCT-like algorithms everybody is using today is that the backup operator was not the mean of playouts. I tried to consider selectivity and backup independently. I still believe it is a powerful idea. My feeling is that a good backup operator should be able to make a good decision independently of selectivity. That is to say, even if all the moves are explored uniformly, the backup operator should converge to min/max. Backing up the mean forces current programs to be extremely selective so that they rapidly converges to min/max. The only part of the tree where it is not important to converge rapidly to min/max is the root. I am aware of some experiments that indicate that being less selective at the root may improve strength. The efficiency of root parallelization may also be an indication that current algorithms are too selective at the root. So my vague intuition is that it may be worth trying to find a good backup operator that works with less selective search. Anyway, tweaking the MCTS formula you are using will never make your program very strong. Clever playouts are immensely more important, especially for 19x19. Rémi On 31 déc. 2010, at 07:02, Fuming Wang wrote: -- Forwarded message -- From: Fuming Wang fuming...@gmail.com Date: Fri, Dec 31, 2010 at 1:50 PM Subject: Re: [Computer-go] News on Tromp-Cook ? To: Aja ajahu...@gmail.com Hi Aja, Remi and S. Gelly's paper both come out in 2006,and I just checked that they did not reference each other. I just read Remi's paper again, and realized that CrazyStone's tree search approach is actually different from the popular UCT method. Similar to you, I haven't been able to get good results from the popular UCT method, so I might try CrazyStone's method for a change. In Remi's paper, CrazyStone is only having around 30% winning rate against Gnu Go 3.6, and now Erica is winning world competitions,this actually proves that high quality MC simulation (realized for the first time in MoGo) is more important than tree search algorithms. Best regards, Fuming ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
Re: [Computer-go] Fwd: News on Tromp-Cook ?
Hi Fuming, Remi's CG2006 paper is published/released half an year earlier. CG2006 was on 2006/5/29-31 at Turin, Italy. Mogo's paper (http://hal.inria.fr/docs/00/12/15/16/PDF/RR-6062.pdf) was released on 2006/11. That's why Mogo's paper cited Remi's paper (please see reference [9] in this report). As for MCTS, except the UCB formula, can you point out what is the difference between Remi's MCTS and Mogo's UCT? In my understand, UCT is exactly a selection strategy of MCTS. This is how Chaslot classified in his paper Progressive Strategies for Monte-Carlo Tree Search. Aja - Original Message - From: Fuming Wang To: computer-go@dvandva.org Sent: Friday, December 31, 2010 2:02 PM Subject: [Computer-go] Fwd: News on Tromp-Cook ? -- Forwarded message -- From: Fuming Wang fuming...@gmail.com Date: Fri, Dec 31, 2010 at 1:50 PM Subject: Re: [Computer-go] News on Tromp-Cook ? To: Aja ajahu...@gmail.com Hi Aja, Remi and S. Gelly's paper both come out in 2006,and I just checked that they did not reference each other. I just read Remi's paper again, and realized that CrazyStone's tree search approach is actually different from the popular UCT method. Similar to you, I haven't been able to get good results from the popular UCT method, so I might try CrazyStone's method for a change. In Remi's paper, CrazyStone is only having around 30% winning rate against Gnu Go 3.6, and now Erica is winning world competitions,this actually proves that high quality MC simulation (realized for the first time in MoGo) is more important than tree search algorithms. Best regards, Fuming On Fri, Dec 31, 2010 at 12:41 PM, Aja ajahu...@gmail.com wrote: Hi Fuming, The idea of improving the quality of simulation is more earlier, than Mogo’s paper, in the Appendix A of Remi Coulom’s CG2006 paper “Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search”(http://remi.coulom.free.fr/CG2006/CG2006.pdf): The choice of a more clever probability distribution can improve the quality of the Monte-Carlo estimation I am not sure if Remi was the first one proposing this concept in Computer Go field, but Mogo definitely was not. I was in Amstertam attending Computer Olympiad 2007, in the team of Chinese chess program “Deep Elephant”. I played with Crazy Stone, Mogo and was very surprised to see they beat me. Afterwards, Mogo’s paper is so easy to understand/implement for me that trigger me to work on Computer Go. Indeed, Mogo has huge contributions, especially in the popularization of MCTS. I don’t mean to weaken or deny it, but just want to point out Crazy Stone’s great contributions. In Erica, I use CrazyStone-like simulations successfully. Mogo-type simulation almost does not help Erica at all. If we want to numerate the strongest programs, we cannot forget Fuego(2010 UEC Cup winner) and MyGoFriend(Computer Olympiad 2010, 9x9 winner). For academic progress, we cannot forget Crazy Stone. For practical development usage, we cannot forget GnuGo and GoGui released by Fuego team. There were really too many contributors in the past. Happy New Years to all. Aja From: Fuming Wang Sent: Friday, December 31, 2010 10:16 AM To: Aja ; computer-go@dvandva.org Subject: Re: [Computer-go] News on Tromp-Cook ? This is certainly a good time to sit back and look at what got us here. The following key ideas have been mentioned so far: UCB, MCTS, RAVE, Pattern and Go knowledge during MC simulation.These ideas are all essential to a strong MC based Go program.If we want to pick the most important idea that got us here, I would say it is the realization that adding Go Pattern and Go Knowledge to MC simulation can significantly improve the quality of board evaluation. This is amount to the important sampling concept in MC integration, which is very import for Monte Carlo applications in many fields. MC simulation with importance sampling give us for the first time a reasonablly accurate evaluation function for Go. UCB, MCTS, RAVE are certainly very important, however, it is still possible that new approaches that can achieve good results with just importantly samples MC simulation. So, I think MoGo is the most important break-through. Happy New Year, everyone! Fuming On Fri, Dec 31, 2010 at 9:20 AM, Aja ajahu...@gmail.com wrote: Hi Jeff, When, do you think, did Mogo started dominating all the KGS computer events and CGOS, and also was the first to extend that dominance from 9x9 to 19x19.? In Computer Olympiad 2007, Steenvreter was gold medal on 9x9. At the final match of 19x19, it's easily to see that Mogo and Crazy Stone were close (finally Mogo 1st and CS 2ed). But, at the end of 2007, Crazy Stone defeated Mogo and won the UEC Cup (19x19). Afterwards, Many Faces won 9x9 and 19x19 on 2008. Zen and Erica won 2009 and 2010, both continuing Crazy