Re: [computer-go] Monte-Carlo Go Misnomer?
Hello, Is there any known (by theory or tests) function of how much a increase in the strength of the simulation policy increases the strength of the MC/UCT Program as a whole? I think that is a very interesting question. In our work on MoGo we found that there could be a decrease of the strength of the MC/UCT program while using a stronger simulation policy. It is why in MoGo it is more the sequence idea, than the strength idea. Our best simulation policy is quite weak compared to others we tested. But we have further experiments, in a work with David Silver from the university of Alberta. We found out that the relation strong simulation policy = strong MC program is wrong at a much larger scale. So the intransivity is true even with much much stronger simulation policies. Of course there is the simple counter example of a deterministic player. But our results hold even if we randomise (in a lot of manners, and tuning as best as we can the parameters) the much stronger policy. I have some theory about this phenomenon in general, but not enough polished for the moment. I really think that understanding deeply this experimental evidence, deeper than some intuition, would help going further. But maybe some already did. Sylvain ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Go Misnomer?
In our work on MoGo we found that there could be a decrease of the strength of the MC/UCT program while using a stronger simulation policy. It is why in MoGo it is more the sequence idea, than the strength idea. Our best simulation policy is quite weak compared to others we tested. Could you elaborate on the sequence idea? Or point me to somewhere where you have already done so? ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Go Misnomer?
Could you elaborate on the sequence idea? Or point me to somewhere where you have already done so? Sure. We already describe it in our research report that you can find there: http://hal.inria.fr/inria-00117266 or there if you prefers the web interface in english but with a longer URL :-) http://hal.inria.fr/index.php?halsid=f72d067037f3b8f161448b400d117210view_this_doc=inria-00117266version=3 Sylvain ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Go Misnomer?
Hello, David Doshay wrote: On 3, Feb 2007, at 2:51 AM, Sylvain Gelly wrote: the speed of the best simulation policy in MoGo is 0.6 * the speed of the uniform random one. I think that this is very good. You give up less than a factor of 2 from uniform random and you probably get better than a factor of 2 in the % of relevant moves. Is there any known (by theory or tests) function of how much a increase in the strength of the simulation policy increases the strength of the MC/UCT Program as a whole? Imagine you're using a almost random playout that has a strength, if playing alone, of say 10 ELO (just a number, I don't know how strong it really is). Using some thousand MC simulations with that 10 ELO playout and some other clever tricks like UCT, and you get a program that plays with 1500 ELO. Now what if you change the playout to one that has a strength, if playing alone, of 20 ELO, 50 ELO or 100 ELO. How whould that affect the strength of your overall programm, if you run the same number of MCs? Any known numbers on that? [I understand that a non-random playout would decrease the speed, so couln't run as many simulations as before in the same time, but let's imagine we get it for free.] eph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Go Misnomer?
Unfortunately, it's just not that simple, because it depends far more on _how_ the playout is improved, rather than, say, the ELO numbers that measure that improvement. For example, programs exist that are good (in part) because they entirely disregard some lines of play. They may be correct to disregard these lines in almost every case, which generally makes the playout program stronger. However, for the few cases where this heuristic pruning is not correct, the calling program will suffer greatly, because these lines of play become completely invisible to the random playouts, no matter how many playouts are performed. As an extreme example, consider programs that play completely deterministic strategies. These are obviously useless as random players, yet it is in principle possible to construct ones that play arbitrarily well. Weston On 2/7/07, Ephrim Khong [EMAIL PROTECTED] wrote: Is there any known (by theory or tests) function of how much a increase in the strength of the simulation policy increases the strength of the MC/UCT Program as a whole? ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Go Misnomer?
Hello Don, I have the same situation with an older program that has a more sophisticated data strucutre. If I was building a smarter program I would rather use the older but slower code but I cannot optimize it any further for simple randomly uniform play-outs. If I compared the old code doing simple play-outs to the old code doing the sophisticated evaluation, it might not look so bad because the older program is doing uneeded work that is of no benefit to a simple program. You are 100% right. However, you could design an optimized version from scratch for the sophisticated evaluation that would run much faster. I believe in a *2,*3 speed. Sylvain ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Go Misnomer?
From: Sylvain Gelly [EMAIL PROTECTED] I truly believe that we can make big steps in this direction and getting much better (real not simulation) players that way. Hi all, Thanks for answering Sylvain. I find MoGo already very impressive, so it surprised me that you think that more big steps are possible. The number of playouts per second for MoGo is very low (maybe the lowest of all MC players?). Is this caused by the large number of calculations necessary to get a good simulation policy? Or is it possible for MoGo to get a speedup like Lukasz Lew has accieved with his library? Edward _ Geef jouw Hotmail kleur met Windows Live Mail! Stap nu over! http://imagine-windowslive.com/mail/launch/default.aspx?Locale=nl-nl ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Go Misnomer?
Hi Edward and hi all, I truly believe that we can make big steps in this direction and getting much better (real not simulation) players that way. I find MoGo already very impressive, so it surprised me that you think that more big steps are possible. I do really think that it is only a first step, and the great improvements are to do and not done. I am really optimistic for the community (and not for MoGo as I don't now the time I can spend on it for the next months). The number of playouts per second for MoGo is very low (maybe the lowest of all MC players?). Is this caused by the large number of calculations necessary to get a good simulation policy? Or is it possible for MoGo to get a speedup like Lukasz Lew has accieved with his library? The slowness of MoGo's simulations come from two parts. Of course improving the simulation policy make it slower. But also we designed MoGo so that we can add and test quickly new ideas. So there are a lot of unnecessary computations during the simulations. I believe that making specific optimisations by a good developer, like Lukasz Lew did, we could achieve a *2, *3 in the speed, perhaps even more? But I also think that it should be done at the end, when we think that the simulation policy is the best we can get, and I don't think it is time. To give a idea, the speed of the best simulation policy in MoGo is 0.6 * the speed of the uniform random one. Sylvain ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Go Misnomer?
I seriously doubt a highly optimized MoGo would be able to stay this close to uniform random in speed. It's already been pointed out that a lot of MoGo is infrastructure to support interesting experiments.I'm guessing here, but my guess is that the uniform random implementation is suffering more from the baggage of this infrastructure than the best simulation policy version that I would guess is benefitting more from this infrastructure. Am I right or wrong Sylvain? - Don On Sat, 2007-02-03 at 10:31 -0800, David Doshay wrote: On 3, Feb 2007, at 2:51 AM, Sylvain Gelly wrote: the speed of the best simulation policy in MoGo is 0.6 * the speed of the uniform random one. I think that this is very good. You give up less than a factor of 2 from uniform random and you probably get better than a factor of 2 in the % of relevant moves. This has been the biggest reason we have delayed adding MC to SlugGo: how do you keep the randomly selected moves anywhere near the relevant moves? With the high branching factor we face in Go, this seems most important to me. And MoGo has made huge strides in that direction. Cheers, David ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Go Misnomer?
Your english is fine, it's my grammer that could use some help. I made it more complicated than it needed to be. You mentioned that you have a slow but flexible data structure that supports lots of experimentation but makes the program slower. But it occured to me that a simple version that only does uniformly random play-outs - can't make any use of this extra sophistication - therefore it must suffer more of a speed penalty. I have the same situation with an older program that has a more sophisticated data strucutre. If I was building a smarter program I would rather use the older but slower code but I cannot optimize it any further for simple randomly uniform play-outs. If I compared the old code doing simple play-outs to the old code doing the sophisticated evaluation, it might not look so bad because the older program is doing uneeded work that is of no benefit to a simple program. - Don On Sat, 2007-02-03 at 20:26 +0100, Sylvain Gelly wrote: I seriously doubt a highly optimized MoGo would be able to stay this close to uniform random in speed. We certainly could not achieve 0.6 the speed Lukasz can achieve with uniform random. is suffering more from the baggage of this infrastructure than the best simulation policy version that I would guess is benefitting more from this infrastructure. Sorry I don't understand :-(. Is the second more of the sentence a mistake, or my english too poor? If I remove the second more, do you mean benefitting for the speed or the accuracy? Sylvain On Sat, 2007-02-03 at 10:31 -0800, David Doshay wrote: On 3, Feb 2007, at 2:51 AM, Sylvain Gelly wrote: the speed of the best simulation policy in MoGo is 0.6 * the speed of the uniform random one. I think that this is very good. You give up less than a factor of 2 from uniform random and you probably get better than a factor of 2 in the % of relevant moves. This has been the biggest reason we have delayed adding MC to SlugGo: how do you keep the randomly selected moves anywhere near the relevant moves? With the high branching factor we face in Go, this seems most important to me. And MoGo has made huge strides in that direction. Cheers, David ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Go Misnomer?
David Doshay wrote: I am a physics guy, and my thesis project was a large MC simulation. The clusters that run SlugGo are usually busy doing MC simulations when not playing Go. In general MC needs to sample according to the proper distribution for the problem. For some problems in quantum mechanics and most in statistical mechanics, the distribution cleanly partitions into percentages of the total, and in those simulations it is easy to do things like generate random numbers and then see what range the random number is in. For Go I could easily argue that sampling random points on the board is clearly the wrong distribution, and those programs using some kind of pattern knowledge are really doing something much closer to MC simulations rather than true random playout. So, I do not think that MC is the misnomer. Thinking that pure random playout is the same as MC is the mistake. Got it. I was thinking Monte Carlo (name based on the gambling city) meant it must be random, but looking into it deeper other statistical sampling methods designed specifically for the problem to reduce the number of simulations can and are often used. Looks like there is some level of confusion about this out on the net too. Wikipedia perhaps needs updating for one. Thanks for clarifying that. Go requires a pretty complex simulation to be run and using more selective moves for the play-outs, if not done very carefully, could have an adverse effect by being too restrictive or sampling the wrong distribution. I'm sure random is not the best distribution also, but at least it is not biased. Are there any methods from other MC research areas to help find good sampling distributions and reduce variance or is Go just wildly different than most domains where MC is used? Matt ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Go Misnomer?
I agree with what you say here and I'll try to make my point better. First, my limited experience working with Monte-Carlo simulations involved photons traveling through scattering media. The sequences of moves described in the Mogo paper are pleasantly reminiscent of this. I did not mean to suggest that heavy playouts, incorporating go knowledge, makes the Monte-Carlo description incorrect. Rather, the term is increasingly insufficient as a Go engine description because it lumps together such very different algorithms. The earliest MC engines were extremely simple and easily described. It seems inevitable that someone new to the field will seize on this description, and then combine it with the success of current Monte-Carlo engines, leading to unnecessary confusion. On a tangential note: MC/UCT with light playouts and MC/UCT with heavy playouts are different beasts. If SlugGo's mixture of experts work expands to include MC/UCT, you might want to consider adding one of each. - Dave Hillis -Original Message- From: [EMAIL PROTECTED] To: computer-go@computer-go.org Sent: Fri, 2 Feb 2007 12:30 AM Subject: Re: [computer-go] Monte-Carlo Go Misnomer? I am a physics guy, and my thesis project was a large MC simulation. The clusters that run SlugGo are usually busy doing MC simulations when not playing Go. In general MC needs to sample according to the proper distribution for the problem. For some problems in quantum mechanics and most in statistical mechanics, the distribution cleanly partitions into percentages of the total, and in those simulations it is easy to do things like generate random numbers and then see what range the random number is in. For Go I could easily argue that sampling random points on the board is clearly the wrong distribution, and those programs using some kind of pattern knowledge are really doing something much closer to MC simulations rather than true random playout. So, I do not think that MC is the misnomer. Thinking that pure random playout is the same as MC is the mistake. Cheers, David On 1, Feb 2007, at 8:33 PM, [EMAIL PROTECTED] wrote: I think of it as a continuum going from light to heavy. Pure random playouts are the lightest. But then you add some rules about filling in eyes, then maybe discourage self-atari,... and the playouts keep getting heavier. I agree with you that the current crop of MC engines are not nearly as go-knowledge naive as the name implies. - Dave Hillis -Original Message- From: [EMAIL PROTECTED] To: computer-go@computer-go.org Sent: Thu, 1 Feb 2007 11:22 PM Subject: [computer-go] Monte-Carlo Go Misnomer? Is MC Go a misnomer for programs in this genre not using simple random playouts and combining with other techniques like pattern matching? Technically, does the general Monte-Carlo method require random or pseudo-random sampling? If so, should we dub a new name for these non-random deep play-out sampling based go programs? Maybe Quasi-MC or Directed Sampling... ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ Check Out the new free AIM(R) Mail -- 2 GB of storage and industry- leading spam and email virus protection. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ Check Out the new free AIM(R) Mail -- 2 GB of storage and industry-leading spam and email virus protection. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Go Misnomer?
On 2, Feb 2007, at 9:08 AM, [EMAIL PROTECTED] wrote: I agree with what you say here and I'll try to make my point better. First, my limited experience working with Monte-Carlo simulations involved photons traveling through scattering media. The sequences of moves described in the Mogo paper are pleasantly reminiscent of this. I did not mean to suggest that heavy playouts, incorporating go knowledge, makes the Monte-Carlo description incorrect. Rather, the term is increasingly insufficient as a Go engine description because it lumps together such very different algorithms. It is clear that a wide variety of MC attempts are in use and under construction, each with their own way of attempting to construct a proper distribution (and some with no idea that this is even an important part of the procedure). I think that MoGo is being so successful because they have gone the farthest is getting the sampling closest to a distribution appropriate for Go. Unfortunately, there is nothing like a Boltzman function that helps us with Go. The earliest MC engines were extremely simple and easily described. It seems inevitable that someone new to the field will seize on this description, and then combine it with the success of current Monte-Carlo engines, leading to unnecessary confusion. I am not sure what you mean by that. Do you mean that different people will use the term MC in different ways and cause confusion in the minds of third parties? In other words, some people are using the simplest pure random playout (no consideration of distribution at all) and calling that MC while others are trying hard to keep the moves searched Go-like. and thus get different results. Or do you mean that naive programmers will try pure random playout and wonder why the MoGo folks are doing so very much better, not realizing the importance of getting a decent distribution for MC to be effective? On a tangential note: MC/UCT with light playouts and MC/UCT with heavy playouts are different beasts. If SlugGo's mixture of experts work expands to include MC/UCT, you might want to consider adding one of each. I am quite open minded about what kind of experts we add to SlugGo. We are limited more by the time required to implement and integrate than anything else. Integration is a problem that can be quite severe for experts (or move suggesters) with completely different evaluation functions. How does one arbitrate between suggestions that come to you with evaluations of move quality that are on scales that are not co-linear? This point has kept SlugGo as a pure GNU Go dependent engine for longer than we had originally expected. But I am not sure what the value is in what you are calling light playouts. As per the above, it seems to me that light playout is simply ignoring any proper distribution, and thus is just a much more inefficient way to sample. Cheers, David ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Go Misnomer?
David Doshay wrote: But I am not sure what the value is in what you are calling light playouts. As per the above, it seems to me that light playout is simply ignoring any proper distribution, and thus is just a much more inefficient way to sample. Well maybe Sylvain is willing to answer some related questions of mine. Mogo seems to be much stronger then the rest, so I am more then curious what is more important: 1) A clever random playaout (I assume this is called a proper distribution now?) OR 2) Clever things within the UTC tree and/or above that. (Go knowlegde, selectivity etc.) And: is the answer the same for 9x9 and 19x19 boards? Thanks, Edward de Grijs _ Nieuw: Live Mail. Mis het niet en profiteer direct van de voordelen! http://imagine-windowslive.com/mail/launch/default.aspx?Locale=nl-nl ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Go Misnomer?
-Original Message- From: [EMAIL PROTECTED] ... The earliest MC engines were extremely simple and easily described. It seems inevitable that someone new to the field will seize on this description, and then combine it with the success of current Monte-Carlo engines, leading to unnecessary confusion. I am not sure what you mean by that. Do you mean that different people will use the term MC in different ways and cause confusion in the minds of third parties? In other words, some people are using the simplest pure random playout (no consideration ^of distribution at all) and calling that MC while others are trying hard to keep the moves searched Go-like. and thus get different ^results. Or do you mean that naive programmers will try pure random playout and wonder why the MoGo folks are doing so very much better, not realizing the importance of getting a decent distribution for MC to be effective? I mean both. But IMHO the most serious distinction is whether the MC is combined with tree search or not. I'm not embarking on a nomenclature crusade. Just remarking that when people say Monte-Carlo does this or that, it's often ambiguous what algorithm they are actually talking about. On a tangential note: MC/UCT with light playouts and MC/UCT with heavy playouts are different beasts. If SlugGo's mixture of experts work expands to include MC/UCT, you might want to consider adding one of each. I am quite open minded about what kind of experts we add to SlugGo. We are limited more by the time required to implement and integrate than anything else. Integration is a problem that can be quite severe for experts (or move suggesters) with completely different evaluation functions. How does one arbitrate between suggestions that come to you with evaluations of move quality that are on scales that are not co-linear? This point has kept SlugGo as a pure GNU Go dependent engine for longer than we had originally expected. Well this is a common problem when combining heterogeneous experts and there is no shortage of approaches, just as you point out, a shortage of time and energy to experiment with them. One very simple approach that sometimes works in some domains is to gather a number of experts, have them rank the choices and combine at that level. In this case, it's usually good to have experts that are qualitatively different. But I am not sure what the value is in what you are calling light playouts. As per the above, it seems to me that light playout is simply ignoring any proper distribution, and thus is just a much more inefficient way to sample. AntIgo with heavy playouts is about 300 ELO points stronger on CGOS than AntIgo with light playouts. If you're going to choose just one, I don't think it's a close call. - Dave Hillis Check Out the new free AIM(R) Mail -- 2 GB of storage and industry-leading spam and email virus protection. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Go Misnomer?
[EMAIL PROTECTED] wrote: -Original Message- From: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] ... The earliest MC engines were extremely simple and easily described. It seems inevitable that someone new to the field will seize on this description, and then combine it with the success of current Monte-Carlo engines, leading to unnecessary confusion. I am not sure what you mean by that. Do you mean that different people will use the term MC in different ways and cause confusion in the minds of third parties? In other words, some people are using the simplest pure random playout (no consideration ^of distribution at all) and calling that MC while others are trying hard to keep the moves searched Go-like. and thus get different ^results. Or do you mean that naive programmers will try pure random playout and wonder why the MoGo folks are doing so very much better, not realizing the importance of getting a decent distribution for MC to be effective? I mean both. But IMHO the most serious distinction is whether the MC is combined with tree search or not. I'm not embarking on a nomenclature crusade. Just remarking that when people say Monte-Carlo does this or that, it's often ambiguous what algorithm they are actually talking about. This is the condition that led me to first think of and submit the original question. Sometimes you can't tell what people are talking about when they mention MC go. So does anyone have any thoughts on the follow-up comment and question? That selective move playouts could have adverse effects, and that random is probably not very good but not biased. Also, what MC distribution techniques (from general or specific research areas) might be used to help create better sampling distributions and reduce variance, improving overall results? ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Monte-Carlo Go Misnomer?
Is MC Go a misnomer for programs in this genre not using simple random playouts and combining with other techniques like pattern matching? Technically, does the general Monte-Carlo method require random or pseudo-random sampling? If so, should we dub a new name for these non-random deep play-out sampling based go programs? Maybe Quasi-MC or Directed Sampling... ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/