Re: [computer-go] New scalability study : show uncertainty ?
On Tue, Jan 22, 2008 at 09:51:11PM +0100, Alain Baeckeroot wrote: Le mardi 22 janvier 2008, Michael Williams a écrit : ... perhaps only uniformly random playouts will scale to perfection. The reason that MC/UCT scales to perfection is because of the UCT part, not the MC (playout) part. People seems to forget this a lot. I agree on this _only_ if the UCT check all possible moves. If not one can be limited by the quality of the playout. I think we may be confusing two different things here: a) Using all possible moves in the playouts to evaluate a leaf in the UCT tree b) Making the UCT search all possible moves in a position These two are related, and I suspect often people use the same code for listing the possible moves, so they tend to be the same in many programs. Theoretically speaking, errors and bias in those two may well result in different things. Most MC implementations (that I know of) avoid playing in one-point eyes. That is alrady a deviation from all legal moves, but one that makes perfectly good sense. Yet there is at least one exception, where playing into an one-point eye can create a nakade, and kill a surrounding group... The selection of possible moves for a node in the UCT tree can be somewhat slower, since it is not done nearly as often. Also, adding bad moves here costs less than in the MC playout, since the UCT algorithm can see that they will not lead anywhere, and not give them so much attention. I don't (yet?) have an UCT program, so I can not test this. Some day when I have one, I will try to see how much it will help or hurt to try all legal moves in the UCT portion... If someone else tries it before, let us all know! - Heikki -- Heikki Levanto In Murphy We Turst heikki (at) lsd (dot) dk ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
Quoting Heikki Levanto [EMAIL PROTECTED]: I agree on this _only_ if the UCT check all possible moves. If not one can be limited by the quality of the playout. I think we may be confusing two different things here: a) Using all possible moves in the playouts to evaluate a leaf in the UCT tree b) Making the UCT search all possible moves in a position These two are related, and I suspect often people use the same code for listing the possible moves, so they tend to be the same in many programs. Indeed, Valkyria, uses the same code to prune move in both the playouts and in the UCT-tree. This pruning is supposed to be 100% safe and applies to really bad and ugly moves. But it is really hard to do this right and I still find a lot of bugs of this kind. There is an advantage of using the same code in the UCT-part because when I watch the program play I can see mistakes in pruning which otherwise only would be unseen in the playouts. In the latest version of Valkyria I make the exception that all moves are allowed if there is a ko fight, but I do not know if this is necessary at all. -Magnus ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
On Wed, Jan 23, 2008 at 11:18:37AM +0100, Magnus Persson wrote: Indeed, Valkyria, uses the same code to prune move in both the playouts and in the UCT-tree. This pruning is supposed to be 100% safe and applies to really bad and ugly moves. But you still prune moves like filling a one-point eye. We know that there is a pthological case where that indeed is a correct move. So Valkyria will never converge to perfect play even with unlimited CPU power. But it is really hard to do this right and I still find a lot of bugs of this kind. There is an advantage of using the same code in the UCT-part because when I watch the program play I can see mistakes in pruning which otherwise only would be unseen in the playouts. That is a valid point. Not to the theoretical discussion, but in practical everyday life! -H -- Heikki Levanto In Murphy We Turst heikki (at) lsd (dot) dk ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
Quoting Heikki Levanto [EMAIL PROTECTED]: But you still prune moves like filling a one-point eye. We know that there is a pthological case where that indeed is a correct move. So Valkyria will never converge to perfect play even with unlimited CPU power. Yes, this is a known bug. :) But, I will not fix it until I fixed some more urgent stuff (there is a very long list with higher priority). Also I feel that in the case of having a really strong program attempting perfect play with this bug as the only defect, the opponent has be extremely strong (or extremely lucky) to exploit it. It might even not be possible to exploit aginst perfect play, because there may always be simpler ways to kill the group than making en eye and then fill it, alternatively let it live small and win on points. MoGo is a good example of this. I think Valkyria has an edge against MoGo in many LD situations, but often MoGo just put territorial pressure on the groups leading to positions where the opponent either defends and loses on points or take a gamble, tenuki and die without tactical complications. Magnus ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
I don't think only uniformly random playouts will scale to perfection because what we need for playouts is not just a simple average of final scores but a maximum (in negmax sense) score. It should be the perfect evaluation function. In other words, as MC simulation is a way to get an average of a value, when applying it to optimization problems we need some way to focus the simulations to the _peak_ in a state space. It may be obvious when one consideres LD problems where the best move that leads to the maximum score (live) is only one and all other moves are bad. At such positions it's almost no sense to simulate all legal moves with same probability. So, IMHO, biasing simulations is not just a speed-up technique but is essentially important. I agree, but what I meant about uniformly random playouts is the following: What makes a move outstanding is being unpredictable. For a total novice, playing at the key point of a bulky five may look like a touch of genius, but when you learn a little, its an obvious move. The difference between a 5p and a 9p may be one or two moves nobody can predict (except a 9p). When we add knowledge we find the _ordinary_ good moves faster, we make weaker moves less probable, but that comes at a price, the price of making outstanding unpredictable moves less probable also. Perhaps that introduces a ceiling. I thought that was what you were also pointing. Of course, I don't claim uniformly random playouts are good, I just claim that they should (just as an infeasible theoretic argument) scale to perfection, of course that scaling doesn't have to be linear. Jacques. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
Filling suc an eye does not require an extremely strong opponent; I am rated a mere 9 kyu AGA, and use this method often. My opponents also use it every chance they get. When teaching 20 kyu players, these known-dead shapes are right at the top of the list. However, several otherwise strong programs appear to have a blind spot regarding these situations. - Original Message From: Magnus Persson [EMAIL PROTECTED] To: computer-go@computer-go.org Sent: Wednesday, January 23, 2008 4:03:26 AM Subject: Re: [computer-go] New scalability study : show uncertainty ? Quoting Heikki Levanto [EMAIL PROTECTED]: But you still prune moves like filling a one-point eye. We know that there is a pthological case where that indeed is a correct move. So Valkyria will never converge to perfect play even with unlimited CPU power. Yes, this is a known bug. :) But, I will not fix it until I fixed some more urgent stuff (there is a very long list with higher priority). Also I feel that in the case of having a really strong program attempting perfect play with this bug as the only defect, the opponent has be extremely strong (or extremely lucky) to exploit it. It might even not be possible to exploit aginst perfect play, because there may always be simpler ways to kill the group than making en eye and then fill it, alternatively let it live small and win on points. MoGo is a good example of this. I think Valkyria has an edge against MoGo in many LD situations, but often MoGo just put territorial pressure on the groups leading to positions where the opponent either defends and loses on points or take a gamble, tenuki and die without tactical complications. Magnus ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
Hideki Kato wrote: It's rather odd. I'm checking the log file and then I will check the source code to see if I have some artificial limits in there. Why odd? It all depends on the bias or policy of simulations. If there is a flaw in the policy, the score will converses to the score with some error, which will introduce some limit of scalability, isn't it? That is a very good point. Perhaps it is not the case with FatMan, but that may surely happen. In this study no program is playing with uniformly random playouts and perhaps only uniformly random playouts will scale to perfection. Of course, I can imagine that reaching the strength of Mogo_13 with uniformly random playouts can require a number of simulations that is not feasible. So I don't have any idea about how to improve the study, but this is a serious limitation that has to be considered: If you find some ceiling, the ceiling may be attributed to the playout policy, not to UCT. Jacques. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
Hi Jacques, Jacques Basaldúa: [EMAIL PROTECTED]: Hideki Kato wrote: It's rather odd. I'm checking the log file and then I will check the source code to see if I have some artificial limits in there. Why odd? It all depends on the bias or policy of simulations. If there is a flaw in the policy, the score will converses to the score with some error, which will introduce some limit of scalability, isn't it? That is a very good point. Perhaps it is not the case with FatMan, but that may surely happen. In this study no program is playing with uniformly random playouts and perhaps only uniformly random playouts will scale to perfection. I don't think only uniformly random playouts will scale to perfection because what we need for playouts is not just a simple average of final scores but a maximum (in negmax sense) score. It should be the perfect evaluation function. In other words, as MC simulation is a way to get an average of a value, when applying it to optimization problems we need some way to focus the simulations to the _peak_ in a state space. It may be obvious when one consideres LD problems where the best move that leads to the maximum score (live) is only one and all other moves are bad. At such positions it's almost no sense to simulate all legal moves with same probability. So, IMHO, biasing simulations is not just a speed-up technique but is essentially important. Of course, I can imagine that reaching the strength of Mogo_13 with uniformly random playouts can require a number of simulations that is not feasible. I guess it should be done by only by UCT but to guide UCT to the best path requires good simulations. It may also be possible uniformly random playouts never get the strength of MoGo_13. So I don't have any idea about how to improve the study, but this is a serious limitation that has to be considered: If you find some ceiling, the ceiling may be attributed to the playout policy, not to UCT. Agree. -Hideki Jacques. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ -- [EMAIL PROTECTED] (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
Jacques Basaldúa wrote: Hideki Kato wrote: It's rather odd. I'm checking the log file and then I will check the source code to see if I have some artificial limits in there. Why odd? It all depends on the bias or policy of simulations. If there is a flaw in the policy, the score will converses to the score with some error, which will introduce some limit of scalability, isn't it? That is a very good point. Perhaps it is not the case with FatMan, but that may surely happen. In this study no program is playing with uniformly random playouts and perhaps only uniformly random playouts will scale to perfection. Of course, I can imagine that reaching the strength of Mogo_13 with uniformly random playouts can require a number of simulations that is not feasible. So I don't have any idea about how to improve the study, but this is a serious limitation that has to be considered: If you find some ceiling, the ceiling may be attributed to the playout policy, not to UCT. I think there is a performance bug in FatMan causing the lack of scalability. FatMan should play perfectly given enough time but it looks like it stopped. For instance one problem that would make it stop improving is an arbitrary limit on depth. I do have an arbitrary limit of 30 ply, but I don't think this is a problem at these time-controls. In fact I run a version off-line where I instrument this and it does not exceed 25 ply in any line over one whole game. There are other things that would put a hard limit on how strong it could potentially play, but I haven't found it yet. - Don Jacques. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
I might add that an unexpected benefit of running this study is that I'm now aware of a scalability issue in FatMan. I probably should have put Lazarus in the study instead - it's a good bit stronger and now I would like to know if it has a similar problem! - Don Don Dailey wrote: Jacques Basaldúa wrote: Hideki Kato wrote: It's rather odd. I'm checking the log file and then I will check the source code to see if I have some artificial limits in there. Why odd? It all depends on the bias or policy of simulations. If there is a flaw in the policy, the score will converses to the score with some error, which will introduce some limit of scalability, isn't it? That is a very good point. Perhaps it is not the case with FatMan, but that may surely happen. In this study no program is playing with uniformly random playouts and perhaps only uniformly random playouts will scale to perfection. Of course, I can imagine that reaching the strength of Mogo_13 with uniformly random playouts can require a number of simulations that is not feasible. So I don't have any idea about how to improve the study, but this is a serious limitation that has to be considered: If you find some ceiling, the ceiling may be attributed to the playout policy, not to UCT. I think there is a performance bug in FatMan causing the lack of scalability. FatMan should play perfectly given enough time but it looks like it stopped. For instance one problem that would make it stop improving is an arbitrary limit on depth. I do have an arbitrary limit of 30 ply, but I don't think this is a problem at these time-controls. In fact I run a version off-line where I instrument this and it does not exceed 25 ply in any line over one whole game. There are other things that would put a hard limit on how strong it could potentially play, but I haven't found it yet. - Don Jacques. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
How hard would it be to add Lazarus at some point? Terry McIntyre [EMAIL PROTECTED] “Wherever is found what is called a paternal government, there is found state education. It has been discovered that the best way to insure implicit obedience is to commence tyranny in the nursery.” Benjamin Disraeli, Speech in the House of Commons [June 15, 1874] - Original Message From: Don Dailey [EMAIL PROTECTED] To: computer-go computer-go@computer-go.org Sent: Tuesday, January 22, 2008 7:59:26 AM Subject: Re: [computer-go] New scalability study : show uncertainty ? I might add that an unexpected benefit of running this study is that I'm now aware of a scalability issue in FatMan. I probably should have put Lazarus in the study instead - it's a good bit stronger and now I would like to know if it has a similar problem! - Don Don Dailey wrote: Jacques Basaldúa wrote: Hideki Kato wrote: It's rather odd. I'm checking the log file and then I will check the source code to see if I have some artificial limits in there. Why odd? It all depends on the bias or policy of simulations. If there is a flaw in the policy, the score will converses to the score with some error, which will introduce some limit of scalability, isn't it? That is a very good point. Perhaps it is not the case with FatMan, but that may surely happen. In this study no program is playing with uniformly random playouts and perhaps only uniformly random playouts will scale to perfection. Of course, I can imagine that reaching the strength of Mogo_13 with uniformly random playouts can require a number of simulations that is not feasible. So I don't have any idea about how to improve the study, but this is a serious limitation that has to be considered: If you find some ceiling, the ceiling may be attributed to the playout policy, not to UCT. I think there is a performance bug in FatMan causing the lack of scalability. FatMan should play perfectly given enough time but it looks like it stopped. For instance one problem that would make it stop improving is an arbitrary limit on depth. I do have an arbitrary limit of 30 ply, but I don't think this is a problem at these time-controls. In fact I run a version off-line where I instrument this and it does not exceed 25 ply in any line over one whole game. There are other things that would put a hard limit on how strong it could potentially play, but I haven't found it yet. - Don Jacques. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
... perhaps only uniformly random playouts will scale to perfection. The reason that MC/UCT scales to perfection is because of the UCT part, not the MC (playout) part. People seems to forget this a lot. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
On Tue, 22 Jan 2008, Don Dailey wrote: I probably should have put Lazarus in the study instead - it's a good bit stronger and now I would like to know if it has a similar problem! Why don't you just add Lazarus; and Mogo_14 ? It appears you've got enough CPU power ... Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
I encourage anybody with an extra computer running Linux, or with a duo or quad core, to consider running an instance of Don's study. I have a duo and quad available, and each instance of the study maxes out one core, so I started four instances and still have two cores available, leaving me plenty of computer power. ( I sure love Moore's Law! ) It's all very low-maintenance; start the programs, send a signature to Don, and leave it alone. Terry McIntyre [EMAIL PROTECTED] - Original Message From: Don Dailey [EMAIL PROTECTED] To: computer-go computer-go@computer-go.org Sent: Tuesday, January 22, 2008 7:59:26 AM Subject: Re: [computer-go] New scalability study : show uncertainty ? I might add that an unexpected benefit of running this study is that I'm now aware of a scalability issue in FatMan. I probably should have put Lazarus in the study instead - it's a good bit stronger and now I would like to know if it has a similar problem! - Don Don Dailey wrote: Jacques Basaldúa wrote: Hideki Kato wrote: It's rather odd. I'm checking the log file and then I will check the source code to see if I have some artificial limits in there. Why odd? It all depends on the bias or policy of simulations. If there is a flaw in the policy, the score will converses to the score with some error, which will introduce some limit of scalability, isn't it? That is a very good point. Perhaps it is not the case with FatMan, but that may surely happen. In this study no program is playing with uniformly random playouts and perhaps only uniformly random playouts will scale to perfection. Of course, I can imagine that reaching the strength of Mogo_13 with uniformly random playouts can require a number of simulations that is not feasible. So I don't have any idea about how to improve the study, but this is a serious limitation that has to be considered: If you find some ceiling, the ceiling may be attributed to the playout policy, not to UCT. I think there is a performance bug in FatMan causing the lack of scalability. FatMan should play perfectly given enough time but it looks like it stopped. For instance one problem that would make it stop improving is an arbitrary limit on depth. I do have an arbitrary limit of 30 ply, but I don't think this is a problem at these time-controls. In fact I run a version off-line where I instrument this and it does not exceed 25 ply in any line over one whole game. There are other things that would put a hard limit on how strong it could potentially play, but I haven't found it yet. - Don Jacques. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
- Original Message ... perhaps only uniformly random playouts will scale to perfection. The reason that MC/UCT scales to perfection is because of the UCT part, not the MC (playout) part. People seems to forget this a lot. Playouts can limit scalability. I asked recently about Mogo not being able to detect that four liberties in a square do not two eyes make. The only explanation offered was that the Mogo playouts rejected the possibility of a move inside the eyespace; this skewed the UCT evaluation, leading Mogo to mistakenly believe it had a won game when it was actually doomed even with perfect play on its part. Apparently, it is not easy to tune the playouts to obtain both speed and accuracy. I think some higher-level information will need to be propagated to the playouts. Current patterns can be brittle. A move which is urgent when an isolated group must make two eyes in a small space could be wasted effort if the group can reliably connect to another eye. UCT can help distinguish the two, if the playouts give the vital points and the connection moves high probabilities. At the same time, unlikely moves must not be totally rejected; they should be explored if other attempts fail. Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
On Jan 22, 2008 2:50 PM, Michael Williams [EMAIL PROTECTED] wrote: Playouts can limit scalability. No, I don't think so. Actually, the given example seems legit. I may rephrase it as move pruning policies can limit scalability. Depending on the severity, the bot may be completely blind to some lines of play. On the other hand, with soft pruning (un-pruning?) methods, it may just take a really really long time to come to a realization. Of course, that doesn't really mean that things don't scale. They just don't scale as quickly as some would like in certain situations. Of course, that's a trade that all bots must make... ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
On Tue, Jan 22, 2008 at 12:33:26PM -0500, Michael Williams wrote: The reason that MC/UCT scales to perfection is because of the UCT part, not the MC (playout) part. People seems to forget this a lot. For some level of perfection, of course. Although UCT is a new search algorithm, it just one example of best-first search. It is quite possible that some other form of search might perform equally good, or even better. -H (yes, I have some ideas. I will need to test them before I say much more here. Unfortunately I have a daytime job, and other hobbies getting in the way of go programming. Progress is slow, but it does happen!) -- Heikki Levanto In Murphy We Turst heikki (at) lsd (dot) dk ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
Playouts can limit scalability. No, I don't think so. I asked recently about Mogo not being able to detect that four liberties in a square do not two eyes make. The only explanation offered was that the Mogo playouts rejected the possibility of a move inside the eyespace; this skewed the UCT evaluation, leading Mogo to mistakenly believe it had a won game when it was actually doomed even with perfect play on its part. Try increasing the size of the tree and the number of playouts. And/or try the problem on a smaller board. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
The only problems is that it IS more work for me. It doesn't quite run itself - however we are getting better at it and it's not too bad now that we have ironed out most of the problems. So it anyone else wants to help - let me know. currently we have 17 computers helping out: Games: 8 Dailey P4 Games:23 Hideki site 5 Games:29 MWilliams1 Games:30 terry duo_1 Games:30 terry duo_2 Games:31 terry quad_1 Games:32 MWilliams2 Games:37 Heikki AMD64 Games:40 Jeff sp Games:53 terry quad_2 Games:84 Jeff cn Games:94 Hideki site 2 Games: 100 Hideki stie 1 Games: 105 Frank site 2 Games: 124 Dailey site 1 Games: 132 Dailey site 2 Games: 173 Frank site 1 - Don terry mcintyre wrote: I encourage anybody with an extra computer running Linux, or with a duo or quad core, to consider running an instance of Don's study. I have a duo and quad available, and each instance of the study maxes out one core, so I started four instances and still have two cores available, leaving me plenty of computer power. ( I sure love Moore's Law! ) It's all very low-maintenance; start the programs, send a signature to Don, and leave it alone. Terry McIntyre [EMAIL PROTECTED] - Original Message From: Don Dailey [EMAIL PROTECTED] To: computer-go computer-go@computer-go.org Sent: Tuesday, January 22, 2008 7:59:26 AM Subject: Re: [computer-go] New scalability study : show uncertainty ? I might add that an unexpected benefit of running this study is that I'm now aware of a scalability issue in FatMan. I probably should have put Lazarus in the study instead - it's a good bit stronger and now I would like to know if it has a similar problem! - Don Don Dailey wrote: Jacques Basaldúa wrote: Hideki Kato wrote: It's rather odd. I'm checking the log file and then I will check the source code to see if I have some artificial limits in there. Why odd? It all depends on the bias or policy of simulations. If there is a flaw in the policy, the score will converses to the score with some error, which will introduce some limit of scalability, isn't it? That is a very good point. Perhaps it is not the case with FatMan, but that may surely happen. In this study no program is playing with uniformly random playouts and perhaps only uniformly random playouts will scale to perfection. Of course, I can imagine that reaching the strength of Mogo_13 with uniformly random playouts can require a number of simulations that is not feasible. So I don't have any idea about how to improve the study, but this is a serious limitation that has to be considered: If you find some ceiling, the ceiling may be attributed to the playout policy, not to UCT. I think there is a performance bug in FatMan causing the lack of scalability. FatMan should play perfectly given enough time but it looks like it stopped. For instance one problem that would make it stop improving is an arbitrary limit on depth. I do have an arbitrary limit of 30 ply, but I don't think this is a problem at these time-controls. In fact I run a version off-line where I instrument this and it does not exceed 25 ply in any line over one whole game. There are other things that would put a hard limit on how strong it could potentially play, but I haven't found it yet. - Don Jacques. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
Le mardi 22 janvier 2008, Michael Williams a écrit : ... perhaps only uniformly random playouts will scale to perfection. The reason that MC/UCT scales to perfection is because of the UCT part, not the MC (playout) part. People seems to forget this a lot. I agree on this _only_ if the UCT check all possible moves. If not one can be limited by the quality of the playout. Pure random-MC playout has the great advantage of being totally neutral and unbiased. If you use gnugo for playout, as it has systematical errors (like wrong estimation of life and death for groups surrounded at some distance) your playout will be biased toward a wrong solution, far from perfect. I think Sluggo showed this very clearly. Alain ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
It would be easy. I didn't put Lazarus in because it is less pure in some sense.FatMan is a very straightforward and generic and I thought would be representative of basic UCT.Lazarus has all kinds of kludges and it's not clear to me if it's as scalable. I probably don't really want to run Lazarus. - Don terry mcintyre wrote: How hard would it be to add Lazarus at some point? Terry McIntyre [EMAIL PROTECTED] “Wherever is found what is called a paternal government, there is found state education. It has been discovered that the best way to insure implicit obedience is to commence tyranny in the nursery.” Benjamin Disraeli, Speech in the House of Commons [June 15, 1874] - Original Message From: Don Dailey [EMAIL PROTECTED] To: computer-go computer-go@computer-go.org Sent: Tuesday, January 22, 2008 7:59:26 AM Subject: Re: [computer-go] New scalability study : show uncertainty ? I might add that an unexpected benefit of running this study is that I'm now aware of a scalability issue in FatMan. I probably should have put Lazarus in the study instead - it's a good bit stronger and now I would like to know if it has a similar problem! - Don Don Dailey wrote: Jacques Basaldúa wrote: Hideki Kato wrote: It's rather odd. I'm checking the log file and then I will check the source code to see if I have some artificial limits in there. Why odd? It all depends on the bias or policy of simulations. If there is a flaw in the policy, the score will converses to the score with some error, which will introduce some limit of scalability, isn't it? That is a very good point. Perhaps it is not the case with FatMan, but that may surely happen. In this study no program is playing with uniformly random playouts and perhaps only uniformly random playouts will scale to perfection. Of course, I can imagine that reaching the strength of Mogo_13 with uniformly random playouts can require a number of simulations that is not feasible. So I don't have any idea about how to improve the study, but this is a serious limitation that has to be considered: If you find some ceiling, the ceiling may be attributed to the playout policy, not to UCT. I think there is a performance bug in FatMan causing the lack of scalability. FatMan should play perfectly given enough time but it looks like it stopped. For instance one problem that would make it stop improving is an arbitrary limit on depth. I do have an arbitrary limit of 30 ply, but I don't think this is a problem at these time-controls. In fact I run a version off-line where I instrument this and it does not exceed 25 ply in any line over one whole game. There are other things that would put a hard limit on how strong it could potentially play, but I haven't found it yet. - Don Jacques. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
It was a 9x9 board; any smaller, and I'm not sure it would be meaningful. Had a generous allotment of time, and 4 threads. I'll try some scalability experiments to see whether it would ever discover the problem. What happened in the game was that, after I played a stone inside the square-four, Mogo realized it was way behind, and resigned. Another poster reported similar experiences - after a redundant throw-in, Mogo became aware that the situation was unwinnable. Terry McIntyre [EMAIL PROTECTED] “Wherever is found what is called a paternal government, there is found state education. It has been discovered that the best way to insure implicit obedience is to commence tyranny in the nursery.” Benjamin Disraeli, Speech in the House of Commons [June 15, 1874] - Original Message From: Michael Williams [EMAIL PROTECTED] To: computer-go computer-go@computer-go.org Sent: Tuesday, January 22, 2008 11:50:46 AM Subject: Re: [computer-go] New scalability study : show uncertainty ? Playouts can limit scalability. No, I don't think so. I asked recently about Mogo not being able to detect that four liberties in a square do not two eyes make. The only explanation offered was that the Mogo playouts rejected the possibility of a move inside the eyespace; this skewed the UCT evaluation, leading Mogo to mistakenly believe it had a won game when it was actually doomed even with perfect play on its part. Try increasing the size of the tree and the number of playouts. And/or try the problem on a smaller board. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
Alain Baeckeroot wrote: Le mardi 22 janvier 2008, Michael Williams a écrit : ... perhaps only uniformly random playouts will scale to perfection. The reason that MC/UCT scales to perfection is because of the UCT part, not the MC (playout) part. People seems to forget this a lot. I agree on this _only_ if the UCT check all possible moves. If not one can be limited by the quality of the playout. Well, yeah. If the UCT part is not built in a scalable way, then your program will not be scalable. This still has nothing to do with the playouts. If your tree expansion policy is broken, then it does not matter how great your playouts are. Eventually you will not be able to scale any more. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
I wonder if fatman trend near 13 is an artefact or if the curve shows it has reached some limit in scalabitly. It's rather odd. I'm checking the log file and then I will check the source code to see if I have some artificial limits in there. - Don Alain. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
I checked the log files and see nothing bizarre. I looked for sudden score changes or cases where the program is winning and suddenly loses. I also looked for early search aborting and saw none of that. FatMan will abort the search is the memory pool runs low. There is a very good chance this is merely due to low sample size. Each player has played less than 50 games and a couple of freak losses in a small sample can make a large difference in your rating. You probably noticed that the graph for Mogo was wildly up and down just a few games ago. However I'm still going to look at the source code - I haven't studied FatMan in many months. FatMan has early exit, but even that is implemented in a scalable way. If the score is extremely negative or extremely positive it will stop searching and play a move, but only when it has searched at least 1/10 of the nodes (or time) it had intended to search.In the log files of this study that I have access to, this has not yet changed the expected result of the game.But it's not a feature that should affect it's scalability. - Don Don Dailey wrote: I wonder if fatman trend near 13 is an artefact or if the curve shows it has reached some limit in scalabitly. It's rather odd. I'm checking the log file and then I will check the source code to see if I have some artificial limits in there. - Don Alain. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
It seems like our CPU cyces would be better used testing even higher levels of MoGo than the current levels of Fatman. MoGo appears to be at least an order of magnitude more efficient. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
Michael Williams wrote: It seems like our CPU cyces would be better used testing even higher levels of MoGo than the current levels of Fatman. MoGo appears to be at least an order of magnitude more efficient. The same thought occurred to me too. I want to see if Mogo keeps going. But this experiment is going very well with all the help we are getting - I'm not ready to stop yet. Let's give it a few more days and see how long the programs keep climbing.50 games is not a satisfying statical sample. When this test is completed, let's use the infrastructure I created here to run a self-play mogo test by pushing mogo to higher levels. I can keep the mogo vs mogo data intact from this study and transfer it to the next (it's in a database and so this is easy.) - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
On Jan 21, 2008, at 5:50 PM, Don Dailey wrote: When this test is completed, let's use the infrastructure I created here to run a self-play mogo test by pushing mogo to higher levels. I can keep the mogo vs mogo data intact from this study and transfer it to the next (it's in a database and so this is easy.) Why stop the test? Just add Mogo_14 ... Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
I agree. I see no reason to discard the mogo vs fatman games when mogo moves to higher levels. Just add to the current database, as is. Christoph Birk wrote: On Jan 21, 2008, at 5:50 PM, Don Dailey wrote: When this test is completed, let's use the infrastructure I created here to run a self-play mogo test by pushing mogo to higher levels. I can keep the mogo vs mogo data intact from this study and transfer it to the next (it's in a database and so this is easy.) Why stop the test? Just add Mogo_14 ... Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
Don Dailey: [EMAIL PROTECTED]: I wonder if fatman trend near 13 is an artefact or if the curve shows it has reached some limit in scalabitly. It's rather odd. I'm checking the log file and then I will check the source code to see if I have some artificial limits in there. Why odd? It all depends on the bias or policy of simulations. If there is a flaw in the policy, the score will converses to the score with some error, which will introduce some limit of scalability, isn't it? -Hideki - Don Alain. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ -- [EMAIL PROTECTED] (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New scalability study : show uncertainty ?
Le samedi 19 janvier 2008, Don Dailey a écrit : The new scalability study is in progress. It will be very slow going, only a few games a day can be played but we are trying to get more computers utilized. I will update the data a few times a day for all to see. This includes a crosstable and ratings graphs. The games will be made available for anyone who wants them. Although it's not on the graph itself, Gnugo-3.7.11 level 10 is set to be 1800.0 ELO.The bayeselo program is used to calculate ratings. Results can be found here: http://cgos.boardspace.net/study/index.html It would be nice to also plot uncertainty bar around the estimated rank. I wonder if fatman trend near 13 is an artefact or if the curve shows it has reached some limit in scalabitly. Alain. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/