Re: [computer-go] The physics of Go playing strength.
On Mon, Apr 16, 2007 at 09:40:46AM +0900, Darren Cook wrote: I have one interesting test that I do, which I take with a grain of salt, but I use as a first guess estimate. I search from the opening position a few hundred times and average the time required to find the move e5. ... Yes, I noticed libego usually switches to e5 at some point between 100K and 200K playouts. Like Don, though, I'd take that with a grain of salt; it is quite possible that there is no single best first move (e.g. 5-5, 4-4 and 3-4 might all be joint best). That's true - but all the 3-4 points should get about equal scores on an empty board, as should all 4-4 points, etc. Would that not be a good indication that the program 'understands' the situation, instead of just randomly hitting on a good move? - Heikki -- Heikki Levanto In Murphy We Turst heikki (at) lsd (dot) dk ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
Now I don't feel so bad -- my UCT prog also sucks ass, only slower. On 4/13/07, Darren Cook [EMAIL PROTECTED] wrote: I've been trying the libego program out of the box, and am up to 200,000 UCT playouts, but still gnugo 3.6 on level 6 is winning 10 out of 10. ... If 200,000 play-outs is being beat, something is broken. Even a bad implementation should do better than that. Does libego provide a full UCT implementation out of the box? Thanks for the reply Don. I've just reviewed the code again and it seems to be the same algorithm as described at http://senseis.xmp.net/?UCT The way it is implemented differs from the pseudo-code in a few minor ways, but notably: it (libego) maintains a float win-rate for the node, which is updated, rather than storing wins and recalculating win-rate each time. I suppose there could be a rounding error creeping in there. I've tried changing it to use two ints, for total visits and black wins, and then calculate the winning rate on the fly. But it still cannot win a game against gnugo on level 6: it lost 9 out of 9 at 100K playouts, and has lost 4 out of 4 at 500K playouts. Yet there is no major bug: when I played it a game at each of 100K and 500K it played reasonable go, and I could notice that 500K was stronger. Has anyone else tried studying or using libego's UCT implementation? Is there another open source implementation of UCT I can compare against? One thing that I think most people are doing is related to how you select a move once you have finished the search. Do you pick the highest scoring move? ... No, currently it chooses the most visited node, ignoring score. Darren ___ computer-go mailing list [EMAIL PROTECTED] http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list [EMAIL PROTECTED] http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
Libego played at old CGOS with name sth like UCT-107-???k 100k was about 1550k 200k about 1650k I don't remember and I can't find the rating list anymore. Łukasz On 4/14/07, Darren Cook [EMAIL PROTECTED] wrote: I've been trying the libego program out of the box, and am up to 200,000 UCT playouts, but still gnugo 3.6 on level 6 is winning 10 out of 10. ... If 200,000 play-outs is being beat, something is broken. Even a bad implementation should do better than that. Does libego provide a full UCT implementation out of the box? Thanks for the reply Don. I've just reviewed the code again and it seems to be the same algorithm as described at http://senseis.xmp.net/?UCT The way it is implemented differs from the pseudo-code in a few minor ways, but notably: it (libego) maintains a float win-rate for the node, which is updated, rather than storing wins and recalculating win-rate each time. I suppose there could be a rounding error creeping in there. I've tried changing it to use two ints, for total visits and black wins, and then calculate the winning rate on the fly. But it still cannot win a game against gnugo on level 6: it lost 9 out of 9 at 100K playouts, and has lost 4 out of 4 at 500K playouts. Yet there is no major bug: when I played it a game at each of 100K and 500K it played reasonable go, and I could notice that 500K was stronger. Has anyone else tried studying or using libego's UCT implementation? Is there another open source implementation of UCT I can compare against? One thing that I think most people are doing is related to how you select a move once you have finished the search. Do you pick the highest scoring move? ... No, currently it chooses the most visited node, ignoring score. Darren ___ computer-go mailing list [EMAIL PROTECTED] http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list [EMAIL PROTECTED] http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
On Sat, 2007-04-14 at 20:13 +0200, Łukasz Lew wrote: Libego played at old CGOS with name sth like UCT-107-???k 100k was about 1550k 200k about 1650k I don't remember and I can't find the rating list anymore. Łukasz I have posted the cross-tables, so you should be able to figure out what you need from this: http://cgos.boardspace.net/public/cross-AbyNormal-0.02.html http://cgos.boardspace.net/public/cross-AmiGoGtp.html http://cgos.boardspace.net/public/cross-AnchorMan.html http://cgos.boardspace.net/public/cross-Another10K.html http://cgos.boardspace.net/public/cross-AntIgo-2.html http://cgos.boardspace.net/public/cross-AntIgo-3.html http://cgos.boardspace.net/public/cross-AntIgo-4.html http://cgos.boardspace.net/public/cross-AntIgo-5.html http://cgos.boardspace.net/public/cross-AntIgo-6.html http://cgos.boardspace.net/public/cross-AntIgo-7.html http://cgos.boardspace.net/public/cross-AntIgo-8.html http://cgos.boardspace.net/public/cross-AverageLib.html http://cgos.boardspace.net/public/cross-Aya-553.html http://cgos.boardspace.net/public/cross-Aya1second.html http://cgos.boardspace.net/public/cross-Aya582WP.html http://cgos.boardspace.net/public/cross-Aya585cWP.html http://cgos.boardspace.net/public/cross-AyaBot.html http://cgos.boardspace.net/public/cross-AyaMC1.html http://cgos.boardspace.net/public/cross-Blitzkrieg.html http://cgos.boardspace.net/public/cross-Brown.html http://cgos.boardspace.net/public/cross-Brown_II.html http://cgos.boardspace.net/public/cross-Cataract.html http://cgos.boardspace.net/public/cross-Chuk.html http://cgos.boardspace.net/public/cross-ControlBoy.html http://cgos.boardspace.net/public/cross-Corsair.html http://cgos.boardspace.net/public/cross-CrazyRandom.html http://cgos.boardspace.net/public/cross-CrazyStone.html http://cgos.boardspace.net/public/cross-CrazyStoneQuad.html http://cgos.boardspace.net/public/cross-Curry.html http://cgos.boardspace.net/public/cross-DingBat-2.0.html http://cgos.boardspace.net/public/cross-DingBat-3.0.html http://cgos.boardspace.net/public/cross-DingBat-3.1.html http://cgos.boardspace.net/public/cross-DingBat-3.2.html http://cgos.boardspace.net/public/cross-DingBat-3.3.html http://cgos.boardspace.net/public/cross-DingBat-3.5.html http://cgos.boardspace.net/public/cross-DingBat.html http://cgos.boardspace.net/public/cross-Dingus-1.0.html http://cgos.boardspace.net/public/cross-Dingus-1.1.html http://cgos.boardspace.net/public/cross-Dingus-1.2.html http://cgos.boardspace.net/public/cross-Dingus-1.3.html http://cgos.boardspace.net/public/cross-Dingus-1.31.html http://cgos.boardspace.net/public/cross-Dino.html http://cgos.boardspace.net/public/cross-DumbIdea1.html http://cgos.boardspace.net/public/cross-DumbIdea2.html http://cgos.boardspace.net/public/cross-DumbIdea3.html http://cgos.boardspace.net/public/cross-DumbIdea4.html http://cgos.boardspace.net/public/cross-DumbTactic.html http://cgos.boardspace.net/public/cross-Dynamite-1000.html http://cgos.boardspace.net/public/cross-Explorer.html http://cgos.boardspace.net/public/cross-Fat-20.html http://cgos.boardspace.net/public/cross-Fat-25.html http://cgos.boardspace.net/public/cross-Foo-0.1.html http://cgos.boardspace.net/public/cross-Frostburg-0.1.html http://cgos.boardspace.net/public/cross-Frostburg-0.3.html http://cgos.boardspace.net/public/cross-Frostburg-0.4.html http://cgos.boardspace.net/public/cross-Frostburg-0.5.html http://cgos.boardspace.net/public/cross-Frostburg.html http://cgos.boardspace.net/public/cross-GL7test.html http://cgos.boardspace.net/public/cross-GNUGo-1.2.html http://cgos.boardspace.net/public/cross-GNUGo-2.0.html http://cgos.boardspace.net/public/cross-GenericMC_1.html http://cgos.boardspace.net/public/cross-GenericMC_100K.html http://cgos.boardspace.net/public/cross-GenericMC_200K.html http://cgos.boardspace.net/public/cross-GenericMC_VAR.html http://cgos.boardspace.net/public/cross-GoGalaxy.html http://cgos.boardspace.net/public/cross-GoGalaxy0.1.html http://cgos.boardspace.net/public/cross-GoJin-0.1.html http://cgos.boardspace.net/public/cross-GoJin-0.2.html http://cgos.boardspace.net/public/cross-GoJin-0.3.html http://cgos.boardspace.net/public/cross-GoJin-0.4.html http://cgos.boardspace.net/public/cross-GoJin-0.5.html http://cgos.boardspace.net/public/cross-GoJin-0.6.html http://cgos.boardspace.net/public/cross-GoJin-0.8.html http://cgos.boardspace.net/public/cross-GoJin-1.0.html http://cgos.boardspace.net/public/cross-GoJin-1.01.html http://cgos.boardspace.net/public/cross-GoJin-1.02.html http://cgos.boardspace.net/public/cross-GoJin-1.05.html http://cgos.boardspace.net/public/cross-GoJin-1.06.html http://cgos.boardspace.net/public/cross-GoJin-1.07.html http://cgos.boardspace.net/public/cross-GoJin-1.08.html http://cgos.boardspace.net/public/cross-GoJin-1.09.html http://cgos.boardspace.net/public/cross-GoJin-1.11.html http://cgos.boardspace.net/public/cross-GoJin-1.12.html http://cgos.boardspace.net/public/cross-GoJin-1.13.html
Re: [computer-go] The physics of Go playing strength.
Maybe try this test with libego? Don Dailey: I have one interesting test that I do, which I take with a grain of salt, but I use as a first guess estimate. I search from the opening position a few hundred times and average the time required to find the move e5.My assumption is that e5 is the best move (the program always settles on it eventually if I let it think long enough.) http://computer-go.org/pipermail/computer-go/2006-October/006830.html On 4/13/07, Darren Cook [EMAIL PROTECTED] wrote: I've been trying the libego program out of the box, and am up to 200,000 UCT playouts, but still gnugo 3.6 on level 6 is winning 10 out of 10. ... If 200,000 play-outs is being beat, something is broken. Even a bad implementation should do better than that. Does libego provide a full UCT implementation out of the box? Thanks for the reply Don. I've just reviewed the code again and it seems to be the same algorithm as described at http://senseis.xmp.net/?UCT The way it is implemented differs from the pseudo-code in a few minor ways, but notably: it (libego) maintains a float win-rate for the node, which is updated, rather than storing wins and recalculating win-rate each time. I suppose there could be a rounding error creeping in there. I've tried changing it to use two ints, for total visits and black wins, and then calculate the winning rate on the fly. But it still cannot win a game against gnugo on level 6: it lost 9 out of 9 at 100K playouts, and has lost 4 out of 4 at 500K playouts. Yet there is no major bug: when I played it a game at each of 100K and 500K it played reasonable go, and I could notice that 500K was stronger. Has anyone else tried studying or using libego's UCT implementation? Is there another open source implementation of UCT I can compare against? One thing that I think most people are doing is related to how you select a move once you have finished the search. Do you pick the highest scoring move? ... No, currently it chooses the most visited node, ignoring score. Darren ___ computer-go mailing list [EMAIL PROTECTED] http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list [EMAIL PROTECTED] http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
http://greencheeks.homelinux.org:8015/~drd/public/study.jpg ... I'm actually testing 2 programs - both of them UCT style go programs, but one of those programs does uniformly random play-outs and the other much stronger one is similar to Mogo, as documented in one of their papers. Hi Don, Can you describe the implementation of heavy and lite in more detail; especially lite? I've been trying the libego program out of the box, and am up to 200,000 UCT playouts, but still gnugo 3.6 on level 6 is winning 10 out of 10. According to your chart, lite with 128K playouts should be about 50 ELO points higher than gnugo 3.7 (on level 10 I assume?). (256K playouts should be 130 points higher.) I'm wondering if there is a standard set of UCT enhancements that everyone is doing? TIA, Darren ___ computer-go mailing list [EMAIL PROTECTED] http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
According to the report on MoGo (RR-6062), its playout part seems pruning not interesting moves using patterns. -gg Darren Cook: [EMAIL PROTECTED]: With infinite resource, i agree that random playout will find the best move. But it seems that nothing is guaranteed for heavy playout. As Don pointed out before, the reason it converges to perfect play is because of the UCT part, not because of the playout part. If the playout part prunes some moves, nothing is guaranteed. I believe the point is that UCT never prunes moves. The playouts performed at UCT leaf nodes are just to give an estimate to help UCT decide which part of the tree to explore next. I.e. heavy vs. light playouts are like intelligent vs. random move ordering in alpha-beta. Darren -- [EMAIL PROTECTED] (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
On Tue, 10 Apr 2007, Hideki Kato wrote: According to the report on MoGo (RR-6062), its playout part seems pruning not interesting moves using patterns. Yes, but the UCT part will (sooner or later) explore EVERY path. Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
On Tue, 10 Apr 2007, Christoph Birk wrote: On Tue, 10 Apr 2007, Hideki Kato wrote: According to the report on MoGo (RR-6062), its playout part seems pruning not interesting moves using patterns. Yes, but the UCT part will (sooner or later) explore EVERY path. But then again, if you had the resources to explore every path MC/UCT would be very inefficient compared to alpha/beta. Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
-Original Message- From: [EMAIL PROTECTED] To: computer-go@computer-go.org Sent: Tue, 10 Apr 2007 2:43 PM Subject: Re: [computer-go] The physics of Go playing strength. On Tue, 10 Apr 2007, Christoph Birk wrote: On Tue, 10 Apr 2007, Hideki Kato wrote: According to the report on MoGo (RR-6062), its playout part seems pruning not interesting moves using patterns. Yes, but the UCT part will (sooner or later) explore EVERY path. Well, really it could be either way. In the simplest case, UCT builds a tree without skipping any legal nodes. In that case, the playout type doesn't matter and the engine's convergence in the limit is assured. (Of course we'll all be dead by then...) But some implementations do decline to explore some legal nodes in the tree, particularly for larger board sizes. For instance, some versions of Mogo constrained some internal nodes in the tree to those with moves within a local neighborhood of the previous move. IIRC their RR-6062 report touched on this. In the current version of AntIgo, I use patterns learned from the (heavy) playouts to bias the selection of moves in lightly visited internal nodes. And there are a lot of other little tricks that can potentially alter the asymptotic properties of the algorithm. I doubt it matters, because any such trick I can think of, could be massaged into a form where the engine would converge anyway. It all comes down to the terminology we're using being not so precise or universally accepted. - Dave Hillis Check Out the new free AIM(R) Mail -- 2 GB of storage and industry-leading spam and email virus protection. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
I doubt it matters, because any such trick I can think of, could be massaged into a form where the engine would converge anyway. It all comes down to the terminology we're using being not so precise or universally accepted. And we can be sure that as the hardware improves, engine writers will do whatever they can to make it perform best on current hardware while still remaining very scalable (even if not provably infinitely scalable). ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
Christoph Birk: [EMAIL PROTECTED]: On Tue, 10 Apr 2007, Hideki Kato wrote: According to the report on MoGo (RR-6062), its playout part seems pruning not interesting moves using patterns. Yes, but the UCT part will (sooner or later) explore EVERY path. Yes, but the estimated score could be wrong. Then, where to converge? -gg Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ -- [EMAIL PROTECTED] (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
On 4/8/07, Don Dailey [EMAIL PROTECTED] wrote: These programs, in theory, will play perfect GO given enough time. ... and space. I doubt that your current programs would be capable of storing a large enough game tree to actually converge to the alpha-beta value. So in practice, it really would level out somewhere below optimal play, unless you also increase the memory usage, right? I think that it is still valid to present your chart as representing the lower portions of curves that do not do this, because you would presumably scale up the amount of memory used as well as the time, if you could. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
On Mon, 2007-04-09 at 05:30 -0400, Weston Markham wrote: On 4/8/07, Don Dailey [EMAIL PROTECTED] wrote: These programs, in theory, will play perfect GO given enough time. ... and space. I doubt that your current programs would be capable of storing a large enough game tree to actually converge to the alpha-beta value. So in practice, it really would level out somewhere below optimal play, unless you also increase the memory usage, right? I think that it is still valid to present your chart as representing the lower portions of curves that do not do this, because you would presumably scale up the amount of memory used as well as the time, if you could. Weston Yes, you would need more time and memory. But the point is that as long as you can provide time and memory you will get improvement until perfect play is reached. But if it's true that heavy is increasing faster, that will proabably stop being the case MUCH sooner. They will both likely approach perfect play on a very similar slope (like a jet coming in for a landing) which means the distance beetween the two plot lines must start getting closer much sooner. - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
Don, here is a question. Your curves plotted the playing level vs the computer speed. By computer speed you mean the number of MC simulations per node with all other factors fixed. Is this correct? If it is, it's legitimate for people to speculate that the curve could level off beyond some number of simulations per node. Actually this is an important question, because it relates to the nature of the MC simulation. Daniel Liu -Original Message- From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: computer-go@computer-go.org Sent: Mon, 9 Apr 2007 7:06 AM Subject: Re: [computer-go] The physics of Go playing strength. On Mon, 2007-04-09 at 05:30 -0400, Weston Markham wrote: On 4/8/07, Don Dailey [EMAIL PROTECTED] wrote: These programs, in theory, will play perfect GO given enough time. ... and space. I doubt that your current programs would be capable of storing a large enough game tree to actually converge to the alpha-beta value. So in practice, it really would level out somewhere below optimal play, unless you also increase the memory usage, right? I think that it is still valid to present your chart as representing the lower portions of curves that do not do this, because you would presumably scale up the amount of memory used as well as the time, if you could. Weston Yes, you would need more time and memory. But the point is that as long as you can provide time and memory you will get improvement until perfect play is reached. But if it's true that heavy is increasing faster, that will proabably stop being the case MUCH sooner. They will both likely approach perfect play on a very similar slope (like a jet coming in for a landing) which means the distance beetween the two plot lines must start getting closer much sooner. - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ AOL now offers free email to everyone. Find out more about what's free from AOL at AOL.com. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
On Mon, 2007-04-09 at 14:46 +0100, Tom Cooper wrote: Perhaps it would be possible to infer how the lines would look as perfect play was approached from what the curves looked like for a smaller board size. I thought that too, but the studies on 5x5 and 7x7 break down very quickly. The problem is that the programs already play close to perfect on these board sizes. I can say that because on 7x7 using a fractional komi, either white wins most of the games or black, depending on how you set komi. - Don At 13:06 09/04/2007, Don wrote: On Mon, 2007-04-09 at 05:30 -0400, Weston Markham wrote: On 4/8/07, Don Dailey [EMAIL PROTECTED] wrote: These programs, in theory, will play perfect GO given enough time. ... and space. I doubt that your current programs would be capable of storing a large enough game tree to actually converge to the alpha-beta value. So in practice, it really would level out somewhere below optimal play, unless you also increase the memory usage, right? I think that it is still valid to present your chart as representing the lower portions of curves that do not do this, because you would presumably scale up the amount of memory used as well as the time, if you could. Weston Yes, you would need more time and memory. But the point is that as long as you can provide time and memory you will get improvement until perfect play is reached. But if it's true that heavy is increasing faster, that will proabably stop being the case MUCH sooner. They will both likely approach perfect play on a very similar slope (like a jet coming in for a landing) which means the distance beetween the two plot lines must start getting closer much sooner. - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ -- This email has been verified as Virus free Virus Protection and more available at http://www.plus.net ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
Le lundi 9 avril 2007 14:06, Don Dailey a écrit : But the point is that as long as you can provide time and memory you will get improvement until perfect play is reached. Is there any proof that heavy player converge toward the same solution as the pure random playout ? With infinite resource, i agree that random playout will find the best move. But it seems that nothing is guaranted for heavy playout. Alain ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
On Tue, 2007-04-10 at 00:06 +0200, alain Baeckeroot wrote: Le lundi 9 avril 2007 14:06, Don Dailey a écrit : But the point is that as long as you can provide time and memory you will get improvement until perfect play is reached. Is there any proof that heavy player converge toward the same solution as the pure random playout ? I don't really know, but I suspect it converges as long as you score correctly at the end of the game. If you don't score correctly, then I suppose anything can happen. - Don With infinite resource, i agree that random playout will find the best move. But it seems that nothing is guaranted for heavy playout. Alain ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
On 4/10/07, alain Baeckeroot [EMAIL PROTECTED] wrote: Le lundi 9 avril 2007 14:06, Don Dailey a écrit: But the point is that as long as you can provide time and memory you will get improvement until perfect play is reached. Is there any proof that heavy player converge toward the same solution as the pure random playout ? With infinite resource, i agree that random playout will find the best move. But it seems that nothing is guaranted for heavy playout. With infinite resources the MC part won't have to make any move (heavy or light), so it does not matter. OC, this is all just theoretical throughout most of the game for any board of reasonable size. BTW pure random would fill it's own eyes... E. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
With infinite resource, i agree that random playout will find the best move. But it seems that nothing is guaranteed for heavy playout. As Don pointed out before, the reason it converges to perfect play is because of the UCT part, not because of the playout part. If the playout part prunes some moves, nothing is guaranteed. I believe the point is that UCT never prunes moves. The playouts performed at UCT leaf nodes are just to give an estimate to help UCT decide which part of the tree to explore next. I.e. heavy vs. light playouts are like intelligent vs. random move ordering in alpha-beta. Darren -- Darren Cook http://dcook.org/mlsn/ (English-Japanese-German-Chinese free dictionary) http://dcook.org/work/ (About me and my work) http://dcook.org/work/charts/ (My flash charting demos) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
With a badly designed play-out algorithm you may have a horribly inefficent search - but it would eventually still find the best move in principle. - Don On Tue, 2007-04-10 at 09:16 +0900, Darren Cook wrote: With infinite resource, i agree that random playout will find the best move. But it seems that nothing is guaranteed for heavy playout. As Don pointed out before, the reason it converges to perfect play is because of the UCT part, not because of the playout part. If the playout part prunes some moves, nothing is guaranteed. I believe the point is that UCT never prunes moves. The playouts performed at UCT leaf nodes are just to give an estimate to help UCT decide which part of the tree to explore next. I.e. heavy vs. light playouts are like intelligent vs. random move ordering in alpha-beta. Darren ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
Erik van der Werf wrote: On 4/10/07, alain Baeckeroot [EMAIL PROTECTED] wrote: Le lundi 9 avril 2007 14:06, Don Dailey a écrit: But the point is that as long as you can provide time and memory you will get improvement until perfect play is reached. Is there any proof that heavy player converge toward the same solution as the pure random playout ? With infinite resource, i agree that random playout will find the best move. But it seems that nothing is guaranted for heavy playout. With infinite resources the MC part won't have to make any move (heavy or light), so it does not matter. OC, this is all just theoretical throughout most of the game for any board of reasonable size. BTW pure random would fill it's own eyes... The benefit that MC/UCT has is that it can be stopped any time and get a reasonable current selected move. The probability that it will find better moves tends to increase with more resources/playouts. But wouldn't a simple brute force search of the game tree be more efficient than MC/UCT at finding the certain best move, given the hypothetical hyper super computer we are talking about? So if ever this hyper computer exists with near infinite speed and resources we can just run a methodical brute force search and be done with it. Why would one bother running a googolplex random simulations to approach perfect play? So to be practical we have to find ways to improve the search beyond pure scalability like the MoGo developers are doing. Does this make sense? ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
Le dimanche 8 avril 2007 03:05, Don Dailey a écrit : A few weeks ago I announced that I was doing a long term scalability study with computer go on 9x9 boards. I have constructed a graph of the results so far: http://greencheeks.homelinux.org:8015/~drd/public/study.jpg Thanks for this interesting study. [snip] Feel free to interpret the data any way you please, but here are my own observations: 1. Scalability is almost linear with each doubling. 2. But there appears to be a very gradual fall-off with time - which is what one would expect (ELO improvements cannot be infinite so they must be approaching some limit.) Could'nt the inflexion of heavy curve also mean that the advantage of heavy play-out disappears when the number of simulation is very high ? With huge number of simulation the heavy player could become weaker than the light player, due to the wrong knowledge introduced in the play-out. Sadly it seems hard to test this (12-13-14) without huge computing power, a distributed [EMAIL PROTECTED], or a big amount of patience :-) 3. The heavy-playout version scales at least as well, if not better, than the light play-out version. (You can see the rating gap between them gradually increase with the number of play-outs.) between 10 and 11 the trend changes. Regards. Alain ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
Thanks dons for producing these fascinating results. I hope that when you have finished the study, you will show us not just this graph, but also the game results (number of wins) that it is derived from. At 02:05 08/04/2007, you wrote: A few weeks ago I announced that I was doing a long term scalability study with computer go on 9x9 boards. I have constructed a graph of the results so far: http://greencheeks.homelinux.org:8015/~drd/public/study.jpg ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
According these results the slope is considerable greater than in chess. In the classical experiment of Ken Thompons searching 1 ply deeper is worth about 200 Elo. 1 ply corresponds to 5-6 times longer/faster. In 9x9 already a factor of 2 gives the same improvement. This is really remarkable. Another explanation would be, that 100 Elo have in Go a different meaning than in chess. It is often argued that the distance between week and stronger player is much greater in Go than in Chess. In chess the distance between an average club player and top humans is about 1000 Elo. Maybe in Go its 2000 Elo?? In chess the green level-11 version would have world-champion level. Is it just enough to make a 2 million playouts version to beat the top-Dans in 9x9? Is it that easy? Just build a special purpose chip like ChipTest aka Deep Blue. Or implement it on a cluster. Or just wait a few years on do it on the PC. Or a playstation. Chrilly Is there any notion of the Elo rating of a professional Go player. In chess terms the - Original Message - From: Don Dailey [EMAIL PROTECTED] To: computer-go computer-go@computer-go.org Sent: Sunday, April 08, 2007 3:05 AM Subject: [computer-go] The physics of Go playing strength. A few weeks ago I announced that I was doing a long term scalability study with computer go on 9x9 boards. I have constructed a graph of the results so far: http://greencheeks.homelinux.org:8015/~drd/public/study.jpg Although I am still collecting data, I feel that I have enough samples to report some results - although I will continue to collect samples for a while. This study is designed to measure the improvement in strength that can be expected with each doubling of computer resources. I'm actually testing 2 programs - both of them UCT style go programs, but one of those programs does uniformly random play-outs and the other much stronger one is similar to Mogo, as documented in one of their papers. Dave Hillis coined the terminolgoy I will be using, light play-outs vs heavy play-outs. For the study I'm using 12 versions of each program. The weakest version starts with 1024 play-outs in order to produce a move. The next version doubles this to 2048 play-outs, and so on until the 12th version which does 2 million (2,097,152) playouts. This is a substantial study which has taken weeks so far to get to this point. Many of the faster programs have played close to 250 games, but the highest levels have only played about 80 games so far. The scheduling algorithm is very similar to the one used by CGOS. An attempt is made not to waste a lot of time playing seriously mis-matched opponents. The games were rated and the results graphed. You can see the result of the graph here (which I also included near the top of this message): http://greencheeks.homelinux.org:8015/~drd/public/study.jpg The x-axis is the number of doublings starting with 1024 play-outs and the y-axis is the ELO rating. The public domain program GnuGo version 3.7.9 was assigned the rating 2000 as a reference point. On CGOS, this program has acheived 1801, so in CGOS terms all the ratings are about 200 points optimistic. Feel free to interpret the data any way you please, but here are my own observations: 1. Scalability is almost linear with each doubling. 2. But there appears to be a very gradual fall-off with time - which is what one would expect (ELO improvements cannot be infinite so they must be approaching some limit.) 3. The heavy-playout version scales at least as well, if not better, than the light play-out version. (You can see the rating gap between them gradually increase with the number of play-outs.) 4. The curve is still steep at 2 million play-outs, this is convincing empirical evidence that there are a few hundred ELO points worth of improvement possible beyond this. 5. GnuGo 3.7.9 is not competive with the higher levels of Lazarus. However, what the study doesn't show is that Lazarus needs 2X more thinking time to play equal to GnuGo 3.7.9. This graph explains why I feel that absolute playing strength is a poor conceptual model of how humans or computers play go. If Lazarus was running on the old Z-80 processors of a few decades ago, it would be veiewed as an incredibly weak program, but running on a supercomputer it's a very strong program. But in either case it's the SAME program. The difference is NOT the amount of work each system is capable of, it's just that one takes longer to accomplish a given amount of work. It's much like the relationships between power, work, force, time etc. in physics. Based on this type of analysis and the physics analogy, GnuGo 3.7.9 is a stronger program than Lazarus (even at 9x9 go). Lazarus requires about 2X more time to equalize. So Lazarus plays with less force (if you use the physics analogy) and needs more TIME to get the same amount of work done. ELO
Re: [computer-go] The physics of Go playing strength.
The discussion here http://senseis.xmp.net/?EloRating suggests that the difference between beginners and top players in go is about 3000 ELO on a 19x19 board. This difference is very dependent on the board size. I can think of a naive argument that this difference should scale linearly with the (linear) size of the board, that is as the square-root of the area of the board. At 08:56 08/04/2007, you wrote: According these results the slope is considerable greater than in chess. In the classical experiment of Ken Thompons searching 1 ply deeper is worth about 200 Elo. 1 ply corresponds to 5-6 times longer/faster. In 9x9 already a factor of 2 gives the same improvement. This is really remarkable. Another explanation would be, that 100 Elo have in Go a different meaning than in chess. It is often argued that the distance between week and stronger player is much greater in Go than in Chess. In chess the distance between an average club player and top humans is about 1000 Elo. Maybe in Go its 2000 Elo?? In chess the green level-11 version would have world-champion level. Is it just enough to make a 2 million playouts version to beat the top-Dans in 9x9? Is it that easy? Just build a special purpose chip like ChipTest aka Deep Blue. Or implement it on a cluster. Or just wait a few years on do it on the PC. Or a playstation. Chrilly Is there any notion of the Elo rating of a professional Go player. In chess terms the - Original Message - From: Don Dailey [EMAIL PROTECTED] To: computer-go computer-go@computer-go.org Sent: Sunday, April 08, 2007 3:05 AM Subject: [computer-go] The physics of Go playing strength. A few weeks ago I announced that I was doing a long term scalability study with computer go on 9x9 boards. I have constructed a graph of the results so far: http://greencheeks.homelinux.org:8015/~drd/public/study.jpg Although I am still collecting data, I feel that I have enough samples to report some results - although I will continue to collect samples for a while. This study is designed to measure the improvement in strength that can be expected with each doubling of computer resources. I'm actually testing 2 programs - both of them UCT style go programs, but one of those programs does uniformly random play-outs and the other much stronger one is similar to Mogo, as documented in one of their papers. Dave Hillis coined the terminolgoy I will be using, light play-outs vs heavy play-outs. For the study I'm using 12 versions of each program. The weakest version starts with 1024 play-outs in order to produce a move. The next version doubles this to 2048 play-outs, and so on until the 12th version which does 2 million (2,097,152) playouts. This is a substantial study which has taken weeks so far to get to this point. Many of the faster programs have played close to 250 games, but the highest levels have only played about 80 games so far. The scheduling algorithm is very similar to the one used by CGOS. An attempt is made not to waste a lot of time playing seriously mis-matched opponents. The games were rated and the results graphed. You can see the result of the graph here (which I also included near the top of this message): http://greencheeks.homelinux.org:8015/~drd/public/study.jpg The x-axis is the number of doublings starting with 1024 play-outs and the y-axis is the ELO rating. The public domain program GnuGo version 3.7.9 was assigned the rating 2000 as a reference point. On CGOS, this program has acheived 1801, so in CGOS terms all the ratings are about 200 points optimistic. Feel free to interpret the data any way you please, but here are my own observations: 1. Scalability is almost linear with each doubling. 2. But there appears to be a very gradual fall-off with time - which is what one would expect (ELO improvements cannot be infinite so they must be approaching some limit.) 3. The heavy-playout version scales at least as well, if not better, than the light play-out version. (You can see the rating gap between them gradually increase with the number of play-outs.) 4. The curve is still steep at 2 million play-outs, this is convincing empirical evidence that there are a few hundred ELO points worth of improvement possible beyond this. 5. GnuGo 3.7.9 is not competive with the higher levels of Lazarus. However, what the study doesn't show is that Lazarus needs 2X more thinking time to play equal to GnuGo 3.7.9. This graph explains why I feel that absolute playing strength is a poor conceptual model of how humans or computers play go. If Lazarus was running on the old Z-80 processors of a few decades ago, it would be veiewed as an incredibly weak program, but running on a supercomputer it's a very strong program. But in either case it's the SAME program. The difference is NOT the amount of work each system is capable of, it's just that one takes longer
Re: [computer-go] The physics of Go playing strength.
On Sun, 2007-04-08 at 09:56 +0200, Chrilly wrote: Is it just enough to make a 2 million playouts version to beat the top-Dans in 9x9? Is it that easy? Of course the ELO numbers are arbitrary. I assigned GnuGo 3.7.9 a level of 2000 but on CGOS it is 1800.But CGOS numbers are arbitrary too. If you can estimate the ELO rating of GnuGo 3.7.9 compared to say a 1 Dan player, then you can estimate the winning percentages of any version against a human of a given strength. Mogo of course is about 200-300 higher at the same levels. So I believe that if special purpose hardware could bump MoGo up a few times faster, you would really would have a professional strength 9x9 player. It might be easier to do this with a lite play-out version if you can get substantially more speed, say 4X faster due to the much simpler move logic. I don't know hardware, but it could be easier (more opportunities for parallelism to do it with a heavy version.) - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
On Sun, 2007-04-08 at 09:36 +0100, Tom Cooper wrote: Thanks dons for producing these fascinating results. I hope that when you have finished the study, you will show us not just this graph, but also the game results (number of wins) that it is derived from. I have all games and all data if anyone want them. I would probably build a crosstable of results for each pairing in a final report on a web page. - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
On Sun, 2007-04-08 at 14:44 +0200, Heikki Levanto wrote: Aren't you being a bit optimistic here? It is quite conceivable that the curves will flatten out and reach a maximum level somewhat below perfect play. I don't see how we can predict the difference between them at that time. UCT has been proven to converge to optimum play. Read some of the papers. - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
The question here is not about UCT(yes, it gaves the same rusults as alpha-beta). It's about MC scoring. It has not been proved that MC score will generate the optimum play with large enough simulation. Now the best super computer uses about 500,000 CPUs, which is 2 to the 18th power of computation increase. Don's curve can be tested to the number 18 now. -Original Message- From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: computer-go@computer-go.org Sent: Sun, 8 Apr 2007 7:49 AM Subject: Re: [computer-go] The physics of Go playing strength. On Sun, 2007-04-08 at 14:44 +0200, Heikki Levanto wrote: Aren't you being a bit optimistic here? It is quite conceivable that the curves will flatten out and reach a maximum level somewhat below perfect play. I don't see how we can predict the difference between them at that time. UCT has been proven to converge to optimum play. Read some of the papers. - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ AOL now offers free email to everyone. Find out more about what's free from AOL at AOL.com. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
RE: [computer-go] The physics of Go playing strength.
On Sun, 2007-04-08 at 18:11 +0200, Edward de Grijs wrote: Hello Don, A few weeks ago I announced that I was doing a long term scalability study with computer go on 9x9 boards. I have constructed a graph of the results so far: Your work and insight keeps on amazing me. If I understand is correctly the playouts are made all from the root position? I am very interested in the results if the larger amount of playouts are not made from the root position, but from the moment the UCT part is over, and the playouts begin, especially with larger amount of playouts. I remember some statement of you that it made no significant difference if the playouts are multiplied from that position, instead of the root position, even with hundreds of playouts... Do those statements still hold? Edward de Grijs. Hi Edward, Thank you for the kind remarks. I count the total play-outs - version 0 does 1024 play-outs period and then the search is stopped cold and a more returned. It makes the program weaker if you just do N play-outs from the leaf position of UCT. But it does not seem to hurt the program at all to not expand the existing CHILDREN of a node, until the parent has been visited, say 100 times. I have not tried anything larger than 100, but if it's weaker with 100, I haven't been able to measure it.This really is huge benefit in memory usage, so I like 100. Of course this means some children will get 2 or more visits before getting expanded.There is no need for this if you have memory to burn. I'm not sure what experiment you are proposing, but at some point it might be interesting to see how other values work. If you have a computer with very little memory to spare, you could probably use much larger numbers although at some point you would surely experience a noticable weakening. If you are proposing doing 2, 3, 4 or more play-outs at the point where you normally do one, I think it strengthens the program - but not enough to justfy the extra work. In other words doing 2 play-outs doubles the amount of time spent on a move and it does not increase the strength enough to justify this. - Don From: Don Dailey [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED], computer-go computer-go@computer-go.org To: computer-go computer-go@computer-go.org Subject: [computer-go] The physics of Go playing strength. Date: Sat, 07 Apr 2007 21:05:19 -0400 A few weeks ago I announced that I was doing a long term scalability study with computer go on 9x9 boards. I have constructed a graph of the results so far: http://greencheeks.homelinux.org:8015/~drd/public/study.jpg Although I am still collecting data, I feel that I have enough samples to report some results - although I will continue to collect samples for a while. This study is designed to measure the improvement in strength that can be expected with each doubling of computer resources. I'm actually testing 2 programs - both of them UCT style go programs, but one of those programs does uniformly random play-outs and the other much stronger one is similar to Mogo, as documented in one of their papers. Dave Hillis coined the terminolgoy I will be using, light play-outs vs heavy play-outs. For the study I'm using 12 versions of each program. The weakest version starts with 1024 play-outs in order to produce a move. The next version doubles this to 2048 play-outs, and so on until the 12th version which does 2 million (2,097,152) playouts. This is a substantial study which has taken weeks so far to get to this point. Many of the faster programs have played close to 250 games, but the highest levels have only played about 80 games so far. The scheduling algorithm is very similar to the one used by CGOS. An attempt is made not to waste a lot of time playing seriously mis-matched opponents. The games were rated and the results graphed. You can see the result of the graph here (which I also included near the top of this message): http://greencheeks.homelinux.org:8015/~drd/public/study.jpg The x-axis is the number of doublings starting with 1024 play-outs and the y-axis is the ELO rating. The public domain program GnuGo version 3.7.9 was assigned the rating 2000 as a reference point. On CGOS, this program has acheived 1801, so in CGOS terms all the ratings are about 200 points optimistic. Feel free to interpret the data any way you please, but here are my own observations: 1. Scalability is almost linear with each doubling. 2. But there appears to be a very gradual fall-off with time - which is what one would expect (ELO improvements cannot be infinite so they must be approaching some limit.) 3. The heavy-playout version scales at least as well, if not better, than the light play-out version. (You can see the rating gap between them gradually increase with the number of play-outs
Re: [computer-go] The physics of Go playing strength.
On Sun, 2007-04-08 at 10:09 -0400, [EMAIL PROTECTED] wrote: The question here is not about UCT(yes, it gaves the same rusults as alpha-beta). It's about MC scoring. It has not been proved that MC score will generate the optimum play with large enough simulation. MC is obviously wrong as an evaluation function - it is not guaranteed to return a correct or even a good score no matter how many simulations. However, UCT eventually turns into a pure tree where MC is not a factor.These programs, in theory, will play perfect GO given enough time. - Don Now the best super computer uses about 500,000 CPUs, which is 2 to the 18th power of computation increase. Don's curve can be tested to the number 18 now. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
I think you are right, because MC score becomes precise when only a few available moves left. However, do you search to the depth of the end of the game, or to the extent that MC score becomes precise? -Original Message- From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: computer-go@computer-go.org Sent: Sun, 8 Apr 2007 2:22 PM Subject: Re: [computer-go] The physics of Go playing strength. On Sun, 2007-04-08 at 10:09 -0400, [EMAIL PROTECTED] wrote: The question here is not about UCT(yes, it gaves the same rusults as alpha-beta). It's about MC scoring. It has not been proved that MC score will generate the optimum play with large enough simulation. MC is obviously wrong as an evaluation function - it is not guaranteed to return a correct or even a good score no matter how many simulations. However, UCT eventually turns into a pure tree where MC is not a factor.These programs, in theory, will play perfect GO given enough time. - Don Now the best super computer uses about 500,000 CPUs, which is 2 to the 18th power of computation increase. Don's curve can be tested to the number 18 now. AOL now offers free email to everyone. Find out more about what's free from AOL at AOL.com. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] The physics of Go playing strength.
A few weeks ago I announced that I was doing a long term scalability study with computer go on 9x9 boards. I have constructed a graph of the results so far: http://greencheeks.homelinux.org:8015/~drd/public/study.jpg Although I am still collecting data, I feel that I have enough samples to report some results - although I will continue to collect samples for a while. This study is designed to measure the improvement in strength that can be expected with each doubling of computer resources. I'm actually testing 2 programs - both of them UCT style go programs, but one of those programs does uniformly random play-outs and the other much stronger one is similar to Mogo, as documented in one of their papers. Dave Hillis coined the terminolgoy I will be using, light play-outs vs heavy play-outs. For the study I'm using 12 versions of each program. The weakest version starts with 1024 play-outs in order to produce a move. The next version doubles this to 2048 play-outs, and so on until the 12th version which does 2 million (2,097,152) playouts. This is a substantial study which has taken weeks so far to get to this point. Many of the faster programs have played close to 250 games, but the highest levels have only played about 80 games so far. The scheduling algorithm is very similar to the one used by CGOS. An attempt is made not to waste a lot of time playing seriously mis-matched opponents. The games were rated and the results graphed. You can see the result of the graph here (which I also included near the top of this message): http://greencheeks.homelinux.org:8015/~drd/public/study.jpg The x-axis is the number of doublings starting with 1024 play-outs and the y-axis is the ELO rating. The public domain program GnuGo version 3.7.9 was assigned the rating 2000 as a reference point. On CGOS, this program has acheived 1801, so in CGOS terms all the ratings are about 200 points optimistic. Feel free to interpret the data any way you please, but here are my own observations: 1. Scalability is almost linear with each doubling. 2. But there appears to be a very gradual fall-off with time - which is what one would expect (ELO improvements cannot be infinite so they must be approaching some limit.) 3. The heavy-playout version scales at least as well, if not better, than the light play-out version. (You can see the rating gap between them gradually increase with the number of play-outs.) 4. The curve is still steep at 2 million play-outs, this is convincing empirical evidence that there are a few hundred ELO points worth of improvement possible beyond this. 5. GnuGo 3.7.9 is not competive with the higher levels of Lazarus. However, what the study doesn't show is that Lazarus needs 2X more thinking time to play equal to GnuGo 3.7.9. This graph explains why I feel that absolute playing strength is a poor conceptual model of how humans or computers play go. If Lazarus was running on the old Z-80 processors of a few decades ago, it would be veiewed as an incredibly weak program, but running on a supercomputer it's a very strong program. But in either case it's the SAME program. The difference is NOT the amount of work each system is capable of, it's just that one takes longer to accomplish a given amount of work. It's much like the relationships between power, work, force, time etc. in physics. Based on this type of analysis and the physics analogy, GnuGo 3.7.9 is a stronger program than Lazarus (even at 9x9 go). Lazarus requires about 2X more time to equalize. So Lazarus plays with less force (if you use the physics analogy) and needs more TIME to get the same amount of work done. ELO is treated numerically as if it were work in physics because when it's measured by playing games, both players get the same amount of time. The time factor cancels out but it cause us to ignore that it's part of the equation. On CGOS, Lazarus and FatMan are the same program, but one does much more work and they have ELO ratings that differ by almost 300 ELO points. Even though they are the same program you will look on CGOS and believe Lazarus is much stronger because you have not considered the physics of Go playing strength. - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/