Re: [computer-go] Dynamic komi
I'm curious to find out what is meant by lazy. If, as I am led to believe by your report, Monte Carlo strategies applied to Double Step Races are lazy, yet they converge to perfect play, then I'm not sure why we are meant to worry. I certainly understand that the strategies can converge faster in some cases than in others. I would expect that this could be used to one's advantage, by using Monte Carlo to evaluate a related game, that has similar correct moves, but for which the Monte Carlo evaluation is more accurate. But it isn't clear how one is supposed to transform games like his 6-vs-6 into 6-vs-5, without already knowing enough to completely solve the game! Specifically, is there a way to do this that doesn't _also_ convert 6-vs-5 into 6-vs-4? Switching back to go, I would like to point out that it seems to me to be a serious mistake to apply UCT to any game other than the one at hand. If you do some dynamic fiddling with komi, you really ought to only do it in playouts, and you should ensure that as the length of the playout decreases, the amount of fiddling approaches zero. That way, once UCT has expanded a deep line of play, it will be working with accurate win/loss values, instead of some fantasy based on a globally-applied dynamic komi. This seems very much in line with what others have already suggested regarding the insertion of passes or other ways of degrading the play in the playouts. I suspect that adjusting komi in the above manner may be an even better solution. Weston On Wed, Aug 19, 2009 at 1:45 PM, Ingo Althöfer3-hirn-ver...@gmx.de wrote: Don wrote: But how do you create the required tension in a way that produces a program that plays the game better? At least in high handicap go on 19x19 (with the dynamic bot being the stronger player) it seems to work when the bot is kept in some 35-45 % corridor, as long as it is clearly behind. Ingo. -- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01 ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] New CGOS
On Fri, Jun 5, 2009 at 5:59 PM, Christoph Birk b...@ociw.edu wrote: Handicap games are for humans ... they get frustrated losing over and over. Computers have no problems with that. 2009/6/5 terry mcintyre terrymcint...@yahoo.com: The purpose of a handicap games is to allow a 50% chance of either contestant winning. A handicap system based on time controls could be appropriate for many computer programs. Not that I expect Don to implement any such thing any time soon. I just think that it is worth mentioning. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RefBot (thought-) experiments
On Mon, Dec 15, 2008 at 11:10 PM, Mark Boon tesujisoftw...@gmail.com wrote: It would have been much more persuasive if you had simply run a 5K playout bot against a 100K bot and see which wins more. In 200 games, 100k beat 5k a total of 127 times. So that's about a 63.5% win rate. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RefBot (thought-) experiments
On Mon, Dec 15, 2008 at 5:47 PM, Don Dailey drdai...@cox.net wrote: Is Jrefgo the pure version that does not use tricks like the futures map? If you use things like that, all bets are off - I can't be sure this is not negatively scalable. I don't know, although I was under the impression that I had downloaded the pure version. I found a reference to the source here on the list, and downloaded and compiled that. When I get back home, how would I quickly determine which is the case? You cannot draw any reasonable conclusions by stopping after 10 moves and letting gnugo judge the game either.Why didn't you play complete games? I think that complete games would have to be at least one of: 1. Against a similarly weak opponent. This casts doubt on whether the results apply against other opponents. 2. Unlikely to be won by an AMAF program. This makes their differences hard to measure. 3. Played with handicap stones. The granularity seems too coarse on 9x9. Nevertheless, it might be worthwhile to try this. 4. Played with a komi that is very far from an even game. In 9x9, this would mean that the better player must control the entire board in order to win. At that point, komi no longer becomes useful as a means for providing a handicap. Originally, (about two years ago) I ran studies such as this in order to tune parameters that affected the playouts, and that I thought could probably have different optimum values at different points in the game. Playing against an opponent that is generally stronger makes it more likely that the improvements I find are likely to apply to opponents in general, rather than simply tuning my program against one particular opponent. Playing against a close relative of the same program (e.g., pitting 5k against 100k directly) gives misleading results, in my experience. Often, both programs will be blind to the same lines of play, allowing genuinely bad moves to go unpunished. On Mon, Dec 15, 2008 at 6:10 PM, Mark Boon tesujisoftw...@gmail.com wrote: It would have been much more persuasive if you had simply run a 5K playout bot against a 100K bot and see which wins more. It shouldn't take much more than a day to gather a significant number of games. I may do that, although personally I would be far more cautious about drawing conclusions from those matches, as compared to ones played against a strong reference opponent. But I guess other people feel differently about this. Anyway, the results would still be interesting to me no matter which way they went, even if they failed to convince me of anything. I did in fact put up a 100K+ ref-bot on CGOS for a little while, and it ended up with a rating slightly (possibly insignificantly) higher than the 2K ref-bot. Maybe I didn't put it there long enough, certainly not for thousands of games. But it didn't look anywhere near to supporting your findings. That doesn't particularly disagree with my conclusions either. For example, I would guess that the best overall performance is somewhere around 5k-10k, so a program with a setting in that range would obtain a higher rating than either of 2K and your 100K+. I could easily be wrong about that, though. I say 100K+ because I didn't set it to a specific number, just run as many as it could within time allowed. Generally it would reach well over 100K per move, probably more like 250K-500K. That should only make things worse according to your hypothesis. Yes, this is what sparked the conversation originally. When you reported that a while ago, my reaction was, Of course that won't work very well; you're running way too many simulations. I was actually a bit surprised that noone else thought that this was as bad as I think it is. So although I think the result of your experiment is very curious, I think it might be a bit hasty draw your conclusion. Yes, it very well may be. As I mentioned, I ran a number of similar experiments a couple years ago, for which I unfortunately lost the results. My recollection is that they typically indicated the same thing, across a number of variations on my own program. Performance would improve up to a point, then degrade as the program's behavior became essentially deterministic. But I may have made mistakes in those tests, or I could be misremembering. On Tue, Dec 16, 2008 at 12:20 PM, Don Dailey dailey@gmail.com wrote: A monte carlo bot like refbot, in most positions is going to converge on some specific move. I think in the starting position it wants to play e5 and it is going to play e5 with an infinite number of playouts, whether than is the best move or not.There will be many situations where the move it wants to play is not the best, and so you can surmise that it's more likely to play a good move with fewer playouts. Incidentally, when I get home, I'll post the line of play that follows those moves with the highest (asymptotic) Monte Carlo values, according to jrefgo. I have
Re: [computer-go] RefBot (thought-) experiments
On Tue, Dec 16, 2008 at 7:34 PM, Weston Markham weston.mark...@gmail.com wrote: And I believe that current Monte Carlo methods only really manage to avoid the very worst of the bad moves, regardless of how many playouts they run. Um, perhaps I should qualify that as pure Monte Carlo, meaning without any form of tree search. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RefBot (thought-) experiments
On Wed, Dec 17, 2008 at 12:34 AM, Weston Markham weston.mark...@gmail.com wrote: I don't know, although I was under the impression that I had downloaded the pure version. I found a reference to the source here on the list, and downloaded and compiled that. When I get back home, how would I quickly determine which is the case? The program reports 081016-2022 to the GTP version command. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RefBot (thought-) experiments
On Wed, Dec 17, 2008 at 12:51 AM, Darren Cook dar...@dcook.org wrote: I'd also like to second Mark Boon's statement that Gnugo is not an expert judge, especially not after only 10 moves. One experiment I did, a couple of years ago, was scoring lots of terminal or almost-terminal 9x9 positions with gnugo and crazy stone, and they disagreed a lot of the time. (Sorry, I don't remember what a lot was, maybe 10% or so.) Hmm. I agree as well. I see this as the biggest weakness in the experiment. The weaknesses of gnugo could very well favor 5k's end positions over 100k's, for irrelevant reasons. The earlier experiments I had run also used gnugo. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RefBot (thought-) experiments
On Wed, Dec 17, 2008 at 1:32 AM, Mark Boon tesujisoftw...@gmail.com wrote: By the way, what does scratch100k.sh look like? ../gogui-1.1.3/bin/gogui-twogtp -auto -black java -jar jrefgo.jar 10 -game s 1 -komi 0.5 -maxmoves 10 -referee gnugo --mode gtp --score aftermath --ch inese-rules --positional-superko -sgffile games/jr100k-v-mogo-10m -size 9 -whit e `cygpath -w /home/Experience/projects/go/MoGo_release3/mogo` ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RefBot (thought-) experiments
On Wed, Dec 17, 2008 at 2:07 AM, Mark Boon tesujisoftw...@gmail.com wrote: Thanks. I just realized that you set the komi to 0.5. That doesn't sound like a good idea. I wanted to make sure you had the same for the 100k version. Were your earlier experiments also with 0.5 komi? MC programs are highly sensitive to komi, so I'd use something more reasonable, like 6.5 or 7.5. Ah. I'm sorry if I failed to draw attention to that in my original description. I believe that some of my experiments were very similar to this one. I don't really recall the details although I did pick the values used here of 5k, 100k, and 10 moves based on my recollection. I think that originally I may have measured the difference between the two over moves 5-10 or 5-15. And of course the Monte Carlo program was my own, which had some minor differences. More significantly, I would probably have used gnugo as the reference program for those experiments. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RefBot (thought-) experiments
On Tue, Dec 16, 2008 at 7:34 PM, Weston Markham weston.mark...@gmail.com wrote: Incidentally, when I get home, I'll post the line of play that follows those moves with the highest (asymptotic) Monte Carlo values, according to jrefgo. I have about 18 moves calculated with high accuracy. Here is a .sgf for the first 17 moves. They are: E5 D5 D6 E6 F6 E7 F5 D4 C6 F7 G7 G6 G5 D7 C7 C8 D8 The 18th move is very nearly a tie between E8 and F8, although I think F8 eventually wins. (;FF[4]CA[UTF-8]AP[GoGui:1.1.3]SZ[9] KM[6.5]PB[JrefBot:081016-2022]PW[JrefBot:081016-2022]DT[2008-12-04]RE[B+34.5] C[Black command: java -jar jrefgo.jar 10 White command: java -jar jrefgo.jar 10 Black version: 081016-2022 White version: 081016-2022 Opening: openings/opening2.sgf Result[Black\]: ? Result[White\]: ? Referee: gnugo --level 0 --mode gtp --score estimate --capture-all-dead --chinese-rules Result[Referee\]: B+34.5 Host: localhost.localdomain (Pentium III (Coppermine)) Date: December 5, 2008 2:05:47 PM EST] ;B[ee];W[de];B[dd];W[ed];B[fd];W[ec];B[fe];W[df];B[cd];W[fc] ;B[gc];W[gd];B[ge];W[dc];B[cc];W[cb];B[db] ) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RefBot (thought-) experiments
On Wed, Dec 17, 2008 at 2:38 AM, Don Dailey dailey@gmail.com wrote: Is it the java version? I believe there is only one version of that and it's the pure reference bot. I did make modification to a C version but I think I kept that private. Yes, it is the Java version. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RefBot (thought-) experiments
Hi. This is a continuation of a month-old conversation about the possibility that the quality of AMAF Monte Carlo can degrade, as the number of simulations increases: Me: running 10k playouts can be significantly worse than running 5k playouts. On Tue, Nov 18, 2008 at 2:27 PM, Don Dailey drdai...@cox.net wrote: On Tue, 2008-11-18 at 14:17 -0500, Weston Markham wrote: On Tue, Nov 18, 2008 at 12:02 PM, Michael Williams michaelwilliam...@gmail.com wrote: It doesn't make any sense to me from a theoretical perspective. Do you have empirical evidence? I used to have data on this, from a program that I think was very nearly identical to Don's reference spec. When I get a chance, I'll try to reproduce it. Unless the difference is large, you will have to run thousands of games to back this up. - Don I am comparing the behavior of the AMAF reference bot with 5000 playouts against the behavior with 10 playouts, and I am only considering the first ten moves (five from each player) of the (9x9) games. I downloaded a copy of Don's reference bot, as well as a copy of Mogo, which is used as an opponent for each of the two settings. gnugo version 3.7.11 is also used, in order to judge which side won (jrefgo or mogo) after each individual match. gnugo was used because it is simple to set it up for this sort of thing via command-line options, and it seems plausible that it should give a somewhat realistic assessment of the situation. jrefgo always plays black, and Mogo plays white. Komi is set to 0.5, so that jrefgo has a reasonable number of winning lines available to it, although the general superiority of Mogo means that egregiously bad individual moves will be punished. In the games played, Mogo would occasionally crash. (This was run under Windows Vista; perhaps there is some incompatibility of the binary I downloaded) I have discarded these games (about 1 out of 50, I think) from the statistics gathered. As far as I know, there would be no reason to think that this would skew the comparison between 5k playouts and 100k playouts. Other than occasional crashes, the behavior of Mogo seemed reasonable in other games that I observed. I have no reason to think that it was not playing at a relatively high level in the retained results. Out of 3637 matches using 5k playouts, jrefgo won (i.e., was ahead after 10 moves, as estimated by gnugo) 1688 of them. (46.4%) Out of 2949 matches using 100k playouts, jrefgo won 785. (26.6%) It appears clear to me that increasing the number of playouts from 5k to 100k certainly degrades the performance of jrefgo. Below, I am including the commands that I used to run the tests and tally the results. Weston $ cat scratch5k.sh ../gogui-1.1.3/bin/gogui-twogtp -auto -black \C:Program FilesJavaj dk1.6.0_06binjava.exe\ -jar jrefgo.jar 5000 -games 1 -komi 0.5 -ma xmoves 10 -referee gnugo --mode gtp --score aftermath --chinese-rules --positio nal-superko -sgffile games/jr5k-v-mogo -size 9 -white C:cygwinhomeE xperienceprojectsgoMoGo_release3mogo.exe $ grep B+ games/jr5k-v-mogo.dat | grep -v unexp | wc -l 1688 $ grep W+ games/jr5k-v-mogo.dat | grep -v unexp | wc -l 1949 $ grep B+ games/jr100k-v-mogo.dat | grep -v unexp | wc -l 785 $ grep W+ games/jr100k-v-mogo.dat | grep -v unexp | wc -l 2164 ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RefBot (thought-) experiments
On Mon, Nov 17, 2008 at 11:13 PM, Michael Williams [EMAIL PROTECTED] wrote: No one ever alleged that pure AMAF or pure MC was infinitely scalable. My point is that in many cases, they doesn't even keep all of their benefits, after some number of trials have been run. So, running 10k playouts can be significantly worse than running 5k playouts. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RefBot (thought-) experiments
On Tue, Nov 18, 2008 at 12:02 PM, Michael Williams [EMAIL PROTECTED] wrote: It doesn't make any sense to me from a theoretical perspective. Do you have empirical evidence? I used to have data on this, from a program that I think was very nearly identical to Don's reference spec. When I get a chance, I'll try to reproduce it. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] RefBot (thought-) experiments
On Mon, Nov 17, 2008 at 2:30 PM, Don Dailey [EMAIL PROTECTED] wrote: On Mon, 2008-11-17 at 16:04 -0200, Mark Boon wrote: On another note, as an experiment I have a bot running on CGOS that is the ref-bot but instead of using a fixed number of simulations I use a fixed amount of time that slowly diminishes towards the end of the game. The result is it does about 200K simulations per move for most of the game on a single processor. Its rating is currently stuck After 2k playouts you are already facing serious diminishing returns - but there is still a bit more to be had. But after 5-10K playouts you get almost nothing. I think that I have seen this sort of thing with Monte Carlo programs, and I think it is possible to get even less than almost nothing. You may be getting overly-precise measurements of the Monte Carlo values of the moves near the beginning of the game, so that the played moves are biased toward lines of play where the Monte Carlo values are unrealistically good. (This could be thought of as being somewhat analogous to a horizon effect.) Put another way, the Monte Carlo values do tend to accurately distinguish the relatively good moves from the relatively poor moves, (which of course makes them very useful) but at any given position, you can't expect them to give the best score to the best move, even in the limit. As you run additional playouts, you can be more and more confident that your program has identified the move with the best Monte Carlo value. But suppose that there are other moves that are equally good (or better) under perfect play (or against a particular opponent). Then any supposed superiority of the program's selected move over those alternatives is entirely due to inaccuracies of the Monte Carlo value. So once you are running enough playouts to detect those differences, it also becomes more likely that subsequent positions will encounter these same inaccuracies. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] small study
On Sun, Oct 26, 2008 at 4:22 PM, Don Dailey [EMAIL PROTECTED] wrote: I imagine that it approaches some hypothetical level in an asymptotic way. For most board positions, it is reasonable to expect that there exists a single move for which the asymptotic Monte Carlo value is higher than the rest. Even when this is not the case, the different moves with the same value are generally symmetrical, and the only possible strategic advantage of one of these moves over another would be due to superko. So I expect that the asymptotic behavior is very nearly deterministic. In the past, I have had difficulties in evaluating the overall strength of such players, because I would not get a good variety of positions in the games played. If you continue with higher numbers of playouts, you might want to either ensure that you either include at least one strong player that plays a variety of openings, or else play games with a variety of starting positions. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] From zero to playing on CGOS in 10 minutes
On Thu, Oct 23, 2008 at 1:00 PM, Mark Boon [EMAIL PROTECTED] wrote: This is still something I don't understand. Are there others who implemented the same thing and got 111 moves per game on average? I tried to look through some posts on this list but didn't see any other numbers published. 111 matches my recollection from past experiments that I did with playouts that probably match Don's. I think that this number has been posted previously on the mailing list. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Go with modified scores
In the context of Monte Carlo, a win or loss by a large margin is quite likely (at least in any close game) to be due to large blunders. (For example, allowing a large, safe group of stones to be captured.) Given this, it does not make sense to weight it more strongly than any other win or loss. It would not even surprise me if a better evaluation could be obtained by _penalizing_ wins by greater margins. (Something like 100-k for a win and 100+k for a loss, exactly contrary to what you propose.) As a hopefully concrete example, consider the following 5x5 position, with black to play: .XO.. .XOO. .XXOO XXXO. .XO.O Komi is such that black can win by taking and filling the ko. Black _could_ still capture the whole board, by playing on the upper-right corner, but only if white fails to defend its group. One needs to avoid making this move appear overly attractive, because the winning move does not win by such a large margin. Weston On Thu, Oct 9, 2008 at 3:23 AM, Ingo Althöfer [EMAIL PROTECTED] wrote: Some of you may want to stone me for this heresy, but read carfully before. When you have MCST-/UCT for Go that can work for real-valued scores (or at least a version that can work for three-valued scores: winm, draw, loss), you may look at Go with different scoring systems. Example: A win by 0.5+k points is worth 100+k. A loss by 0.5+k points gives score -100-k. Let's call b=100 the base score. Question: How does the playing style change with b? (Of course, for very large B you should have almost the normal playing style.) Ingo. PS: Such evaluations may also help to reduce the problem of MC's laziness. -- Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] a few more questions
On Tue, May 13, 2008 at 7:08 PM, Gunnar Farnebäck [EMAIL PROTECTED] wrote: And I agree, don't even think of doing this with floating point numbers. This is a bit tangential to computer go. But you have piqued my curiosity Offhand, that sounds rather extreme to me. Perhaps I haven't really thought this through completely, but it seems as if there is a simple modification to the stated algorithm, that would ensure that it works fine with floating point numbers. Or at least well enough for any conceivable purpose: Make sure that you select each random number from a slightly larger range than prescribed by the exact totals. (For example, always round up while calculating the totals. If you do not have control over the rounding mode, just add on some small fraction of the largest weight.) Then, in the event that subtracting the weights of the points/rows/buckets never reduced the selected number to a non-positive value, just select a new random number and start over. Would this work fine, or is there some flaw in my thinking that Álvaro and Gunnar have already considered? Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: endgame (Was [computer-go] Re: Should 9x9 komi be 8.0 ?])
On 3/6/08, Christoph Birk [EMAIL PROTECTED] wrote: On Thu, 6 Mar 2008, Don Dailey wrote: advantageous to give away stones that not. Despite what many people believe, MC programs don't normally believe it's better to win small and they are not hell-bent on giving away stones in order to try to make the score come out to be exactly 0.5 win. You are correct that it is not explicitly programmed into the MC programs to win by 0.5 pts, but since most of them don't care about the margin they in practice often do. You are right, but I think that you may also be misconstruing the nakade problem as a lack of concern about margin, when it is really a fundamental failure to understand (i.e., failure to explore sufficiently) the nakade shapes. The tendancy for your final scores on lost games to be 0.5 really reflects a motivation on your own part to lose by the smallest margin that you can. After a point when the game is resolved, this doesn't conflict with the program's goal, so it lets you do just that. As an interesting thought, I think that it might actually be informative for people to try out programs that _do_ try to win by exactly 0.5! Not as a fundamental goal, but rather as a slight preference for the endgame. If authors can do this in a manner that does not degrade the playing ability much, then human players might be able to see even more clearly what life death errors a program makes, and at what point it is able to discover the correct solution. Clearly the programs would be weaker like this, but I think it could expose some systematic problems very quickly, so that they could be focused on. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Is Rémi correct?
I know that other people have mentioned this sort of thing already, but the result of level 8 being better than level 10 matches my own experience with slightly older versions of gnugo. As I recall, 8 was the best, 9 a little worse, and 10 worse again. Increasing the level seems to improve play after that, but it dramatically increases the time. Weston On Feb 6, 2008 12:48 PM, Don Dailey [EMAIL PROTECTED] wrote: Here is an update from the new 1000 game test using gungo at level 8 instead of 10. Rank Name Elo+- games score oppo. draws 1 Gnugo-3.7.11 1800 34 30 2186 97% 11370% 2 Mogo_03 1507 48 56 186 16% 18000% 3 Mogo_02 1202 43 51 10003% 18000% 4 Mogo_01 1003 70 96 10001% 18000% The test, at this point, seems to indicate that gnugo at level 8 is stronger than at level 10 because mogo is not doing as well as in the previous test.It will be more meaningful when we get to levels close to gnugo's strength. - Don As promised, to answer Rémi, I did a study with mogo vs Gnu at various levels. There is NO self play involved, Gnugo-3.7.11 is the only opponent for progressively higher rated version of Mogo. Here are the raw results so far: Rank Name Elo+- games score oppo. draws 1 Mogo_10 2319 72 60 500 95% 18000% 2 Mogo_11 2284 94 74 259 94% 18000% 3 Mogo_09 2234 57 49 500 92% 18000% 4 Mogo_08 2124 43 39 500 87% 18000% 5 Mogo_07 2016 35 33 500 78% 18000% 6 Mogo_06 1961 32 30 500 72% 18000% 7 Mogo_05 1814 28 28 500 52% 18000% 8 Gnugo-3.7.11 1800 13 13 5259 44% 18230% 9 Mogo_04 1711 29 29 500 37% 18000% 10 Mogo_03 1534 35 38 500 18% 18000% 11 Mogo_02 1281 60 72 5005% 18000% 12 Mogo_01 1004 115 178 5001% 18000% The issue is whether self-play results distort the rating of programs. In this case, we are only testing whether it distorts the ratings of Mogo since no other programs were tested. In the following table, I played up to 500 games between Gnugo and Mogo at various levels. The levels are the exact levels that correspond to the big scalability study. In the middle column I listed the ratings as computed by bayeselo in games against ONLY Gnugo and set the default rating of Gnugo to 1800, just as in the study. Unfortunately, I used level 10 in the gnugo only games but in the big study we use level 8. It's my understanding there is little difference between these 2 but we can probably assume Mogo might be a little better than indicated relative to the big scalability study. It looks like there indeed is a lot of distortion at the low end of the scale. Mogo seems much stronger at low levels than the larger scalability study indicated. At the higher levels, we also get a mismatch, where Mogo's rating doesn't seem as high when playing only Gnugo. This is as Rémi claims. One thing to note is that at higher levels it's more difficult to get an accurate rating. Mogo_10 is winning 95% of it's games against Gnugo, and an extra win or loss every few games can make a lot of difference. However I am inclined to believe this is real since it seems to hold for several upper levels. At level 7 it's only 42 ELO, but at levels beyond this it's over 100 ELO. I've never doubted that there is some intransivity between programs, but I am a little surprised that it is this much. Even if the comparison is slightly unfair due to Mogo playing a stronger version of Gnugo in this study, it's still seems like it must be at least 100 ELO. vers vs Gnu Study -- - 011004688 021281 1093 031534 1331 041711 1554 051814 1751 061961 1971 072016 2058 082124 2270 092234 2347 102319 2470 My suggestion to improve this situation is to play a few thousands games against a well rated Gnugo and set up mogo as a second anchor. - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: Re : [computer-go] Is MC-UCT really scalable ... is a troll
(I typed the following up earlier today, before other people cast some doubts about what infinite scalability means. Perhaps it is helpful.) I think that Alain was specifically referring to a property of UCT, whereby given that a winning line of play exists, the probability that the algorithm has discovered one such line settled upon it approaches 1 as time approaches infinity. I believe that this has been proven. And no, this theoretical result does not represent any intrinsic advantage over a calculation of the minimax value. (Someone should feel free to correct me if there is a stronger theoretical result that I am not aware of, though.) I think it is important to understand that unless you restrict UCT from exploring some correct lines of play, this infinite scalability will still be true. And, it is true regardless of what moves have been preferred/pruned/whatever within the MC playouts. (Or whatever one does at the leaf nodes; it need not be MC at all, as long as the UCT tree eventually gets expanded as the node is revisited.) This is not to say that this particular property of UCT is sufficient in order for us to claim that UCT is scalable in practice. My point is rather that noone is claiming infinite scalability of MC, but rather scalability of UCT. When you combine that with the fact that UCT has been very successful at 9x9 go, (and scalable across a wide range of time limits) it seems to be reasonable to expect more of the same in 19x19. (Which is what concerned the original poster.) Also, one should expect that the UCT portion of MC-UCT will tend to eventually fix up any systematic errors that are made by the MC playouts. I have one other point I'd like to make, in regard to light versus heavy playouts: Even light playouts do not actually use a uniform distribution, due to the quasi-eye rule that is generally used. I think that there is every reason to expect that other nonuniform playout policies further outperform light playouts in every practical way. Granted, it is possible to introduce severe misconceptions when one incorporates knowledge into one's playouts. But even in that case, one can still fall back on the fact that UCT is cleaning up after one's mistakes: the eventual behavior, given enough time, is still perfect play. (And of course it is not as if people blindly adjust the monte carlo policies without checking the revised versions for severe misconceptions.) Weston On 1/22/08, ivan dubois [EMAIL PROTECTED] wrote: I think there is some missconception about this infinite scalability of MC stuff. Theory is only usefull when the model it is based on shares important aspects with reality. In theory, 19*19 Go is a finite game, wich can be analysed in a finite amount of time. So ANY algorithm that solves the game at some point is, in theory, infinitely scalable. For example, the folowing algorithm is infinitely scalable : Analyse the complete mini-max tree of the game. If enough time to finish, returns the correct move, if not, return a random move. Now, is this algorithm really scalable, in practive ? Of course not. I support the idea that MC is infinitely scalable (in the reality) ONLY if the play-out part does not have severe misconceptions. So i think that currently, only MC based on uniform playouts is infinitely scalable. It is well know that even Mogo has troubles with big eyes (he thinks a big eye gives life, wich is false). This problem can not be solved with more computing power (excep absurdly big, of course you can always mini-max to the end). - Message d'origine De : Alain Baeckeroot [EMAIL PROTECTED] À : computer-go computer-go@computer-go.org Envoyé le : Mardi, 22 Janvier 2008, 22h13mn 26s Objet : Re: [computer-go] Is MC-UCT really scalable ... is a troll Le mardi 22 janvier 2008, David Fotland a écrit : The UCT-MC programs do particularly well against traditional programs because they expose the brittleness inherent in the pattern databases they use. Strong humans are not so easily beaten by playing unconventional and somewhat inferior moves. I think Remi posted a game of CrazyStone on 19x19 commented by one pro who said this move is 2 dan level. So far no go program got such analyse, and the also the huge novelty provided by MC/UCT programs is they have a _real_ strenght with their own style, like humans: play solid when winning, and play hamete (tricky moves) when losing. On kgs MC programs played hundreds (if not thousands) games against humans, and no doubt they fully merit their rank, and no doubt they are improving. Infinite scalability is a theoricaly proved property, so please don't feed the troll :-) Alain ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ _ Ne
Re: [computer-go] Simplified MC evaluator ¿explained?
The second explanation was no clearer to me. I'll try to criticize in more detail: 1. Uniform playouts, as used in practice, are not really uniform over all legal go moves. Generally, pass moves are excluded until necessary, and moves that fill eyelike points are excluded. So, I assume that when you use the word legal, you mean admissible within this sort of playout. 2. That variance depends on the length of the playout. It is difficult for me to make sense of this statement, simply because not all playouts from a given position have the same length. My best guess is that you are claiming that the longer a playout is, the more likely it is that its result differs from the result under correct play. However, I strongly doubt that this is true for all starting positions. (Imagine that the first player needs to prevent the second player from forming two eyes in a large group. After doing this, that group will eventually be captured, allowing playouts to continue longer by filling the intersections that it once occupied. Failing to kill this group may allow the playouts to complete much more quickly, but gives inaccurate results.) 2.5. The variance of the stochastic process is not to be mixed up with the distribution of the error of a repeated Bernoulli experiment! Perhaps I have mixed them up. Can you explain more clearly or precisely what the variance of the stochastic process is? Do you perhaps mean some measurement of variation across different starting points, rather than across different Bernoulli trials from the same starting position? Or, do you mean to distinguish the probability that a playout's outcome differs from the outcome under correct play, from the probability that a playout results in a win? (Although those are just two different Bernoulli experiments, right?) Or is there some subtlety that I have missed? 3. 'p is a biased towards 1/2 estimator of W'. Consider the game: o / \ o o / \ | 1 0 0 (1 is a win by the first player, and 0 is a loss.) There is a move that could allow the first player to win, if the second player does not respond to it correctly. This sounds like a realistic scenario for go. W = 1/3 p = 1/4 p is further from 1/2 than W. Does this game violate the condition that the number of legal moves for each side is balanced? (It is still not clear to me what this condition is that you are attempting to impose.) Or, was I supposed to calculate a statistic across multiple game trees where W=1/3, in order to interpret p as an estimator of W? 4. Even if we can compute W exactly, do we have any reason to think that its value is a good estimate of the minimax value of the game? Is it even a better estimate than p, which we can already estimate accurately? (Note that in the game tree above, it is not.) My offhand guess is that it would not be as good. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] The physics of Go playing strength.
On 4/8/07, Don Dailey [EMAIL PROTECTED] wrote: These programs, in theory, will play perfect GO given enough time. ... and space. I doubt that your current programs would be capable of storing a large enough game tree to actually converge to the alpha-beta value. So in practice, it really would level out somewhere below optimal play, unless you also increase the memory usage, right? I think that it is still valid to present your chart as representing the lower portions of curves that do not do this, because you would presumably scale up the amount of memory used as well as the time, if you could. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Simplified MC evaluator ¿explained?
On 4/7/07, Jacques Basaldúa [EMAIL PROTECTED] wrote: Assuming two simplifying hypotheses: 1. The playouts are uniformly random. 2. Both players have the same number of legal moves (or any unbalanced numbers compensate in the long term). I did not understand your post either. Is #2 the same as neither player passes until the end of the game? Or do you intend to assume that at any given level of the game tree, all nodes have the same number of children? The latter assumption seems to be much stronger than what you intend, since it would imply that p=W exactly. Plus, it is obviously (and dramatically) false under any normal go rules where two passes end the game. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: pseudoliberties
It appears to me that at least 91 is possible: .xx.x.xx. xx.xxx.xx .xx.x.xx. xx.xxx.xx .xx.x.xx. xx.xxx.xx .xx.x.xx. xx.xxx.xx .xxx.xxx. Weston On 3/29/07, Jason House [EMAIL PROTECTED] wrote: After some trial and error, I got 90 * * * * * * * * * * * * *** * * * *** * * * * * * * * * * * * On 3/29/07, John Tromp [EMAIL PROTECTED] wrote: On 3/29/07, John Tromp [EMAIL PROTECTED] wrote: Is 88 the maximum number of pseuoliberties a string can have on 9x9? Make that 89:-) -John ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: pseudoliberties
A pseudo-liberty is a pairing of a stone in the group and an adjacent, empty intersection. On 3/29/07, Jim O'Flaherty, Jr. [EMAIL PROTECTED] wrote: What's a pseudo-liberty? And how can there be more of them than there are empty intersections (81) on the board? - Original Message From: Jason House [EMAIL PROTECTED] To: computer-go computer-go@computer-go.org Sent: Thursday, March 29, 2007 1:02:01 PM Subject: Re: [computer-go] Re: pseudoliberties After some trial and error, I got 90 * * * * * * * * * * * * *** * * * *** * * * * * * * * * * * * On 3/29/07, John Tromp [EMAIL PROTECTED] wrote: On 3/29/07, John Tromp [EMAIL PROTECTED] wrote: Is 88 the maximum number of pseuoliberties a string can have on 9x9? Make that 89:-) -John ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: pseudoliberties
On 3/29/07, John Tromp [EMAIL PROTECTED] wrote: On 3/29/07, Weston Markham [EMAIL PROTECTED] wrote: It appears to me that at least 91 is possible: Nice! If you use O's instead like .OO.O.OO. OO.OOO.OO .OO.O.OO. OO.OOO.OO .OO.O.OO. OO.OOO.OO .OO.O.OO. OO.OOO.OO .OOO.OOO. it looks pretty artistic, like a vase with a flower inside:-) :) It appears that you can get 92 as well, by replacing the two empty intersections along the upper edge of my diagram with a single one in the center of that edge, and placing an empty intersection on one of the other intersections along the central line of symmetry. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: pseudoliberties
On 3/29/07, John Tromp [EMAIL PROTECTED] wrote: On 3/29/07, Weston Markham [EMAIL PROTECTED] wrote: On 3/29/07, John Tromp [EMAIL PROTECTED] wrote: On 3/29/07, Weston Markham [EMAIL PROTECTED] wrote: It appears to me that at least 91 is possible: Nice! If you use O's instead like .OO.O.OO. OO.OOO.OO .OO.O.OO. OO.OOO.OO .OO.O.OO. OO.OOO.OO .OO.O.OO. OO.OOO.OO .OOO.OOO. it looks pretty artistic, like a vase with a flower inside:-) :) It appears that you can get 92 as well, by replacing the two empty intersections along the upper edge of my diagram with a single one in the center of that edge, and placing an empty intersection on one of the other intersections along the central line of symmetry. If it weren't for the fact that that central line is needed to keep everything connected:-) Yes, I realized that later (as soon as I tried to draw it) but didn't bother to post anything, since Gunnar had already posted that 91 is the best possible. But thanks for pointing it out. :) Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] GTPv3
On 3/7/07, Paul Pogonyshev [EMAIL PROTECTED] wrote: Don Dailey wrote: On Wed, 2007-03-07 at 22:53 +0100, Gunnar Farneback wrote: * To abort the asynchronous genmove, the controller should send the (synchronous) command abort_async_genmove. If the engine has not returned the asynchronous genmove response before responding to the abort command, it is no longer allowed to return a move. I'm open for discussion on the exact name and whether it should return an error if the move had already been sent (I don't think it should). But what about race conditions here? The engine may be responding to the async_genmove command an instant before it realizes an abort command just arrived. In this case it would be violating your rule but it wouldn't be anyone's fault. There is no race condition because commands _are_ read synchronously. But responses _may be_ sent asynchronously. Paul I'm not sure that I understand Paul's explanation, so I'll try out my own: Any race condition that occurs here is entirely the fault of the engine. The engine is already responsible for serializing all of the responses that it makes. Any failure to do so would allow interleaved responses, which could not possibly be understood by the controller. So, at the time that the async_genmove is permitted to write its asynchronous response, (it may have to acquire a lock in order for this to be the case) it is unambiguous whether or not the response to abort_async_genmove has been sent. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [gtp] Re: [computer-go] GTPv3
On 3/7/07, Paul Pogonyshev [EMAIL PROTECTED] wrote: Gunnar Farneback wrote: Example 3: Like example 2, but abort command comes too late. ... Maybe it should then read ?2 not in progress It also makes sense to forbid an async_genmove (or simple genmove for that matter) until the previous genmove/async_genmove has finished. The engine should just fail with a predefined error string; I think it is really cumbersome for the engine to try synchronization here and serves little purpose. Paul Although Gunnar specifies async_genmove to return zero or one responses, I believe that it may be more useful to allow it to return any number. This allows the possibility for the controller to update a current best display while it has still not accepted a single definitive response. From the controller's point of view, the command could still be in effect until a response to the abort_async_genmove has been received. (although perhaps end_async_genmove would be more appropriate name in this case.) I don't feel strongly either way on this, but I thought that I should mention the possibility. Does it also make sense to forbid commands that modify the engine's state? If the board state changes before async_genmove generates a response, to which board does the response apply? Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] GTPv3
On 3/8/07, Don Dailey [EMAIL PROTECTED] wrote: On Thu, 2007-03-08 at 12:27 -0500, Weston Markham wrote: I think I am confused. I may be confused as well. I see that Paul has responded as well, but just in case this doesn't clear up on it's own: So what you might get is this: 1. controller sends async_genmove 2. controller (after some period of time) sends abort. 3. engine responds to aysnc_genmove 4. engine responds to the abort search ... Isn't this a race condition? I believe this should cause no problems because the controller can simply ignore the non-relevant response to genmove (it knows it send abort.) But I don't think that is what you are talking about. I didn't think that this is what you were talking about, either, so now it is not clear to me what race condition you are worried about. What behavior in this example depends on the timing? (I see no constraint on the actual ordering of #2, and #3, but no matter what order they occur in, the engine's responses would be the same, wouldn't they? I assume, by the way, that #3 is the asynchronous response to #1, and not the initial synchronous one.) You are claiming that there is some case where this scenario violates some rule of Gunnar's proposed protocol, correct? (The proposal did not constrain the responses based on what additional commands had already been sent, but rather what other responses had already been sent.) What exactly is that violation? Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Big board
That board needs to have the inside edge be connected to its outside edge, in order to represent a torus. Weston On 2/20/07, Antonin Lucas [EMAIL PROTECTED] wrote: No need for those difficulties, you can play along this board : http://www.youdzone.com/go.html On 2/21/07, Weston Markham [EMAIL PROTECTED] wrote: Somewhere online, I played a game on a torus, against someone's Java applet that has this option. I seem to recall playing a normal game at either 9x9 or 13x13, and then a game on the same-sized torus. I recall the first game as being somewhat challenging to me, (a beginner) and the second game to be somewhat hard to visualize, but quite easy to beat the computer opponent. Of course, it is possible that my experience in the first game helped me significantly in the second one, so it doesn't mean much. However, I was puzzled at the time because I had expected my inability to visualize the interactions across the edge of the board to be a huge handicap, relative to a program that presumably has no such difficulty. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Big board
(oops. Other people have already pointed this out, in an appropriately re-named thread.) On 2/21/07, Weston Markham [EMAIL PROTECTED] wrote: That board needs to have the inside edge be connected to its outside edge, in order to represent a torus. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Big board
Somewhere online, I played a game on a torus, against someone's Java applet that has this option. I seem to recall playing a normal game at either 9x9 or 13x13, and then a game on the same-sized torus. I recall the first game as being somewhat challenging to me, (a beginner) and the second game to be somewhat hard to visualize, but quite easy to beat the computer opponent. Of course, it is possible that my experience in the first game helped me significantly in the second one, so it doesn't mean much. However, I was puzzled at the time because I had expected my inability to visualize the interactions across the edge of the board to be a huge handicap, relative to a program that presumably has no such difficulty. Weston On 2/20/07, steve uurtamo [EMAIL PROTECTED] wrote: here's my first guess at don's question about how this would affect the game. my intuition is weak here, but i'll take a stab at it just for fun. no edges, no corners and no center mean that you're effectively playing in the middle at all times. this should mean that life would be harder to make and that scores would be much lower. certainly komi would have to change. my gut is that it'd be a harder game and that there aren't any cheap clever tricks that would reduce it to a simplistic game. someone with a very powerful 9x9 MC player could run a few thousand games to see what would happen. oh, and keeping track of symmetry would really be annoying, since translations would have to be taken into account as well. oops, this changes my intuition -- this means that many, many more board positions are identical, which should vastly reduce the complexity of the game. :) s. No need to miss a message. Get email on-the-go with Yahoo! Mail for Mobile. Get started. http://mobile.yahoo.com/mail ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach
On 2/9/07, Weston Markham [EMAIL PROTECTED] wrote: I don't seem to have any numbers on this anymore, but I should be able to try some experiments this weekend. I do have some code that does what I describe below. It is also using an all moves as first heuristic. According to my notes, I made this change in an attempt to avoid severely conservative (in my non-expert opinion) moves near the beginning of the game, which seem to be preferred when using all-moves-as-first. It specifically aims for a 30-point lead at the beginning of the game, and reduces this by one point for each turn into the game. I should point out that I am not averaging scores, but simply changing which games I count as wins for the evaluation of a move. This is perhaps not quite what Steve Uurtamo had in mind when he was originally musing about being greedy at the beginning of the game. Nevertheless, it is a very similar sort of idea to what he described, so I thought that I would mention it. Weston On 2/8/07, Chris Fant [EMAIL PROTECTED] wrote: On 2/8/07, Weston Markham [EMAIL PROTECTED] wrote: I believe that I have had some success with an approach like this, actually. I believe that I initially only tally games that are won by a certain margin, and reduce that margin to zero as the game progresses. I am pretty sure that this improves upon straight Monte Carlo. I think that I can get some numbers on it, if anyone is interested. Weston Yes, please. I did run some tests over the weekend, but they did not complete as quickly as I had expected. Anyway, I have enough data at this point to say that although there is nothing spectacular to show one way or the other, I was probably wrong that the code was an improvement. Playing against gnugo 3.7.10, komi 7.5, playing black in 50% of the games: 86/732 games won baseline (11.7%) 106/1000 games won after enabling logic described above. (10.6%, a slight degradation) I also tried a version that dynamically adjusts an expected score, and tallies the number of games where the score exceeds this. It appears to give an improvement: 12.8% of the games were won, out of 1000. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach
I think that you are essentially correct. However, this is only going to affect a small number of games where two different moves are exactly tied for the best winning percentage, after many playouts. Even if the underlying probabilities are exactly the same, you can't really expect this to happen much. Weston On 2/11/07, Heikki Levanto [EMAIL PROTECTED] wrote: But there is a way. If we do N play-outs, the effect of any single of them is 1/N. If we make sure to scale the score to be less than half of this, it can not disturb anything in cases where the number of wins is different. Only in cases with exactly the same number of wins in the play-outs, would the score break the tie. In other words my large constant of 1000 was far too small. It would have to be something like 2NM, where M is the maximum score (say 361). Round it up to 1000N, and we should be safe. I still believe it would make endgames look more reasonable, and possibly even better, in case the winning program has overlooked a detail somewhere, having a large margin of points on the board should act as an insurance against small blunders. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach
I don't seem to have any numbers on this anymore, but I should be able to try some experiments this weekend. I do have some code that does what I describe below. It is also using an all moves as first heuristic. According to my notes, I made this change in an attempt to avoid severely conservative (in my non-expert opinion) moves near the beginning of the game, which seem to be preferred when using all-moves-as-first. It specifically aims for a 30-point lead at the beginning of the game, and reduces this by one point for each turn into the game. I should point out that I am not averaging scores, but simply changing which games I count as wins for the evaluation of a move. This is perhaps not quite what Steve Uurtamo had in mind when he was originally musing about being greedy at the beginning of the game. Nevertheless, it is a very similar sort of idea to what he described, so I thought that I would mention it. Weston On 2/8/07, Chris Fant [EMAIL PROTECTED] wrote: On 2/8/07, Weston Markham [EMAIL PROTECTED] wrote: I believe that I have had some success with an approach like this, actually. I believe that I initially only tally games that are won by a certain margin, and reduce that margin to zero as the game progresses. I am pretty sure that this improves upon straight Monte Carlo. I think that I can get some numbers on it, if anyone is interested. Weston Yes, please. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach
On 2/8/07, steve uurtamo [EMAIL PROTECTED] wrote: i wonder if this kind of greediness might, however, be useful for selecting, say, the first move or two in a 9x9 game. the thinking here is that since the endgame is essentially noise at this point, you might as well be greedy before tactics become an issue. that's probably faulty intuition, though. I believe that I have had some success with an approach like this, actually. I believe that I initially only tally games that are won by a certain margin, and reduce that margin to zero as the game progresses. I am pretty sure that this improves upon straight Monte Carlo. I think that I can get some numbers on it, if anyone is interested. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Go Misnomer?
Unfortunately, it's just not that simple, because it depends far more on _how_ the playout is improved, rather than, say, the ELO numbers that measure that improvement. For example, programs exist that are good (in part) because they entirely disregard some lines of play. They may be correct to disregard these lines in almost every case, which generally makes the playout program stronger. However, for the few cases where this heuristic pruning is not correct, the calling program will suffer greatly, because these lines of play become completely invisible to the random playouts, no matter how many playouts are performed. As an extreme example, consider programs that play completely deterministic strategies. These are obviously useless as random players, yet it is in principle possible to construct ones that play arbitrarily well. Weston On 2/7/07, Ephrim Khong [EMAIL PROTECTED] wrote: Is there any known (by theory or tests) function of how much a increase in the strength of the simulation policy increases the strength of the MC/UCT Program as a whole? ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC Go Effectiveness
On 2/7/07, Unknown [EMAIL PROTECTED] wrote: to binary search the file on the parent position key, collect all of these records together (making sure there is a legal move which leads to the cannonical response key) and then You are not clear here. Is there only a one-move-step between key and resp ? Why not store the move in the first place ? (instead of the resulting hash) 128 bits of hash vs 64 bits. From his first post: This makes collision very unlikely. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] MC approach
But of course, it's not the size of the win that counts, it is rather the confidence that it really is a win. In random playouts that continue from a position from a close game, the ones that result in a large victory are generally only ones where the opponent made a severe blunder. (Put another way, the score of the game is affected more by how bad the bad moves are, rather than how good the good ones are, or even how good most of the moves are. Others have commented on this effect in this list, in other contexts.) Since you can't count on that happening in the real game, these simulations have a lower value in the context of ensuring a win. Even when the program is losing (say by a little bit) it is more important to play moves that it thinks are the most likely to convert the game to a win by confusing the opponent, rather than to play moves that will make the losing outcome more apparent. These tend to be different moves, and Monte Carlo methods are good at uncovering these differences. I agree that this is a bit surprising, but I find it much less so when I think about it in these terms. Given that people have reported such a strong effect, I am actually wondering if these simulations (those that result in a large score difference) should be _penalized_, for not being properly representative of the likely outcome of the game. In other words: value = 1000 * win - score Weston On 2/7/07, Don Dailey [EMAIL PROTECTED] wrote: My intuition is also the same as yours - one would think 100 point win is better. It's not that it's better or it's a better move objectively, but it should be better against a fallible opponent who could easily get distracted by the little battles and forget the real point of the game. At least it's a way to fight when you are losing. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] February KGS computer Go tournament: results
The version in the Open division was the standard development version. The one in the Open division, MonteGNU Should one of those Opens be Formal? Weston On 2/5/07, Nick Wedd [EMAIL PROTECTED] wrote: I have put a short report on the event at http://www.weddslist.com/kgs/past/23/index.html ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Re: Scaling monte carlo up to 19x19
I have an idea in the back of my mind that is an extreme version of this: Divide the board into 361 separate local searches, then use information from these to guide a global search. The local searches would be done on the full board, but would only search for strategies that will capture or defend individual intersections. I suspect that this first phase could potentially benefit from parallelization for a significant portion of the game. Eventually this parallelism will break down because there will only be a limited number of local battles, and the eventual status of points that are in the same chain will almost always be the same. Any practical program would need to deal with this gracefully, of course, rather than duplicate its effort many times. Also, I only have a vague idea of how to take advantage of the information gained from the local searches, when performing the global search. Weston On 1/30/07, Dave Dyer [EMAIL PROTECTED] wrote: The idea isn't more than lightly toasted (less than half baked), but the kernal is turn the full board search into set of searches on much smaller boards, using the overlapping strips as boundary conditions, then do some unifying final step to pick the move. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Can a computer beat a human?
On a slightly (but not much) more serious note: The proposal that elicited (for better or for worse) Alain's size-of-the-universe comment was not for a complete table of all possible board states, but rather a table of winning moves. I expect that most positions will have multiple winning moves, but only a single one would need to be stored. The table does not need to store anything for a position that cannot be reached by playing those selected moves. I believe that this greatly reduces the size needed. I certainly don't know how much smaller, although someone may be able to come up with actual numbers for 5x5 and 7x7 games, and use an educated guess on how these will scale up to 19x19. My offhand guess is that this size is still _far_ greater than the meagre memory resources that were suggested in the original post. For what it is worth, I expect that it is also larger than the number of quantum states of one of Alain's eyelashes. I would guess that it is smaller, however, than 10^100 bits, which I believe is a common back-of-the-envelope number for the total information in the visible universe. One also has the freedom to pick and choose which winning moves to store. This can be used to maximize the overlap between the different stored variations. If you also take advantage of patterns within the table, you will certainly be able to compress it. (As has been mentioned already.) At the far extreme, of course, you only need one entry in the table for each of 282 possible moves, and a hash function that simply ... er, figures out which move to play. Or, one can strike a balance between these two extremes that gives an appropriate tradeoff between computation time and memory. I would guess that kami no itte would still be impossible on 32GB, 300Mhz. However, beating a professional dan player seems reasonable to me. Of course, Alain will still have a large enough lookup table that he will be able to beat it ... in the blink of an eye. Weston On 1/24/07, Don Dailey [EMAIL PROTECTED] wrote: On Wed, 2007-01-24 at 21:11 +0100, alain Baeckeroot wrote: With 10^170 legal position for 19x19 what would be the size of this table ? I m afraid we cannot build it with all the matter in visible universe. I think the computer science greats should have consulted you before writing their textbooks - I just looked at this crazy thing called a turing machine in one of my textbooks. A universal turing machine supposedly has an infinite tape attached to it. Maybe they are smart about computers, but they don't know anything about physics. I think all these textbooks need to be thrown out because they are obviously of no practical value. - Don ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Can a computer beat a human?
On 1/24/07, Weston Markham [EMAIL PROTECTED] wrote: 282 possible moves Um. Dunno where I got that number from. (I meant 362, I think.) Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] an idea for a new measure of a computer go program's rank.
Personally, I use the terminology in much the same way as Heikki. I use the word mistake to describe (for example) a move that loses a large group, but does not change the game from a win to a loss. It makes sense to me to generally apply mistake to any move that loses points relative to the best move on the board. The term blunder means, essentially, a move that lost the game. It can be quite difficult, of course, to determine unambiguously whether or not a particular move is a blunder. In an otherwise close match, a large mistake (i.e., loses many points) is probably a blunder. Toward the end of a close game, it may be possible to find unambiguous blunders, and some of these could be single point mistakes. Weston On 1/23/07, Heikki Levanto [EMAIL PROTECTED] wrote: On Sun, Jan 21, 2007 at 08:16:07PM -0800, Ray Tayek wrote: I don't know the percentage of blunders. It also depends on what you call a blunder. Is a 1 point mistake a blunder? no, maybe 10 or more points My gut feeling is that a real blunder is enough to loose the game. Between equally strong players, a one point mistake can be a blunder, if it was late in the yose, and the game was won by half a point. On the other hand, throwing away a 20-stone group may not be a blunder if you were already going to loose by 100 points. It could even be a (mis?)calculated risk, ignoring a threatening move in order to get an attack on an even larger group, even if that attack later turns out not to work... Just my uninformed gut feeling, of course. -H -- Heikki Levanto In Murphy We Turst heikki (at) lsd (dot) dk ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Fw: Compensation for handicap plays?
Okay. Don's later post does indicate that he intends to compensate for the stones. So, in the interest of being 100% clear: is this compensation included in the komi value that is sent to the client? Weston On 12/29/06, Weston Markham [EMAIL PROTECTED] wrote: Am I correct in inferring that this is also what Don Dailey has in mind, but with no compensation for the handicap stones? ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] professional game libraries for pattern harvesting
Also, there was a recent thread on the mailing list: 50, 576 pro/dan games without repetitions nor easily detectable problems, started by Jacques Basaldúa, who has put together a collection of games: http://www.dybot.com/masters/masters.zip If I recall correctly, the format of this file is only documented in Jacques' message, so you may need to refer to that for details. Weston ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/