Re: [computer-go] (early) termination of random playouts?
>> Anything can exist in a random game :-) > Occur yes, repeat forever requires very special situations. That makes long repeats unlikely in any individual run, and infinite runs infinitely unlikely, unless we run into one of those very special situations. But how special do these actually need to be? One popular way of scaling up Monte-Carlo based engines seems to add more processing resources to explore more of the search space. Now, if you have a loop that requires n specific steps with at most m loop-breakers at each step, it isn't one of those very special cases that would force a bot into the loop. But if your engine explores more than (m+1)^n alternative runs, one of its simulations is going to run into the loop, even if that isn't likely for each individual run, right? And if the class of dangerous situations isn't quite as narrow as one might think at first, then random simulations overall are more likely to run into one of them, even if we don't use such a "beast" to start the simulation. But even short of infinite loops, very few of those finite repetitions are likely to occur in real games - some of them have and do, but most are a sign of unrealistic simulations (unsupported claim;-), which are likely to mess with the quality of evaluation results. Since finite bounds can take care of infinite loops, such degradations in quality are more worrying, at least to me. Especially if their origins are hidden somewhere inside statistics gathered from a huge invisible (unrecorded for efficiency reasons) set of random simulation runs (tuned via experiments). Perhaps one could record a histogram of simulation lengths at least, and stop worrying unless there happen to be lots of entries in the >361 area. Otherwise, results might be influenced by whether the artifical bound on simulation length happens to fall into a high or a low of overlength runs. Claus ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] (early) termination of random playouts?
On Thu, Oct 9, 2008 at 11:39 AM, Jason House <[EMAIL PROTECTED]> wrote: > On Oct 9, 2008, at 11:08 AM, "Eric Boesch" <[EMAIL PROTECTED]> wrote: > [...] > Does anyone allow passing at random in their playouts??? A game stopped from > two premature passes is tough to score, if not completely meaningless. I haben't done this, but I have a plan that includes passes "at random"(with non-uniform probabilities). With the rules I use, a game stopped by two premature passes is very easy to score, and it does have some meaning. I'll explain. For any move played during a playout we use some information (the 3x3 pattern around it, the location on the board, the number of liberties, the number of stones captured...) to estimate its score. The sum of the estimates of all the moves in a playout should come close to the actual score at the end. We use the difference between those two numbers to adjust some weights that were used in computing the estimates. Over time this system will learn which moves are good and which aren't. Now we can bias the playouts so moves with high estimated scores are played more often (in some cases, much more often). A system like that would learn that playing almost any move is generally better than passing. The hope is that this would recognize sekis because moves there would have scores that are much worse than passing, so both players would pass without moving there. It is even possible that we can remove the definition of what points look "eyeish", and have this algorithm figure out that moves with those 3x3 patterns are usually worse than passing. I don't know how workable that is, but it's about the only plan I have to get MC to handle sekis correctly. Álvaro. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] (early) termination of random playouts?
On Oct 9, 2008, at 11:46 AM, "Erik van der Werf" <[EMAIL PROTECTED] > wrote: Anything can exist in a random game :-) Occur yes, repeat forever requires very special situations. Sent-two-return-one may me the biggest practical concern This is naturally resolved in light random playouts since cycle breaking moves exist. This is far too academic to debate on this list, so I'm going to stop talking about this topic. , but I would not be surprised if some day a molasses ko would pop up as well, especially if your playouts are not too stupid. Erik On Thu, Oct 9, 2008 at 5:24 PM, Jason House <[EMAIL PROTECTED] > wrote: Which multi stone capture case still exists under random games? Sent from my iPhone On Oct 9, 2008, at 11:12 AM, "Erik van der Werf" <[EMAIL PROTECTED] > wrote: On Thu, Oct 9, 2008 at 5:03 PM, Jason House <[EMAIL PROTECTED] > wrote: On Oct 9, 2008, at 10:41 AM, "Erik van der Werf" <[EMAIL PROTECTED]> wrote: Sure, some long cycles have multi-stone captures. Can you provide an example? http://www.cs.cmu.edu/~wjh/go/rules/bestiary.html ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] (early) termination of random playouts?
On 9-okt-08, at 10:15, Don Dailey wrote: If the play-outs are uniformly random and you have eye rule, it is guaranteed to terminate as long as you use simple ko. I don't think this is quite correct. With using just simple ko there's a chance you end the game in a never-ending triple-ko. The side to move always has exactly one legal move, so randomization doesn't help. I think I saw it happen once in a hundred games (not playouts) or so, so it's common enough to need addressing. For example, Black to move: XXX XOXOX.X O.O.OXO OOO Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] (early) termination of random playouts?
Anything can exist in a random game :-) Sent-two-return-one may me the biggest practical concern, but I would not be surprised if some day a molasses ko would pop up as well, especially if your playouts are not too stupid. Erik On Thu, Oct 9, 2008 at 5:24 PM, Jason House <[EMAIL PROTECTED]> wrote: > Which multi stone capture case still exists under random games? > > Sent from my iPhone > > On Oct 9, 2008, at 11:12 AM, "Erik van der Werf" <[EMAIL PROTECTED]> > wrote: > >> On Thu, Oct 9, 2008 at 5:03 PM, Jason House <[EMAIL PROTECTED]> >> wrote: >>> >>> On Oct 9, 2008, at 10:41 AM, "Erik van der Werf" >>> <[EMAIL PROTECTED]> >>> wrote: Sure, some long cycles have multi-stone captures. >>> >>> Can you provide an example? >> >> http://www.cs.cmu.edu/~wjh/go/rules/bestiary.html >> ___ >> computer-go mailing list >> computer-go@computer-go.org >> http://www.computer-go.org/mailman/listinfo/computer-go/ > > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] (early) termination of random playouts?
On Oct 9, 2008, at 11:08 AM, "Eric Boesch" <[EMAIL PROTECTED]> wrote: On Thu, Oct 9, 2008 at 10:25 AM, Jason House <[EMAIL PROTECTED]> wrote: You are incorrect that the following heuristics in random games lead to finite game length: * no eye filling * no suicide * no simple ko violations Consider two eyeless chains with 3 ko's connecting them... Two taken by black and it's white's move. Filling the one ko it has is suicide. It must take a ko. On black's turn, it can't fill a ko due to suicide and must take a ko. The cycle repeats infinitely. If passing is always legal and two passes terminate the game, then the expected game length is finite no matter how you limit the other options. If you forbid passing early then I suppose you have good reasons for that, but it could be the reason why you and Don reached different conclusions. Does anyone allow passing at random in their playouts??? A game stopped from two premature passes is tough to score, if not completely meaningless. Once upon a time, I thought the simple playout rules led to finite games. Infinite cycles are really rare and went unnoticed by me for a very long time. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] (early) termination of random playouts?
Which multi stone capture case still exists under random games? Sent from my iPhone On Oct 9, 2008, at 11:12 AM, "Erik van der Werf" <[EMAIL PROTECTED] > wrote: On Thu, Oct 9, 2008 at 5:03 PM, Jason House <[EMAIL PROTECTED] > wrote: On Oct 9, 2008, at 10:41 AM, "Erik van der Werf" <[EMAIL PROTECTED] > wrote: Sure, some long cycles have multi-stone captures. Can you provide an example? http://www.cs.cmu.edu/~wjh/go/rules/bestiary.html ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] (early) termination of random playouts?
On Thu, Oct 9, 2008 at 5:03 PM, Jason House <[EMAIL PROTECTED]> wrote: > On Oct 9, 2008, at 10:41 AM, "Erik van der Werf" <[EMAIL PROTECTED]> > wrote: >> Sure, some long cycles have multi-stone captures. > > Can you provide an example? http://www.cs.cmu.edu/~wjh/go/rules/bestiary.html ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] (early) termination of random playouts?
On Thu, Oct 9, 2008 at 10:25 AM, Jason House <[EMAIL PROTECTED]> wrote: > You are incorrect that the following heuristics in random games lead to > finite game length: > * no eye filling > * no suicide > * no simple ko violations > > Consider two eyeless chains with 3 ko's connecting them... Two taken by > black and it's white's move. Filling the one ko it has is suicide. It must > take a ko. On black's turn, it can't fill a ko due to suicide and must take > a ko. The cycle repeats infinitely. If passing is always legal and two passes terminate the game, then the expected game length is finite no matter how you limit the other options. If you forbid passing early then I suppose you have good reasons for that, but it could be the reason why you and Don reached different conclusions. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] (early) termination of random playouts?
On Oct 9, 2008, at 10:41 AM, "Erik van der Werf" <[EMAIL PROTECTED] > wrote: Sure, some long cycles have multi-stone captures. Can you provide an example? When I've thought about multi-stone captured before, I was unable to create a scenario that lead to infinite _random_ games. Many situations resolve themselves, even if it's a stupid result. Of course, a perfect heuristic for random games can be fragile. Heavier playouts may avoid certain moves that would allow things to work out on their own. Erik On Thu, Oct 9, 2008 at 4:39 PM, Don Dailey <[EMAIL PROTECTED]> wrote: You might be right. I have a liberal game length limit on my play- outs so I didn't notice this. Another game limiting rule could be something based on counting the number of consecutive 1 stone captures and terminating once this goes beyond some reasonable limit such as 10.Would infinite games still be possible with this rule? - Don On Thu, 2008-10-09 at 10:25 -0400, Jason House wrote: You are incorrect that the following heuristics in random games lead to finite game length: * no eye filling * no suicide * no simple ko violations Consider two eyeless chains with 3 ko's connecting them... Two taken by black and it's white's move. Filling the one ko it has is suicide. It must take a ko. On black's turn, it can't fill a ko due to suicide and must take a ko. The cycle repeats infinitely. Yes, this is a real bug found in a real CGOS game... I opted for the game length limit so my bot could remain online 24/7 Sent from my iPhone On Oct 9, 2008, at 9:15 AM, Don Dailey <[EMAIL PROTECTED]> wrote: Claus, I think you have summarized things better than I am going to, but here goes anyway: If the play-outs are uniformly random and you have eye rule, it is guaranteed to terminate as long as you use simple ko. It might even be guaranteed to terminate if you don't, I don't know. Although it's guaranteed to terminate, it could be arbitrarily long (millions or billions of moves) but that is not a practical consideration. The odds of it not terminating within a couple of hundred moves at 9x9 is probably extremely low as long as you have eye and simple KO and do not allow suicide. If it's not uniformly random, all bets are off. If you want to be paranoid, you can limit the number of moves but it's not necessary. Something like number of empty points times some constant such as 2 or 3 might be good here. There are several eye rules used, but I believe the vast majority of us use the same one. It is stated like this: 1. An empty point surrounded by friendly stones. 2. If edge, no diagonal enemies allowed. 3. If NOT edge, only 1 diagonal enemy allowed. Some programs do these things in addition to simple ko, no suicide allowed and eye rule: 1. Limit the number of moves. (not needed) 2. Stop game if one side has many more stones than the other. 3. Detect superko (a waste of time) 4. Let pass move be one of the random moves tried. (bad) You asked if people compared the eye rules. This has been discussed on this group before and the short answer seems to be that it probably does not make too much of a difference if it's reasonably sound. You might get different opinions on this. - Don On Thu, 2008-10-09 at 13:31 +0100, Claus Reinke wrote: Hi again, with yet another question:-) Could someone please summarize the state-of-the art wrt the various ways of limiting random playouts, and the their consequences? There are several variations on the "don't fill your own eyes" (dfyoe) approach, but the way this approach is often described tends to confuse me (eg, when someone seems to claim that there is exactly one correct way of implementing this, or when someone seems to claim that this guarantees termination, or when the resulting biases are not even mentioned, let alone discussed or compared against biases in alternative implementations, etc.). I'll try to summarize my current understanding, in the hope that others will fill in the missing pieces or point out misunderstandings: - the only thing that guarantees termination of random playouts (with or without dfyoe) is the positional superko rule: no whole-board repetitions allowed. Waiting for this to kick in without taking other measures is not an option: it takes too long and the results don't tell us much about the value of the initial position. - dfyoe variations share some motivations: - reduce the likelihood of unnaturally long playouts (but not to zero) - try to avoid disastrous endgame-situation mistakes that random playouts are otherwise prone to make (and that eliminate the value of playouts) - try to be simple (efficiently implementable), with low (but non-zero) probability of introducing errors (filling in when not filling in would be better, or vice versa) - aim to steer playouts to easily counted terminal positions that bear a quantifiable relationship to
Re: [computer-go] (early) termination of random playouts?
Sure, some long cycles have multi-stone captures. Erik On Thu, Oct 9, 2008 at 4:39 PM, Don Dailey <[EMAIL PROTECTED]> wrote: > You might be right. I have a liberal game length limit on my play-outs > so I didn't notice this. > > Another game limiting rule could be something based on counting the > number of consecutive 1 stone captures and terminating once this goes > beyond some reasonable limit such as 10.Would infinite games still > be possible with this rule? > > - Don > > > > On Thu, 2008-10-09 at 10:25 -0400, Jason House wrote: >> You are incorrect that the following heuristics in random games lead >> to finite game length: >> * no eye filling >> * no suicide >> * no simple ko violations >> >> Consider two eyeless chains with 3 ko's connecting them... Two taken >> by black and it's white's move. Filling the one ko it has is suicide. >> It must take a ko. On black's turn, it can't fill a ko due to suicide >> and must take a ko. The cycle repeats infinitely. >> >> Yes, this is a real bug found in a real CGOS game... I opted for the >> game length limit so my bot could remain online 24/7 >> >> Sent from my iPhone >> >> On Oct 9, 2008, at 9:15 AM, Don Dailey <[EMAIL PROTECTED]> wrote: >> >> > Claus, >> > >> > I think you have summarized things better than I am going to, but >> > here >> > goes anyway: >> > >> > If the play-outs are uniformly random and you have eye rule, it is >> > guaranteed to terminate as long as you use simple ko. It might even >> > be >> > guaranteed to terminate if you don't, I don't know. Although it's >> > guaranteed to terminate, it could be arbitrarily long (millions or >> > billions of moves) but that is not a practical consideration. The >> > odds >> > of it not terminating within a couple of hundred moves at 9x9 is >> > probably extremely low as long as you have eye and simple KO and do >> > not >> > allow suicide. >> > >> > If it's not uniformly random, all bets are off. >> > >> > If you want to be paranoid, you can limit the number of moves but it's >> > not necessary. Something like number of empty points times some >> > constant such as 2 or 3 might be good here. >> > >> > There are several eye rules used, but I believe the vast majority of >> > us >> > use the same one. It is stated like this: >> > >> > 1. An empty point surrounded by friendly stones. >> > 2. If edge, no diagonal enemies allowed. >> > 3. If NOT edge, only 1 diagonal enemy allowed. >> > >> > Some programs do these things in addition to simple ko, no suicide >> > allowed and eye rule: >> > >> > 1. Limit the number of moves. (not needed) >> > 2. Stop game if one side has many more stones than the other. >> > 3. Detect superko (a waste of time) >> > 4. Let pass move be one of the random moves tried. (bad) >> > >> > >> > You asked if people compared the eye rules. This has been discussed >> > on >> > this group before and the short answer seems to be that it probably >> > does >> > not make too much of a difference if it's reasonably sound. You >> > might >> > get different opinions on this. >> > >> > - Don >> > >> > >> > >> > >> > >> > On Thu, 2008-10-09 at 13:31 +0100, Claus Reinke wrote: >> >> Hi again, with yet another question:-) >> >> >> >> Could someone please summarize the state-of-the art wrt the various >> >> ways >> >> of limiting random playouts, and the their consequences? >> >> >> >> There are several variations on the "don't fill your own >> >> eyes" (dfyoe) approach, >> >> but the way this approach is often described tends to confuse me >> >> (eg, when >> >> someone seems to claim that there is exactly one correct way of >> >> implementing >> >> this, or when someone seems to claim that this guarantees >> >> termination, or when >> >> the resulting biases are not even mentioned, let alone discussed or >> >> compared >> >> against biases in alternative implementations, etc.). >> >> >> >> I'll try to summarize my current understanding, in the hope that >> >> others will >> >> fill in the missing pieces or point out misunderstandings: >> >> >> >> - the only thing that guarantees termination of random playouts >> >> (with or >> >>without dfyoe) is the positional superko rule: no whole-board >> >> repetitions >> >>allowed. Waiting for this to kick in without taking other >> >> measures is not >> >>an option: it takes too long and the results don't tell us much >> >> about the >> >>value of the initial position. >> >> >> >> - dfyoe variations share some motivations: >> >>- reduce the likelihood of unnaturally long playouts (but not to >> >> zero) >> >>- try to avoid disastrous endgame-situation mistakes that random >> >> playouts >> >>are otherwise prone to make (and that eliminate the value of >> >> playouts) >> >>- try to be simple (efficiently implementable), with low (but >> >> non-zero) >> >>probability of introducing errors (filling in when not >> >> filling in would be >> >>better, or vice versa) >> >>
Re: [computer-go] (early) termination of random playouts?
You might be right. I have a liberal game length limit on my play-outs so I didn't notice this. Another game limiting rule could be something based on counting the number of consecutive 1 stone captures and terminating once this goes beyond some reasonable limit such as 10.Would infinite games still be possible with this rule? - Don On Thu, 2008-10-09 at 10:25 -0400, Jason House wrote: > You are incorrect that the following heuristics in random games lead > to finite game length: > * no eye filling > * no suicide > * no simple ko violations > > Consider two eyeless chains with 3 ko's connecting them... Two taken > by black and it's white's move. Filling the one ko it has is suicide. > It must take a ko. On black's turn, it can't fill a ko due to suicide > and must take a ko. The cycle repeats infinitely. > > Yes, this is a real bug found in a real CGOS game... I opted for the > game length limit so my bot could remain online 24/7 > > Sent from my iPhone > > On Oct 9, 2008, at 9:15 AM, Don Dailey <[EMAIL PROTECTED]> wrote: > > > Claus, > > > > I think you have summarized things better than I am going to, but > > here > > goes anyway: > > > > If the play-outs are uniformly random and you have eye rule, it is > > guaranteed to terminate as long as you use simple ko. It might even > > be > > guaranteed to terminate if you don't, I don't know. Although it's > > guaranteed to terminate, it could be arbitrarily long (millions or > > billions of moves) but that is not a practical consideration. The > > odds > > of it not terminating within a couple of hundred moves at 9x9 is > > probably extremely low as long as you have eye and simple KO and do > > not > > allow suicide. > > > > If it's not uniformly random, all bets are off. > > > > If you want to be paranoid, you can limit the number of moves but it's > > not necessary. Something like number of empty points times some > > constant such as 2 or 3 might be good here. > > > > There are several eye rules used, but I believe the vast majority of > > us > > use the same one. It is stated like this: > > > > 1. An empty point surrounded by friendly stones. > > 2. If edge, no diagonal enemies allowed. > > 3. If NOT edge, only 1 diagonal enemy allowed. > > > > Some programs do these things in addition to simple ko, no suicide > > allowed and eye rule: > > > > 1. Limit the number of moves. (not needed) > > 2. Stop game if one side has many more stones than the other. > > 3. Detect superko (a waste of time) > > 4. Let pass move be one of the random moves tried. (bad) > > > > > > You asked if people compared the eye rules. This has been discussed > > on > > this group before and the short answer seems to be that it probably > > does > > not make too much of a difference if it's reasonably sound. You > > might > > get different opinions on this. > > > > - Don > > > > > > > > > > > > On Thu, 2008-10-09 at 13:31 +0100, Claus Reinke wrote: > >> Hi again, with yet another question:-) > >> > >> Could someone please summarize the state-of-the art wrt the various > >> ways > >> of limiting random playouts, and the their consequences? > >> > >> There are several variations on the "don't fill your own > >> eyes" (dfyoe) approach, > >> but the way this approach is often described tends to confuse me > >> (eg, when > >> someone seems to claim that there is exactly one correct way of > >> implementing > >> this, or when someone seems to claim that this guarantees > >> termination, or when > >> the resulting biases are not even mentioned, let alone discussed or > >> compared > >> against biases in alternative implementations, etc.). > >> > >> I'll try to summarize my current understanding, in the hope that > >> others will > >> fill in the missing pieces or point out misunderstandings: > >> > >> - the only thing that guarantees termination of random playouts > >> (with or > >>without dfyoe) is the positional superko rule: no whole-board > >> repetitions > >>allowed. Waiting for this to kick in without taking other > >> measures is not > >>an option: it takes too long and the results don't tell us much > >> about the > >>value of the initial position. > >> > >> - dfyoe variations share some motivations: > >>- reduce the likelihood of unnaturally long playouts (but not to > >> zero) > >>- try to avoid disastrous endgame-situation mistakes that random > >> playouts > >>are otherwise prone to make (and that eliminate the value of > >> playouts) > >>- try to be simple (efficiently implementable), with low (but > >> non-zero) > >>probability of introducing errors (filling in when not > >> filling in would be > >>better, or vice versa) > >>- aim to steer playouts to easily counted terminal positions > >> that bear a > >>quantifiable relationship to the estimated value of the > >> initial position > >>(perhaps a more sug
Re: [computer-go] (early) termination of random playouts?
You are incorrect that the following heuristics in random games lead to finite game length: * no eye filling * no suicide * no simple ko violations Consider two eyeless chains with 3 ko's connecting them... Two taken by black and it's white's move. Filling the one ko it has is suicide. It must take a ko. On black's turn, it can't fill a ko due to suicide and must take a ko. The cycle repeats infinitely. Yes, this is a real bug found in a real CGOS game... I opted for the game length limit so my bot could remain online 24/7 Sent from my iPhone On Oct 9, 2008, at 9:15 AM, Don Dailey <[EMAIL PROTECTED]> wrote: Claus, I think you have summarized things better than I am going to, but here goes anyway: If the play-outs are uniformly random and you have eye rule, it is guaranteed to terminate as long as you use simple ko. It might even be guaranteed to terminate if you don't, I don't know. Although it's guaranteed to terminate, it could be arbitrarily long (millions or billions of moves) but that is not a practical consideration. The odds of it not terminating within a couple of hundred moves at 9x9 is probably extremely low as long as you have eye and simple KO and do not allow suicide. If it's not uniformly random, all bets are off. If you want to be paranoid, you can limit the number of moves but it's not necessary. Something like number of empty points times some constant such as 2 or 3 might be good here. There are several eye rules used, but I believe the vast majority of us use the same one. It is stated like this: 1. An empty point surrounded by friendly stones. 2. If edge, no diagonal enemies allowed. 3. If NOT edge, only 1 diagonal enemy allowed. Some programs do these things in addition to simple ko, no suicide allowed and eye rule: 1. Limit the number of moves. (not needed) 2. Stop game if one side has many more stones than the other. 3. Detect superko (a waste of time) 4. Let pass move be one of the random moves tried. (bad) You asked if people compared the eye rules. This has been discussed on this group before and the short answer seems to be that it probably does not make too much of a difference if it's reasonably sound. You might get different opinions on this. - Don On Thu, 2008-10-09 at 13:31 +0100, Claus Reinke wrote: Hi again, with yet another question:-) Could someone please summarize the state-of-the art wrt the various ways of limiting random playouts, and the their consequences? There are several variations on the "don't fill your own eyes" (dfyoe) approach, but the way this approach is often described tends to confuse me (eg, when someone seems to claim that there is exactly one correct way of implementing this, or when someone seems to claim that this guarantees termination, or when the resulting biases are not even mentioned, let alone discussed or compared against biases in alternative implementations, etc.). I'll try to summarize my current understanding, in the hope that others will fill in the missing pieces or point out misunderstandings: - the only thing that guarantees termination of random playouts (with or without dfyoe) is the positional superko rule: no whole-board repetitions allowed. Waiting for this to kick in without taking other measures is not an option: it takes too long and the results don't tell us much about the value of the initial position. - dfyoe variations share some motivations: - reduce the likelihood of unnaturally long playouts (but not to zero) - try to avoid disastrous endgame-situation mistakes that random playouts are otherwise prone to make (and that eliminate the value of playouts) - try to be simple (efficiently implementable), with low (but non-zero) probability of introducing errors (filling in when not filling in would be better, or vice versa) - aim to steer playouts to easily counted terminal positions that bear a quantifiable relationship to the estimated value of the initial position (perhaps a more suggestive name would be "random legal fill- ins"?) - some variations of dfyoe have been reported (are there others?): [Bruegmann; Monte Carlo Go] (Gobble): The computer only passes if either no legal move is available or all legal moves reduce the eye space of one of its groups from two to one. The (over-) simplification lies in the definition of an eye, which is defined to be a field whose direct neighbors are all of the same color and whose diagonal neighbors contain no more than 1 stone of the opposite color (0 for border and corner fields). The reader may convince himself that this definition of an eye is correct for most situations, excluding sekis. .. the rule is necessary for technical reasons, and even though (if passing was implemented) the program could fi
Re: [computer-go] (early) termination of random playouts?
I wrote a simple Bloom filter and I tried feeding it a sequence of random numbers (which is what Zobrist keys would look like if there are no repetitions), to see how many false positives I would get, with different values for k. I tried using 128 bytes, 256 bytes and 512 bytes. In every case. It looks like k=2 is certainly better than k=1, and overall I think 512 bytes with k=2 is probably what I would use (probability of a false positive is about .01 after 200 moves, and about .04 after 400 moves). If you are interested in the details, I have some code and some graphs. Álvaro. On Thu, Oct 9, 2008 at 9:51 AM, Don Dailey <[EMAIL PROTECTED]> wrote: > Álvaro, > > I never tried it, but I once considered doing it. It's an intriguing > idea. Since speed is all important here I would probably try just a > single probe version (bloom filter with k = 1 where k is the number of > hash functions.) > > You have to clear the filter before each playout of course, but the > filter could be quite tiny, perhaps 256 - 1024 bytes for 9x9.One > would of course measure the trade-offs and also test k=2, etc.My > intuition is that k=1 is best for speed but it should be tested. > > There is a cost involved in just maintaining a Zobrist hash key and I > wonder if this is the greatest cost anyway? Even with the bloom filter > you have a lot of logic on top of an extremely simple move maker so I > never got motivated enough to try this. Plus, I didn't feel that it > would actually make the program stronger. > > In my programs, I don't maintain a key or do PSK in the playouts. I > have 2 versions of everything involving move generation. One make() > function tests for superko and builds a key, the other tries to be fast > and cheats. > > - Don > > > > On Thu, 2008-10-09 at 09:36 -0400, Álvaro Begué wrote: >> On Thu, Oct 9, 2008 at 9:26 AM, Don Dailey <[EMAIL PROTECTED]> wrote: >> > If you use zobrist hashing, it is probably not ridiculously slow to do >> > this. And if your play-outs are pretty heavy anyway, the cost will be >> > negligible as you say. >> >> Has anyone tried to use a Bloom filter ( >> http://en.wikipedia.org/wiki/Bloom_filter ) for this? It would very >> quickly tell you that most positions are not repetitions, and leave >> only a tiny fraction of positions to test in a deterministic way. >> >> Álvaro. >> ___ >> computer-go mailing list >> computer-go@computer-go.org >> http://www.computer-go.org/mailman/listinfo/computer-go/ > > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] (early) termination of random playouts?
If you use zobrist hashing, it is probably not ridiculously slow to do this. And if your play-outs are pretty heavy anyway, the cost will be negligible as you say. I remember John Stanback, the Zarkov chess programmer used to say that he could add anything he wanted to the evaluation function because it was so slow already. He had basically committed to heavy evaluation and it worked for him, the program was quite strong at the time. - Don On Thu, 2008-10-09 at 15:12 +0200, Magnus Persson wrote: > Valkyria does, because is superheavy anyway! A lot of weird stuff can > happen near the end of the game against programs that play randomly. I > think I implemented it because I had to to make it play correctly in > some positions. But it was a long time so I do not remember the details. > > -Magnus > > Quoting Michael Williams <[EMAIL PROTECTED]>: > > > > > While this may be true, I don't think anyone uses the superko rule in > > the playouts. It would slow them down too much and there would > > probably not be much benefit beyond the theoretical. > > > > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ signature.asc Description: This is a digitally signed message part ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] (early) termination of random playouts?
Álvaro, I never tried it, but I once considered doing it. It's an intriguing idea. Since speed is all important here I would probably try just a single probe version (bloom filter with k = 1 where k is the number of hash functions.) You have to clear the filter before each playout of course, but the filter could be quite tiny, perhaps 256 - 1024 bytes for 9x9.One would of course measure the trade-offs and also test k=2, etc.My intuition is that k=1 is best for speed but it should be tested. There is a cost involved in just maintaining a Zobrist hash key and I wonder if this is the greatest cost anyway? Even with the bloom filter you have a lot of logic on top of an extremely simple move maker so I never got motivated enough to try this. Plus, I didn't feel that it would actually make the program stronger. In my programs, I don't maintain a key or do PSK in the playouts. I have 2 versions of everything involving move generation. One make() function tests for superko and builds a key, the other tries to be fast and cheats. - Don On Thu, 2008-10-09 at 09:36 -0400, Álvaro Begué wrote: > On Thu, Oct 9, 2008 at 9:26 AM, Don Dailey <[EMAIL PROTECTED]> wrote: > > If you use zobrist hashing, it is probably not ridiculously slow to do > > this. And if your play-outs are pretty heavy anyway, the cost will be > > negligible as you say. > > Has anyone tried to use a Bloom filter ( > http://en.wikipedia.org/wiki/Bloom_filter ) for this? It would very > quickly tell you that most positions are not repetitions, and leave > only a tiny fraction of positions to test in a deterministic way. > > Álvaro. > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ signature.asc Description: This is a digitally signed message part ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] (early) termination of random playouts?
On Thu, Oct 9, 2008 at 9:26 AM, Don Dailey <[EMAIL PROTECTED]> wrote: > If you use zobrist hashing, it is probably not ridiculously slow to do > this. And if your play-outs are pretty heavy anyway, the cost will be > negligible as you say. Has anyone tried to use a Bloom filter ( http://en.wikipedia.org/wiki/Bloom_filter ) for this? It would very quickly tell you that most positions are not repetitions, and leave only a tiny fraction of positions to test in a deterministic way. Álvaro. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] (early) termination of random playouts?
On Thu, 2008-10-09 at 09:15 -0400, Michael Williams wrote: > I stand corrected. Do you know if you were able to measure a strength > increase? I think you almost have to have either superko or some limiting device in a program with heavy play-outs. I don't know what Magnus does, but Lazarus with heavy play-outs would crash until I artificially limited the length of the play-outs. I probably could have also solved this with a superko rule. So whether it actually helps his program play better may not be very important although I'm interested to see what he says on this. Maybe Lazarus would be a little stronger if I did this too? - Don > > Magnus Persson wrote: > > Valkyria does, because is superheavy anyway! A lot of weird stuff can > > happen near the end of the game against programs that play randomly. I > > think I implemented it because I had to to make it play correctly in > > some positions. But it was a long time so I do not remember the details. > > > > -Magnus > > > > Quoting Michael Williams <[EMAIL PROTECTED]>: > > > >> > >> While this may be true, I don't think anyone uses the superko rule in > >> the playouts. It would slow them down too much and there would > >> probably not be much benefit beyond the theoretical. > > > > > > > > ___ > > computer-go mailing list > > computer-go@computer-go.org > > http://www.computer-go.org/mailman/listinfo/computer-go/ > > > > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ signature.asc Description: This is a digitally signed message part ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] (early) termination of random playouts?
No, if there was a serious problem it would perhaps only happen for 1 in 1000 games. So it would be pointless trying to measure it. And some of these problems only happens against extremely weak programs. At least in my experience. -Magnus Quoting Michael Williams <[EMAIL PROTECTED]>: I stand corrected. Do you know if you were able to measure a strength increase? ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] (early) termination of random playouts?
I stand corrected. Do you know if you were able to measure a strength increase? Magnus Persson wrote: Valkyria does, because is superheavy anyway! A lot of weird stuff can happen near the end of the game against programs that play randomly. I think I implemented it because I had to to make it play correctly in some positions. But it was a long time so I do not remember the details. -Magnus Quoting Michael Williams <[EMAIL PROTECTED]>: While this may be true, I don't think anyone uses the superko rule in the playouts. It would slow them down too much and there would probably not be much benefit beyond the theoretical. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] (early) termination of random playouts?
Claus, I think you have summarized things better than I am going to, but here goes anyway: If the play-outs are uniformly random and you have eye rule, it is guaranteed to terminate as long as you use simple ko. It might even be guaranteed to terminate if you don't, I don't know. Although it's guaranteed to terminate, it could be arbitrarily long (millions or billions of moves) but that is not a practical consideration. The odds of it not terminating within a couple of hundred moves at 9x9 is probably extremely low as long as you have eye and simple KO and do not allow suicide. If it's not uniformly random, all bets are off. If you want to be paranoid, you can limit the number of moves but it's not necessary. Something like number of empty points times some constant such as 2 or 3 might be good here. There are several eye rules used, but I believe the vast majority of us use the same one. It is stated like this: 1. An empty point surrounded by friendly stones. 2. If edge, no diagonal enemies allowed. 3. If NOT edge, only 1 diagonal enemy allowed. Some programs do these things in addition to simple ko, no suicide allowed and eye rule: 1. Limit the number of moves. (not needed) 2. Stop game if one side has many more stones than the other. 3. Detect superko (a waste of time) 4. Let pass move be one of the random moves tried. (bad) You asked if people compared the eye rules. This has been discussed on this group before and the short answer seems to be that it probably does not make too much of a difference if it's reasonably sound. You might get different opinions on this. - Don On Thu, 2008-10-09 at 13:31 +0100, Claus Reinke wrote: > Hi again, with yet another question:-) > > Could someone please summarize the state-of-the art wrt the various ways > of limiting random playouts, and the their consequences? > > There are several variations on the "don't fill your own eyes" (dfyoe) > approach, > but the way this approach is often described tends to confuse me (eg, when > someone seems to claim that there is exactly one correct way of implementing > this, or when someone seems to claim that this guarantees termination, or when > the resulting biases are not even mentioned, let alone discussed or compared > against biases in alternative implementations, etc.). > > I'll try to summarize my current understanding, in the hope that others will > fill in the missing pieces or point out misunderstandings: > > - the only thing that guarantees termination of random playouts (with or > without dfyoe) is the positional superko rule: no whole-board repetitions > allowed. Waiting for this to kick in without taking other measures is not > an option: it takes too long and the results don't tell us much about the > value of the initial position. > > - dfyoe variations share some motivations: > - reduce the likelihood of unnaturally long playouts (but not to zero) > - try to avoid disastrous endgame-situation mistakes that random playouts > are otherwise prone to make (and that eliminate the value of playouts) > - try to be simple (efficiently implementable), with low (but non-zero) > probability of introducing errors (filling in when not filling in > would be > better, or vice versa) > - aim to steer playouts to easily counted terminal positions that bear a > quantifiable relationship to the estimated value of the initial > position > (perhaps a more suggestive name would be "random legal fill-ins"?) > > - some variations of dfyoe have been reported (are there others?): > > [Bruegmann; Monte Carlo Go] (Gobble): > The computer only passes if either no legal move is available or all > legal moves reduce the eye space of one of its groups from two to > one. The (over-) simplification lies in the definition of an eye, > which > is defined to be a field whose direct neighbors are all of the same > color and whose diagonal neighbors contain no more than 1 stone > of the opposite color (0 for border and corner fields). The reader > may convince himself that this definition of an eye is correct for > most situations, excluding sekis. > .. the rule is necessary for technical reasons, and even though (if > passing was implemented) the program could figure out the rule by > itself, that would be very inefficient .. > > [Bouzy,Helmstetter; Monte-Carlo Go Developments] (Olga/Oleg) > Without this rule, the random player would never make living groups > and the games would never end. There are different ways to define > "eyes" as precisely as possible with domain-dependent knowledge > such as Fotland (2002) and Chen and Chen (1999). Our definitions > are designed to be integrated into a random Go-playing program; > they are simple and fast but not correct in some
Re: [computer-go] (early) termination of random playouts?
Valkyria does, because is superheavy anyway! A lot of weird stuff can happen near the end of the game against programs that play randomly. I think I implemented it because I had to to make it play correctly in some positions. But it was a long time so I do not remember the details. -Magnus Quoting Michael Williams <[EMAIL PROTECTED]>: While this may be true, I don't think anyone uses the superko rule in the playouts. It would slow them down too much and there would probably not be much benefit beyond the theoretical. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] (early) termination of random playouts?
Claus Reinke wrote: - the only thing that guarantees termination of random playouts (with or without dfyoe) is the positional superko rule: no whole-board repetitions allowed. Waiting for this to kick in without taking other measures is not an option: it takes too long and the results don't tell us much about the value of the initial position. While this may be true, I don't think anyone uses the superko rule in the playouts. It would slow them down too much and there would probably not be much benefit beyond the theoretical. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] (early) termination of random playouts?
Hi again, with yet another question:-) Could someone please summarize the state-of-the art wrt the various ways of limiting random playouts, and the their consequences? There are several variations on the "don't fill your own eyes" (dfyoe) approach, but the way this approach is often described tends to confuse me (eg, when someone seems to claim that there is exactly one correct way of implementing this, or when someone seems to claim that this guarantees termination, or when the resulting biases are not even mentioned, let alone discussed or compared against biases in alternative implementations, etc.). I'll try to summarize my current understanding, in the hope that others will fill in the missing pieces or point out misunderstandings: - the only thing that guarantees termination of random playouts (with or without dfyoe) is the positional superko rule: no whole-board repetitions allowed. Waiting for this to kick in without taking other measures is not an option: it takes too long and the results don't tell us much about the value of the initial position. - dfyoe variations share some motivations: - reduce the likelihood of unnaturally long playouts (but not to zero) - try to avoid disastrous endgame-situation mistakes that random playouts are otherwise prone to make (and that eliminate the value of playouts) - try to be simple (efficiently implementable), with low (but non-zero) probability of introducing errors (filling in when not filling in would be better, or vice versa) - aim to steer playouts to easily counted terminal positions that bear a quantifiable relationship to the estimated value of the initial position (perhaps a more suggestive name would be "random legal fill-ins"?) - some variations of dfyoe have been reported (are there others?): [Bruegmann; Monte Carlo Go] (Gobble): The computer only passes if either no legal move is available or all legal moves reduce the eye space of one of its groups from two to one. The (over-) simplification lies in the definition of an eye, which is defined to be a field whose direct neighbors are all of the same color and whose diagonal neighbors contain no more than 1 stone of the opposite color (0 for border and corner fields). The reader may convince himself that this definition of an eye is correct for most situations, excluding sekis. .. the rule is necessary for technical reasons, and even though (if passing was implemented) the program could figure out the rule by itself, that would be very inefficient .. [Bouzy,Helmstetter; Monte-Carlo Go Developments] (Olga/Oleg) Without this rule, the random player would never make living groups and the games would never end. There are different ways to define "eyes" as precisely as possible with domain-dependent knowledge such as Fotland (2002) and Chen and Chen (1999). Our definitions are designed to be integrated into a random Go-playing program; they are simple and fast but not correct in some cases. In OLGA, an eye is an empty intersection surrounded by stones of one colour with two liberties or more. In OLEG, an eye is an empty intersection surrounded by stones belonging to the same string. The upside of both definitions is the speed of the programs. OLEG's definition is simpler and faster than OLGA's. Both approaches have the downside of being wrong in some cases. OLEG's definition is very restrictive: OLEG's eyes are actual true eyes but it may fill an actual eye surrounded by more than one string. Besides, OLGA has a fuzzy and optimistic definition: it never fills an actual eye but, to connect its stones surrounding an OLGA's eye, OLGA always expects one adjacent stone to be put into atari. Gobble's definition seems to be the one used in recent discussions here. Has anyone compared the different eye definitions, without other changes? Or looked for systematic (exploitable) errors (erroneous fill-ins, erroneous inability to fill in)? What about the remaining unrealistic and overlong games (filling large parts of the board without making eyes, then killing all and repeat with small variations)? What about that claim that "the program could figure out the rule by itself"? Comments/pointers appreciated (especially if there was a discussion that limited the options to Gobble's definition, as recent threads seemed to suggest). Claus ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/