Re: [computer-go] Dynamic komi at high handicaps
Maybe I should ask first, for clarity sake, is MCTS performance in handicap games currently a problem? Mark Yes, it's a big problem. And thats not a matter of opinion. MC bots, leading a game by a large margin, will give away their advantage lighly except for the last half point. Even on a 9*9 board, even if the bot wins more games on even with 7.5 komi, that doesn't mean that it's impossible for the human to win, giving a 2 stone handicap. All it needs is a single bot missjudgement after the game got close. Granted, bots are really excellent at defending the last half point advantage tooth and claw. I'm just saying that it should be impossible for the human to win on 2 stones, and it isn't. If they are behind by a large margin they will play either random or ko threat type moves. So there is a kind of symmetry here. Beeing too far ahead or behind ruins the bots plays. The biggest practical problem right now is poor play against pros on a 19*19 board, taking a large handicap. Special fuseki patterns are only a patch. When, after a decent opening, the regular patterns take over, they usually immediately start to work against the bots own previous moves. Looking into the horses mouth, instead of invoking Aristotle, is really the only way to find out. I had hoped that programmers would find the idea interesting enough to try it out. Instead, I found myself in a hand waving contest. Granted, I started it, so I can't complain. Thanks to Ingo for simulating dynamic komi by hand to give programmers something less speculative. Btw, I played 2 games (as gogonuts) on KGS against goIngo(really ManyFaces). I won both on 5 stones. But in the first one, with komi adjusted by Ingo, I had to make a very critical invasion that should not really have worked. In the second game I won without problems. At the time, Ingo adjusted the win rate for w to 50%. Since then, with his limited trials, Ingo found out that adjusting the komi to give each side a 50% win rate isn't optimal. His current rule is to adjust to 42% for w. This is ofcourse only a crude start, but sophistication can only be introduced by programmers. Stefan ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
On Aug 12, 2009, at 10:31 PM, Petri Pitkanen wrote: Maybe they are long way from giving handicaps to you. But best of bots in KGS are around 2k and there are hundreds of 9k and weaker players present there at all times. So being able to play white is worthy thing at least for commercial bot. That's correct. I have a more "academic" point of view. Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
On Aug 12, 2009, at 3:43 PM, Don Dailey wrote: I believe the only thing wrong with the current MCTS strategy is that you cannot get a statistical meaningful number of samples when almost all games are won or lost.You can get more meanful NUMBER of samples by adjusting komi, but unfortunately you are sampling the wrong thing - an approximation of the actual goal. Since the approximation may be wrong or right, your algorithm is not scalable. You could run on a billion processors sampling billions of nodes per seconds and with no flaw to the search or the playouts still play a move that gives you no chances of winning. I think you got it the wrong way round. Without dynamic komi (in high ha ndicap games) even trillions of simulations with _not_ find a move that creates a winning line, because the is none, if the opponet has the same strength as you. WHITE has to assume that BLACK will make mistakes, otherwise there would be no handicap. Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
Maybe they are long way from giving handicaps to you. But best of bots in KGS are around 2k and there are hundreds of 9k and weaker players present there at all times. So being able to play white is worthy thing at least for commercial bot. Petri 2009/8/13 Christoph Birk : > > On Aug 12, 2009, at 2:51 PM, Don Dailey wrote: >> >> I disagree. I think strong players have a sense of what kind of mistakes >> to expect, and try to provoke those mistakes. Dynamic komi does not model >> that. >> >> It also does the opposite of making the program play provocatively, which >> I believe is necessary to beat a weaker player with a large handicap against >> you. Instead of making it fight, it encourages the program to be content >> with less. How does this model strong handicap players? > > Maybe dynamic komi works better for BLACK? Computers are still > a looong way from actually _giving_ a handicap. > > Christoph > > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > -- Petri Pitkänen e-mail: petri.t.pitka...@gmail.com ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
On Aug 12, 2009, at 3:10 PM, Don Dailey wrote: If the handicap is fair, their chance is about 50/50. However, rigging komi to give the same chance is NOT what humans do. The only thing you said that I consider correct is that humans estimate their chances to be about 50/50. One thing humans do is to set short term goals and I think dynamic komi is an attempt to do that - but it's a misguided attempt because you are setting the WRONG short term goal. Setting the komi to that the game is 50/50 creates the (correct) short term goal of gaining a few points, then again, and again ... Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
On Aug 12, 2009, at 2:51 PM, Don Dailey wrote: I disagree. I think strong players have a sense of what kind of mistakes to expect, and try to provoke those mistakes. Dynamic komi does not model that. It also does the opposite of making the program play provocatively, which I believe is necessary to beat a weaker player with a large handicap against you.Instead of making it fight, it encourages the program to be content with less. How does this model strong handicap players? Maybe dynamic komi works better for BLACK? Computers are still a looong way from actually _giving_ a handicap. Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Monte-Carlo Simulation Balancing
After about the 5th reading, I'm concluding that this is an excellent paper. Is anyone (besides the authors) doing research based on this? There is a lot to do. David Silver wrote: Hi everyone, Please find attached my ICML paper with Gerry Tesauro on automatically learning a simulation policy for Monte-Carlo Go. Our preliminary results show a 200+ Elo improvement over previous approaches, although our experiments were restricted to simple Monte-Carlo search with no tree on small boards. Abstract In this paper we introduce the first algorithms for efficiently learning a simulation policy for Monte-Carlo search. Our main idea is to optimise the balance of a simulation policy, so that an accurate spread of simulation outcomes is maintained, rather than optimising the direct strength of the simulation policy. We develop two algorithms for balancing a simulation policy by gradient descent. The first algorithm optimises the balance of complete simulations, using a policy gradient algorithm; whereas the second algorithm optimises the balance over every two steps of simulation. We compare our algorithms to reinforcement learning and supervised learning algorithms for maximising the strength of the simulation policy. We test each algorithm in the domain of 5x5 and 6x6 Computer Go, using a softmax policy that is parameterised by weights for a hundred simple patterns. When used in a simple Monte-Carlo search, the policies learnt by simulation balancing achieved significantly better performance, with half the mean squared error of a uniform random policy, and equal overall performance to a sophisticated Go engine. -Dave ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Dynamic komi at high handicaps
No thought experiments are going to convince me on this subject. Someone will have to do an actual test. Ingo's work is the best to date on the subject. Anyone who is overly committed to thought experiments should consider that we are talking about applying MCTS to Go, that most deterministic of all games. The whole idea is absurd from a logical perspective. Despite logic, some things just seem to work. Maybe dynamic komi will work. Or maybe we need to maximize point differential. Or maybe we just need to get stronger. Only actual experiments and testing will tell. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
2009/8/12 Don Dailey : > > If the program makes decisions about the best way to win N points, there > is no guarantee that this is ALSO the best way to win N+1 points. Although this is obviously true, that doesn't automatically mean it's not the best approach. Because there's a hidden assumption in there. And that is it's not the best way to win by N+1, given proper play by the opponent thereafter. If not perfect, then at least as strong as the stronger player. Whatever your strategy, even when you catch up a lot there's no guarantee the opponent will keep making mistakes enough for you to win. Human players generally do keep track whether they seem to be catching up 'enough' and will take more risk when progress is not in line with the progress of the game. I don't think anyone is trying to argue that adjusting komi is the perfect answer. But what apparently is observed (I never tried myself) is that currently MCTS does poorly in handicap games. So the question is whether adjusting the handicap would improve performance. The positions seem to be entrenched. But I have yet to see conclusive evidence or persuasive arguments one way or the other. Maybe I should ask first, for clarity sake, is MCTS performance in handicap games currently a problem? Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
Don Dailey wrote: > Matthew Woodcraft wrote: > > Don Dailey wrote: > > > The problem with MCTS programs is that they like to consolidate. You > > > set the komi and thereby give them a goal and they very quickly make > > > moves which commit to that specific goal. > > > > How did you form this opinion? Can you show an example game record > > (on 19x19) showing this behaviour? > Your kidding, right?Does anyone honestly dispute this? I believe it to be false, yes. There are plenty of records of 19x19 MCTS computer games available. I haven't seen one in which the computer very quickly committed to anything, and I don't believe you have either. I suggest your view of computer go may be distorted by too much concentration on 9x9. -M- ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
"For instance I am sure he will not sit merrily by and watch his opponent consolidate a won game just so that he can have a "respectable" but losing score.Dynamic komi of course does not address that at all." This seems self evident, but it may actually be a treacherous conclusion. Dynamic komi really only has one legitimate use. To avoid flat lining or, taking handicap, saturated win rates. This doesnt mean that the program needs to be "satisfied" with losing.(A komi that put the win rate to 50% would model that) A win rate range between 35 and 45 percent might make the program locally ambitious, without attempting the impossible with ko threat type moves. Stefan - Original Message - From: Don Dailey To: computer-go Sent: Thursday, August 13, 2009 12:10 AM Subject: Re: [computer-go] Dynamic komi at high handicaps On Wed, Aug 12, 2009 at 5:58 PM, Mark Boon wrote: 2009/8/12 Don Dailey : > > I disagree about this being what humans do. They do not set a fake komi > and then try to win only by that much. I didn't say that humans do that. I said they consider their chance 50-50. For an MC program to consider its chances to be 50-50 you'd have to up the komi. There's a difference. If the handicap is fair, their chance is about 50/50. However, rigging komi to give the same chance is NOT what humans do. The only thing you said that I consider correct is that humans estimate their chances to be about 50/50. One thing humans do is to set short term goals and I think dynamic komi is an attempt to do that - but it's a misguided attempt because you are setting the WRONG short term goal. The human will have a much more specific goal that is going to be compatible with his hope of winning the game.For instance I am sure he will not sit merrily by and watch his opponent consolidate a won game just so that he can have a "respectable" but losing score.Dynamic komi of course does not address that at all. > > I think their model is somewhat incremental, trying to win a bit at a time > but I'm quite convinced that they won't just let the opponent consolidate > like MCTS does. With dynamic komi the program will STILL just try to > consolidate and not care about what his opponent does. But strong players > will know that letting your opponent consolidate is not going to work. So > they will keep things complicated and challenge their weaker opponents > everywhere that is important. > It's difficult to make hard claims about this. I don't agree at all that the stronger player constantly needs to keep things complicated. Personally I tend to play solidly when giving a handicap. Because most damage is self-inflicted. You can either make a guess what the weaker player doesn't know, or you can give him the initiative and he'll show you. I prefer the latter approach. When done properly, I don't see how an MCTS program would consolidate all the time. Doing so would keep the position stable while the komi declines. As soon as he gets behind the komi degradation curve play will automatically get more dynamic in an attempt to catch up. The problem is: we're speculating. The proof is in the pudding. Agreed. - Don Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ -- ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
As for how to beat weaker players ... the strong players whom I have observed make strong, stable positions; they wait for the weaker player to make mistakes. The stronger player will leave things unresolved for longer, knowing that there will be time to extend in one direction or another later in the game. They'll include some of the more interesting, obscure joseki, which the weaker player will not grok appropriately. I have seen players who make unsound "trick plays", but these are the kyu players; dan-level players know that "trick plays" can become costly mistakes. They'll use them for teaching purposes, not to win - but that's a different kind of game entirely. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
"What seems difficult to me however is to devise a reasonable way to decrease this komi as the game progresses" Certainly that is the main problem. But the main considerations are not so hard to find 1. Win rate of the best move. 2. How far has the game progressed 3. deviation between the win rates of all possible moves.( with higher deviation dynamic komi is less called for) Stefan - Original Message - From: "Mark Boon" To: "computer-go" Sent: Wednesday, August 12, 2009 11:36 PM Subject: Re: [computer-go] Dynamic komi at high handicaps I started to write something on this subject a while ago but it got caught up in other things I had to do. When humans play a (high) handicap game, they don't estimate a high winning percentage for the weaker player. They'll consider it to be more or less 50-50. So to adjust the komi at the beginning of the game such that the winning percentage becomes 50% seems a very reasonable idea to me. This is what humans do too, they'll assume the stronger player will be able to catch up a certain number of points to overcome the handicap. What seems difficult to me however is to devise a reasonable way to decrease this komi as the game progresses. In an actual game the stronger player catches up in leaps and bounds, not smoothly. In MC things are not always intuitive though. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
2009/8/12 Stefan Kaitschick > What a bot does with its playouts in a handicap situation is to > essentially try to beat itself, despite the handicap. > > And in this situation the bot reacts in a very human way, it becomes > despondend. > > Adjusting the komi dynamically shifts the goal from winning to catching up > quickly enough. > I think that is the problem though. You have only 1 thing you can control, how to set the komi before doing the search.But how the program deals with your artificial (and crude) setting is unpredictable. What you really need is some kind of way to tell it to try to win some territory, but not spoil your chances of winning a bit more later. It's easier to win N points if you know in advance that you will not be asked later to win N more points.And I'm afraid that is what will happen all too often - the program will maximize it's chances of winning N, but this does not always translate directly into winning N plus MORE. > > I feel that that is the natural handicap strategy, not a band-aid. > It's a scalability issue, which is why I call it a band-aid. It's not natural because it's an artificial goal, not a natural one and certainly not the ACTUAL goal, which is to win the game. Do you want to win N points, or do you want to win the game?And we all KNOW that it will try to maximize it's chances of winning N points, regardless of the consequences beyond that. You would never ask a runner to stop 50 feet short of the finish line, then ask him to go 10 feet more, and so on. The runner plans his strategy based on the actual distance run and anything else would change his pacing strategy in a bad way. If the program makes decisions about the best way to win N points, there is no guarantee that this is ALSO the best way to win N+1 points. This is the implicit assumption in this strategy, that the best way to win with ANY komi is the same and that the same moves are just as good no matter what. In fact the more you must win by, the more chances you must take. I believe the only thing wrong with the current MCTS strategy is that you cannot get a statistical meaningful number of samples when almost all games are won or lost.You can get more meanful NUMBER of samples by adjusting komi, but unfortunately you are sampling the wrong thing - an approximation of the actual goal. Since the approximation may be wrong or right, your algorithm is not scalable. You could run on a billion processors sampling billions of nodes per seconds and with no flaw to the search or the playouts still play a move that gives you no chances of winning. - Don > Ofcourse, the dynamic komi must be adjusted down to zero in good time. > > I think there are 2 main reasons that this hasnt been fully explored sofar. > > 1. Trying to maximize the score turned out to be a huge mistake, compared > to trying to maximize the winrate. > This makes dynamic komi a kind of blind spot. > > 2. Handicap go wasnt given special attention sofar. > > > Stefan > > > - Original Message - > *From:* Don Dailey > *To:* computer-go > *Sent:* Wednesday, August 12, 2009 11:24 PM > *Subject:* Re: [computer-go] Dynamic komi at high handicaps > > Terry, > > I understand the reasoning behind this, your thought experiment did not add > anything to my understanding. And I agree that if the program is strong > enough and the handicap is high enough this is probably better than doing > nothing at all. > > However, I think there must be something that is more along the lines of > treating the disease, not the symptoms.You might be able to put a band > aid on the problem but you have not addressed the real issue in a systematic > way. > > Besides, I have not yet seen anyone demonstrate that this works - it's > always talked about but never implemented.It is made to sound so simple > that you have to wonder where the implementation is and why the strong > programs do not have it. > > - Don > > > > > 2009/8/12 terry mcintyre > >> Consider this thought experiment. >> >> You sit down at a board and your opponent has a 9-stone handicap. >> >> By any objective measure of the game, you should resign immediately. >> >> All your win-rate calculations report this hopeless state of affairs. >> >> Winrate gives you no objective basis to prefer one move or another. >> >> But, you think, what if I can make a small group? What if I try for a >> lesser goal, such as "don't lose by more than 90 points?" >> >> Your opponent has a 9 stone handicap because he makes more mistakes than >> you do. >> >> As the game progresses, those mistakes add up. You set your goal higher - >> losing by only 50 points; losing by only 10 points. >> >> The changing goal permits you to discriminate in a field which would >> otherwise look like a dark, desolate, win-less landscape. >> >> Terry McIntyre >> >> “We hang the petty thieves and appoint the great ones to public office.” >> -- Aesop >> >>
Re: [computer-go] Dynamic komi at high handicaps
"You are giving the program an arbitrary short term goal which may, or may not be compatible with the long term goal of winning the game." Don, this is a very important consideration. How can an illusionary goal be better than the real goal? But I would argue that in the handicap situation, catching up quickly enough is actually the real goal. You write: "And as the base program gets stronger this aspect of the program becomes more and more of a wart." This I disagree with. Because no matter how strong the program will become, it will never find a way to defeat itself against a large handicap. This is effectively what a program tries to do with its playouts. The only reasonable alternative to trying to catch up quickly enough is to model the weaker players errors straight into the playouts, and try to find a direct win. But this seems more speculative to me than dynamic komi. Surely, it is also harder to implement well. Stefan - Original Message - From: Don Dailey To: computer-go Sent: Wednesday, August 12, 2009 11:11 PM Subject: Re: [computer-go] Dynamic komi at high handicaps The problem with MCTS programs is that they like to consolidate. You set the komi and thereby give them a goal and they very quickly make moves which commit to that specific goal. Commiting to less than you need to actually win will often involve sacrificing chances to win.Sometime it won't, but you cannot have a scalable algorithm which is this arbitrary. However, if the handicap is too high, the program thinks every line is a loss and it plays randomly. That's why we even consider doing this. Dynamically changing komi could be of some benefit in that situation if there is no alternative reasonable strategy, but it does not address the real problem - which is what I call the "committal consolidation" problem. You are giving the program an arbitrary short term goal which may, or may not be compatible with the long term goal of winning the game. Whether it's compatible or not is based on your own credulity - not anything predictible or that you can scale. And as the base program gets stronger this aspect of the program becomes more and more of a wart. If this can be made to work in the short term, it should be considered a temporary hack which should be fixed as soon as possible. We have to think about this anyway sooner or later because if programs continue to develop and the predictive ability of the playouts and tree search gets several hundred ELO better, these programs may start to see more and more positions as either dead won or dead lost. I'm sure we will want some kind of robust mechanism for dealing with this which is better at estimating chances that the opponent will go wrong as opposed to doing something that is a random benefit or hindrance. - Don 2009/8/12 terry mcintyre Ingo suggested something interesting - instead of changing the komi according to the move number, or some other fixed schedule, it varies according to the estimated winrate. It also, implicitly, depends on one's guess of the ability of the opponent. An interesting test would be to take an opponent known to be weaker, offer it a handicap, and tweak the dynamic komi per Ingo's suggestion. At what handicap does the ratio balance at 50:50? Can the number of handicap stones be increased with such an adaptive algorithm? Even better, play against a stronger opponent; can one increase the win rate versus strong opponents? The usual range of computer opponents is fairly narrow. None approach high-dan levels on 19x19 boards - yet. Terry McIntyre “We hang the petty thieves and appoint the great ones to public office.” -- Aesop From: Brian Sheppard To: computer-go@computer-go.org Sent: Wednesday, August 12, 2009 12:33:13 PM Subject: [computer-go] Dynamic komi at high handicaps >The small samples is probably the least of the problems with this. Do you >actually believe that you can play games against it and not be subjective in >your observations or how you play against it? These are computer-vs-computer games. Ingo is manually transferring moves between two computer opponents. The result does support Ingo's belief that dynamic Komi will help programs play high handicap games. Due to small sample size it isn't very strong evidence. But maybe it is enough to induce a programmer who actually plays in such games to create a more exhaustive test. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.c
Re: [computer-go] Dynamic komi at high handicaps
On Wed, Aug 12, 2009 at 6:03 PM, Matthew Woodcraft wrote: > Don Dailey wrote: > > The problem with MCTS programs is that they like to consolidate. You > > set the komi and thereby give them a goal and they very quickly make > > moves which commit to that specific goal. > > How did you form this opinion? Can you show an example game record > (on 19x19) showing this behaviour? Your kidding, right?Does anyone honestly dispute this? I'm certainly not going to entertain this with examples - if you don't understand this I'm sure we would waste a dozen emails arguing about it regardless of what I could show you. - Don > > -M- > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
What a bot does with its playouts in a handicap situation is to essentially try to beat itself, despite the handicap. And in this situation the bot reacts in a very human way, it becomes despondend. Adjusting the komi dynamically shifts the goal from winning to catching up quickly enough. I feel that that is the natural handicap strategy, not a band-aid. Ofcourse, the dynamic komi must be adjusted down to zero in good time. I think there are 2 main reasons that this hasnt been fully explored sofar. 1. Trying to maximize the score turned out to be a huge mistake, compared to trying to maximize the winrate. This makes dynamic komi a kind of blind spot. 2. Handicap go wasnt given special attention sofar. Stefan - Original Message - From: Don Dailey To: computer-go Sent: Wednesday, August 12, 2009 11:24 PM Subject: Re: [computer-go] Dynamic komi at high handicaps Terry, I understand the reasoning behind this, your thought experiment did not add anything to my understanding. And I agree that if the program is strong enough and the handicap is high enough this is probably better than doing nothing at all. However, I think there must be something that is more along the lines of treating the disease, not the symptoms.You might be able to put a band aid on the problem but you have not addressed the real issue in a systematic way. Besides, I have not yet seen anyone demonstrate that this works - it's always talked about but never implemented.It is made to sound so simple that you have to wonder where the implementation is and why the strong programs do not have it. - Don 2009/8/12 terry mcintyre Consider this thought experiment. You sit down at a board and your opponent has a 9-stone handicap. By any objective measure of the game, you should resign immediately. All your win-rate calculations report this hopeless state of affairs. Winrate gives you no objective basis to prefer one move or another. But, you think, what if I can make a small group? What if I try for a lesser goal, such as "don't lose by more than 90 points?" Your opponent has a 9 stone handicap because he makes more mistakes than you do. As the game progresses, those mistakes add up. You set your goal higher - losing by only 50 points; losing by only 10 points. The changing goal permits you to discriminate in a field which would otherwise look like a dark, desolate, win-less landscape. Terry McIntyre “We hang the petty thieves and appoint the great ones to public office.” -- Aesop From: Don Dailey To: computer-go Sent: Wednesday, August 12, 2009 1:05:36 PM Subject: Re: [computer-go] Dynamic komi at high handicaps Ok, I misunderstood his testing procedure. What he is doing is far more scientific than what I thought he was doing. There has got to be something better than this. What we need is a way to make the playouts more meaningful but not by artificially reducing our actual objective which is to win. For the high handicap games, shouldn't the goal be to maximize the score? Instead of adjusting komi why not just change the goal to win as much of the board as possible?This would be far more honest and reliable I would think and the program would not be forced to constantly waste effort on constantly changing goals. - Don On Wed, Aug 12, 2009 at 3:33 PM, Brian Sheppard wrote: >The small samples is probably the least of the problems with this. Do you >actually believe that you can play games against it and not be subjective in >your observations or how you play against it? These are computer-vs-computer games. Ingo is manually transferring moves between two computer opponents. The result does support Ingo's belief that dynamic Komi will help programs play high handicap games. Due to small sample size it isn't very strong evidence. But maybe it is enough to induce a programmer who actually plays in such games to create a more exhaustive test. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ -- ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/comput
Re: [computer-go] Dynamic komi at high handicaps
On Wed, Aug 12, 2009 at 5:58 PM, Mark Boon wrote: > 2009/8/12 Don Dailey : > > > > I disagree about this being what humans do. They do not set a fake komi > > and then try to win only by that much. > > I didn't say that humans do that. I said they consider their chance > 50-50. For an MC program to consider its chances to be 50-50 you'd > have to up the komi. There's a difference. If the handicap is fair, their chance is about 50/50. However, rigging komi to give the same chance is NOT what humans do. The only thing you said that I consider correct is that humans estimate their chances to be about 50/50. One thing humans do is to set short term goals and I think dynamic komi is an attempt to do that - but it's a misguided attempt because you are setting the WRONG short term goal. The human will have a much more specific goal that is going to be compatible with his hope of winning the game.For instance I am sure he will not sit merrily by and watch his opponent consolidate a won game just so that he can have a "respectable" but losing score.Dynamic komi of course does not address that at all. > > > > I think their model is somewhat incremental, trying to win a bit at a > time > > but I'm quite convinced that they won't just let the opponent consolidate > > like MCTS does. With dynamic komi the program will STILL just try to > > consolidate and not care about what his opponent does. But strong > players > > will know that letting your opponent consolidate is not going to work. > So > > they will keep things complicated and challenge their weaker opponents > > everywhere that is important. > > > > It's difficult to make hard claims about this. I don't agree at all > that the stronger player constantly needs to keep things complicated. > Personally I tend to play solidly when giving a handicap. Because most > damage is self-inflicted. You can either make a guess what the weaker > player doesn't know, or you can give him the initiative and he'll show > you. I prefer the latter approach. > > When done properly, I don't see how an MCTS program would consolidate > all the time. Doing so would keep the position stable while the komi > declines. As soon as he gets behind the komi degradation curve play > will automatically get more dynamic in an attempt to catch up. > > The problem is: we're speculating. The proof is in the pudding. Agreed. - Don > > > Mark > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
Don Dailey wrote: > The problem with MCTS programs is that they like to consolidate. You > set the komi and thereby give them a goal and they very quickly make > moves which commit to that specific goal. How did you form this opinion? Can you show an example game record (on 19x19) showing this behaviour? -M- ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
In practical terms, the problem to solve is the reverse: how do we encourage weak programs to hang on to as much of their advantage as possible, against stronger players? In 2020, we can worry about how to beat pro players who take large handicaps against computer programs. Terry McIntyre “We hang the petty thieves and appoint the great ones to public office.” -- Aesop From: Don Dailey To: computer-go Sent: Wednesday, August 12, 2009 2:51:09 PM Subject: Re: [computer-go] Dynamic komi at high handicaps 2009/8/12 terry mcintyre Most experiments are done on even games; this dynamic algorithm applies particularly to handicap games.In that context, it is not an ungainly kludge, but actually reflects the assessment of evenly matched pro players - they look at the board, and see a victory of n times 10 handicap stones ( or something roughly comparable ) for black. > > >This matters because today's programs are not even close to playing at the pro >level; to win respect, they'll have to master handicap games - and to do that, >they'll need to do two things. First, they'll need to model the expectation >that black with a handicap _should_ win big. Second, they'll need to behave >gracefully as that initial advantage is whittled down. I disagree. I think strong players have a sense of what kind of mistakes to expect, and try to provoke those mistakes. Dynamic komi does not model that. It also does the opposite of making the program play provocatively, which I believe is necessary to beat a weaker player with a large handicap against you. Instead of making it fight, it encourages the program to be content with less. How does this model strong handicap players? - Don > >Existing programs don't do either of those two things well. They're tuned >toward > even-game strategy. > > >Terry McIntyre > > >“We hang the petty thieves and appoint the great ones to public office.” -- >Aesop > > > > > >___ >>computer-go mailing list >computer-go@computer-go.org >http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
2009/8/12 Don Dailey : > > I disagree about this being what humans do. They do not set a fake komi > and then try to win only by that much. I didn't say that humans do that. I said they consider their chance 50-50. For an MC program to consider its chances to be 50-50 you'd have to up the komi. There's a difference. > > I think their model is somewhat incremental, trying to win a bit at a time > but I'm quite convinced that they won't just let the opponent consolidate > like MCTS does. With dynamic komi the program will STILL just try to > consolidate and not care about what his opponent does. But strong players > will know that letting your opponent consolidate is not going to work. So > they will keep things complicated and challenge their weaker opponents > everywhere that is important. > It's difficult to make hard claims about this. I don't agree at all that the stronger player constantly needs to keep things complicated. Personally I tend to play solidly when giving a handicap. Because most damage is self-inflicted. You can either make a guess what the weaker player doesn't know, or you can give him the initiative and he'll show you. I prefer the latter approach. When done properly, I don't see how an MCTS program would consolidate all the time. Doing so would keep the position stable while the komi declines. As soon as he gets behind the komi degradation curve play will automatically get more dynamic in an attempt to catch up. The problem is: we're speculating. The proof is in the pudding. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
2009/8/12 terry mcintyre > Most experiments are done on even games; this dynamic algorithm applies > particularly to handicap games.In that context, it is not an ungainly > kludge, but actually reflects the assessment of evenly matched pro players - > they look at the board, and see a victory of n times 10 handicap stones ( or > something roughly comparable ) for black. > > This matters because today's programs are not even close to playing at the > pro level; to win respect, they'll have to master handicap games - and to do > that, they'll need to do two things. First, they'll need to model the > expectation that black with a handicap _should_ win big. Second, they'll > need to behave gracefully as that initial advantage is whittled down. > I disagree. I think strong players have a sense of what kind of mistakes to expect, and try to provoke those mistakes. Dynamic komi does not model that. It also does the opposite of making the program play provocatively, which I believe is necessary to beat a weaker player with a large handicap against you.Instead of making it fight, it encourages the program to be content with less. How does this model strong handicap players? - Don > > > Existing programs don't do either of those two things well. They're tuned > toward even-game strategy. > > Terry McIntyre > > “We hang the petty thieves and appoint the great ones to public office.” -- > Aesop > > > > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
On Wed, Aug 12, 2009 at 5:36 PM, Mark Boon wrote: > I started to write something on this subject a while ago but it got > caught up in other things I had to do. > > When humans play a (high) handicap game, they don't estimate a high > winning percentage for the weaker player. They'll consider it to be > more or less 50-50. So to adjust the komi at the beginning of the game > such that the winning percentage becomes 50% seems a very reasonable > idea to me. This is what humans do too, they'll assume the stronger > player will be able to catch up a certain number of points to overcome > the handicap. I disagree about this being what humans do. They do not set a fake komi and then try to win only by that much. I think their model is somewhat incremental, trying to win a bit at a time but I'm quite convinced that they won't just let the opponent consolidate like MCTS does. With dynamic komi the program will STILL just try to consolidate and not care about what his opponent does. But strong players will know that letting your opponent consolidate is not going to work.So they will keep things complicated and challenge their weaker opponents everywhere that is important. - Don > > > What seems difficult to me however is to devise a reasonable way to > decrease this komi as the game progresses. In an actual game the > stronger player catches up in leaps and bounds, not smoothly. > > In MC things are not always intuitive though. > > Mark > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
Most experiments are done on even games; this dynamic algorithm applies particularly to handicap games.In that context, it is not an ungainly kludge, but actually reflects the assessment of evenly matched pro players - they look at the board, and see a victory of n times 10 handicap stones ( or something roughly comparable ) for black. This matters because today's programs are not even close to playing at the pro level; to win respect, they'll have to master handicap games - and to do that, they'll need to do two things. First, they'll need to model the expectation that black with a handicap _should_ win big. Second, they'll need to behave gracefully as that initial advantage is whittled down. Existing programs don't do either of those two things well. They're tuned toward even-game strategy. Terry McIntyre “We hang the petty thieves and appoint the great ones to public office.” -- Aesop ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
I started to write something on this subject a while ago but it got caught up in other things I had to do. When humans play a (high) handicap game, they don't estimate a high winning percentage for the weaker player. They'll consider it to be more or less 50-50. So to adjust the komi at the beginning of the game such that the winning percentage becomes 50% seems a very reasonable idea to me. This is what humans do too, they'll assume the stronger player will be able to catch up a certain number of points to overcome the handicap. What seems difficult to me however is to devise a reasonable way to decrease this komi as the game progresses. In an actual game the stronger player catches up in leaps and bounds, not smoothly. In MC things are not always intuitive though. Mark ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
Some algorithms are special-purpose by nature. What I sketched is an approximation of my understanding of how strong players defeat weaker players with large handicaps. When Myungwan Kim faced off against MFG a few days ago, with a 7 stone handicap, he had to come up with a strategy which would ultimately win a theoretically unwinnable game. When two pros face off, and one has a 7 stone handicap, the expectation is that the one with the handicap will not merely win, but win by 70 points. A pro with such a large handicap would be satisfied with nothing less. The Nihon Ki'in has published a book of pro-pro handicap games, where black exploits the full power of the handicap stones to give white a very thorough drubbing. David Fotland might have some insights into how MFG viewed that game. Perhaps MFG thought it was so far ahead that it was indifferent about the various opening moves. Who cares if the win rate is 99.9998 or 99.7? But there are differences, which weaker players use to hang on to as much of their advantage as possible, and stronger players use to wear down that advantage. It becomes a war of attrition - whoever runs out of troops or ammo first loses the war. Terry McIntyre “We hang the petty thieves and appoint the great ones to public office.” -- Aesop From: Don Dailey To: computer-go Sent: Wednesday, August 12, 2009 2:11:58 PM Subject: Re: [computer-go] Dynamic komi at high handicaps The problem with MCTS programs is that they like to consolidate. You set the komi and thereby give them a goal and they very quickly make moves which commit to that specific goal. Commiting to less than you need to actually win will often involve sacrificing chances to win.Sometime it won't, but you cannot have a scalable algorithm which is this arbitrary. However, if the handicap is too high, the program thinks every line is a loss and it plays randomly. That's why we even consider doing this. Dynamically changing komi could be of some benefit in that situation if there is no alternative reasonable strategy, but it does not address the real problem - which is what I call the "committal consolidation" problem. You are giving the program an arbitrary short term goal which may, or may not be compatible with the long term goal of winning the game. Whether it's compatible or not is based on your own credulity - not anything predictible or that you can scale. And as the base program gets stronger this aspect of the program becomes more and more of a wart. If this can be made to work in the short term, it should be considered a temporary hack which should be fixed as soon as possible. We have to think about this anyway sooner or later because if programs continue to develop and the predictive ability of the playouts and tree search gets several hundred ELO better, these programs may start to see more and more positions as either dead won or dead lost. I'm sure we will want some kind of robust mechanism for dealing with this which is better at estimating chances that the opponent will go wrong as opposed to doing something that is a random benefit or hindrance. - Don 2009/8/12 terry mcintyre Ingo suggested something interesting - instead of changing the komi according to the move number, or some other fixed schedule, it varies according to the estimated winrate. > >It also, implicitly, depends on one's guess of the ability of the opponent. > >An interesting test would be to take an opponent known to be weaker, offer it >a handicap, and tweak the dynamic komi per Ingo's suggestion. At what handicap >does the ratio balance at 50:50? Can the number of handicap stones be >increased with such an adaptive algorithm? > >Even better, play against a stronger opponent; can one increase the win rate >versus strong opponents? > >The usual range of computer opponents is fairly narrow. None approach high-dan >levels on 19x19 boards - yet. > > Terry McIntyre > > > >“We hang the petty thieves and appoint the great ones to public office.” -- >Aesop > > > From: Brian Sheppard >To: computer-go@computer-go.org >Sent: Wednesday, August 12, 2009 12:33:13 PM >Subject: [computer-go] Dynamic komi at high handicaps > > >>>The small samples is probably the least of the problems with this. Do you >>actually believe that you can play games against it and not be subjective >in >>your observations or how you play against it? > >These are computer-vs-computer games. Ingo is manually transferring moves >between two computer opponents. > >The result does support Ingo's belief that dynamic Komi will help programs >play high handicap games. Due to small sample size it isn't very strong >>evidence. But maybe it is enough to induce a programmer who actually plays >in such games to create a more exhaustive test. > >___ >c
Re: [computer-go] Dynamic komi at high handicaps
I 100% agree with Don, dynamic komi just cant be the right approach in my opinion. One idea I just have is this : In the tree search part, instead of using a rule wich converges to MAX, use a rule wich converges to alpha*MAX + beta*AVERAGE. Do this only for plies where it is the weaker player turn (the player who benefits from handicap stones) When beta is high, it may simulate the fact that the weak player cant actualy read out sequences reliabily, thus increasing the chances of succes of the stronger player. Just a wild guess anyway... - Original Message - From: Don Dailey To: computer-go Sent: Wednesday, August 12, 2009 11:11 PM Subject: Re: [computer-go] Dynamic komi at high handicaps The problem with MCTS programs is that they like to consolidate. You set the komi and thereby give them a goal and they very quickly make moves which commit to that specific goal. Commiting to less than you need to actually win will often involve sacrificing chances to win.Sometime it won't, but you cannot have a scalable algorithm which is this arbitrary. However, if the handicap is too high, the program thinks every line is a loss and it plays randomly. That's why we even consider doing this. Dynamically changing komi could be of some benefit in that situation if there is no alternative reasonable strategy, but it does not address the real problem - which is what I call the "committal consolidation" problem. You are giving the program an arbitrary short term goal which may, or may not be compatible with the long term goal of winning the game. Whether it's compatible or not is based on your own credulity - not anything predictible or that you can scale. And as the base program gets stronger this aspect of the program becomes more and more of a wart. If this can be made to work in the short term, it should be considered a temporary hack which should be fixed as soon as possible. We have to think about this anyway sooner or later because if programs continue to develop and the predictive ability of the playouts and tree search gets several hundred ELO better, these programs may start to see more and more positions as either dead won or dead lost. I'm sure we will want some kind of robust mechanism for dealing with this which is better at estimating chances that the opponent will go wrong as opposed to doing something that is a random benefit or hindrance. - Don 2009/8/12 terry mcintyre Ingo suggested something interesting - instead of changing the komi according to the move number, or some other fixed schedule, it varies according to the estimated winrate. It also, implicitly, depends on one's guess of the ability of the opponent. An interesting test would be to take an opponent known to be weaker, offer it a handicap, and tweak the dynamic komi per Ingo's suggestion. At what handicap does the ratio balance at 50:50? Can the number of handicap stones be increased with such an adaptive algorithm? Even better, play against a stronger opponent; can one increase the win rate versus strong opponents? The usual range of computer opponents is fairly narrow. None approach high-dan levels on 19x19 boards - yet. Terry McIntyre “We hang the petty thieves and appoint the great ones to public office.” -- Aesop From: Brian Sheppard To: computer-go@computer-go.org Sent: Wednesday, August 12, 2009 12:33:13 PM Subject: [computer-go] Dynamic komi at high handicaps >The small samples is probably the least of the problems with this. Do you >actually believe that you can play games against it and not be subjective in >your observations or how you play against it? These are computer-vs-computer games. Ingo is manually transferring moves between two computer opponents. The result does support Ingo's belief that dynamic Komi will help programs play high handicap games. Due to small sample size it isn't very strong evidence. But maybe it is enough to induce a programmer who actually plays in such games to create a more exhaustive test. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ -- ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/___ computer-go mailing list computer-go@computer-go.org http:/
Re: [computer-go] Dynamic komi at high handicaps
Terry, I understand the reasoning behind this, your thought experiment did not add anything to my understanding. And I agree that if the program is strong enough and the handicap is high enough this is probably better than doing nothing at all. However, I think there must be something that is more along the lines of treating the disease, not the symptoms.You might be able to put a band aid on the problem but you have not addressed the real issue in a systematic way. Besides, I have not yet seen anyone demonstrate that this works - it's always talked about but never implemented.It is made to sound so simple that you have to wonder where the implementation is and why the strong programs do not have it. - Don 2009/8/12 terry mcintyre > Consider this thought experiment. > > You sit down at a board and your opponent has a 9-stone handicap. > > By any objective measure of the game, you should resign immediately. > > All your win-rate calculations report this hopeless state of affairs. > > Winrate gives you no objective basis to prefer one move or another. > > But, you think, what if I can make a small group? What if I try for a > lesser goal, such as "don't lose by more than 90 points?" > > Your opponent has a 9 stone handicap because he makes more mistakes than > you do. > > As the game progresses, those mistakes add up. You set your goal higher - > losing by only 50 points; losing by only 10 points. > > The changing goal permits you to discriminate in a field which would > otherwise look like a dark, desolate, win-less landscape. > > Terry McIntyre > > “We hang the petty thieves and appoint the great ones to public office.” -- > Aesop > > -- > *From:* Don Dailey > *To:* computer-go > *Sent:* Wednesday, August 12, 2009 1:05:36 PM > *Subject:* Re: [computer-go] Dynamic komi at high handicaps > > Ok, I misunderstood his testing procedure. What he is doing is far more > scientific than what I thought he was doing. > > There has got to be something better than this. What we need is a way to > make the playouts more meaningful but not by artificially reducing our > actual objective which is to win. > > For the high handicap games, shouldn't the goal be to maximize the > score? Instead of adjusting komi why not just change the goal to win as > much of the board as possible?This would be far more honest and reliable > I would think and the program would not be forced to constantly waste effort > on constantly changing goals. > > > - Don > > > > > > On Wed, Aug 12, 2009 at 3:33 PM, Brian Sheppard wrote: > >> >The small samples is probably the least of the problems with this. Do >> you >> >actually believe that you can play games against it and not be subjective >> in >> >your observations or how you play against it? >> >> These are computer-vs-computer games. Ingo is manually transferring moves >> between two computer opponents. >> >> The result does support Ingo's belief that dynamic Komi will help programs >> play high handicap games. Due to small sample size it isn't very strong >> evidence. But maybe it is enough to induce a programmer who actually plays >> in such games to create a more exhaustive test. >> >> ___ >> computer-go mailing list >> computer-go@computer-go.org >> http://www.computer-go.org/mailman/listinfo/computer-go/ >> > > > > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
The problem with MCTS programs is that they like to consolidate. You set the komi and thereby give them a goal and they very quickly make moves which commit to that specific goal. Commiting to less than you need to actually win will often involve sacrificing chances to win.Sometime it won't, but you cannot have a scalable algorithm which is this arbitrary. However, if the handicap is too high, the program thinks every line is a loss and it plays randomly. That's why we even consider doing this. Dynamically changing komi could be of some benefit in that situation if there is no alternative reasonable strategy, but it does not address the real problem - which is what I call the "committal consolidation" problem. You are giving the program an arbitrary short term goal which may, or may not be compatible with the long term goal of winning the game. Whether it's compatible or not is based on your own credulity - not anything predictible or that you can scale. And as the base program gets stronger this aspect of the program becomes more and more of a wart. If this can be made to work in the short term, it should be considered a temporary hack which should be fixed as soon as possible. We have to think about this anyway sooner or later because if programs continue to develop and the predictive ability of the playouts and tree search gets several hundred ELO better, these programs may start to see more and more positions as either dead won or dead lost. I'm sure we will want some kind of robust mechanism for dealing with this which is better at estimating chances that the opponent will go wrong as opposed to doing something that is a random benefit or hindrance. - Don 2009/8/12 terry mcintyre > Ingo suggested something interesting - instead of changing the komi > according to the move number, or some other fixed schedule, it varies > according to the estimated winrate. > > It also, implicitly, depends on one's guess of the ability of the opponent. > > > An interesting test would be to take an opponent known to be weaker, offer > it a handicap, and tweak the dynamic komi per Ingo's suggestion. At what > handicap does the ratio balance at 50:50? Can the number of handicap stones > be increased with such an adaptive algorithm? > > Even better, play against a stronger opponent; can one increase the win > rate versus strong opponents? > > The usual range of computer opponents is fairly narrow. None approach > high-dan levels on 19x19 boards - yet. > > Terry McIntyre > > “We hang the petty thieves and appoint the great ones to public office.” -- > Aesop > -- > *From:* Brian Sheppard > *To:* computer-go@computer-go.org > *Sent:* Wednesday, August 12, 2009 12:33:13 PM > *Subject:* [computer-go] Dynamic komi at high handicaps > > >The small samples is probably the least of the problems with this. Do you > >actually believe that you can play games against it and not be subjective > in > >your observations or how you play against it? > > These are computer-vs-computer games. Ingo is manually transferring moves > between two computer opponents. > > The result does support Ingo's belief that dynamic Komi will help programs > play high handicap games. Due to small sample size it isn't very strong > evidence. But maybe it is enough to induce a programmer who actually plays > in such games to create a more exhaustive test. > > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > > > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
I think Terry's suggestion is the best way to test these ideas: 1) Take 2 severely mismatched engines (perhaps 2 versions of the same engine but with different numbers of playouts.) 2) Find the fair handicap by playing a sequence of games and adjusting the number of handicap stones whenever one side loses N out of M games. 3) Plot the handicap over time-it should converge, more or less. 4) Keeping one engine fixed, adjust the other engine, using dynamic Komi, or whatever you think is the best way, and see how much you can improve on the handicap. - Dave Hillis -Original Message- From: terry mcintyre To: computer-go Sent: Wed, Aug 12, 2009 3:42 pm Subject: Re: [computer-go] Dynamic komi at high handicaps Ingo suggested something interesting - instead of changing the komi according to the move number, or some other fixed schedule, it varies according to the estimated winrate. It also, implicitly, depends on one's guess of the ability of the opponent. An interesting test would be to take an opponent known to be weaker, offer it a handicap, and tweak the dynamic komi per Ingo's suggestion. At what handicap does the ratio balance at 50:50? Can the number of handicap stones be increased with such an adaptive algorithm? Even better, play against a stronger opponent; can one increase the win rate versus strong opponents? The usual range of computer opponents is fairly narrow. None approach high-dan levels on2019x19 boards - yet. Terry McIntyre “We hang the petty thieves and appoint the great ones to public office.” -- Aesop From: Brian Sheppard To: computer-go@computer-go.org Sent: Wednesday, August 12, 2009 12:33:13 PM Subject: [computer-go] Dynamic komi at high handicaps >The small samples is probably the least of the problems with this. Do you >actually believe that you can play games against it and not be subjective in >your observations or how you play against it? These are computer-vs-computer games. Ingo is manually transferring moves between two computer opponents. The result does support Ingo's belief that dynamic Komi will help programs play high handicap games. Due to small sample size it isn't very strong evidence. But maybe it is enough to induce a programmer who actually plays in such games to create a more exhaustive test. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ omputer-go mailing list omputer...@computer-go.org ttp://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
Consider this thought experiment. You sit down at a board and your opponent has a 9-stone handicap. By any objective measure of the game, you should resign immediately. All your win-rate calculations report this hopeless state of affairs. Winrate gives you no objective basis to prefer one move or another. But, you think, what if I can make a small group? What if I try for a lesser goal, such as "don't lose by more than 90 points?" Your opponent has a 9 stone handicap because he makes more mistakes than you do. As the game progresses, those mistakes add up. You set your goal higher - losing by only 50 points; losing by only 10 points. The changing goal permits you to discriminate in a field which would otherwise look like a dark, desolate, win-less landscape. Terry McIntyre “We hang the petty thieves and appoint the great ones to public office.” -- Aesop From: Don Dailey To: computer-go Sent: Wednesday, August 12, 2009 1:05:36 PM Subject: Re: [computer-go] Dynamic komi at high handicaps Ok, I misunderstood his testing procedure. What he is doing is far more scientific than what I thought he was doing. There has got to be something better than this. What we need is a way to make the playouts more meaningful but not by artificially reducing our actual objective which is to win. For the high handicap games, shouldn't the goal be to maximize the score? Instead of adjusting komi why not just change the goal to win as much of the board as possible?This would be far more honest and reliable I would think and the program would not be forced to constantly waste effort on constantly changing goals. - Don On Wed, Aug 12, 2009 at 3:33 PM, Brian Sheppard wrote: >>The small samples is probably the least of the problems with this. Do you >>>actually believe that you can play games against it and not be subjective >>in >>>your observations or how you play against it? > >>These are computer-vs-computer games. Ingo is manually transferring moves >>between two computer opponents. > >>The result does support Ingo's belief that dynamic Komi will help programs >>play high handicap games. Due to small sample size it isn't very strong >>evidence. But maybe it is enough to induce a programmer who actually plays >>in such games to create a more exhaustive test. > >>___ >>computer-go mailing list >computer-go@computer-go.org >http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
Ok, I misunderstood his testing procedure. What he is doing is far more scientific than what I thought he was doing. There has got to be something better than this. What we need is a way to make the playouts more meaningful but not by artificially reducing our actual objective which is to win. For the high handicap games, shouldn't the goal be to maximize the score? Instead of adjusting komi why not just change the goal to win as much of the board as possible?This would be far more honest and reliable I would think and the program would not be forced to constantly waste effort on constantly changing goals. - Don On Wed, Aug 12, 2009 at 3:33 PM, Brian Sheppard wrote: > >The small samples is probably the least of the problems with this. Do > you > >actually believe that you can play games against it and not be subjective > in > >your observations or how you play against it? > > These are computer-vs-computer games. Ingo is manually transferring moves > between two computer opponents. > > The result does support Ingo's belief that dynamic Komi will help programs > play high handicap games. Due to small sample size it isn't very strong > evidence. But maybe it is enough to induce a programmer who actually plays > in such games to create a more exhaustive test. > > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
Ingo suggested something interesting - instead of changing the komi according to the move number, or some other fixed schedule, it varies according to the estimated winrate. It also, implicitly, depends on one's guess of the ability of the opponent. An interesting test would be to take an opponent known to be weaker, offer it a handicap, and tweak the dynamic komi per Ingo's suggestion. At what handicap does the ratio balance at 50:50? Can the number of handicap stones be increased with such an adaptive algorithm? Even better, play against a stronger opponent; can one increase the win rate versus strong opponents? The usual range of computer opponents is fairly narrow. None approach high-dan levels on 19x19 boards - yet. Terry McIntyre “We hang the petty thieves and appoint the great ones to public office.” -- Aesop From: Brian Sheppard To: computer-go@computer-go.org Sent: Wednesday, August 12, 2009 12:33:13 PM Subject: [computer-go] Dynamic komi at high handicaps >The small samples is probably the least of the problems with this. Do you >actually believe that you can play games against it and not be subjective in >your observations or how you play against it? These are computer-vs-computer games. Ingo is manually transferring moves between two computer opponents. The result does support Ingo's belief that dynamic Komi will help programs play high handicap games. Due to small sample size it isn't very strong evidence. But maybe it is enough to induce a programmer who actually plays in such games to create a more exhaustive test. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Dynamic komi at high handicaps
>The small samples is probably the least of the problems with this. Do you >actually believe that you can play games against it and not be subjective in >your observations or how you play against it? These are computer-vs-computer games. Ingo is manually transferring moves between two computer opponents. The result does support Ingo's belief that dynamic Komi will help programs play high handicap games. Due to small sample size it isn't very strong evidence. But maybe it is enough to induce a programmer who actually plays in such games to create a more exhaustive test. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Dynamic komi at high handicaps
2009/8/12 "Ingo Althöfer" <3-hirn-ver...@gmx.de> > In the last few weeks I have experimented a lot with dynamic > komi in games with high handicap. Especially, I used the > really nice commercial program Many Faces of Go (version 12.013) > with its Monte Carlo level (about 2 kyu on 19x19 board) and > its traditional 18-kyu level as the opponent. > > At handicap 21 I played (manulally!) 8 games with these opponents: > 4 games with static komi (0.5) - here MFoG (2-kyu) won 1 of the 4 games. > 4 games with dynamic komi - here MFoG (2-kyu) won 3 of the 4 games. > > I used "dynamic komi" in the following "Rule 42" way. Starting point for > this internal artificial komi was a very high value (to compensate for > the handicap stones), typically 300.5 or 320.5 . > Then, always when the evaluation had climbed up to 42 % or higher, > dynamic komi was reduced by 50 or 30 or 20 (or 10 near the end), > until finally the true value of 0.5 was reached. > > After this little sample I also tried a few games with dynamic komi > at handicap 25. After some unsuccessful games (the Monte Carlo side > died of starvation at komi=40.5 or 30.5) today one win came out: > In best Monte Carlo fashion, the MC-level won by half a point. > > I have included sgf of this game. > > I am aware that small samples are not enough to prove something. > Therefore, I hope that programmers may realize automatic versions > of something like "Rule 42" to find out how their programs behave > with dynamic komi. The small samples is probably the least of the problems with this. Do you actually believe that you can play games against it and not be subjective in your observations or how you play against it? - Don > > Ingo. > -- > GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! > Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01 > > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Dynamic komi at high handicaps
In the last few weeks I have experimented a lot with dynamic komi in games with high handicap. Especially, I used the really nice commercial program Many Faces of Go (version 12.013) with its Monte Carlo level (about 2 kyu on 19x19 board) and its traditional 18-kyu level as the opponent. At handicap 21 I played (manulally!) 8 games with these opponents: 4 games with static komi (0.5) - here MFoG (2-kyu) won 1 of the 4 games. 4 games with dynamic komi - here MFoG (2-kyu) won 3 of the 4 games. I used "dynamic komi" in the following "Rule 42" way. Starting point for this internal artificial komi was a very high value (to compensate for the handicap stones), typically 300.5 or 320.5 . Then, always when the evaluation had climbed up to 42 % or higher, dynamic komi was reduced by 50 or 30 or 20 (or 10 near the end), until finally the true value of 0.5 was reached. After this little sample I also tried a few games with dynamic komi at handicap 25. After some unsuccessful games (the Monte Carlo side died of starvation at komi=40.5 or 30.5) today one win came out: In best Monte Carlo fashion, the MC-level won by half a point. I have included sgf of this game. I am aware that small samples are not enough to prove something. Therefore, I hope that programmers may realize automatic versions of something like "Rule 42" to find out how their programs behave with dynamic komi. Ingo. -- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01 handicap25-dynamicKomi.sgf Description: application/go-sgf ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] new kid on the block
> > > Yes, known problem :-( I'm still trying to find a method to see if a > point is in an eye. Should not be too difficult in theory but in > practice i have not found a method yet. Are you talking about 1 point eyes? For this I think most programs use the same definition, which is quite good and safe. As far as I know there is no perfect rule but this is close to perfect. The definition of an eye we use is this: An empty point whose direct neighbors are all of the same color AND whose diagonal neighbors contain no more than 1 stone of the opposite color. This definition must be modified slightly if the point in question is on the edge of the board - in which case there must be NO diagonal enemy stones. To know if a point is inside a bigger eye - that's much more speculative I think. - Don > > -- > Multi tail barnamaj mowahib li mora9abat attasjilat wa nataij awamir > al 7asoub. damj, talwin, mora9abat attarchi7 wa ila akhirih. > http://www.vanheusden.com/multitail/ > -- > Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] new kid on the block
> Congrats for breaking the 1000 elo mark on cgos. ;) Thanks! Version 0.5 made quiet a difference compared to version 0.4. I'm graphing the elo ratings of the versions running at cgos here: http://keetweej.vanheusden.com/stats/stop-all-elo-cgos.png > Some things I noticed when watching 2 games: > -stop plays on the first line/corner in the beginning. maybe this helps: > http://computer-go.org/pipermail/computer-go/2008-December/017340.html > or this: > http://computer-go.org/pipermail/computer-go/2008-December/017457.html Ok, will read that. > -stop fills its own eyes, killing alive groups. you should prevent moves > that fill own eyes. look here: > http://computer-go.org/pipermail/computer-go/2008-May/014929.html Yes, known problem :-( I'm still trying to find a method to see if a point is in an eye. Should not be too difficult in theory but in practice i have not found a method yet. After a side puts a stone on a cross, I collect all stones and put those in seperate arrays; each chain in a seperate array. Now what I should do is finding out if such a chain makes a circle and then with a simple is-a-point-in-a-polygon-algorithm check if the point is in an eye. Still failing on that. Any tips are welcome! Folkert van Heusden -- Multi tail barnamaj mowahib li mora9abat attasjilat wa nataij awamir al 7asoub. damj, talwin, mora9abat attarchi7 wa ila akhirih. http://www.vanheusden.com/multitail/ -- Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] new kid on the block
Congrats for breaking the 1000 elo mark on cgos. ;) Some things I noticed when watching 2 games: -stop plays on the first line/corner in the beginning. maybe this helps: http://computer-go.org/pipermail/computer-go/2008-December/017340.html or this: http://computer-go.org/pipermail/computer-go/2008-December/017457.html -stop fills its own eyes, killing alive groups. you should prevent moves that fill own eyes. look here: http://computer-go.org/pipermail/computer-go/2008-May/014929.html regards, ibd ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/