Re: [computer-go] Floating komi
Christoph Birk wrote: On Mar 5, 2008, at 11:58 AM, Don Dailey wrote: Don Dailey wrote: not assuming that MC plays the best move. The problem isn't the assumptions I am making, but the assumptions others are making, that it's NOT playing the best move.You want to apply a fix to all positions without really knowing which positions are a problem. One last time: Nobody suggested a one fix for all positions/problems. The floating komi was suggested to guide the UCT search along certain lines of play during specific (close!) endgame positions. When I said all positions I meant all games.You expect to apply this to all winning and losing positions in every game, not just specific ones. - Don Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Floating komi
why doesn't someone simply try this and post the results, if they think that it would help? s. On Thu, Mar 6, 2008 at 8:30 AM, Don Dailey [EMAIL PROTECTED] wrote: Christoph Birk wrote: On Mar 5, 2008, at 11:58 AM, Don Dailey wrote: Don Dailey wrote: not assuming that MC plays the best move. The problem isn't the assumptions I am making, but the assumptions others are making, that it's NOT playing the best move.You want to apply a fix to all positions without really knowing which positions are a problem. One last time: Nobody suggested a one fix for all positions/problems. The floating komi was suggested to guide the UCT search along certain lines of play during specific (close!) endgame positions. When I said all positions I meant all games.You expect to apply this to all winning and losing positions in every game, not just specific ones. - Don Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Floating komi
On Thu, 6 Mar 2008, Don Dailey wrote: One last time: Nobody suggested a one fix for all positions/problems. The floating komi was suggested to guide the UCT search along certain lines of play during specific (close!) endgame positions. When I said all positions I meant all games.You expect to apply this to all winning and losing positions in every game, not just specific ones. No. Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Floating komi
Don Dailey wrote: I would be satisfied if someone implemented it, reported a 500 game self-test sample and concluded that it didn't hurt the program measurably and show a few examples of how it improved the moves cosmetically, perhaps even comparing both version with specific positions. Remember that there's good reason to believe that this technique will be less helpful in self-play. -M- ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Floating komi
Hideki Kato wrote: I'd like to give here an example to make things clear. The conditions are: 1) Using digitizing scheme that maps real score to [0,1] (or [-1,1]) so that the program cannot distinguish losing/winning by 0.5 or 10.5 pt at all. 2) Playouts include some foolish moves (usually with low but not zero probability), not to connect large groups in atari position for example, due to hold its randomness. 3) The position is at early endgame where there are no moves that gain greater than 2 pt, for example, in perfect play. 4) Black is behind by 0.5 pt. The playouts may return winning but gambling move (perhaps with low probability) under above conditions, especialy in case of the number of playouts is small which is usually true on 19x19, and UCT will choose it. The question is, which is better to keep 0.5 pt behind or to play gambling moves (here I mean such moves that B will lose many pts if W will answer correctly) with expecting W's (stupid) mistakes? The assumption is that you suddenly cannot trust MC to do what it does best even though you did for the entire game up until this point. MC of course will choose the gambling move. The whole concept of MC is to do what is most likely to produce a win. We should think twice before asking it to choose the moves that produces the more sure loss.We are the ones that have a bias about this, not the MC programs. In addition to above, there is one more issue to consider. If the playout has a systematic error, nakade for example, it's not good to keep 0.5 pt ahead. Having more margin is clearly better. I believe nakade is a strawman.There are lots of things MC does better and lot's of things it does less well.You can always find positions that are hard or easy for your program to solve, but it isn't intrinsic to this issue. I don't think you should weaken this concept of playing for the best winning chances for the very few positions where MC programs take longer to resolve the endgame and there is a slight chance that it will win if it just happens to be enough to cover the exact situation. Because this is no solution - it is at best a patch and would only work in some cases. If I could do something that didn't hurt the program in other ways, but might help certain positions once in a while, I would go for it. I've been in game programming a long time - if you have a problem with certain types of positions you really want a pointed solution that has little or no impact on other positions. You don't want to be going back and forth fixing things up but you want to solve the problem as correctly as possible the first time. I'll call this principle, every solution has a side effect but this is a pretty bad side effect.(I can't tell you how many times I fixed something in my chess program with some evaluation change only to find that I broke many other things at the same time.) - Don The idea of floating komi helps above two. I'd like to emphasize that I know it's not a universal solution. As it seems, however, very hard to solve nakade problem, it could be a pracitical solution. -Hideki -- [EMAIL PROTECTED] (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Floating komi
Don Dailey wrote: Hideki Kato wrote: I'd like to give here an example to make things clear. The conditions are: 1) Using digitizing scheme that maps real score to [0,1] (or [-1,1]) so that the program cannot distinguish losing/winning by 0.5 or 10.5 pt at all. 2) Playouts include some foolish moves (usually with low but not zero probability), not to connect large groups in atari position for example, due to hold its randomness. 3) The position is at early endgame where there are no moves that gain greater than 2 pt, for example, in perfect play. 4) Black is behind by 0.5 pt. The playouts may return winning but gambling move (perhaps with low probability) under above conditions, especialy in case of the number of playouts is small which is usually true on 19x19, and UCT will choose it. The question is, which is better to keep 0.5 pt behind or to play gambling moves (here I mean such moves that B will lose many pts if W will answer correctly) with expecting W's (stupid) mistakes? The assumption is that you suddenly cannot trust MC to do what it does best even though you did for the entire game up until this point. MC of course will choose the gambling move. The whole concept of MC is to do what is most likely to produce a win. Not entirely, no. The concept of MC is to do what has most lines leading to a win, which is slightly different. There's obviously a strong correlation, or MC wouldn't work at all, but I think it's dangerous to assume that MC by definition plays the best move. For one thing it makes it very hard to argue about how to improve MC programs, it creates lots of noise of don't do that, it will only make your program weaker. We should think twice before asking it to choose the moves that produces the more sure loss.We are the ones that have a bias about this, not the MC programs. In addition to above, there is one more issue to consider. If the playout has a systematic error, nakade for example, it's not good to keep 0.5 pt ahead. Having more margin is clearly better. I believe nakade is a strawman.There are lots of things MC does better and lot's of things it does less well.You can always find positions that are hard or easy for your program to solve, but it isn't intrinsic to this issue. I don't think you should weaken this concept of playing for the best winning chances for the very few positions where MC programs take longer to resolve the endgame and there is a slight chance that it will win if it just happens to be enough to cover the exact situation. Because this is no solution - it is at best a patch and would only work in some cases. I don't think patching one thing at a time is such a bad way to write a go program. Small steps, one at a time, and you suddenly have a much stronger program. And again you're making the assumption that to deviate from accurate MC means less winning chances. It might mean less winning LINES, but the probability of a loss or win is entirely dependent of how the opponent plays, which is (hopefully) never random. And this does not mean you're doing opponent modeling, or - if you define opponent modeling very loosely so it includes this - that what you're doing is bad. If I could do something that didn't hurt the program in other ways, but might help certain positions once in a while, I would go for it. I don't think you'll find ANY improvement to ANY non-trivial program that doesn't, in some cases, make it play worse. What matters is how it does in the average case. I've been in game programming a long time - if you have a problem with certain types of positions you really want a pointed solution that has little or no impact on other positions. You don't want to be going back and forth fixing things up but you want to solve the problem as correctly as possible the first time. I'll call this principle, every solution has a side effect but this is a pretty bad side effect.(I can't tell you how many times I fixed something in my chess program with some evaluation change only to find that I broke many other things at the same time.) A good understanding of _why_ your program works can help this a lot, ensuring that you know how to fix a problem without causing bigger problems elsewhere. I think part of the problem here is that few (if any) people know _why_ MC works. Why does the cumulative result of random play-outs correlate so strongly with the strength of the position? In what ways does it NOT correlate, that can be fixed? ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Floating komi
Raymond Wold wrote: Don Dailey wrote: Hideki Kato wrote: I'd like to give here an example to make things clear. The conditions are: 1) Using digitizing scheme that maps real score to [0,1] (or [-1,1]) so that the program cannot distinguish losing/winning by 0.5 or 10.5 pt at all. 2) Playouts include some foolish moves (usually with low but not zero probability), not to connect large groups in atari position for example, due to hold its randomness. 3) The position is at early endgame where there are no moves that gain greater than 2 pt, for example, in perfect play. 4) Black is behind by 0.5 pt. The playouts may return winning but gambling move (perhaps with low probability) under above conditions, especialy in case of the number of playouts is small which is usually true on 19x19, and UCT will choose it. The question is, which is better to keep 0.5 pt behind or to play gambling moves (here I mean such moves that B will lose many pts if W will answer correctly) with expecting W's (stupid) mistakes? The assumption is that you suddenly cannot trust MC to do what it does best even though you did for the entire game up until this point. MC of course will choose the gambling move. The whole concept of MC is to do what is most likely to produce a win. Not entirely, no. The concept of MC is to do what has most lines leading to a win, which is slightly different. There's obviously a strong correlation, or MC wouldn't work at all, but I think it's dangerous to assume that MC by definition plays the best move. For one thing it makes it very hard to argue about how to improve MC programs, it creates lots of noise of don't do that, it will only make your program weaker. I'm not assuming that MC plays the best move. The problem isn't the assumptions I am making, but the assumptions others are making, that it's NOT playing the best move.You want to apply a fix to all positions without really knowing which positions are a problem. Now if you want to talk about noise, you have just added more noise if you do that. I can give you an example of why you must be extremely careful about making unfounded assumptions: My naive simple MC program would often fail to make a needed but small capture move because of noise in the play-outs.The move would be seconds or third choice but not quite make it to the top - instead it would play a speculative attack or something I considered rather stupid. My fix was to assume that this was a general problem, that it didn't know what it was doing - after all, I could see with my own eyes that this was a problem. So I applied a general incentive to encourage it to make capture moves. Guess what? That was a mistake. The program was actually pretty good at not being greedy about captures - most of the time the capture can wait and you essentially lose a stone every time you play a greedy capture. But now when it needed to do something useful it preferred to waste time making a capture that could wait indefinitely (the group was dead, the capture could be made later when there were no important moves left.) I backed the change out in a hurry and realized that I would never get it perfect unless I applied a very specific fix.I already knew that from computer chess. In one early program I had a bonus for rook on the 7th rank, a good general principle.We played one of the top US grandmasters in a casual game once and the computer put the rook on the 7th rank in a pawn-less endgame where the kings were already centralized.The rook incentive got in the way of it finding the right plan. Although the rook on the 7th heuristic made the program stronger in general, it really wasn't implemented as well as it could be. Every good general principle in chess and I'm sure Go too, if taken too seriously, has a side effect and will cause problems and the solution is to try to address things as specifically as you can within reason. In my opinion, your proposed fix for this imagined problem is equivalent to if I had decided to reverse the rook to 7th bonus and turn it into a penalty because I saw it go wrong once. In this case, my working assumption would be that monte carlo programs know what they are doing (much more often than not) when they play risky moves in losing positions. In other words, are you trying to fix something that is not broken? I think there is a disconnect too between what appeals to the eye and what actually works. Most reasonable go players won't take risks when the score is close because they think the game is about equal when it probably isn't.Near the end of the game with Chinese scoring, the chances are rarely close to even - you just think they are because you think that because the territory is about even the game could go either way. When I look at the logs of Lazarus, the games are virtually over well before the game actually ends.
Re: [computer-go] Floating komi
Don Dailey: [EMAIL PROTECTED]: Hideki Kato wrote: I'd like to give here an example to make things clear. The conditions are: 1) Using digitizing scheme that maps real score to [0,1] (or [-1,1]) so that the program cannot distinguish losing/winning by 0.5 or 10.5 pt at all. 2) Playouts include some foolish moves (usually with low but not zero probability), not to connect large groups in atari position for example, due to hold its randomness. 3) The position is at early endgame where there are no moves that gain greater than 2 pt, for example, in perfect play. 4) Black is behind by 0.5 pt. The playouts may return winning but gambling move (perhaps with low probability) under above conditions, especialy in case of the number of playouts is small which is usually true on 19x19, and UCT will choose it. The question is, which is better to keep 0.5 pt behind or to play gambling moves (here I mean such moves that B will lose many pts if W will answer correctly) with expecting W's (stupid) mistakes? The assumption is that you suddenly cannot trust MC to do what it does best even though you did for the entire game up until this point. MC of course will choose the gambling move. The whole concept of MC is to do what is most likely to produce a win. We should think twice before asking it to choose the moves that produces the more sure loss.We are the ones that have a bias about this, not the MC programs. Whatever the 'concept' of MC is, I just will improve its weakness by any means. MC is just a method, IMHO, to evaluate a position better than to use a static function. # I'm just an engineer :). In addition to above, there is one more issue to consider. If the playout has a systematic error, nakade for example, it's not good to keep 0.5 pt ahead. Having more margin is clearly better. I believe nakade is a strawman.There are lots of things MC does better and lot's of things it does less well.You can always find positions that are hard or easy for your program to solve, but it isn't intrinsic to this issue. I don't think you should weaken this concept of playing for the best winning chances for the very few positions where MC programs take longer to resolve the endgame and there is a slight chance that it will win if it just happens to be enough to cover the exact situation. Because this is no solution - it is at best a patch and would only work in some cases. If I could do something that didn't hurt the program in other ways, but might help certain positions once in a while, I would go for it. I've been in game programming a long time - if you have a problem with certain types of positions you really want a pointed solution that has little or no impact on other positions. You don't want to be going back and forth fixing things up but you want to solve the problem as correctly as possible the first time. I'll call this principle, every solution has a side effect but this is a pretty bad side effect.(I can't tell you how many times I fixed something in my chess program with some evaluation change only to find that I broke many other things at the same time.) Although I'm afraid my English skill is enough to understand such a long paragraph correctly (shorter is better :), I believe it's just a choice. Some are good at developing an essential solution but not all. As I know I'm worse at that, I just do what I can. It was developped for the First UEC Cup that features Japanese rules which count seki differently so that all MC programs need to have some margin. RĂ©mi introduced a fixed margin by changing komi for Crazy Stone and I've developed a dynamic one. Do you say we have to wait until we will develop an essential solution? -Hideki - Don The idea of floating komi helps above two. I'd like to emphasize that I know it's not a universal solution. As it seems, however, very hard to solve nakade problem, it could be a pracitical solution. -Hideki -- [EMAIL PROTECTED] (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ -- [EMAIL PROTECTED] (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Floating komi
On Mar 5, 2008, at 11:58 AM, Don Dailey wrote: Don Dailey wrote: not assuming that MC plays the best move. The problem isn't the assumptions I am making, but the assumptions others are making, that it's NOT playing the best move.You want to apply a fix to all positions without really knowing which positions are a problem. One last time: Nobody suggested a one fix for all positions/problems. The floating komi was suggested to guide the UCT search along certain lines of play during specific (close!) endgame positions. Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Floating komi
I'd like to give here an example to make things clear. The conditions are: 1) Using digitizing scheme that maps real score to [0,1] (or [-1,1]) so that the program cannot distinguish losing/winning by 0.5 or 10.5 pt at all. 2) Playouts include some foolish moves (usually with low but not zero probability), not to connect large groups in atari position for example, due to hold its randomness. 3) The position is at early endgame where there are no moves that gain greater than 2 pt, for example, in perfect play. 4) Black is behind by 0.5 pt. The playouts may return winning but gambling move (perhaps with low probability) under above conditions, especialy in case of the number of playouts is small which is usually true on 19x19, and UCT will choose it. The question is, which is better to keep 0.5 pt behind or to play gambling moves (here I mean such moves that B will lose many pts if W will answer correctly) with expecting W's (stupid) mistakes? In addition to above, there is one more issue to consider. If the playout has a systematic error, nakade for example, it's not good to keep 0.5 pt ahead. Having more margin is clearly better. The idea of floating komi helps above two. I'd like to emphasize that I know it's not a universal solution. As it seems, however, very hard to solve nakade problem, it could be a pracitical solution. -Hideki -- [EMAIL PROTECTED] (Kato) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/