Re: [computer-go] Scoring - step function or sigmoid function?
Thinking about why... In a given board position moves can be grouped into sets: the set of correct moves, the set of 1pt mistakes, 2pt mistakes, etc. Let's assume each side has roughly the same number of moves each in each of these groupings. If black is winning by 0.5pt with perfect play, then mistakes by each side balance out and we get a winning percentage of just over 50%. If he is winning by 1.5pt then he has breathing space and can make an extra mistake. Or in other words, at a certain move he can play any of the moves in the correct moves set, or any of the moves in the 1pt mistakes set, and still win. So he wins more of the playouts. Say 55%. If he is winning by 2.5pts then he can make one 2pt mistakes or two 1pt mistakes (more than the opponent) and still win, so he wins more playouts, 60% perhaps. And so on. My conclusion was that the winning percentage is more than just an estimate of how likely the player is to win. It is in fact a crude estimator of the final score. Going back to your original comment, when choosing between move A that leads to a 0.5pt win, and move B that leads to a 100pt win, you should be seeing move B has a higher winning percentage. Darren Point well taken.Winning positions tend to cluster and critical swing moves are rare, statistically speaking. If the position is more or less evenly balanced, the step function might allready be very close to optimal because of this. But I would like to bring up a well known mc quirk: In handicap positions, or after one side scored a big success in an even game, bots play badly with both sides, until the position becomes closer again. The problem here is that every move is a win (or every move is a loss). On 9*9, its possible to beat a bot, giving it 2 stones, even when it's a close contest on even with komi. All it needs is a single bot missread at the moment the position becomes close(which it will, because the bot will be lazy until that point). So it would be desirable for the bot to make keeping the score advantage large an auxiliary goal. This has been tried ofcourse, but without much success sofar. And it seems that the main reason is that tinkering with the scoring function to achive this, tends to worsen play in competitive situations. I have an alternative suggestion: In handicap games, introduce a virtual komi, that gets reduced to 0 as the game progresses. This would work for the bot on both sides: If the bot has b it will make less lazy plays, if it has w, it will be less maniacal. For example, in a 4 stone 19*19 game, if the real starting advantage is about 45 points, the bot could introduce an internal komi of about 30-35. The bot should be optimistic with b and pessimistic with w, but not to the point that every move evaluates to the same value, and move selection becomes a toss-up. Another way to look at this, is that humans that give a handicap know that they can't usually catch up in one piece. And humans that take a handicap know that they can't give up their advantage too quickly. Virtual komi encodes this simple knowledge. During the course of the game this internal komi would ofcourse have to be reduced to 0. The proper criteria can only be found by experimentation, but the important factors will be how far the game has progressed, and what the win rate is for the best move. If the bot becomes pessimistic with b it should lower the internal komi more quickly. One advantage of this approach is that it doesn't mess up even game play. A more elaborate scheme would be to make a komi search before the real search - to find the best ratio of win rate to internal komi before making the normal move search with this komi. This could also be useful in even play after one side pulled ahead. Stefan ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Scoring - step function or sigmoid function?
On Mon, Jun 8, 2009 at 7:35 AM, Stefan Kaitschick stefan.kaitsch...@hamburg.de wrote: Thinking about why... In a given board position moves can be grouped into sets: the set of correct moves, the set of 1pt mistakes, 2pt mistakes, etc. Let's assume each side has roughly the same number of moves each in each of these groupings. If black is winning by 0.5pt with perfect play, then mistakes by each side balance out and we get a winning percentage of just over 50%. If he is winning by 1.5pt then he has breathing space and can make an extra mistake. Or in other words, at a certain move he can play any of the moves in the correct moves set, or any of the moves in the 1pt mistakes set, and still win. So he wins more of the playouts. Say 55%. If he is winning by 2.5pts then he can make one 2pt mistakes or two 1pt mistakes (more than the opponent) and still win, so he wins more playouts, 60% perhaps. And so on. My conclusion was that the winning percentage is more than just an estimate of how likely the player is to win. It is in fact a crude estimator of the final score. Going back to your original comment, when choosing between move A that leads to a 0.5pt win, and move B that leads to a 100pt win, you should be seeing move B has a higher winning percentage. Darren Point well taken.Winning positions tend to cluster and critical swing moves are rare, statistically speaking. If the position is more or less evenly balanced, the step function might allready be very close to optimal because of this. But I would like to bring up a well known mc quirk: In handicap positions, or after one side scored a big success in an even game, bots play badly with both sides, until the position becomes closer again. The problem here is that every move is a win (or every move is a loss). On 9*9, its possible to beat a bot, giving it 2 stones, even when it's a close contest on even with komi. All it needs is a single bot missread at the moment the position becomes close(which it will, because the bot will be lazy until that point). I would say it's foolish to purposely give the bot 2 stones in order to hope for a misread unless you are expert on that particular behaviour and can predict just where and why it will go wrong. So it would be desirable for the bot to make keeping the score advantage large an auxiliary goal. This has been tried ofcourse, but without much success sofar. And it seems that the main reason is that tinkering with the scoring function to achive this, tends to worsen play in competitive situations. This is easy to understand - it's because maximizing your winning chances is a better strategy than maximizing how many points you take. One is the actual goal of the game (to win) and the other is a different goal which is not as highly corelated as we like to think it is. I have an alternative suggestion: In handicap games, introduce a virtual komi, that gets reduced to 0 as the game progresses. This would work for the bot on both sides: If the bot has b it will make less lazy plays, if it has w, it will be less maniacal. For example, in a 4 stone 19*19 game, if the real starting advantage is about 45 points, the bot could introduce an internal komi of about 30-35. The bot should be optimistic with b and pessimistic with w, but not to the point that every move evaluates to the same value, and move selection becomes a toss-up. Another way to look at this, is that humans that give a handicap know that they can't usually catch up in one piece. And humans that take a handicap know that they can't give up their advantage too quickly. Virtual komi encodes this simple knowledge. During the course of the game this internal komi would ofcourse have to be reduced to 0. The proper criteria can only be found by experimentation, but the important factors will be how far the game has progressed, and what the win rate is for the best move. If the bot becomes pessimistic with b it should lower the internal komi more quickly. In principle this is no different from the usual schemes applied when there is no handicap. In practice, there is one thing different that could make it at least worth a look.When you play WITH a handicap it's because your opponent is weaker than you are.When the opponent has the handicap it's because YOU are the weaker player.So you can use the fact that you are playing in a handicap game to tell you something about your opponent. Now if you are playing against a weaker opponent, your winning chances actually do increase, so by manipulation of the komi you can represent that fact.It's certainly wrong to do this with an equal opponent but perhaps not so bad with a weaker opponent. My guess is that this still won't work, but at least there is something different about these kind of games that could make this worth another look. The reason komi schemes like this probably dont' work well in general is because they
Re: [computer-go] Scoring - step function or sigmoid function?
Since this topic has resurfaced, I'll mention again the alternative strategy of using unbalanced playout rules to compensate for high handicaps. As Don pointed out, the existence of a high handicap *should* indicate that black is more likely to make mistakes. This is simple to model, assuming heavy playouts, by adding a bit more randomness to black moves (only outside the tree). ?Personally, I'm inclined to believe that unbalancing the playouts should be superior to adjusting the Komi. When black's external playout moves are more random, white will be more likely to win unsettled areas, but settled areas will tend to stay settled. Inside the tree, we want white searching for complicated board positions and black searching for simpler ones. In my program, it was easy to find an appropriate adjustment for different handicaps through offline testing. From a purely subjective viewpoint, I found that the resulting opening moves looked much more reasonable. People who have tried dynamically adjusting the Komi, report similar, subjective, success. There might not be any practical difference. It's not obvious to me what a fair test would be. I'm convinced that either method is worth doing in the opening for very high handicaps. Just looking at some examples is pretty persuasive. I'm more doubtful about trying them late in an even game when one side has pulled ahead. - Dave Hillis -Original Message- From: Don Dailey dailey@gmail.com To: computer-go computer-go@computer-go.org Sent: Wed, Jul 8, 2009 8:49 am Subject: Re: [computer-go] Scoring - step function or sigmoid function? On Mon, Jun 8, 2009 at 7:35 AM, Stefan Kaitschick stefan.kaitsch...@hamburg.de wrote: Thinking about why... In a given board position moves can be grouped into sets: the set of correct moves, the set of 1pt mistakes, 2pt mistakes, etc. Let's assume each side has roughly the same number of moves each in each of these groupings. If black is winning by 0.5pt with perfect play, then mistakes by each side balance out and we get a winning percentage of just over 50%. If he is winning by 1.5pt then he has breathing space and can make an extra mistake. Or in other words, at a certain move he can play any of the moves in the correct moves set, or any of the moves in the 1pt mistakes set, and still win. So he wins more of the playouts. Say 55%. If he is winning by 2.5pts then he can make one 2pt mistakes or two 1pt mistakes (more than the opponent) and still win, so he wins more playouts, 60% perhaps. And so on. My conclusion was that the winning percentage is more than just an estimate of how likely the player is to win. It is in fact a crude estimator of the final score. Going back to your original comment, when choosing between move A that leads to a 0.5pt win, and move B that leads to a 100pt win, you should be seeing move B has a higher winning percentage. Darren Point well taken.Winning positions tend to cluster and critical swing moves are rare, statistically speaking. If the position is more or less evenly balanced, the step function might allready be very close to optimal because of this. But I would like to bring up a well known mc quirk: In handicap positions, or after one side scored a big success in an even game, bots play badly with both sides, until the position becomes closer again. The problem here is that every move is a win (or every move is a loss). On 9*9, its possible to beat a bot, giving it 2 stones, even when it's a close contest on even with komi. All it needs is a single bot missread at the moment the position becomes close(which it will, because the bot will be lazy until that point). I would say it's foolish to purposely give the bot 2 stones in order to hope for a misread unless you are expert on that particular behaviour and can predict just where and why it will go wrong.? ? So it would be desirable for the bot to make keeping the score advantage large an auxiliary goal. This has been tried ofcourse, but without much success sofar. And it seems that the main reason is that tinkering with the scoring function to achive this, tends to worsen play in competitive situations. This is easy to understand - it's because maximizing your winning chances is a better strategy than maximizing how many points you take.?? One is the actual goal of the game (to win) and the other is a different goal which is not as highly corelated as we like to think it is.?? ? I have an alternative suggestion: In handicap games, introduce a virtual komi, that gets reduced to 0 as the game progresses. This would work for the bot on both sides: If the bot has b it will make less lazy plays, if it has w, it will be less maniacal. For example, in a 4 stone 19*19 game, if the real starting advantage is about 45 points, the bot could introduce an internal komi of about 30-35. The bot should be optimistic with b and pessimistic with w, but not to the point that every
Re: [computer-go] Scoring - step function or sigmoid function?
To properly test any method of playing with a handicap, today's programs will need to play against much stronger opponents. Self-play, or play against other roughly-equal programs, won't test the ability to eke out a win against a professional go player. Terry McIntyre terrymcint...@yahoo.com “We hang the petty thieves and appoint the great ones to public office.” -- Aesop ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Scoring - step function or sigmoid function?
You can also test against weaker programs with compensating handicap. It might not be quite as interesting, but it's easier to test. 2009/7/8 terry mcintyre terrymcint...@yahoo.com: To properly test any method of playing with a handicap, today's programs will need to play against much stronger opponents. Self-play, or play against other roughly-equal programs, won't test the ability to eke out a win against a professional go player. Terry McIntyre terrymcint...@yahoo.com “We hang the petty thieves and appoint the great ones to public office.” -- Aesop ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Prediction
--- On Wed, 7/8/09, Michael Williams michaelwilliam...@gmail.com wrote: From: Michael Williams michaelwilliam...@gmail.com Prediction: First, developers and algorithm gurus will expend huge amounts of effort to parallelize code to take advantage of multi-core chips. Then, hardware engineers and physics gurus will give us light-based CPUs and orders of magnitude improvement in single-core performance. Quite all right! light-based CPUs will come in multi-core versions too! ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Prediction
Prediction: First, developers and algorithm gurus will expend huge amounts of effort to parallelize code to take advantage of multi-core chips. Then, hardware engineers and physics gurus will give us light-based CPUs and orders of magnitude improvement in single-core performance. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/