Lets say there is a semeai and Black wins 6.5 points if it wins the semeai. Otherwise White wins 3.5 points.
Playout 1 B+6.5 Playout 2 B+6.5 Playout 3 B+6.5 Playout 4 B+6.5 Playout 5 B+6.5 Playout 6 B+6.5 Playout 7 B+6.5 Playout 8 B+6.5 Playout 9 B+6.5 Playout 10 W+3.5 Average score = (6.5 * 9 + (- 3.5)) / 10 = 5.5 In this case, shifting komi to something like “Black 0.5 point win” (namely increasing komi by 5) does not work. Black simply has to win this semeai to win. So I think Zach’s idea quite makes sense. Actually I had a similar idea in the past but still have no time to try it. Aja From: Stefan Kaitschick Sent: Monday, January 09, 2012 2:28 PM To: Aja Huang ; computer-go@dvandva.org Subject: Re: [Computer-go] replacing dynamic komi with a scoring function The correct answer is 0, because shifting by just 1 point drops the rate to 10% But is this really a dynamic komi problem? I mean, is there really a graceful way to misevaluate semeais? apropos semeais: would it be feasible to expand RAVE to include the information of the previous move? ( like the "killer heuristic" in chess) It could be limited to the 5 by 5 surroundings, so that the previous move is one of 24 points or "other". But that's still not cheap computationally. I just have no idea how much effort it takes to maintain the current RAVE. Stefan On Mon, Jan 9, 2012 at 8:14 PM, Aja Huang <ajahu...@gmail.com> wrote: Interesting. I have observed a problem in the current dynamic komi scheme especially when there is a big semeai/life-and-death: Playout 1 B+0.5 Playout 2 B+0.5 Playout 3 B+0.5 Playout 4 B+0.5 Playout 5 B+0.5 Playout 6 B+0.5 Playout 7 B+0.5 Playout 8 B+0.5 Playout 9 B+0.5 Playout 10 W+8.5 B has 90% winning rate. How many points should we shift for dynamic komi? Aja -----原始郵件----- From: Zach Wegner Sent: Monday, January 09, 2012 11:53 AM To: computer-go@dvandva.org Subject: Re: [Computer-go] replacing dynamic komi with a scoring function On Mon, Jan 9, 2012 at 7:26 AM, Vlad Dumitrescu <vladd...@gmail.com> wrote: Hi, On Mon, Jan 9, 2012 at 13:17, Don Dailey <dailey....@gmail.com> wrote: Summary: I believe a more correct scoring function won't be based on how much you win by OR how often you win but will incorporate some other more relevant concept and it will be dynamic. And it will not matter if the game is a handicap game or otherwise because the scoring function will always be relevant. The goal will be to maximize your winning chances but it will incorporate something more sophisticated that just counting how often you win or how much you win by. I hope I may interfere with something that Don's nice description revealed to me. It feels rather obvious, but since nobody stated it explicitly, maybe it's news for at least some people here. MCTS is maximizing the chances of winning. These chances are largest for a minimal score difference because this allows for making some errors. Winning by the largest possible score has rather small chances to happen because every move has to be perfect. The curve describing the probability of ending the game with a certain score is bell-shaped and MCTS explores the area beneath it, looking for winning moves. With handicap, the disadvantaged side is getting less samples explored, making it less likely to discover the really good moves. Dynamic komi shifts the bell left or right in order to equalize the sampling on both sides, but as mentioned it isn't dynamic enough (the curve changes after each move) and also is actually using a different shape for the curve than the real "handicap curve". In theory, I think that the solution for keeping the same level of play with handicap as without would be to make sure that the the disadvantaged side gets just as many samples with or without handicap. That is, use more playouts when playing with handicap. In practice, this is probably prohibitive... I wonder if it might be possible to estimate the shape of this curve after each move and use that estimate to dynamically adjust the number of playouts. One might have to use higher precision calculations, too, so that the noise doesn't get too loud. Does this make any sense? Has anyone tried something like this? best regards, Vlad _______________________________________________ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go This is similar to ideas I've had to solve this. Of course, I'm not a go programmer, so it's really just idle curiosity for me when reading this list. Perhaps an ideal scoring function is "highest median score". You can do this in a slow/memory intensive way by storing a histogram of final score in each node instead of wins/losses. Wins/losses can then be computed by summing the histogram buckets above/below the komi level. Computing the median is simply finding the bucket where the sum crosses 50%. Other percentages could be used here as well, though I'm not quite sure what effect this might have. This solves the problem of getting far less than one bit of information per playout when the win rate is very high or low, and also supersedes dynamic komi--if you have the full histogram at any node, you can compute the exact win rate for any arbitrary komi. Using this in practice might be fairly difficult, and would probably involve making the histogram "lossy" in some sense. I would be interested if somebody here could run tests with this as a limit study though--keep the number of nodes allocated and playouts per move constant, and see the effect on strength. Zach _______________________________________________ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go _______________________________________________ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
_______________________________________________ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go