2009/8/12 Stefan Kaitschick <[email protected]> > What a bot does with its playouts in a handicap situation is to > essentially try to beat itself, despite the handicap. > > And in this situation the bot reacts in a very human way, it becomes > despondend. > > Adjusting the komi dynamically shifts the goal from winning to catching up > quickly enough. >
I think that is the problem though. You have only 1 thing you can control, how to set the komi before doing the search. But how the program deals with your artificial (and crude) setting is unpredictable. What you really need is some kind of way to tell it to try to win some territory, but not spoil your chances of winning a bit more later. It's easier to win N points if you know in advance that you will not be asked later to win N more points. And I'm afraid that is what will happen all too often - the program will maximize it's chances of winning N, but this does not always translate directly into winning N plus MORE. > > I feel that that is the natural handicap strategy, not a band-aid. > It's a scalability issue, which is why I call it a band-aid. It's not natural because it's an artificial goal, not a natural one and certainly not the ACTUAL goal, which is to win the game. Do you want to win N points, or do you want to win the game? And we all KNOW that it will try to maximize it's chances of winning N points, regardless of the consequences beyond that. You would never ask a runner to stop 50 feet short of the finish line, then ask him to go 10 feet more, and so on. The runner plans his strategy based on the actual distance run and anything else would change his pacing strategy in a bad way. If the program makes decisions about the best way to win N points, there is no guarantee that this is ALSO the best way to win N+1 points. This is the implicit assumption in this strategy, that the best way to win with ANY komi is the same and that the same moves are just as good no matter what. In fact the more you must win by, the more chances you must take. I believe the only thing wrong with the current MCTS strategy is that you cannot get a statistical meaningful number of samples when almost all games are won or lost. You can get more meanful NUMBER of samples by adjusting komi, but unfortunately you are sampling the wrong thing - an approximation of the actual goal. Since the approximation may be wrong or right, your algorithm is not scalable. You could run on a billion processors sampling billions of nodes per seconds and with no flaw to the search or the playouts still play a move that gives you no chances of winning. - Don > Ofcourse, the dynamic komi must be adjusted down to zero in good time. > > I think there are 2 main reasons that this hasnt been fully explored sofar. > > 1. Trying to maximize the score turned out to be a huge mistake, compared > to trying to maximize the winrate. > This makes dynamic komi a kind of blind spot. > > 2. Handicap go wasnt given special attention sofar. > > > Stefan > > > ----- Original Message ----- > *From:* Don Dailey <[email protected]> > *To:* computer-go <[email protected]> > *Sent:* Wednesday, August 12, 2009 11:24 PM > *Subject:* Re: [computer-go] Dynamic komi at high handicaps > > Terry, > > I understand the reasoning behind this, your thought experiment did not add > anything to my understanding. And I agree that if the program is strong > enough and the handicap is high enough this is probably better than doing > nothing at all. > > However, I think there must be something that is more along the lines of > treating the disease, not the symptoms. You might be able to put a band > aid on the problem but you have not addressed the real issue in a systematic > way. > > Besides, I have not yet seen anyone demonstrate that this works - it's > always talked about but never implemented. It is made to sound so simple > that you have to wonder where the implementation is and why the strong > programs do not have it. > > - Don > > > > > 2009/8/12 terry mcintyre <[email protected]> > >> Consider this thought experiment. >> >> You sit down at a board and your opponent has a 9-stone handicap. >> >> By any objective measure of the game, you should resign immediately. >> >> All your win-rate calculations report this hopeless state of affairs. >> >> Winrate gives you no objective basis to prefer one move or another. >> >> But, you think, what if I can make a small group? What if I try for a >> lesser goal, such as "don't lose by more than 90 points?" >> >> Your opponent has a 9 stone handicap because he makes more mistakes than >> you do. >> >> As the game progresses, those mistakes add up. You set your goal higher - >> losing by only 50 points; losing by only 10 points. >> >> The changing goal permits you to discriminate in a field which would >> otherwise look like a dark, desolate, win-less landscape. >> >> Terry McIntyre <[email protected]> >> >> “We hang the petty thieves and appoint the great ones to public office.” >> -- Aesop >> >> ------------------------------ >> *From:* Don Dailey <[email protected]> >> *To:* computer-go <[email protected]> >> *Sent:* Wednesday, August 12, 2009 1:05:36 PM >> *Subject:* Re: [computer-go] Dynamic komi at high handicaps >> >> Ok, I misunderstood his testing procedure. What he is doing is far more >> scientific than what I thought he was doing. >> >> There has got to be something better than this. What we need is a way to >> make the playouts more meaningful but not by artificially reducing our >> actual objective which is to win. >> >> For the high handicap games, shouldn't the goal be to maximize the >> score? Instead of adjusting komi why not just change the goal to win as >> much of the board as possible? This would be far more honest and reliable >> I would think and the program would not be forced to constantly waste effort >> on constantly changing goals. >> >> >> - Don >> >> >> >> >> >> On Wed, Aug 12, 2009 at 3:33 PM, Brian Sheppard <[email protected]>wrote: >> >>> >The small samples is probably the least of the problems with this. Do >>> you >>> >actually believe that you can play games against it and not be >>> subjective >>> in >>> >your observations or how you play against it? >>> >>> These are computer-vs-computer games. Ingo is manually transferring moves >>> between two computer opponents. >>> >>> The result does support Ingo's belief that dynamic Komi will help >>> programs >>> play high handicap games. Due to small sample size it isn't very strong >>> evidence. But maybe it is enough to induce a programmer who actually >>> plays >>> in such games to create a more exhaustive test. >>> >>> _______________________________________________ >>> computer-go mailing list >>> [email protected] >>> http://www.computer-go.org/mailman/listinfo/computer-go/ >>> >> >> >> >> _______________________________________________ >> computer-go mailing list >> [email protected] >> http://www.computer-go.org/mailman/listinfo/computer-go/ >> > > ------------------------------ > > _______________________________________________ > computer-go mailing list > [email protected] > http://www.computer-go.org/mailman/listinfo/computer-go/ > > > _______________________________________________ > computer-go mailing list > [email protected] > http://www.computer-go.org/mailman/listinfo/computer-go/ >
_______________________________________________ computer-go mailing list [email protected] http://www.computer-go.org/mailman/listinfo/computer-go/
