Re: [computer-go] Dynamic komi at high handicaps

Don Dailey Wed, 19 Aug 2009 10:03:08 -0700

2009/8/19 terry mcintyre <terrymcint...@yahoo.com>

> Consider the game when computer is black, with 7 stones against a very
> strong human opponent.
>
> Computer thinks every move is a winning move; it plays randomly; a
> half-point win is as good as a 70-point win.
>
> Pro gains ground as computer makes "slack moves", taking slightly less than
> its full due.
>
> At some point, the computer, being the weaker player, makes one "slack
> move" too many and loses the game.
>
> Rinse and repeat.
>


All you are doing here is repeating the idealized scenario that illustrates
the rationale for this idea.

That is a fine way to describe how this has potential to be a solution to
the problem,  but it doesn't explain why it has not been made to work and it
does not address the fact that your scenario is highly idealized - not all
positions work that way with everything so cut and dried.

I can do exactly the same thing and I have been,  constructing scenarios
that will fail with any heuristic you propose.   But that does not prove or
disprove anything.

So I propose that we need to start thinking about why this has been so
resistant to success and consider the possibility that it doesn't work, or
that we are not properly addressing the reason why it doesn't work.    And
to do this YOU have to be the one that constructs counter-examples.


>
> At some point, it dawns on the programmer: must attack to win handicap
> games. Must be a little bit greedy, to slow down the process of attrition.
>

The attack part I agree with, the greed I do not.


>
>
> Dynamic komi models something real: the significant advantage of the
> computer in a handicap game. It tries to preserve as much of that advantage
> as possible.
>

I think the problem and what I consider your misconception is revealed
here.   You use the word "advantage" incorrectly.   Grabbing up points on
the board is a different concept than having or not having an "advantage"

Your solution is to suddenly switch to an inferior definition of
"advantage",  one that is clearly broken, otherwise we would be using point
count instead of win count in our programs.

So I don't see this at all as "trying to preserve the advantage",  I see it
as giving it away.

I really believe the secret has to be in the playouts - we use those to
estimate our chances.   If you change komi the playouts try to estmate the
chances that you will win at that NEW KOMI,  not at some other komi.




>
> I don't know if it will work for computer vs human games.
>
> I do know that a similar idea helped me defeat a human player and reduce my
> handicap by 3 stones. Not having the patience for thousands of 30-minute
> games to achieve statistically valid results, I settled for trouncing my
> opponent three games in a row by a large margin, then doing it again with a
> smaller handicap for three more games. I can't win by 70 stones in a 7 stone
> game, but 20 or 30 was enough to prove my point. If I made random plays
> under the assumption that I still had a half-point win, my opponent's
> predictive powers would be superior to mine - that's why he gives me a
> handicap and not vice versa.
>

That trick helped you due to human psychology.  Humans have a tendancy to
rise to the occaions and computers do not know how to "try harder" like we
do.     In computer chess one of the strengths of the old programs was that
when they were losing they did not become disheartened like human players
often do.     Once I win a pawn or two or a piece you can sometimes feel
your oppoent "resign" even if he doesn't say the words right away.

You also assume the computer program is being sloppy which could not be
farther from the truth.  If the comptuer plays a random looking move it's
only because the move has no affect on the winning chances that are within
the computers ability estimate.    And the move is only "random" because you
define it to be so.

If you try the same thing,  you are just being cocky and you are letting
your guard down.   Us humans must always be vigilant and keep our interest
in the game as high as possible and the best way to do this is to continue
to fight - what is called the "follow through" in baseball.       In tennis
doubles, after breaking the opponent serve we used to say, "now that we have
them down, let's kick them."     You almost have to play like you are losing
in order to win if you are human.   But computers always play at full
throttle.



>
>
> I can't really be sure that my prediction of a 22.5 point win is exact to
> the last decimal point - but if it should be within 5 or 10 or even 20, I'm
> perfectly happy.
>
> It's nice that statistics of a series of one-bit values are so useful, but
> when a significant fraction of those one-bit values are 100% wrong, that
> introduces a bit of noise to one's estimates. One hopes that they balance
> evenly, but perhaps they do not.
>
> Terry McIntyre <terrymcint...@yahoo.com>
>
> “We hang the petty thieves and appoint the great ones to public office.” --
> Aesop
>
> ------------------------------
> *From:* Don Dailey <dailey....@gmail.com>
> *To:* computer-go <computer-go@computer-go.org>
> *Sent:* Wednesday, August 19, 2009 6:03:50 AM
> *Subject:* Re: [computer-go] Dynamic komi at high handicaps
>
> One must decide if the goal is to improve the program or to improve it's
> playing behavior when it's in a dead won or dead lost positions.
>
> It's my belief that you can probably cannot improve the playing strength
> soley with komi manipulation,  but at a slight decrease in playing strength
> you can probably improve the behavior, as measured by a willingness to fight
> for space that is technically not relevant to the goal of winning the
> game.    And only then if this is done carefully.      However I believe
> there are better ways,  such a pre-ordering the moves.
>
> Even if this can prove to be a gain,  you are really working very hard to
> find something that only kicks in when the game is already decided - how to
> play when the game is already won or already lost.    But only the case when
> the game is lost is this very interesting from the standpoint of making the
> program stronger.
>
> And even this case is not THAT interesting, because if you are losing, on
> average you are losing to stronger players.   So you are working hard on an
> algorithm to beat stronger players when you are in a dead lost game?   How
> much sense does that make?
>
> So the only realistic pay-off here is how to salvage lost games against
> players that are relatively close in strength since you can expect not to be
> in this situation very often agaist really weak players.    So you are
> hoping to bamboozle players who are not not weaker than you - in situations
> where you have been bamboozled (since you are losing,  you are the one being
> outplayed.)
>
> That is why I believe that at best you are looking at only a very minor
> improvement.    If I were working on this problem I would be focused only on
> the playing style,  not the playing strength.
>
> If you want more than the most minor playing strength improvement out of
> this algorithm, you have to start using it BEFORE the loss is clear,  but
> then you are no longer playing for the win when you lower your goals,  you
> are playing for the loss.
>
> - Don
>
>
>
>
> 2009/8/19 Stefan Kaitschick <stefan.kaitsch...@hamburg.de>
>
>>  One last rumination on dynamic komi:
>>
>> The main objection against introducing dynamic komi is that it ignores the
>> true goal
>> of winning by half a point. The power of the win/loss step function as
>> scoring function underscores
>> the validity of this critique. And yet, the current behaviour of mc bots,
>> when either leading or trailing by a large margin, resembles random play.
>> The simple reason for this is that either every move is a win or every
>> move is a loss.
>> So from the perspective of securing a win, every move is as good as any
>> other move.
>> Humans know how to handle these situations. They try to catch up from
>> behind, or try to play safely while defending enough of a winning margin.
>> For a bot, there are some numerical clues when it is missbehaving.
>> When the calculated win rate is either very high or low and many move
>> candidates have almost identical win rates, the bot is in coin toss country.
>> A simple rule would be this: define a minimum value that has to separate
>> the win rate of the 2 best move candidates.
>> Do a "normal" search without komi.
>> If the minimum difference is not reached, do a new a new search with some
>> komi, but only allow the moves within the minimum value range from the best
>> candidate.
>> Repeat this with progressively higher komi until the two best candidates
>> are sufficiently separated.(Or until the win rate is in a defined middle
>> region)
>> There can be some traps here, a group of moves can all accomplish the same
>> critical goal.
>> But I'm sure this can be handled. The main idea is to look for a less
>> ambitious gloal when the true goal cannot be reached.
>> (Or a more ambitious goal when it is allready satisfied). By only allowing
>> moves that are in a statistical tie in the 0 komi search,
>> it can be assured that short term gain doesn't compromise the long term
>> goal.
>>
>> Stefan
>>
>> _______________________________________________
>> computer-go mailing list
>> computer-go@computer-go.org
>> http://www.computer-go.org/mailman/listinfo/computer-go/
>>
>
>
>
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Dynamic komi at high handicaps

Reply via email to