"You are giving the program an arbitrary short term goal which may,  or may not 
be compatible with the long term goal of winning the game."

Don,

this is a very important consideration. How can an illusionary goal be better 
than the real goal?
But I would argue that in the handicap situation, catching up quickly enough is 
actually the real goal.
You write:
"And as the base program gets stronger this aspect of the program becomes more 
and more of a wart."

This I disagree with. Because no matter how strong the program will become, it 
will never find a way to defeat itself against a large handicap.
This is effectively what a program tries to do with its playouts.
The only reasonable alternative to trying to catch up quickly enough is to 
model the weaker players errors straight into the playouts, and try to find a 
direct win. But this seems more speculative to me than dynamic komi. Surely, it 
is also harder to implement well.

Stefan

  ----- Original Message ----- 
  From: Don Dailey 
  To: computer-go 
  Sent: Wednesday, August 12, 2009 11:11 PM
  Subject: Re: [computer-go] Dynamic komi at high handicaps


  The problem with MCTS programs  is that they like to consolidate.   You set 
the komi and thereby give them a goal and they very quickly make moves which 
commit to that specific goal.   Commiting to less than you need to actually win 
will often involve sacrificing chances to win.    Sometime it won't,  but you 
cannot have a scalable algorithm which is this arbitrary.    

  However, if the handicap is too high, the program thinks every line is a loss 
and it plays randomly.   That's why we even consider doing this.

  Dynamically changing komi could be of some benefit in that situation if there 
is no alternative reasonable strategy,   but it does not address the real 
problem - which is what I call the "committal consolidation" problem.      You 
are giving the program an arbitrary short term goal which may,  or may not be 
compatible with the long term goal of winning the game.     Whether it's 
compatible or not is based on your own credulity - not anything predictible or 
that you can scale.   And as the base program gets stronger this aspect of the 
program becomes more and more of a wart.   

  If this can be made to work in the short term,  it should be considered a 
temporary hack which should be fixed as soon as possible.   

  We have to think about this anyway sooner or later because if programs 
continue to develop and the predictive ability of the playouts and tree search 
gets several hundred ELO better,  these programs may start to see more and more 
positions as either dead won or dead lost.      I'm sure we will want some kind 
of robust mechanism for dealing with this which is better at estimating chances 
that the opponent will go wrong  as opposed to doing something that is a random 
benefit or hindrance.     

  - Don


   





  2009/8/12 terry mcintyre <[email protected]>

    Ingo suggested something interesting - instead of changing the komi 
according to the move number, or some other fixed schedule, it varies according 
to the estimated winrate. 

    It also, implicitly, depends on one's guess of the ability of the opponent. 

    An interesting test would be to take an opponent known to be weaker, offer 
it a handicap, and tweak the dynamic komi per Ingo's suggestion. At what 
handicap does the ratio balance at 50:50? Can the number of handicap stones be 
increased with such an adaptive algorithm?

    Even better, play against a stronger opponent; can one increase the win 
rate versus strong opponents?

    The usual range of computer opponents is fairly narrow. None approach 
high-dan levels on 19x19 boards - yet.


    Terry McIntyre <[email protected]>


    “We hang the petty thieves and appoint the great ones to public office.” -- 
Aesop



----------------------------------------------------------------------------
    From: Brian Sheppard <[email protected]>
    To: [email protected]
    Sent: Wednesday, August 12, 2009 12:33:13 PM
    Subject: [computer-go] Dynamic komi at high handicaps


    >The small samples is probably the least of the problems with this.  Do you
    >actually believe that you can play games against it and not be subjective
    in
    >your observations or how you play against it?

    These are computer-vs-computer games. Ingo is manually transferring moves
    between two computer opponents.

    The result does support Ingo's belief that dynamic Komi will help programs
    play high handicap games. Due to small sample size it isn't very strong
    evidence. But maybe it is enough to induce a programmer who actually plays
    in such games to create a more exhaustive test.

    _______________________________________________
    computer-go mailing list
    [email protected]
    http://www.computer-go.org/mailman/listinfo/computer-go/




    _______________________________________________
    computer-go mailing list
    [email protected]
    http://www.computer-go.org/mailman/listinfo/computer-go/





------------------------------------------------------------------------------


  _______________________________________________
  computer-go mailing list
  [email protected]
  http://www.computer-go.org/mailman/listinfo/computer-go/
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to