Re: [Computer-go] replacing dynamic komi with a scoring function

Aja Huang Mon, 09 Jan 2012 13:45:40 -0800

Lets say there is a semeai and Black wins 6.5 points if it wins the semeai. 
Otherwise White wins 3.5 points.


Playout 1    B+6.5
Playout 2    B+6.5
Playout 3    B+6.5
Playout 4    B+6.5
Playout 5    B+6.5
Playout 6    B+6.5
Playout 7    B+6.5
Playout 8    B+6.5
Playout 9    B+6.5
Playout 10 W+3.5


Average score = (6.5 * 9 + (- 3.5)) / 10 = 5.5

In this case, shifting komi to something like “Black 0.5 point win” (namely 
increasing komi by 5) does not work. Black simply has to win this semeai to 
win. So I think Zach’s idea quite makes sense. Actually I had a similar idea in 
the past but still have no time to try it.

Aja


From: Stefan Kaitschick 
Sent: Monday, January 09, 2012 2:28 PM
To: Aja Huang ; computer-go@dvandva.org 
Subject: Re: [Computer-go] replacing dynamic komi with a scoring function

The correct answer is 0, because shifting by just 1 point drops the rate to 10%
But is this really a dynamic komi problem?
I mean, is there really a graceful way to misevaluate semeais?
apropos semeais: would it be feasible to expand RAVE to include the information 
of the previous move?
( like the "killer heuristic" in chess)
It could be limited to the 5 by 5 surroundings, so that the previous move is 
one of 24 points or "other".
But that's still not cheap computationally. I just have no idea how much effort 
it takes to maintain the current RAVE.

Stefan


On Mon, Jan 9, 2012 at 8:14 PM, Aja Huang <ajahu...@gmail.com> wrote:

  Interesting. I have observed a problem in the current dynamic komi scheme 
especially when there is a big semeai/life-and-death:

  Playout 1   B+0.5
  Playout 2   B+0.5
  Playout 3   B+0.5
  Playout 4   B+0.5
  Playout 5   B+0.5
  Playout 6   B+0.5
  Playout 7   B+0.5
  Playout 8   B+0.5
  Playout 9   B+0.5
  Playout 10 W+8.5

  B has 90% winning rate. How many points should we shift for dynamic komi?

  Aja

  -----原始郵件----- From: Zach Wegner
  Sent: Monday, January 09, 2012 11:53 AM 

  To: computer-go@dvandva.org
  Subject: Re: [Computer-go] replacing dynamic komi with a scoring function


  On Mon, Jan 9, 2012 at 7:26 AM, Vlad Dumitrescu <vladd...@gmail.com> wrote:

    Hi,

    On Mon, Jan 9, 2012 at 13:17, Don Dailey <dailey....@gmail.com> wrote:

      Summary:

      I believe a more correct scoring function won't be based on how much you 
win
      by OR how often you win but will incorporate some other more relevant
      concept and it will be dynamic.    And it will not matter if the game is
      a handicap game or otherwise because the scoring function will always be
      relevant.   The goal will be to maximize your winning chances but it
      will incorporate something more sophisticated that just counting how often
      you win or how much you win by.


    I hope I may interfere with something that Don's nice description
    revealed to me. It feels rather obvious, but since nobody stated it
    explicitly, maybe it's news for at least some people here.

    MCTS is maximizing the chances of winning. These chances are largest
    for a minimal score difference because this allows for making some
    errors. Winning by the largest possible score has rather small chances
    to happen because every move has to be perfect.

    The curve describing the probability of ending the game with a certain
    score is bell-shaped and MCTS explores the area beneath it, looking
    for winning moves. With handicap, the disadvantaged side is getting
    less samples explored, making it less likely to discover the really
    good moves. Dynamic komi shifts the bell left or right in order to
    equalize the sampling on both sides, but as mentioned it isn't dynamic
    enough (the curve changes after each move) and also is actually using
    a different shape for the curve than the real "handicap curve".

    In theory, I think that the solution for keeping the same level of
    play with handicap as without would be to make sure that the the
    disadvantaged side gets just as many samples with or without handicap.
    That is, use more playouts when playing with handicap. In practice,
    this is probably prohibitive...

    I wonder if it might be possible to estimate the shape of this curve
    after each move and use that estimate to dynamically adjust the number
    of playouts. One might have to use higher precision calculations, too,
    so that the noise doesn't get too loud.

    Does this make any sense? Has anyone tried something like this?

    best regards,
    Vlad
    _______________________________________________
    Computer-go mailing list
    Computer-go@dvandva.org
    http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


  This is similar to ideas I've had to solve this. Of course, I'm not a
  go programmer, so it's really just idle curiosity for me when reading
  this list. Perhaps an ideal scoring function is "highest median
  score". You can do this in a slow/memory intensive way by storing a
  histogram of final score in each node instead of wins/losses.
  Wins/losses can then be computed by summing the histogram buckets
  above/below the komi level. Computing the median is simply finding the
  bucket where the sum crosses 50%. Other percentages could be used here
  as well, though I'm not quite sure what effect this might have.

  This solves the problem of getting far less than one bit of
  information per playout when the win rate is very high or low, and
  also supersedes dynamic komi--if you have the full histogram at any
  node, you can compute the exact win rate for any arbitrary komi.

  Using this in practice might be fairly difficult, and would probably
  involve making the histogram "lossy" in some sense. I would be
  interested if somebody here could run tests with this as a limit study
  though--keep the number of nodes allocated and playouts per move
  constant, and see the effect on strength.

  Zach
  _______________________________________________
  Computer-go mailing list
  Computer-go@dvandva.org
  http://dvandva.org/cgi-bin/mailman/listinfo/computer-go 
  _______________________________________________
  Computer-go mailing list
  Computer-go@dvandva.org
  http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

_______________________________________________
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] replacing dynamic komi with a scoring function

Reply via email to