Re: [computer-go] A new reference bot enhancement to try

Jason House Thu, 30 Oct 2008 16:16:39 -0700

The error bars of all bots overlap. I'm not familiar enough withBayesELO to compute p-values. I'd bet that only the 0.1 version has astatistically significant strength difference.


Sent from my iPhone


On Oct 30, 2008, at 7:00 PM, Don Dailey <[EMAIL PROTECTED]> wrote:

The basic idea seems to be a modest improvement after 752 games.  Note
that ALL versions with the incentive play stronger.

I'm going to try more aggressive values now - when I find a reasonable
value I'll try tanh() stuff.

Rank Name           Elo    +    - games score oppo. draws
  1 inc-0.1       2033   19   19   752   54%  2004    0%
  2 inc-0.025     2008   19   19   750   49%  2012    0%
  3 inc-0.01      2003   19   19   752   48%  2014    0%
  4 mwNoDup-2000  2000   19   19   750   48%  2015    0%





On Thu, 2008-10-30 at 14:59 -0200, Mark Boon wrote:

Funny, I have been playing with something very similar. Although I
got side-tracked to something else for the moment. Intuitively I felt
tanh() was more appropriate than a linear function. Although you may
want to have the inverse of that, as I was trying to calculate the
territory certainty whereass you want the territory uncertainty.

   Mark

On 30-okt-08, at 14:21, Don Dailey wrote:

Reference bot enhancement
=========================

Here is another possible enhancement to the reference bot which I am
currently testing.  I do not yet have anything conclusive enough to
report, but it looks good so far with a small number of games.

But even if this idea doesn't pan out, it will produce a much more
natural playing style without weakening the bot.

Here is how it works.  We will use 1000 playouts for our example:

1. Modify the bot to keep a "futures" table.  At the end of each
  playout, tally the wins for white and black for each point on the
  board.  (I tally -1 for a white win, 1 for a black win to get a
  final score from -1000 to 1000 for each point.)

2. When the 1000 playouts are complete, compute an "uncertaintyvalue"

  for each point, where 1.0 is completely uncertain, and 0.0 is

completely certain. A point is completely certain if at the endof

  each playout it was ALWAYS owned by one player or the other.  It's
  completely uncertain if it won 50% of the time for either side.

3. When determining which move to play, apply an uncertainty deltato

  the computed score of each move.  This is simply some fraction of
  the "uncertainty value" and the best value I've tested so far is
  0.025.  So you get a bonus that ranges from 0.0 to 0.025.

4. Choose the move with the best (sc + uncertainty_delta.)

5. The incentive must be small, large incentives will destroy the
  playing strength.  For instance 0.1 is too high and weakens it.
  The value that is testing the best for me (of the ones I've tried
  so far) is 0.025

6. This may test at some levels better than others.  I'm testing
  at 2000 playouts.

The idea is to gently encourage the bot to avoid playing to points

that are clearly a forgone conclusion (or conversely, encourage itto

play where the "action" is.)

This should make the bot play much less artificially. Near theend of

the game it will prefer moves to unresolved points.  Earlier in the

game it will avoid moving to areas that are "probably" already wonor

lost.

My feeling is that these "incentives" should probably becalculated ina non-linear way, but what I described is a good starting point.From

experiments in the past it seems more important to put the focus and
most of the weight on avoiding play to highly certain points.   So I
will try some non-linear formula next.


- Don

_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/


_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] A new reference bot enhancement to try

Reply via email to