Re: [computer-go] A new reference bot enhancement to try

Don Dailey Thu, 30 Oct 2008 10:44:58 -0700

I have 210 games now and the lead has increase to 82 ELO.


On Thu, 2008-10-30 at 13:12 -0400, Don Dailey wrote:
> I may have been wrong when I said 0.1 was too high,  it's the value that
> is now testing best and it's the highest value I am testing.   It is
> showing 61 ELO  improvement over not using the idea at all.   I have
> only played about 160 games, so there is still a lot of statistical
> noise here and anything can happen.   
> 
> When I have checked this out good,  I'll experiment with tanh()
> 
> 
> - Don
> 
> 
> 
> 
> 
> On Thu, 2008-10-30 at 14:59 -0200, Mark Boon wrote:
> > Funny, I have been playing with something very similar. Although I  
> > got side-tracked to something else for the moment. Intuitively I felt  
> > tanh() was more appropriate than a linear function. Although you may  
> > want to have the inverse of that, as I was trying to calculate the  
> > territory certainty whereass you want the territory uncertainty.
> > 
> >     Mark
> > 
> > On 30-okt-08, at 14:21, Don Dailey wrote:
> > 
> > > Reference bot enhancement
> > > =========================
> > >
> > > Here is another possible enhancement to the reference bot which I am
> > > currently testing.  I do not yet have anything conclusive enough to
> > > report, but it looks good so far with a small number of games.
> > >
> > > But even if this idea doesn't pan out, it will produce a much more
> > > natural playing style without weakening the bot.
> > >
> > > Here is how it works.  We will use 1000 playouts for our example:
> > >
> > > 1. Modify the bot to keep a "futures" table.  At the end of each
> > >    playout, tally the wins for white and black for each point on the
> > >    board.  (I tally -1 for a white win, 1 for a black win to get a
> > >    final score from -1000 to 1000 for each point.)
> > >
> > > 2. When the 1000 playouts are complete, compute an "uncertainty value"
> > >    for each point, where 1.0 is completely uncertain, and 0.0 is
> > >    completely certain.  A point is completely certain if at the end of
> > >    each playout it was ALWAYS owned by one player or the other.  It's
> > >    completely uncertain if it won 50% of the time for either side.
> > >
> > > 3. When determining which move to play, apply an uncertainty delta to
> > >    the computed score of each move.  This is simply some fraction of
> > >    the "uncertainty value" and the best value I've tested so far is
> > >    0.025.  So you get a bonus that ranges from 0.0 to 0.025.
> > >
> > > 4. Choose the move with the best (sc + uncertainty_delta.)
> > >
> > > 5. The incentive must be small, large incentives will destroy the
> > >    playing strength.  For instance 0.1 is too high and weakens it.
> > >    The value that is testing the best for me (of the ones I've tried
> > >    so far) is 0.025
> > >
> > > 6. This may test at some levels better than others.  I'm testing
> > >    at 2000 playouts.
> > >
> > > The idea is to gently encourage the bot to avoid playing to points
> > > that are clearly a forgone conclusion (or conversely, encourage it to
> > > play where the "action" is.)
> > >
> > > This should make the bot play much less artificially.  Near the end of
> > > the game it will prefer moves to unresolved points.  Earlier in the
> > > game it will avoid moving to areas that are "probably" already won or
> > > lost.
> > >
> > > My feeling is that these "incentives" should probably be calculated in
> > > a non-linear way, but what I described is a good starting point.  From
> > > experiments in the past it seems more important to put the focus and
> > > most of the weight on avoiding play to highly certain points.   So I
> > > will try some non-linear formula next.
> > >
> > >
> > > - Don
> > >
> > > _______________________________________________
> > > computer-go mailing list
> > > [email protected]
> > > http://www.computer-go.org/mailman/listinfo/computer-go/
> > 
> > _______________________________________________
> > computer-go mailing list
> > [email protected]
> > http://www.computer-go.org/mailman/listinfo/computer-go/
> _______________________________________________
> computer-go mailing list
> [email protected]
> http://www.computer-go.org/mailman/listinfo/computer-go/

signature.asc
Description: This is a digitally signed message part

_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] A new reference bot enhancement to try

Reply via email to