I have 210 games now and the lead has increase to 82 ELO.
On Thu, 2008-10-30 at 13:12 -0400, Don Dailey wrote: > I may have been wrong when I said 0.1 was too high, it's the value that > is now testing best and it's the highest value I am testing. It is > showing 61 ELO improvement over not using the idea at all. I have > only played about 160 games, so there is still a lot of statistical > noise here and anything can happen. > > When I have checked this out good, I'll experiment with tanh() > > > - Don > > > > > > On Thu, 2008-10-30 at 14:59 -0200, Mark Boon wrote: > > Funny, I have been playing with something very similar. Although I > > got side-tracked to something else for the moment. Intuitively I felt > > tanh() was more appropriate than a linear function. Although you may > > want to have the inverse of that, as I was trying to calculate the > > territory certainty whereass you want the territory uncertainty. > > > > Mark > > > > On 30-okt-08, at 14:21, Don Dailey wrote: > > > > > Reference bot enhancement > > > ========================= > > > > > > Here is another possible enhancement to the reference bot which I am > > > currently testing. I do not yet have anything conclusive enough to > > > report, but it looks good so far with a small number of games. > > > > > > But even if this idea doesn't pan out, it will produce a much more > > > natural playing style without weakening the bot. > > > > > > Here is how it works. We will use 1000 playouts for our example: > > > > > > 1. Modify the bot to keep a "futures" table. At the end of each > > > playout, tally the wins for white and black for each point on the > > > board. (I tally -1 for a white win, 1 for a black win to get a > > > final score from -1000 to 1000 for each point.) > > > > > > 2. When the 1000 playouts are complete, compute an "uncertainty value" > > > for each point, where 1.0 is completely uncertain, and 0.0 is > > > completely certain. A point is completely certain if at the end of > > > each playout it was ALWAYS owned by one player or the other. It's > > > completely uncertain if it won 50% of the time for either side. > > > > > > 3. When determining which move to play, apply an uncertainty delta to > > > the computed score of each move. This is simply some fraction of > > > the "uncertainty value" and the best value I've tested so far is > > > 0.025. So you get a bonus that ranges from 0.0 to 0.025. > > > > > > 4. Choose the move with the best (sc + uncertainty_delta.) > > > > > > 5. The incentive must be small, large incentives will destroy the > > > playing strength. For instance 0.1 is too high and weakens it. > > > The value that is testing the best for me (of the ones I've tried > > > so far) is 0.025 > > > > > > 6. This may test at some levels better than others. I'm testing > > > at 2000 playouts. > > > > > > The idea is to gently encourage the bot to avoid playing to points > > > that are clearly a forgone conclusion (or conversely, encourage it to > > > play where the "action" is.) > > > > > > This should make the bot play much less artificially. Near the end of > > > the game it will prefer moves to unresolved points. Earlier in the > > > game it will avoid moving to areas that are "probably" already won or > > > lost. > > > > > > My feeling is that these "incentives" should probably be calculated in > > > a non-linear way, but what I described is a good starting point. From > > > experiments in the past it seems more important to put the focus and > > > most of the weight on avoiding play to highly certain points. So I > > > will try some non-linear formula next. > > > > > > > > > - Don > > > > > > _______________________________________________ > > > computer-go mailing list > > > [email protected] > > > http://www.computer-go.org/mailman/listinfo/computer-go/ > > > > _______________________________________________ > > computer-go mailing list > > [email protected] > > http://www.computer-go.org/mailman/listinfo/computer-go/ > _______________________________________________ > computer-go mailing list > [email protected] > http://www.computer-go.org/mailman/listinfo/computer-go/
signature.asc
Description: This is a digitally signed message part
_______________________________________________ computer-go mailing list [email protected] http://www.computer-go.org/mailman/listinfo/computer-go/
