I know I'm only wading in the kiddie pool of computer go with my 1-ply bots,
but I think I may have found a useful enhancement to monte carlo.

HouseBot supports three 1-ply search modes:
  1plyMC - Uniform sampling
  1plyShuffle - Uniform sampling with monte carlo transposition reuse
  1plyUCT - Non-uniform sampling based on the UCT algorithm (AKA UCB)

Obviously, 1plyMC is far inferior to 1plyUCT as everyone probably expects.
What may surprise many is that 1plyShuffle defeats 1plyUCT nearly every
time.  I'm basic this on self-play data from CGOS.  Currently,
http://cgos.boardspace.net/9x9/cross/housebot-617-UCB.html shows 10 matches
between housebot-617-UCB has played housebot-618-shuff.  housebot-617-UCB
(1plyUCT) lost every time.

While tricky, it should be possible to combine UCT and MCTR for an even
stronger bot.  MCTR can be thought of as a low bias alternative to the AMAF
heuristic.  Rather than using all moves, MCTR takes only the top N moves,
where N is computed based on which moves were played in the random game.
>From an open board position MCTR uses about 1/3 of the moves that AMAF
would.  Computation of the resulting winning percentage must also be
weighted based on the probabilities of duplicating results (roughly
speaking, it's 1/N).

As a result of using MCTR, winning rates are no longer integers as one would
expect.  Here's the estimated winning rates for all three algorithms when
asked for a white response to black G3:

1plyMC:       781     / 1272
1plyShuffle:  140.15 /  231.75
1plyUCT:     936     / 1515

1plyShuffle is slower because of the extra work information tracking, but
the variance in estimates should be far lower than the numbers would
indicate.  I have yet to do the computations, but a sample size of
231.75has an estimation error of around 6000 normal MC runs for that
position.
That is why my implementation of MCTR is defeating my (1ply) implementation
of UCT.
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to