Re: [computer-go] monte carlo transposition reuse (MCTR)

Jason House Fri, 14 Sep 2007 07:39:29 -0700

Time management is identical.  Based on quick profiling, the MCTR version
does about 1/6 of the simulations.  Actually, since the MCTR does extra
tracking of info, it can reuse some old simulations, so it may be more like
1/4 or 1/5 of the simulations.  It's just using the results more
efficiently.


On 9/14/07, Chris Fant <[EMAIL PROTECTED]> wrote:
>
> Does it defeat it based on number of samples taken or time allotted per
> turn?
>
> On 9/14/07, Jason House <[EMAIL PROTECTED]> wrote:
> > I know I'm only wading in the kiddie pool of computer go with my 1-ply
> bots,
> > but I think I may have found a useful enhancement to monte carlo.
> >
> > HouseBot supports three 1-ply search modes:
> >   1plyMC - Uniform sampling
> >   1plyShuffle - Uniform sampling with monte carlo transposition reuse
> >   1plyUCT - Non-uniform sampling based on the UCT algorithm (AKA UCB)
> >
> > Obviously, 1plyMC is far inferior to 1plyUCT as everyone probably
> expects.
> > What may surprise many is that 1plyShuffle defeats 1plyUCT nearly every
> > time.  I'm basic this on self-play data from CGOS.  Currently,
> > http://cgos.boardspace.net/9x9/cross/housebot-617-UCB.html
> > shows 10 matches between housebot-617-UCB has played housebot-618-shuff.
> > housebot-617-UCB (1plyUCT) lost every time.
> >
> > While tricky, it should be possible to combine UCT and MCTR for an even
> > stronger bot.  MCTR can be thought of as a low bias alternative to the
> AMAF
> > heuristic.  Rather than using all moves, MCTR takes only the top N
> moves,
> > where N is computed based on which moves were played in the random game.
> > From an open board position MCTR uses about 1/3 of the moves that AMAF
> > would.  Computation of the resulting winning percentage must also be
> > weighted based on the probabilities of duplicating results (roughly
> > speaking, it's 1/N).
> >
> > As a result of using MCTR, winning rates are no longer integers as one
> would
> > expect.  Here's the estimated winning rates for all three algorithms
> when
> > asked for a white response to black G3:
> >
> > 1plyMC:       781     / 1272
> > 1plyShuffle:  140.15 /  231.75
> > 1plyUCT:     936     / 1515
> >
> > 1plyShuffle is slower because of the extra work information tracking,
> but
> > the variance in estimates should be far lower than the numbers would
> > indicate.  I have yet to do the computations, but a sample size of
> 231.75
> > has an estimation error of around 6000 normal MC runs for that position.
> > That is why my implementation of MCTR is defeating my (1ply)
> implementation
> > of UCT.
>

_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] monte carlo transposition reuse (MCTR)

Reply via email to