Can you explain a bit more about how 1plyShuffle works?

On 9/14/07, Jason House <[EMAIL PROTECTED]> wrote:
> Time management is identical.  Based on quick profiling, the MCTR version
> does about 1/6 of the simulations.  Actually, since the MCTR does extra
> tracking of info, it can reuse some old simulations, so it may be more like
> 1/4 or 1/5 of the simulations.  It's just using the results more
> efficiently.
>
>
> On 9/14/07, Chris Fant <[EMAIL PROTECTED]> wrote:
> > Does it defeat it based on number of samples taken or time allotted per
> turn?
> >
> > On 9/14/07, Jason House <[EMAIL PROTECTED]> wrote:
> > > I know I'm only wading in the kiddie pool of computer go with my 1-ply
> bots,
> > > but I think I may have found a useful enhancement to monte carlo.
> > >
> > > HouseBot supports three 1-ply search modes:
> > >   1plyMC - Uniform sampling
> > >   1plyShuffle - Uniform sampling with monte carlo transposition reuse
> > >   1plyUCT - Non-uniform sampling based on the UCT algorithm (AKA UCB)
> > >
> > > Obviously, 1plyMC is far inferior to 1plyUCT as everyone probably
> expects.
> > > What may surprise many is that 1plyShuffle defeats 1plyUCT nearly every
> > > time.  I'm basic this on self-play data from CGOS.  Currently,
> > >
> http://cgos.boardspace.net/9x9/cross/housebot-617-UCB.html
> > > shows 10 matches between housebot-617-UCB has played housebot-618-shuff.
> > > housebot-617-UCB (1plyUCT) lost every time.
> > >
> > > While tricky, it should be possible to combine UCT and MCTR for an even
> > > stronger bot.  MCTR can be thought of as a low bias alternative to the
> AMAF
> > > heuristic.  Rather than using all moves, MCTR takes only the top N
> moves,
> > > where N is computed based on which moves were played in the random game.
> > > From an open board position MCTR uses about 1/3 of the moves that AMAF
> > > would.  Computation of the resulting winning percentage must also be
> > > weighted based on the probabilities of duplicating results (roughly
> > > speaking, it's 1/N).
> > >
> > > As a result of using MCTR, winning rates are no longer integers as one
> would
> > > expect.  Here's the estimated winning rates for all three algorithms
> when
> > > asked for a white response to black G3:
> > >
> > > 1plyMC:       781     / 1272
> > > 1plyShuffle:  140.15 /  231.75
> > > 1plyUCT:     936     / 1515
> > >
> > > 1plyShuffle is slower because of the extra work information tracking,
> but
> > > the variance in estimates should be far lower than the numbers would
> > > indicate.  I have yet to do the computations, but a sample size of
> 231.75
> > > has an estimation error of around 6000 normal MC runs for that position.
> > > That is why my implementation of MCTR is defeating my (1ply)
> implementation
> > > of UCT.
> >
>
>
> _______________________________________________
> computer-go mailing list
> [email protected]
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to