Hi Dan, hi friends,
> There is actually no randomness in the algorithm, just like AlphaZero's...
but then it does not make sense to call that algorithm "rollout".
**
In general: when introducing a new name, care should
be taken that the name de
> but then it does not make sense to call that algorithm "rollout".
>
> In general: when introducing a new name, care should
> be taken that the name describes properly what is going on.
Speaking of which, why did people start calling them rollouts instead of
playouts?
Darren
P.S. And don't get
Hi Darren,
> > but then it does not make sense to call that algorithm "rollout".
> > ...
>
> Speaking of which, why did people start calling them rollouts
> instead of playouts?
it comes from the Backgammon scene, where for instance
rungames in the endgame were estimated by dozens or
hundreds o
The technique originated with backgammon players in the late 1970's, who would
roll out positions manually. Ron Tiekert (Scrabble champion) also applied the
technique to Scrabble, and I took that idea for Maven. It seemed like people
were using the terms interchangeably.
-Original Message--
In the AGZ paper, there is a formula for what they call “a variant of the PUCT
algorithm”, and they cite a paper from Christopher Rosin:
http://gauss.ececs.uc.edu/Workshops/isaim2010/papers/rosin.pdf
But that paper has a formula that he calls the PUCB formula, which incorporates
the priors i