Hi Ingo, There is actually no randomness in the algorithm, just like AlphaZero's. It is the same algorithm as a recursive alpha-beta searcher, with the only difference being the rollouts version examines one leaf per episode (one path from root to leaf). This opens the door for mixing alpha-beta with MCTS; some have already used is an MCTS-ab algorithm that does exactly this. At the leaves, I call a qsearch() for evaluation whereas AlphaZero uses Value Network (no playouts ).
Daniel On Wed, Mar 7, 2018 at 2:20 AM, "Ingo Althöfer" <3-hirn-ver...@gmx.de> wrote: > Hi Dan, > > I find your definition of "Alpha-Beta rollouts" somewhat puzzling. > > > Alpha-beta rollouts is like MCTS without playouts (as in > > AlphaZero), and something that can also do alpha-beta pruning. > > I would instead define "Alpha-Beta rollout" in the following way: > You have a fast alpha-beta program (with some randomness) for > that game, and a rollout means a quick selfplay game of this > program. > > Ingo. > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go >
_______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go