Hi, Thank you for the info and an interesting idea. I wonder, however, if DCNN can replace handcraft rollouts...
In the case of Zen, there are so many lines for special cases such as snapbacks, approach moves, and nakades to play "correct" reply moves at 100% probability. ReLU is not so good at approximating binary functions and, as a result, "dead" stones, for example, may live at a few tens percent in the average outcome of rollouts. Hideki patrick.bardou via Computer-go: <[email protected]>: >Hi Hideki, >I think they could have used a rollout policy network (RPN), as described >in "Convolutional Monte Carlo Rollouts in Go" >:https://arxiv.org/abs/1512.03375 >and have it trained based on the MCTS outcome, at the same time and in the >same way as the policy head is trained. This RPN would start playing random >rollout, then benefit from the policy head training. >This would let as "human knowledge" the mixing factor between rollout and >value net evaluations. But there is anyway such a mixing factor in Zero >training pipeline, in the loss function mixing policy and value heads. >Regards,Patrick > > > >Message: 1 >Date: Fri, 17 Nov 2017 02:32:29 +0900 >From: Hideki Kato <[email protected]> >To: [email protected] >Subject: Re: [Computer-go] Is MCTS needed? >Message-ID: <5a0dcba9.8060%[email protected]> >Content-Type: text/plain; charset=US-ASCII > >Hi, > >I strongly believe adding rollout makes Zero stronger. >They removed rollout just to say "no human knowledge". >#Though the number of past moves (16) has been tuned by >human :). > > >---- inline file >_______________________________________________ >Computer-go mailing list >[email protected] >http://computer-go.org/mailman/listinfo/computer-go -- Hideki Kato <mailto:[email protected]> _______________________________________________ Computer-go mailing list [email protected] http://computer-go.org/mailman/listinfo/computer-go
