Re: [Computer-go] Adding roll-out to Zero

Hideki Kato Thu, 16 Nov 2017 18:10:49 -0800

Hi,

Thank you for the info and an interesting idea.  I wonder,
however, if DCNN can replace handcraft rollouts...


In the case of Zen, there are so many lines for special
cases such as snapbacks, approach moves, and nakades to 
play "correct" reply moves at 100% probability.  ReLU is not 
so good at approximating binary functions and, as a 
result, "dead" stones, for example, may live at a few tens 
percent in the average outcome of rollouts.

Hideki

patrick.bardou via Computer-go: 
<[email protected]>:
>Hi Hideki,
>I think they could have used a rollout policy network (RPN), as described 
>in "Convolutional Monte Carlo Rollouts in Go" 
>:https://arxiv.org/abs/1512.03375
>and have it trained based on the MCTS outcome, at the same time and in the 
>same way as the policy head is trained. This RPN would start playing random 
>rollout, then benefit from the policy head training.
>This would let as "human knowledge" the mixing factor between rollout and 
>value net evaluations. But there is anyway such a mixing factor in Zero 
>training pipeline, in the loss function mixing policy and value heads.
>Regards,Patrick
>
>
>
>Message: 1
>Date: Fri, 17 Nov 2017 02:32:29 +0900
>From: Hideki Kato <[email protected]>
>To: [email protected]
>Subject: Re: [Computer-go] Is MCTS needed?
>Message-ID: <5a0dcba9.8060%[email protected]>
>Content-Type: text/plain; charset=US-ASCII
>
>Hi,
>
>I strongly believe adding rollout makes Zero stronger.  
>They removed rollout just to say "no human knowledge".
>#Though the number of past moves (16) has been tuned by 
>human :).
>
>
>---- inline file
>_______________________________________________
>Computer-go mailing list
>[email protected]
>http://computer-go.org/mailman/listinfo/computer-go
-- 
Hideki Kato <mailto:[email protected]>
_______________________________________________
Computer-go mailing list
[email protected]
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Adding roll-out to Zero

Reply via email to