Hi Daniel,
AGZ paper: greedy player based on policy network (= zero look-ahead) has an
estimated ELO of 3000 ~ Fan Hui 2p.
Professional player level with Zero look-ahead. For me, it is the other
striking aspect of 'Zero' ! ;-)
IMO, this implies that the NN has indeed captured lots of tactics.
Hi Hideki,
I think they could have used a rollout policy network (RPN), as described in
"Convolutional Monte Carlo Rollouts in Go" :https://arxiv.org/abs/1512.03375
and have it trained based on the MCTS outcome, at the same time and in the same
way as the policy head is trained. This RPN would
@Gian-Carlo,
Indeed, multi-labelled value net/ head sounds a good way to inject more signal
into the network, accorging to that paper, thus to inject more reinforcemenf
learning signal for learning from scratch.
I was wondering if it could also be beneficial for bootstrapping the policy
net/
Hi Pierce from Caltech,
Would an Aspergers typically try to lift himself up by pulling on his shoelaces
?
I think you just mistaken me for my fellowcontryman Xavier Combelle and was not
replying to my post:
http://computer-go.org/pipermail/computer-go/2017-October/010338.html
I posted it
Hi ! A question for Aja:
It has been a while since January 2016 paper uncovering Alphago architecture
and training pipeline. Which was related to v13 (Fan Hui match). Alphago
version against Lee Sedol then the Master version have been recognized as
increasingly stronger. Wuzhen version might
This video gives a good overview of some difference between AG and pro styles.
https://www.reddit.com/r/baduk/comments/5q58ji/yunguseng_dojang_video_alphagos_four_specialties/
Patrick
Message d'origine
De : computer-go-requ...@computer-go.org
Date : 10/02/2017 04:00