Could use late-move reductions to eliminate the hard pruning. Given the
accuracy rate of the policy network, I would guess that even move 2 should be
reduced.
-Original Message-
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of
Hiroshi Yamashita
Sent: Saturday,
Hi,
HiraBot author reported mini-max search with Policy and Value network.
It does not use monte-carlo.
Only top 8 moves from Policy is searched in root node. In other depth,
top 4 moves is searched.
Game result against Policy network best move (without search)
Win Loss winrate
M
Hi ! A question for Aja:
It has been a while since January 2016 paper uncovering Alphago architecture
and training pipeline. Which was related to v13 (Fan Hui match). Alphago
version against Lee Sedol then the Master version have been recognized as
increasingly stronger. Wuzhen version might be