I think, this new paper by Yuandong Tian and Yan Zhu
is interesting for us:
Better Computer Go Player with Neural Network and Long-term Prediction
http://arxiv.org/abs/1511.06410
Cheers, Ingo.
___
Computer-go mailing list
Computer-go@computer-go.org
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Hi,
thanks a lot for sharing! I try a slightly different approach at the
moment:
I use a combined policy / value network (adding 3-5 layers with about
16 filters at the end of the policy network for the value network to
avoid overfitting) and I use
Hi,
I tried to make Value network.
"Policy network + Value network" vs "Policy network"
Winrate Wins/Games
70.7%322 / 455,1000 playouts/move
76.6%141 / 184, 1 playouts/move
It seems more playouts, more Value network is effetctive. Games
is not enough though. Search is