On 17/11/2016 22:38, Hiroshi Yamashita wrote: > Features are 49 channels. > http://computer-go.org/pipermail/computer-go/2016-February/008606.html ... > Value Net is 32 Filters, 14 Layers. > 32 5x5 x1, 32 3x3 x11, 32 1x1 x1, fully connect 256, fully connect tanh 1 > Features are 50 channels. > http://computer-go.org/pipermail/computer-go/2016-March/008768.htm
Thank you for this information. It takes a long time to train the networks, so knowing which experiments have not worked is very valuable. Did you not find a benefit from a larger value network? Too little data and too much overfitting? Or more benefit from more frequent evaluation? > Policy + Value vs Policy, 1000 playouts/move, 1000 games. 9x9, komi 7.0 > 0.634 using game result. 0 or 1 I presume this is a winrate, but over what base? Policy network? > I also made 19x19 Value net. 19x19 learning positions are from KGS 4d over, > GoGoD, Tygem and 500 playouts/move selfplay. 990255 games. 32 positions > are selected from a game. Like Detlef's idea, I also use game result. > I trust B+R and W+R games with komi 5.5, 6.5 and 7.5. In other games, > If B+ and 1000 playouts at final position is over +0.60, I use it. How do you handle handicap games? I see you excluded them from the KGS dataset. Can your value network deal with handicap? At least in the KGS ruleset, handicap stones are added to the score calculation, so it is required that the network knows the exact handicap. -- GCP _______________________________________________ Computer-go mailing list [email protected] http://computer-go.org/mailman/listinfo/computer-go
