I also wonder about this. A purely convolutional approach would save a lot of 
weights. The output for pass can be set to be a single bias parameter, 
connected to nothing. Setting pass to a constant might work, too. I don't 
understand the reason for such a complication.

----- Mail original -----
De: "Andy" <andy.olsen...@gmail.com>
À: "computer-go" <computer-go@computer-go.org>
Envoyé: Vendredi 29 Décembre 2017 06:47:06
Objet: [Computer-go] AGZ Policy Head



Is there some particular reason AGZ uses two 1x1 filters for the policy head 
instead of one? 


They could also have allowed more, but I guess that would be expensive? I 
calculate that the fully connected layer has 2*361*362 weights, where 2 is the 
number of filters. 


By comparison the value head has only a single 1x1 filter, but it goes to a 
hidden layer of 256. That gives 1*361*256 weights. Why not use two 1x1 filters 
here? Maybe since the final output is only a single scalar it's not needed? 










_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to