If I understood it right, the playout NN in AlphaGo was created by using
the same training set as the one used for the large NN that is used in the
tree. There would be an alternative though. I don't know if this is the
best source, but here is one example: https://arxiv.org/pdf/1312.6184.pdf
The idea is to teach a shallow NN to mimic the outputs of a deeper net. For
one thing, this seems to give better results than direct training on the
same set. But also, more importantly, this could be done after the large NN
has been improved with selfplay.
And after that, the selfplay could be restarted with the new playout NN.
So it seems to me, there is real room for improvement here.

Stefan
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to