date:20170623

Re: [Computer-go] Value network that doesn't want to learn.

2017-06-23 Thread Brian Sheppard via Computer-go

>... my value network was trained to tell me the game is balanced at the >beginning... :-) The best training policy is to select positions that correct errors. I used the policies below to train a backgammon NN. Together, they reduced the expected loss of the network by 50% (cut the error

Re: [Computer-go] Value network that doesn't want to learn.

2017-06-23 Thread Vincent Richard

Finally found the problem. In the end, it was as stupid as expected: When I pick a game for the batch creation I select randomly a limited number of moves inside the game. In the case of the value network I use like 8-16 moves to not overfit the data (I can't take 1 or then the I/O operations