The new Iteration 4 network has 25000 games added to the game pool based on the first Iteration 3 network, but note there is a delay in learning since the games are generated and added in the background.

From visual inspection it looks like the prior distribution looks more finished. Really bad moves has close to zero probability, but the most important is if the high probability moves are strong or not.

On CGOS it seems both _I4 versions (full search and 5000 simulations) has a 55% winrate over the previous respective version. But against other programs there seems I4 is equal or even slightly worse.

It could be that the I3 versions won games because of playing very unconventional moves, and when I4 moves into more normal games, it get beaten by programs who are tuned to play those variations better. Pure speculation!

I also had to restart network training for 9x9 komi 7.0 and 5.5 becuase I got a "NaN" as a report loss value, from testing it looked like the networks might have exploded for some reason. My suspicion is that I have a very small batchsize to avoid overfitting (lower batchsize increases randomness and avoids sharp local minima that can lead to overfitting) but this might also cause instability. So I reduced the learning rate with a factor 1/2.

I also yesterday made the 13x13 play less games per iterations and faster games in selfplay, becuase it will take an eternity otherwise. It still has not generated the I3 network.

On 19x19 which is already running fast training with only 10000 games from the first iteration it seems that is currently learned to predict the forced moves generated by Odin. Perhaps a little too much to my taste. In the extreme the network will just copy the monte carlo playouts of Odin...

There is a risk in my experiment that the networks will just become strongly biased to the original MCTS evaluation of Odin without learning any deeper knowledge about go.

Best
Magnus Persson
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to