The new Iteration 4 network has 25000 games added to the game pool based
on the first Iteration 3 network, but note there is a delay in learning
since the games are generated and added in the background.
From visual inspection it looks like the prior distribution looks more
finished. Really bad moves has close to zero probability, but the most
important is if the high probability moves are strong or not.
On CGOS it seems both _I4 versions (full search and 5000 simulations)
has a 55% winrate over the previous respective version. But against
other programs there seems I4 is equal or even slightly worse.
It could be that the I3 versions won games because of playing very
unconventional moves, and when I4 moves into more normal games, it get
beaten by programs who are tuned to play those variations better. Pure
speculation!
I also had to restart network training for 9x9 komi 7.0 and 5.5 becuase
I got a "NaN" as a report loss value, from testing it looked like the
networks might have exploded for some reason. My suspicion is that I
have a very small batchsize to avoid overfitting (lower batchsize
increases randomness and avoids sharp local minima that can lead to
overfitting) but this might also cause instability. So I reduced the
learning rate with a factor 1/2.
I also yesterday made the 13x13 play less games per iterations and
faster games in selfplay, becuase it will take an eternity otherwise. It
still has not generated the I3 network.
On 19x19 which is already running fast training with only 10000 games
from the first iteration it seems that is currently learned to predict
the forced moves generated by Odin. Perhaps a little too much to my
taste. In the extreme the network will just copy the monte carlo
playouts of Odin...
There is a risk in my experiment that the networks will just become
strongly biased to the original MCTS evaluation of Odin without learning
any deeper knowledge about go.
Best
Magnus Persson
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go