Training of AlphaGo Zero has been done on thousands of TPUs, according to this source: https://www.reddit.com/r/baduk/comments/777ym4/alphago_zero_learning_from_scratch_deepmind/dokj1uz/?context=3
Maybe that should explain the difference in orders of magnitude that you noticed? On Fri, Oct 20, 2017 at 10:44 AM, Gian-Carlo Pascutto <g...@sjeng.org> wrote: > I reconstructed the full AlphaGo Zero network in Caffe: > https://sjeng.org/dl/zero.prototxt > > I did some performance measurements, with what should be > state-of-the-art on consumer hardware: > > GTX 1080 Ti > NVIDIA-Caffe + CUDA 9 + cuDNN 7 > batch size = 8 > > Memory use is about ~2G. (It's much more for learning, the original > minibatch size of 32 wouldn't fit on this card!) > > Running 2000 iterations takes 93 seconds. > > In the AlphaGo paper, they claim 0.4 seconds to do 1600 MCTS > simulations, and they expand 1 node per visit (if I got it right) so > that would be 1600 network evaluations as well, or 200 of my iterations. > > So it would take me ~9.3s to produce a self-play move, compared to 0.4s > for them. > > I would like to extrapolate how long it will take to reproduce the > research, but I think I'm missing how many GPUs are in each self-play > worker (4 TPU or 64 GPU or ?), or perhaps the average length of the games. > > Let's say the latter is around 200 moves. They generated 29 million > games for the final result, which means it's going to take me about 1700 > years to replicate this. I initially estimated 7 years based on the > reported 64 GPU vs 1 GPU, but this seems far worse. Did I miss anything > in the calculations above, or was it really a *pile* of those 64 GPU > machines? > > Because the performance on playing seems reasonable (you would be able > to actually run the MCTS on a consumer machine, and hence end up with a > strong program), I would be interested in setting up a distributed > effort for this. But realistically there will be maybe 10 people > joining, 80 if we're very lucky (looking at Stockfish numbers). That > means it'd still take 20 to 170 years. > > Someone please tell me I missed a factor of 100 or more somewhere. I'd > love to be wrong here. > > -- > GCP > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go
_______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go