Training of AlphaGo Zero has been done on thousands of TPUs, according to
this source:
https://www.reddit.com/r/baduk/comments/777ym4/alphago_zero_learning_from_scratch_deepmind/dokj1uz/?context=3

Maybe that should explain the difference in orders of magnitude that you
noticed?


On Fri, Oct 20, 2017 at 10:44 AM, Gian-Carlo Pascutto <g...@sjeng.org> wrote:

> I reconstructed the full AlphaGo Zero network in Caffe:
> https://sjeng.org/dl/zero.prototxt
>
> I did some performance measurements, with what should be
> state-of-the-art on consumer hardware:
>
> GTX 1080 Ti
> NVIDIA-Caffe + CUDA 9 + cuDNN 7
> batch size = 8
>
> Memory use is about ~2G. (It's much more for learning, the original
> minibatch size of 32 wouldn't fit on this card!)
>
> Running 2000 iterations takes 93 seconds.
>
> In the AlphaGo paper, they claim 0.4 seconds to do 1600 MCTS
> simulations, and they expand 1 node per visit (if I got it right) so
> that would be 1600 network evaluations as well, or 200 of my iterations.
>
> So it would take me ~9.3s to produce a self-play move, compared to 0.4s
> for them.
>
> I would like to extrapolate how long it will take to reproduce the
> research, but I think I'm missing how many GPUs are in each self-play
> worker (4 TPU or 64 GPU or ?), or perhaps the average length of the games.
>
> Let's say the latter is around 200 moves. They generated 29 million
> games for the final result, which means it's going to take me about 1700
> years to replicate this. I initially estimated 7 years based on the
> reported 64 GPU vs 1 GPU, but this seems far worse. Did I miss anything
> in the calculations above, or was it really a *pile* of those 64 GPU
> machines?
>
> Because the performance on playing seems reasonable (you would be able
> to actually run the MCTS on a consumer machine, and hence end up with a
> strong program), I would be interested in setting up a distributed
> effort for this. But realistically there will be maybe 10 people
> joining, 80 if we're very lucky (looking at Stockfish numbers). That
> means it'd still take 20 to 170 years.
>
> Someone please tell me I missed a factor of 100 or more somewhere. I'd
> love to be wrong here.
>
> --
> GCP
> _______________________________________________
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to