The paper describes 20 and 40 block networks, but the section on comparison 
says AlphaGo Zero uses 20 blocks. I think your protobuf describes a 40 block 
network. That's a factor of two 😊

If you only want pro strength rather than superhuman, you can train for half 
their time.

Your time looks reasonable when calculating the time to generate the 29M games 
at about 10 seconds per move. This is only the time to generate the input data. 
Do you have an estimate of the additional time it takes to do the training? 
It's probably small in comparison, but it might not be.

My plan is to start out with a little supervised learning, since I'm not trying 
to prove a breakthrough. I experimented last year for a few months with 
res-nets for a policy network and there are some things I discovered there that 
probably apply to this network. They should get perhaps a factor of 5 to 10 
speedup. For a commercial program I'll be happy with 7-dan amateur with about 6 
months of training using my two GPUs and sixteen i7 cores. 


-----Original Message-----
From: Computer-go [] On Behalf Of 
Gian-Carlo Pascutto
Sent: Friday, October 20, 2017 10:45 AM
Subject: [Computer-go] Zero performance

I reconstructed the full AlphaGo Zero network in Caffe:

I did some performance measurements, with what should be state-of-the-art on 
consumer hardware:

GTX 1080 Ti
NVIDIA-Caffe + CUDA 9 + cuDNN 7
batch size = 8

Memory use is about ~2G. (It's much more for learning, the original minibatch 
size of 32 wouldn't fit on this card!)

Running 2000 iterations takes 93 seconds.

In the AlphaGo paper, they claim 0.4 seconds to do 1600 MCTS simulations, and 
they expand 1 node per visit (if I got it right) so that would be 1600 network 
evaluations as well, or 200 of my iterations.

So it would take me ~9.3s to produce a self-play move, compared to 0.4s for 

I would like to extrapolate how long it will take to reproduce the research, 
but I think I'm missing how many GPUs are in each self-play worker (4 TPU or 64 
GPU or ?), or perhaps the average length of the games.

Let's say the latter is around 200 moves. They generated 29 million games for 
the final result, which means it's going to take me about 1700 years to 
replicate this. I initially estimated 7 years based on the reported 64 GPU vs 1 
GPU, but this seems far worse. Did I miss anything in the calculations above, 
or was it really a *pile* of those 64 GPU machines?

Because the performance on playing seems reasonable (you would be able to 
actually run the MCTS on a consumer machine, and hence end up with a strong 
program), I would be interested in setting up a distributed effort for this. 
But realistically there will be maybe 10 people joining, 80 if we're very lucky 
(looking at Stockfish numbers). That means it'd still take 20 to 170 years.

Someone please tell me I missed a factor of 100 or more somewhere. I'd love to 
be wrong here.

Computer-go mailing list

Computer-go mailing list

Reply via email to