I agree. Even on 19x19 you can use smaller searches. 400 iterations MCTS is
probably already a lot stronger than the raw network, especially if you are
expanding every node (very different from a normal program at 400
playouts!). Some tuning of these mini searches is important. Surely you
don't want to explore every child node for the first play urgency... I
remember this little algorithmic detail was missing from the first paper as
well.

So that's a factor 32 gain. Because the network is smaller, it should learn
much faster too. Someone on reddit posted a comparison of 20 blocks vs 40
blocks.

With 10 people you can probably get some results in a few months. The
question is, how much Elo have we lost on the way...

Another advantage would be that, as long as you keep all the SGF, you can
bootstrap a bigger network from the data! So, nothing is lost from starting
small. You can "upgrade" if the improvements start to plateau.

On Fri, Oct 20, 2017, 23:32 Álvaro Begué <alvaro.be...@gmail.com> wrote:

> I suggest scaling down the problem until some experience is gained.
>
> You don't need the full-fledge 40-block network to get started. You can
> probably get away with using only 20 blocks and maybe 128 features (from
> 256). That should save you about a factor of 8, plus you can use larger
> mini-batches.
>
> You can also start with 9x9 go. That way games are shorter, and you
> probably don't need 1600 network evaluations per move to do well.
>
> Álvaro.
>
>
> On Fri, Oct 20, 2017 at 1:44 PM, Gian-Carlo Pascutto <g...@sjeng.org>
> wrote:
>
>> I reconstructed the full AlphaGo Zero network in Caffe:
>> https://sjeng.org/dl/zero.prototxt
>>
>> I did some performance measurements, with what should be
>> state-of-the-art on consumer hardware:
>>
>> GTX 1080 Ti
>> NVIDIA-Caffe + CUDA 9 + cuDNN 7
>> batch size = 8
>>
>> Memory use is about ~2G. (It's much more for learning, the original
>> minibatch size of 32 wouldn't fit on this card!)
>>
>> Running 2000 iterations takes 93 seconds.
>>
>> In the AlphaGo paper, they claim 0.4 seconds to do 1600 MCTS
>> simulations, and they expand 1 node per visit (if I got it right) so
>> that would be 1600 network evaluations as well, or 200 of my iterations.
>>
>> So it would take me ~9.3s to produce a self-play move, compared to 0.4s
>> for them.
>>
>> I would like to extrapolate how long it will take to reproduce the
>> research, but I think I'm missing how many GPUs are in each self-play
>> worker (4 TPU or 64 GPU or ?), or perhaps the average length of the games.
>>
>> Let's say the latter is around 200 moves. They generated 29 million
>> games for the final result, which means it's going to take me about 1700
>> years to replicate this. I initially estimated 7 years based on the
>> reported 64 GPU vs 1 GPU, but this seems far worse. Did I miss anything
>> in the calculations above, or was it really a *pile* of those 64 GPU
>> machines?
>>
>> Because the performance on playing seems reasonable (you would be able
>> to actually run the MCTS on a consumer machine, and hence end up with a
>> strong program), I would be interested in setting up a distributed
>> effort for this. But realistically there will be maybe 10 people
>> joining, 80 if we're very lucky (looking at Stockfish numbers). That
>> means it'd still take 20 to 170 years.
>>
>> Someone please tell me I missed a factor of 100 or more somewhere. I'd
>> love to be wrong here.
>>
>
>> --
>> GCP
>
>
>> _______________________________________________
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>
> _______________________________________________
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go

-- 

GCP
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to