threeleafzerg commented on issue #10696: [MXNET-366]Extend MXNet Distributed Training by MPI AllReduce URL: https://github.com/apache/incubator-mxnet/pull/10696#issuecomment-386487375 @rahul003 Local Batch Size: 64 means every node's batch size is 64 so global batch size is 64 * 8 = 512. Currently, the result is based upon CPU. For resnet50, we tried, the scaling efficiency close to 99%. Our currently implementation covers CPU and leaves place holder for GPU.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services