rohith14 opened a new issue #7813: Performance doesn't improve (scalability 
issue) with # GPUs with running train_imagenet.py
URL: https://github.com/apache/incubator-mxnet/issues/7813
 
 
   While training AlexNet CNN with ImageNet data, i don't see performance 
improvement (in-fact i see slight performance degradation) with increasing 
number of GPUs
   
   python train_imagenet.py --data-train 
/local/ImageNet/MXNet_data/MXNet_data.rec --data-val 
/local/ImageNet/MXNet_data/MXNet_data_test.rec --gpus 0,1,2,3 --network alexnet 
--batch-size 256  --num-epochs 1 --kv-store device
   
   Per epoch (and batch-size/GPU : 64),
   With 1 GPU, Time-cost : 910 sec
   With 2 GPU, Time-cost : 924 sec
   With 4 GPU, Time-cost : 964 sec
   
   I have 4 Titan Xps
   
   However, with synthetic data (as shown in the demo 
https://github.com/apache/incubator-mxnet/blob/master/example/image-classification/README.md)
 i see good scalability.
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to