chrishkchris edited a comment on issue #535: SINGA-490 Optimize performance of 
stochastic gradient descent (SGD)
   Since I have done some optimization of the framework, I need to retest again 
the distributed performance: 
   I tested the distributed training in AWS p2.x8large, after adding the Sync() 
in the SGD loop of and
   The speed up of using 8 GPUs is now 7.21x, but this is compared without real 
data feeding. 
   See the following throughput comparison in and
   ubuntu@ip-172-31-28-231:~/incubator-singa/examples/autograd$ python3
   Start intialization............
 100/100 [01:23<00:00,  1.19it/s]
   Throughput = 38.13589358185999 per second
   Total=0.8391045022010803, forward=0.26401839971542357, 
softmax=0.0020227289199829103, backward=0.5730633735656739, 
/home/ubuntu/mpich-3.3/build/bin/mpiexec --hostfile host_file python3
   Start intialization...........
   100%|██████████| 100/100 [01:33<00:00,  1.08it/s]
   Throughput = 274.9947180123401 per second
   Total=0.9309269714355469, forward=0.2690380573272705, 
softmax=0.0021610450744628906, backward=0.6597278690338135, 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

With regards,
Apache Git Services

Reply via email to