chrishkchris opened a new pull request #564: SINGA-487 Asynchronous training algorithm with partial parameters synchronization URL: https://github.com/apache/singa/pull/564 In this PR, there are two things added: 1. An experimental feature for distributed training is added in opt.py, which is referred as "Asynchronous training algorithm with partial parameters synchronization" 2. The example codes using CIFAR-10 dataset on Resnet-50 to test the above algorithm (i.e. cifar10_multiprocess.py), as well as doing single GPU training (i.e. resnet_cifar10.py). ``` ubuntu@ip-172-31-18-205:~/singa/examples/autograd$ python3 cifar10_multiprocess.py Starting Epoch 0: Training loss = 3731.784668, training accuracy = 0.222937 Evaluation accuracy = 0.100761, Elapsed Time = 162.025833s Starting Epoch 1: Training loss = 3086.915039, training accuracy = 0.274960 Evaluation accuracy = 0.199519, Elapsed Time = 162.342712s Starting Epoch 2: Training loss = 2764.083984, training accuracy = 0.341326 Evaluation accuracy = 0.247997, Elapsed Time = 162.180337s ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services