chrishkchris edited a comment on pull request #697:
URL: https://github.com/apache/singa/pull/697#issuecomment-637980576


   Concerning the error in evaluation accuracy. Now I train in single GPU, the 
PR branch has bug, while dev branch is good .
   Note that cnn is good while resnet, xceptionnet has problem, so I think I 
suspect batchnorm...I am not sure 
   
   ```
   1. this PR branch
   
   root@33804dcbc1c1:~/dcsysh/singa/examples/cnn# python3 train.py xceptionnet 
cifar10 --bs 16
   Starting Epoch 0:
   Training loss = 11979.167969, training accuracy = 0.159080
   Evaluation accuracy = 0.100000, Elapsed Time = 633.876963s
   Starting Epoch 1:
   Training loss = 7296.525879, training accuracy = 0.311760
   Evaluation accuracy = 0.100000, Elapsed Time = 634.936632s
   Starting Epoch 2:
   Training loss = 5394.903320, training accuracy = 0.453420
   Evaluation accuracy = 0.100000, Elapsed Time = 635.466069s
   
   root@33804dcbc1c1:~/dcsysh/singa/examples/cnn# python3 train.py resnet 
cifar10 --id 1 --bs 32
   Starting Epoch 0:
   Training loss = 2914.102539, training accuracy = 0.344330
   Evaluation accuracy = 0.100160, Elapsed Time = 305.759969s
   Starting Epoch 1:
   Training loss = 2065.130371, training accuracy = 0.523668
   Evaluation accuracy = 0.099860, Elapsed Time = 310.018232s
   Starting Epoch 2:
   Training loss = 1643.553833, training accuracy = 0.628781
   Evaluation accuracy = 0.100160, Elapsed Time = 310.691379s
   
   2. dev branch
   root@c414bea0e577:~/dcsysh/singa2/examples/cnn# python3 train.py resnet 
cifar10 --id 2 --bs 32
   Starting Epoch 0:
   Training loss = 2259.674561, training accuracy = 0.479713
   Evaluation accuracy = 0.635517, Elapsed Time = 168.079483s
   Starting Epoch 1:
   Training loss = 1466.185791, training accuracy = 0.669894
   Evaluation accuracy = 0.730068, Elapsed Time = 167.397288s
   Starting Epoch 2:
   Training loss = 1147.042358, training accuracy = 0.745018
   Evaluation accuracy = 0.775240, Elapsed Time = 166.614314s
   ```
   
   This problem is already fixed by @dcslin in his commit ab0cb13


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to