PistonY opened a new issue #13709: Why FP16 training speed is too slow on Tesla 
T4 in Gluon?
URL: https://github.com/apache/incubator-mxnet/issues/13709
 
 
   Hi, I tried to train with FP16 on  Tesla T4, but it's speed is slower than 
GTX 1070 with FP32.
   Could you please give me some suggests to solve that?
   T4 is on Mxnet-cu100mkl and GTX1070 is on mxnet-cu90mkl
   Here are my script and logs:
   code: https://gist.github.com/PistonY/8dfcefdc46b747afd4d18b37f9a18665
   logs:
   T4 log:
   ```
   INFO:root:Iter 390. Loss: 2.14372, Train RMSE 0.23653.Time 00:05:47.lr 
0.019948717948717953
   INFO:root:Test Loss: 1.935017, Test acc 0.327200.
   INFO:root:Iter 780. Loss: 1.89404, Train RMSE 0.22111.Time 00:05:52.lr 
0.03994871794871795
   INFO:root:Test Loss: 1.460350, Test acc 0.473100.
   INFO:root:Iter 1170. Loss: 1.72982, Train RMSE 0.20837.Time 00:05:49.lr 
0.05994871794871795
   INFO:root:Test Loss: 1.288763, Test acc 0.559500.
   INFO:root:Iter 1560. Loss: 1.57620, Train RMSE 0.19388.Time 00:05:48.lr 
0.07994871794871795
   INFO:root:Test Loss: 1.856537, Test acc 0.530100.
   ```
   GTX 1070 log:
   ```
   INFO:root:Epoch 0, Iter 390. Loss: 2.12699, Train RMSE 0.23722.Time 
00:03:00.lr 0.019948717948717953
   INFO:root:Test Loss: 1.746372, Test acc 0.361800.
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to