PistonY commented on issue #13709: Why FP16 training speed is too slow on Tesla T4 in Gluon? URL: https://github.com/apache/incubator-mxnet/issues/13709#issuecomment-458847487 And I tried to only run forward ```python sta = time.time() for _ in range(test_num): with autograd.record(): output = net(data.astype(dtype, copy=False)) # loss = Loss(output, lable.astype(dtype, copy=False)) # loss.backward() # trainer.step(128) end = time.time() ``` FP32 costs 7.83 FP16 cost 18.9
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
