oleg-trott commented on issue #17665: No speedup from using FP16 (4 times 
slower than PyTorch)
URL: 
https://github.com/apache/incubator-mxnet/issues/17665#issuecomment-592698779
 
 
   @ptrendx 
   
   Without `multi_precision`, `mxnet.optimizer.SGD` says it would just use the 
same precision as the weights.
   
   However, here, per iteration, I see
   
   FP16 + multi_precision : 0.14
   FP16 : 0.21
   FP32 : 0.24-0.34
   
   So, not using multi_precision is actually slower with FP16.
   
   ```
   import os
   os.environ['MXNET_SAFE_ACCUMULATION']='1'
   
   import mxnet as mx
   from mxnet import gluon, nd, autograd
   from mxnet.gluon.model_zoo import vision
   import numpy as np
   from time import time
   
   ctx = mx.gpu(0)
   
   m = vision.resnet50_v2(pretrained=True, ctx=ctx)
   
   bs = 32*2
   n = 224
   
   with ctx:
       x = nd.random.randn(bs, 3, n, n)
       target = nd.zeros(bs, dtype=np.int32)
   
   if 1: # change this
       x = x.astype('float16')
       m.cast('float16')
   
   loss = gluon.loss.SoftmaxCrossEntropyLoss()
   
   if 1: # change this
       args = {'learning_rate': 1e-9}
   else:
       args = {'learning_rate': 1e-9, 'multi_precision' : True}
   
   opt = gluon.Trainer(m.collect_params(), 'sgd', args)
   
   for i in range(100):
       tic = time()
       with autograd.record():
           y = m(x)
           out = loss(y, target)
       out.backward()
       opt.step(batch_size=bs)
       nd.waitall()
       print(time() - tic)
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to