[GitHub] PistonY commented on issue #13709: Why FP16 training speed is too slow on Tesla T4 in Gluon?

GitBox Tue, 29 Jan 2019 18:56:18 -0800

PistonY commented on issue #13709: Why FP16 training speed is too slow on Tesla 
T4 in Gluon?
URL: 
https://github.com/apache/incubator-mxnet/issues/13709#issuecomment-458792488
 
 
   I tried to fixed input,FP32 work well but FP16 out of memory.
   This is my script.
   ```python
   from mxnet import nd, autograd
   from mxnet import gluon
   from mxnet.gluon import loss as gloss
   from gluoncv.model_zoo import *
   import mxnet as mx
   import time
   
   ctx = mx.gpu(0)
   
   data = nd.random.normal(shape=(64, 3, 224, 224), ctx=ctx)
   lable = nd.random.randint(low=0, high=1, shape=(64, 1), ctx=ctx)
   
   net = resnet101_v2()
   net.hybridize()
   net.initialize(ctx=ctx)
   
   net(data)
   
   test_num = 500
   dtype = 'float16'    # float32 or float16
   if dtype != 'float32':
       net.cast(dtype)
   Loss = gloss.SoftmaxCrossEntropyLoss()
   trainer = gluon.Trainer(net.collect_params(),
                           'nag', {'learning_rate': 0.1, 'momentum': 0.9,
                                   'multi_precision': True  # when fp16 is 
enabled
                                   })
   sta = time.time()
   for _ in range(test_num):
       with autograd.record():
           output = net(data.astype(dtype, copy=False))
           loss = Loss(output, lable.astype(dtype, copy=False))
       loss.backward()
       trainer.step(128)
   end = time.time()
   print(end - sta)
   ```
   mxnet version is 1.5.0 (--pre)
   When training with FP32,it cost 9921Mb memory and 75s.
   But I tested with FP16 memory usage from 7000Mb continue to grow until out 
of memory.
   I don't know why, it's looks like memory doesn't free.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] PistonY commented on issue #13709: Why FP16 training speed is too slow on Tesla T4 in Gluon?

Reply via email to