andrei5055 commented on issue #17335: Excessive GPU memory usage with dynamic 
shape input using Gluon interface
URL: 
https://github.com/apache/incubator-mxnet/issues/17335#issuecomment-613179251
 
 
   @Jerryzcn, @szha. Thanks so much for your suggestions! I tried to use them, 
and this is what I saw in my experiments.
   1. I did not notice any advantage of using `mx.gpu(0).empty_cache()`. On the 
contrary, sometimes I saw new crashes with the error message associated with 
this call.
   2. Yes, I saw about 25-30% slowdown associated with using `mx.nd.waitall()`. 
Unfortunately, it does not solve the initial GPU memory problem. 
   3. What makes this script approx. 30% faster is the usage of `del 
data_loader` after internal loop in
   ```
   for epoch in range(10):
           logger.info('Current batch size is: %d' % batchsize)
           os.system('free >> freeMemory.txt')
           data_loader = DataLoader(dataset, batch_size=batchsize,
                                   batchify_fn=Tuple([Pad(pad_val=0), 
Pad(pad_val=0)]), num_workers=8)
           mx.gpu(0).empty_cache()
           btic = time.time()
           etic = time.time()
           for i, (src_data, dst_data) in enumerate(data_loader):
                . . .
               # Begin to update the parameter
               trainer.step(batchsize)
               . . .
           del data_loader
           batchsize *=2
   ```
   
   Here is some evidence for this statement.
   Initial version launched on datasetSize = 1000:
   ```
   [Epoch 0], Speed: 95.212 samples/sec
   [Epoch 1], Speed: 186.922 samples/sec
   [Epoch 2], Speed: 312.916 samples/sec
   [Epoch 3], Speed: 434.118 samples/sec
   [Epoch 4], Speed: 405.835 samples/sec
   [Epoch 5], Speed: 222.373 samples/sec
   [Epoch 6], Speed: 18.799 samples/sec
   [Epoch 7], Speed: 88.688 samples/sec
   [Epoch 8], Speed: 371.118 samples/sec
   [Epoch 9], Speed: 75.738 samples/sec
    1.993E+02 GPU stress test elapsed time
   ```
   With `del data_loader` on the same dataset:
   ```
   [Epoch 0], Speed: 93.743 samples/sec
   [Epoch 1], Speed: 180.014 samples/sec
   [Epoch 2], Speed: 298.007 samples/sec
   [Epoch 3], Speed: 405.286 samples/sec
   [Epoch 4], Speed: 527.154 samples/sec
   [Epoch 5], Speed: 633.825 samples/sec
   [Epoch 6], Speed: 30.708 samples/sec
   [Epoch 7], Speed: 64.030 samples/sec
   [Epoch 8], Speed: 207.408 samples/sec
   [Epoch 9], Speed: 140.913 samples/sec
    1.368E+02 GPU stress test elapsed time
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to