mifan0208 opened a new issue #17636: After model parameter iteration,  printed 
the loss , and OOM problem appears,
URL: https://github.com/apache/incubator-mxnet/issues/17636
 
 
   ## Description
   After model parameter iteration,  printed the loss , and OOM problem 
appears,What's the problem?
   
   ## Occurrences
   Traceback (most recent call last):
     File "train.py", line 124, in <module>
       train(train_loader)
     File "train.py", line 105, in train
       print('loss',loss_.asnumpy())
     File "/usr/local/lib/python3.6/site-packages/mxnet/ndarray/ndarray.py", 
line 1996, in asnumpy
       ctypes.c_size_t(data.size)))
     File "/usr/local/lib/python3.6/site-packages/mxnet/base.py", line 253, in 
check_call
       raise MXNetError(py_str(_LIB.MXGetLastError()))
   mxnet.base.MXNetError: [14:08:41] 
src/storage/./pooled_storage_manager.h:157: cudaMalloc failed: out of memory
   ## What have you tried to solve it?
   
   1.No printing,OOM will not appear
   2.Reduce the batch from 10-1,print the loss,OOM problem always appears
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to