ChaiBapchya opened a new issue #16447: Large tensor nightly test memory failures
URL: https://github.com/apache/incubator-mxnet/issues/16447
 
 
   Currently, upon running large tensor tests (CPU specific), we see memory 
footprint that exceeds the available memory (where nightly tests are run - C5 
instance with <150G)
   
   Attempts made - 
   Despite calling nd.waitall and empty_cache(), memory footprint continues to 
grow gradually (as nosetests proceed 1 test after another).  For reference this 
commit - 
   
https://github.com/apache/incubator-mxnet/pull/16328/commits/c48f70f2e71c7d4b5b6cc3bf4ff6a37710d74d9a
   
   TODO:
   - C API to release CPU memory (similar to ReleaseAll)
   @anirudh2290 pointed out 
   empty_cache() is a No Op for CPU context.
   So, need to get correct implementation for CPU specific usecases.
   
   Large tensor tests are maintained here
   
   
https://github.com/apache/incubator-mxnet/blob/master/tests/nightly/test_large_array.py
   
https://github.com/apache/incubator-mxnet/blob/master/tests/nightly/test_large_vector.py
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to