ptrendx commented on issue #14329: [Flaky] flaky test in 
test_operator_gpu.test_convolution_multiple_streams
URL: 
https://github.com/apache/incubator-mxnet/issues/14329#issuecomment-581507245
 
 
   I don't think you will be able to repro this failure on its own 
unfortunately - I believe the problem with this test is that it is affected by 
other tests. What I believe happens is the other tests use a lot of GPU memory 
that is cached in the caching memory allocator. Then this tests (to test the 
effect of env variables that are read only during init of MXNet) launches a new 
MXNet process. That process then tries to allocate memory and hit OoM condition 
- this is not a problem for other tests, because they would just free the 
memory stored inside the caching allocator, but this process does not have that 
option, so it just fails, which is then reported as a failure.
   
   I believe the proper way of fixing this would be to move this test to its 
own command during testing (so that there is no memory allocated by other tests 
when it runs).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to