Vikas89 commented on issue #10992: unlink memory shared file immediately on linux URL: https://github.com/apache/incubator-mxnet/pull/10992#issuecomment-457348280 @piiswrong can you explain more about this change and the intent behind the change. There is not enough information in details section and it would be useful if you can tell the motivation behind doing this. For this particular case: https://github.com/apache/incubator-mxnet/pull/10992#issuecomment-451534918 I have a possible explanation and recommended way to fix: We saw that if process doesn't die, we are able to access the ndarray without any trouble.I don't know how multiprocessing queue in python works, but SO link told me that put doesn't guarantee that objects are pickled and in the queue. If I see the stack of error, it comes in consumer which is trying to access a file which was created by Producer process. This means that pickling is happening I am getting from queue. Since process which created that memory already died, the memory is gone, user is getting error because consumer is trying to pickle from a memory which is gone. Recommended fix: 1) Fix the code as recommended, don't let your producer die until the consumers has consumed. 2) explicitly pickle ndarray into queue, (see SO link I shared) , this will ensure that objects are indeed placed in the queue from mxnet memory of the process and removes any un-certainity of when objects are getting pickled(before the process dies or after the process dies)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
