Vikas89 commented on issue #10992: unlink memory shared file immediately on 
linux
URL: https://github.com/apache/incubator-mxnet/pull/10992#issuecomment-451330286
 
 
   @piiswrong this PR introduces a bug in gluon.  
   
   Code to repro is:
   
   ```
   from multiprocessing import Queue, Event, Process
   import mxnet as mx
   import time
   
   class Producer(object):
       
       def __init__(self, input_queue):
           self.input_queue = input_queue
           
       def fill(self, length=100):
           for i in range(length):
               self.input_queue.put(mx.nd.ones((1,1)))
   
   input_queue = Queue()
   producer = Producer(input_queue)
   # producer.fill()
   preprocessor_process = Process(target=producer.fill)
   preprocessor_process.daemon = True
   preprocessor_process.start()
   # preprocessor_process2 = Process(target=producer.fill)
   # preprocessor_process2.daemon = True
   # preprocessor_process2.start()
   
   # Read without new thread
   while True:
       print(input_queue.get())
   
   class Consumer(object):
       
       def __init__(self, input_queue):
           self.input_queue = input_queue
           
       def read(self):
           while True:
               print(self.input_queue.get().shape)
   
   consumer = Consumer(input_queue)
   consumer_process = Process(target=consumer.read)
   consumer_process.daemon = True
   consumer_process.start()
   
   time.sleep(1000000)
   
   
   ```
   
   Error output looks like this
   ```
   3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34)
   [GCC 7.3.0]
   1.3.1
   
   [[1.]]
   <NDArray 1x1 @cpu_shared(0)>
   
   [[1.]]
   <NDArray 1x1 @cpu_shared(0)>
   
   [[1.]]
   <NDArray 1x1 @cpu_shared(0)>
   
   [[1.]]
   <NDArray 1x1 @cpu_shared(0)>
   
   [[1.]]
   <NDArray 1x1 @cpu_shared(0)>
   
   [[1.]]
   <NDArray 1x1 @cpu_shared(0)>
   
   [[1.]]
   <NDArray 1x1 @cpu_shared(0)>
   
   [[1.]]
   <NDArray 1x1 @cpu_shared(0)>
   
   [[1.]]
   <NDArray 1x1 @cpu_shared(0)>
   
   [[1.]]
   <NDArray 1x1 @cpu_shared(0)>
   Traceback (most recent call last):
   File "my_test.py", line 29, in <module>
   print(input_queue.get())
   File 
"/home/ubuntu/anaconda3/envs/mxnet-13-mxnet/lib/python3.6/multiprocessing/queues.py",
 line 113, in get
   return _ForkingPickler.loads(res)
   File 
"/home/ubuntu/anaconda3/envs/mxnet-13-mxnet/lib/python3.6/site-packages/mxnet/gluon/data/dataloader.py",
 line 57, in rebuild_ndarray
   fd = fd.detach()
   File 
"/home/ubuntu/anaconda3/envs/mxnet-13-mxnet/lib/python3.6/multiprocessing/resource_sharer.py",
 line 57, in detach
   with _resource_sharer.get_connection(self._id) as conn:
   File 
"/home/ubuntu/anaconda3/envs/mxnet-13-mxnet/lib/python3.6/multiprocessing/resource_sharer.py",
 line 87, in get_connection
   c = Client(address, authkey=process.current_process().authkey)
   File 
"/home/ubuntu/anaconda3/envs/mxnet-13-mxnet/lib/python3.6/multiprocessing/connection.py",
 line 487, in Client
   c = SocketClient(address)
   File 
"/home/ubuntu/anaconda3/envs/mxnet-13-mxnet/lib/python3.6/multiprocessing/connection.py",
 line 614, in SocketClient
   s.connect(address)
   FileNotFoundError: [Errno 2] No such file or directory
   ```
   
   This happens with python 3.6 from ubuntu. 
   Error is coming from fd = fd.detach() line. 
   I believe this is a regression introduced in 1.3 
   
   @piiswrong 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to