YutingZhang opened a new issue #13521: Gluon DataLoader cannot release the processes in the pool URL: https://github.com/apache/incubator-mxnet/issues/13521 https://github.com/apache/incubator-mxnet/blob/f2dcd7c7b8676b55d912997fc3f9c62c55915307/python/mxnet/gluon/data/dataloader.py#L532-L533 Logically, when a `DataLoader` is recycled, the `_worker_pool` should be recycled, and the `terminate()` of the `_worker_pool` function should be called immediately. However, it did not ... Each time I kill a `DataLoader`, it leaves the worker processes dangling. I guess it is a bug of python `multiprocess.Pool`. Anyway, I think we can patch it by explicitly call `_worker_pool.terminate()` Minimum code to reproduce the errors. ```python import mxnet as mx import numpy as np A=np.random.rand(999, 2000) D=mx.gluon.data.DataLoader(A, batch_size=8, num_workers=2) the_iter = iter(D) next(the_iter) D._worker_pool.terminate() del the_iter del D ``` I recorded a video demo for this bug: https://drive.google.com/open?id=1q4CmU_F1vAtxoZ_KUmrIEfVRk3RsQfv8 Environment: today's mxnet from pip, python3.6 on p3
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
