[issue10037] multiprocessing.pool processes started by worker handler stops working
Changes by Richard Oudkerk shibt...@gmail.com: -- resolution: - later stage: patch review - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10037 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10037] multiprocessing.pool processes started by worker handler stops working
Ask Solem a...@celeryproject.org added the comment: Well, I still don't know exactly why restarting the socket read made it work, but the patch solved an issue where newly started pool processes would be stuck in socket read forever (happening to maybe 1/500 new processes) This and a dozen other pool related fixes are in my billiard fork of multiprocessing, e.g. what you describe in your comment: # trying res.get() would block forever works in billiard, where res.get() will raise WorkerLostError in that case. https://github.com/celery/billiard/ Earlier commit history for the pool can be found in Celery: https://github.com/ask/celery/commits/2.5/celery/concurrency/processes/pool.py My eventual goal is to merge these fixes back into Python, but except for people using Python 3.x, they would have to use billiard for quite some time anyway, so I don't feel in a hurry. I think this issue can be closed, the worker handler is simply borked and we could open up a new issue deciding how to fix it (merging billiard.Pool or someting else). (btw, Richard, you're sbt? I was trying to find your real name to give you credit for the no_execv patch in billiard) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10037 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10037] multiprocessing.pool processes started by worker handler stops working
Richard Oudkerk shibt...@gmail.com added the comment: I think this issue can be closed, the worker handler is simply borked and we could open up a new issue deciding how to fix it (merging billiard.Pool or someting else). OK. I am not sure which option under Resolution should be chosen. Later? (btw, Richard, you're sbt? Yes. I was trying to find your real name to give you credit for the no_execv patch in billiard) The execv stuff certainly won't go in by Py3.3. There has not been consensus that adding it is a good idea. (I also have the unit tests passing with a fork server: the server process is forked at the beginning of the program and then forked children of the server process are started on request. It is about 10 times faster then using execv, and almost as fast as simple forking.) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10037 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10037] multiprocessing.pool processes started by worker handler stops working
Ask Solem a...@celeryproject.org added the comment: Later works, or just close it. I can open up a new issue to merge the improvements in billiard later. The execv stuff certainly won't go in by Py3.3. There has not been consensus that adding it is a good idea. (I also have the unit tests passing with a fork server: the server process is forked at the beginning of the program and then forked children of the server process are started on request. It is about 10 times faster then using execv, and almost as fast as simple forking.) Ah, a working 'fork server' would be just as good. Btw, Billiard now supports running Pool without threads, using epoll/kqueue/select instead. So Celery uses that when it can be nonblocking, and execv when it can't. It performs way better without threads, and in addition shutdown + replacing worker processes is much more responsive. Changing the default Pool is not going to happen, but ncluding a simple select() based Pool would be possible, and then it could also easily work with Twisted, Eventlet, Gevent, etc. (especially now that the Connection is rewritten in pure python). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10037 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10037] multiprocessing.pool processes started by worker handler stops working
Richard Oudkerk shibt...@gmail.com added the comment: Ah, a working 'fork server' would be just as good. Only problem is that it depends on fd passing which is apparently broken on MacOSX. Btw, Billiard now supports running Pool without threads, using epoll/kqueue/select instead. So Celery uses that when it can be nonblocking, and execv when it can't. It performs way better without threads, and in addition shutdown + replacing worker processes is much more responsive. If it were not for Windows I would have tried to avoid using threads. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10037 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10037] multiprocessing.pool processes started by worker handler stops working
Changes by Richard Oudkerk shibt...@gmail.com: -- nosy: +sbt ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10037 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10037] multiprocessing.pool processes started by worker handler stops working
Richard Oudkerk shibt...@gmail.com added the comment: It is not clear to me how to reproduce the bug. When you say letting the workers terminate themselves do mean calling sys.exit() or os._exit() in the submitted task? Are you trying to get the result of a task which caused the worker to exit? I'm not sure how the patch would change the current behaviour. The following seems to work for me: import sys, os import multiprocessing as mp if __name__ == '__main__': p = mp.Pool(4, maxtasksperchild=5) results = [] for i in range(100): if i % 10 == 0: results.append(p.apply_async(sys.exit)) else: results.append(p.apply_async(os.getpid)) for i, res in enumerate(results): if i % 10 != 0: print(res.get()) else: pass # trying res.get() would block forever -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10037 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10037] multiprocessing.pool processes started by worker handler stops working
Sean Reifschneider j...@tummy.com added the comment: The attached patch does change the semantics somewhat, but I don't fully understand how much. In particular: It changes the get() call to be turned into get(timeout=1.0) if inqueue doesn't have a _reader attribute. In the case that inqueue doesn't have a _reader attribute, and inqueue._reader.poll(timeout) is false, get() isn't called at all. It introduces a continue. I'd want Jesse to pronounce on this. -- assignee: - jnoller nosy: +jafo, jnoller ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10037 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10037] multiprocessing.pool processes started by worker handler stops working
Changes by Terry J. Reedy tjre...@udel.edu: -- versions: -Python 3.1 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10037 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10037] multiprocessing.pool processes started by worker handler stops working
Changes by Nir Aides n...@winpdb.org: -- nosy: +nirai ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10037 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10037] multiprocessing.pool processes started by worker handler stops working
Ray.Allen ysj@gmail.com added the comment: Could you give an example code which can reproduce this issue? -- nosy: +ysj.ray ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10037 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10037] multiprocessing.pool processes started by worker handler stops working
New submission from Ask Solem a...@opera.com: While working on an autoscaling (yes, people call it that...) feature for Celery, I noticed that the processes created by the _handle_workers thread doesn't always work. I have reproduced this in general, by just using the maxtasksperchild feature and letting the workers terminate themselves so this seems to have always been an issue (just not easy to reproduce unless workers are created with some frequency) I'm not quite sure of the reason yet, but I finally managed to track it down to the workers being stuck while receiving from the queue. The patch attached seems to resolve the issue by polling the queue before trying to receive. I know this is short, I may have some more data later. -- components: Library (Lib) files: multiprocessing-worker-poll.patch keywords: needs review, patch messages: 118062 nosy: asksol priority: critical severity: normal stage: patch review status: open title: multiprocessing.pool processes started by worker handler stops working type: behavior versions: Python 2.7, Python 3.1, Python 3.2, Python 3.3 Added file: http://bugs.python.org/file19139/multiprocessing-worker-poll.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10037 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com