Ask Solem <a...@celeryproject.org> added the comment: Well, I still don't know exactly why restarting the socket read made it work, but the patch solved an issue where newly started pool processes would be stuck in socket read forever (happening to maybe 1/500 new processes)
This and a dozen other pool related fixes are in my billiard fork of multiprocessing, e.g. what you describe in your comment: # trying res.get() would block forever works in billiard, where res.get() will raise WorkerLostError in that case. https://github.com/celery/billiard/ Earlier commit history for the pool can be found in Celery: https://github.com/ask/celery/commits/2.5/celery/concurrency/processes/pool.py My eventual goal is to merge these fixes back into Python, but except for people using Python 3.x, they would have to use billiard for quite some time anyway, so I don't feel in a hurry. I think this issue can be closed, the worker handler is simply borked and we could open up a new issue deciding how to fix it (merging billiard.Pool or someting else). (btw, Richard, you're sbt? I was trying to find your real name to give you credit for the no_execv patch in billiard) ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue10037> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com