[issue38084] multiprocessing cannot recover from crashed worker

2019-09-10 Thread STINNER Victor
STINNER Victor added the comment: > Agreed with @ppperry that this is a duplicate of issue22393. Ok, in that case I close this issue as a duplicate of bpo-22393. There is no need to duplicate the discussion here :-) -- resolution: -> duplicate stage: -> resolved status: open ->

[issue38084] multiprocessing cannot recover from crashed worker

2019-09-10 Thread Davin Potts
Davin Potts added the comment: Agreed with @ppperry that this is a duplicate of issue22393. The proposed patch in issue22393 is, for the moment, out of sync with more recent changes. That patch's approach would result in the loss of all partial results from a Pool.map, but it may be faster

[issue38084] multiprocessing cannot recover from crashed worker

2019-09-10 Thread ppperry
ppperry added the comment: Is this not a duplicate of issue22393? -- nosy: +ppperry ___ Python tracker ___ ___ Python-bugs-list

[issue38084] multiprocessing cannot recover from crashed worker

2019-09-10 Thread Davin Potts
Davin Potts added the comment: Thanks to Pablo's good work with implementing the use of multiprocessing's Process.sentinel, the logic for handling PoolWorkers that die has been centralized into Pool._maintain_pool(). If _maintain_pool() can also identify which job died with the dead

[issue38084] multiprocessing cannot recover from crashed worker

2019-09-10 Thread Davin Potts
Davin Potts added the comment: Sharing for the sake of documenting a few things going on in this particular example: * When a PoolWorker process exits in this way (os._exit(anything)), the PoolWorker never gets the chance to send a signal of failure (normally sent via the outqueue) to the

[issue38084] multiprocessing cannot recover from crashed worker

2019-09-10 Thread STINNER Victor
STINNER Victor added the comment: > Windows is definitely affected, and you can run the repro in my first post to > check other platforms. Oh right, I can also reproduce the issue on Linux. But I don't understand why test_multiprocessing_spawn works on all platforms, but only fails on

[issue38084] multiprocessing cannot recover from crashed worker

2019-09-10 Thread STINNER Victor
STINNER Victor added the comment: I converted the example into attached file mp_exit.py and I added a call to faulthandler to see what is going on. Output with the master branch of Python: vstinner@apu$ ~/python/master/python ~/mp_exit.py Timeout (0:00:05)! Thread 0x7ff40139a700 (most

[issue38084] multiprocessing cannot recover from crashed worker

2019-09-10 Thread Steve Dower
Steve Dower added the comment: Windows is definitely affected, and you can run the repro in my first post to check other platforms. -- ___ Python tracker ___

[issue38084] multiprocessing cannot recover from crashed worker

2019-09-10 Thread STINNER Victor
STINNER Victor added the comment: > multiprocessing cannot recover from crashed worker This issue has been seen on the macOS job of the Azure Pipeline: bpo-37245. I don't know if other platforms are affected. -- ___ Python tracker

[issue38084] multiprocessing cannot recover from crashed worker

2019-09-10 Thread STINNER Victor
Change by STINNER Victor : -- nosy: +pablogsal, vstinner ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue38084] multiprocessing cannot recover from crashed worker

2019-09-10 Thread Steve Dower
New submission from Steve Dower : Imitation repro: import os from multiprocessing import Pool def f(x): os._exit(0) return "success" if __name__ == '__main__': with Pool(1) as p: print(p.map(f, [1])) Obviously a process may crash for various other reasons besides