[ https://issues.apache.org/jira/browse/BEAM-7873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hannah Jiang updated BEAM-7873: ------------------------------- Description: Pipeline hangs at [#subprocess.Popen()] when shut it down. I looked into source code of subprocess lib. [py27|https://github.com/enthought/Python-2.7.3/blob/master/Lib/subprocess.py#L1286] doesn't do any lock while [py3|https://github.com/python/cpython/blob/3.7/Lib/subprocess.py#L1592] locks when waiting. Py3 added locks at other places of Popen() as well, all unlocked places with py2 may contribute to the problem. I think this is the root cause of hanging. -A workaround is sleeping 0.1 or even better 0.5 second between each call of Popen() so it does not deadlock. I ran wordcound.py 1000 times with 2 workers, and sleeping 0.1 second worked fine.- We can add a lock when calling Popen() to prevent deadlock. was: Pipeline hangs at [subprocess.Popen()|[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/local_job_service.py#L203]] when shut it down. I looked into source code of subprocess lib. [py27|https://github.com/enthought/Python-2.7.3/blob/master/Lib/subprocess.py#L1286] doesn't do any lock while [py3|https://github.com/python/cpython/blob/3.7/Lib/subprocess.py#L1592] locks when waiting. Py3 added locks at other places of Popen() as well, all unlocked places with py2 may contribute to the problem. I think this is the root cause of hanging. -A workaround is sleeping 0.1 or even better 0.5 second between each call of Popen() so it does not deadlock. I ran wordcound.py 1000 times with 2 workers, and sleeping 0.1 second worked fine.- We can add a lock when calling Popen() to prevent deadlock. > FnApi with Subprocess runner hangs frequently when running with multi workers > with py2 > -------------------------------------------------------------------------------------- > > Key: BEAM-7873 > URL: https://issues.apache.org/jira/browse/BEAM-7873 > Project: Beam > Issue Type: Bug > Components: sdk-py-core > Reporter: Hannah Jiang > Assignee: Hannah Jiang > Priority: Major > Fix For: 2.15.0 > > > Pipeline hangs at [#subprocess.Popen()] when shut it down. I looked into > source code of subprocess lib. > [py27|https://github.com/enthought/Python-2.7.3/blob/master/Lib/subprocess.py#L1286] > doesn't do any lock while > [py3|https://github.com/python/cpython/blob/3.7/Lib/subprocess.py#L1592] > locks when waiting. Py3 added locks at other places of Popen() as well, all > unlocked places with py2 may contribute to the problem. > I think this is the root cause of hanging. > -A workaround is sleeping 0.1 or even better 0.5 second between each call of > Popen() so it does not deadlock. I ran wordcound.py 1000 times with 2 > workers, and sleeping 0.1 second worked fine.- > We can add a lock when calling Popen() to prevent deadlock. -- This message was sent by Atlassian JIRA (v7.6.14#76016)