[ https://issues.apache.org/jira/browse/BEAM-14080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17550027#comment-17550027 ]
Danny McCormick commented on BEAM-14080: ---------------------------------------- This issue has been migrated to https://github.com/apache/beam/issues/21597 > Portable runner does not return job exit status to client after long-running > job > -------------------------------------------------------------------------------- > > Key: BEAM-14080 > URL: https://issues.apache.org/jira/browse/BEAM-14080 > Project: Beam > Issue Type: Bug > Components: runner-flink, sdk-py-core > Affects Versions: 2.36.0 > Reporter: Janek Bevendorff > Priority: P2 > > I submit Python Beam jobs to our Flink cluster with the PortableRunner > through a remote job server. If a job finishes within a few seconds or > minutes, the return status (including a dump of any Python exceptions in case > there was an error) is returned to the client upon completion. > If the job, however, runs for longer (say) hours, then the client and job > server seem to lose connection. This results in the client hanging forever > until I press Ctrl+C to terminate it, even long after the actual job has > completed (which has no effect whatsoever on the actual job). > Example pseudo job: > {code:java} > print('Job started') > with beam.Pipeline() as pipeline: > pipeline | DoSomething() > print('Job finished'){code} > If the pipeline finishes quickly, it looks like this from the client's > perspective: > {code:java} > $ python3 myjob.py > Job started > Job finished > $ _{code} > If the job runs for longer, then the {{with}} statement never finishes and I > have to abort the Python script with Ctrl+C: > {code:java} > $ python3 myjob.py > Job started > ^C > $ _{code} -- This message was sent by Atlassian Jira (v8.20.7#820007)