[
https://issues.apache.org/jira/browse/BEAM-14080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Beam JIRA Bot updated BEAM-14080:
---------------------------------
Labels: stale-P2 (was: )
> Portable runner does not return job exit status to client after long-running
> job
> --------------------------------------------------------------------------------
>
> Key: BEAM-14080
> URL: https://issues.apache.org/jira/browse/BEAM-14080
> Project: Beam
> Issue Type: Bug
> Components: runner-flink, sdk-py-core
> Affects Versions: 2.36.0
> Reporter: Janek Bevendorff
> Priority: P2
> Labels: stale-P2
>
> I submit Python Beam jobs to our Flink cluster with the PortableRunner
> through a remote job server. If a job finishes within a few seconds or
> minutes, the return status (including a dump of any Python exceptions in case
> there was an error) is returned to the client upon completion.
> If the job, however, runs for longer (say) hours, then the client and job
> server seem to lose connection. This results in the client hanging forever
> until I press Ctrl+C to terminate it, even long after the actual job has
> completed (which has no effect whatsoever on the actual job).
> Example pseudo job:
> {code:java}
> print('Job started')
> with beam.Pipeline() as pipeline:
> pipeline | DoSomething()
> print('Job finished'){code}
> If the pipeline finishes quickly, it looks like this from the client's
> perspective:
> {code:java}
> $ python3 myjob.py
> Job started
> Job finished
> $ _{code}
> If the job runs for longer, then the {{with}} statement never finishes and I
> have to abort the Python script with Ctrl+C:
> {code:java}
> $ python3 myjob.py
> Job started
> ^C
> $ _{code}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)