[ https://issues.apache.org/jira/browse/BEAM-6777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906252#comment-16906252 ]
Oded Valtzer commented on BEAM-6777: ------------------------------------ Do you know if there is a ticket for this ticket /PR somewhere? > SDK Harness Resilience > ---------------------- > > Key: BEAM-6777 > URL: https://issues.apache.org/jira/browse/BEAM-6777 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow > Reporter: Sam Rohde > Assignee: Yueyang Qiu > Priority: Major > Time Spent: 7h 20m > Remaining Estimate: 0h > > If the Python SDK Harness crashes in any way (user code exception, OOM, etc) > the job will hang and waste resources. The fix is to add a daemon in the SDK > Harness and Runner Harness to communicate with Dataflow to restart the VM > when stuckness is detected. -- This message was sent by Atlassian JIRA (v7.6.14#76016)