[
https://issues.apache.org/jira/browse/BEAM-11400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sam Whittle updated BEAM-11400:
-------------------------------
Fix Version/s: 2.27.0
Resolution: Fixed
Status: Resolved (was: Open)
> StreamingDataflowWorker stuck commits logic triggers exceptions if commits
> eventually complete
> ----------------------------------------------------------------------------------------------
>
> Key: BEAM-11400
> URL: https://issues.apache.org/jira/browse/BEAM-11400
> Project: Beam
> Issue Type: Bug
> Components: runner-dataflow
> Reporter: Sam Whittle
> Assignee: Sam Whittle
> Priority: P2
> Fix For: 2.27.0
>
> Time Spent: 1h
> Remaining Estimate: 0h
>
> Commits that have not completed in a timeout are cancelled as stuck and lost,
> in logs showing up as:
> Detected key with sharding key -6893288510319386341 stuck in COMMITTING
> state, completing it with error.
> However if the commit was not lost but just very slow, when it eventually
> does complete the following error occurs:
> Exception while processing commit response {}
> "java.lang.NullPointerException
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:877)
> at
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$ComputationState.completeWork(StreamingDataflowWorker.java:2246)
> This occurs on the commit stream which finishes processing the current batch
> of responses but then throws the error. This causes the stream to complete
> with an error, resending all of the other commits. So if there were a large
> number of commits on the stream, we make slow progress and only complete a
> couple before retrying everything again. This slowdown can cause further
> commits to exceed the timeout, entering a feedback loop.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)