[
https://issues.apache.org/jira/browse/BEAM-787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15593582#comment-15593582
]
Eugene Kirpichov commented on BEAM-787:
---------------------------------------
I think I found the bug:
https://github.com/apache/incubator-beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ExecutorServiceParallelExecutor.java#L268
- handleResult() applies state updates, including updating the watermark.
And *only after that* at lines 269 and onwards, we commit output bundles.
If the watermark refreshes itself in the meantime, output elements may be
rejected by LateDataDroppingDoFnRunner.
The sad part is, I'm not sure how to fix this. Ideally we should apply state
updates atomically with output bundles instead. Will look tomorrow.
> org.apache.beam.runners.direct.SplittableDoFnTest.testPairWithIndexBasic
> flaking
> --------------------------------------------------------------------------------
>
> Key: BEAM-787
> URL: https://issues.apache.org/jira/browse/BEAM-787
> Project: Beam
> Issue Type: Bug
> Components: testing
> Affects Versions: Not applicable
> Reporter: Daniel Halperin
> Assignee: Eugene Kirpichov
> Priority: Critical
>
> CC [~tgroh]
> {code}
> Expected: iterable over [<KV{a, 0}>, <KV{bb, 0}>, <KV{bb, 1}>, <KV{ccccc,
> 0}>, <KV{ccccc, 1}>, <KV{ccccc, 2}>, <KV{ccccc, 3}>, <KV{ccccc, 4}>] in any
> order
> but: No item matches: <KV{ccccc, 4}> in [<KV{ccccc, 2}>, <KV{ccccc, 3}>,
> <KV{ccccc, 1}>, <KV{bb, 1}>, <KV{a, 0}>, <KV{bb, 0}>, <KV{ccccc, 0}>]
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)