[
https://issues.apache.org/jira/browse/BEAM-11833?focusedWorklogId=555692&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-555692
]
ASF GitHub Bot logged work on BEAM-11833:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 22/Feb/21 08:56
Start Date: 22/Feb/21 08:56
Worklog Time Spent: 10m
Work Description: je-ik commented on pull request #14013:
URL: https://github.com/apache/beam/pull/14013#issuecomment-783209882
> The reason why we want to call
`watermarkEstimator.setWatermark(currentRestriction.getWatermark());` when
`tryClaim()` returns `false` is for tracking watermark when returning
ProcessContinuation.resume(). It could happen when there is no output records
from reader and we want to read again later.
That seems incorrect. When tryClaim returns `false`, it is part of the
contract to return `ProcessContinuation.done()`:
https://github.com/apache/beam/blob/aaad864c9acb22e35050f974a7ac74fb7638f085/runners/core-java/src/main/java/org/apache/beam/runners/core/OutputAndTimeBoundedSplittableProcessElementInvoker.java#L221
When the reader returns false() we do not fail clam, instead we go though
`out[0] == null` in `processElement`.
I think there should be no reason to enforce:
a) returning ProcessContinuation.done(), and
b) manually setting the watermark to BoundedWindow.TIMESTAMP_MAX_VALUE
because, that seems redundant.
You are right, that the watermark is read *before* call to trySplit (I
overlooked that), that probably means, we *must* set watermark both *before*
and *after* the tryClaim loop.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 555692)
Time Spent: 1.5h (was: 1h 20m)
> UnboundedSourceAsSDFRestrictionTracker reports incorrect watermark after
> failed claim
> -------------------------------------------------------------------------------------
>
> Key: BEAM-11833
> URL: https://issues.apache.org/jira/browse/BEAM-11833
> Project: Beam
> Issue Type: Bug
> Components: sdk-java-core
> Affects Versions: 2.28.0
> Reporter: Jan Lukavský
> Assignee: Jan Lukavský
> Priority: P1
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
> After being split by call to {{trySplit}}, the watermark reported by
> {{UnboundedSourceAsSDFRestrictionTracker.currentRestriction().getWatermark()}}
> is BoundedWindow.TIMESTAMP_MAX_VALUE, which is incorrect.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)