[
https://issues.apache.org/jira/browse/BEAM-11366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266641#comment-17266641
]
Beam JIRA Bot commented on BEAM-11366:
--------------------------------------
This issue is assigned but has not received an update in 30 days so it has been
labeled "stale-assigned". If you are still working on the issue, please give an
update and remove the label. If you are no longer working on the issue, please
unassign so someone else may work on it. In 7 days the issue will be
automatically unassigned.
> Dataflow and KafkaIO or KinesisIO has erratic watermark progress due to
> ReaderCache timeouts
> --------------------------------------------------------------------------------------------
>
> Key: BEAM-11366
> URL: https://issues.apache.org/jira/browse/BEAM-11366
> Project: Beam
> Issue Type: Bug
> Components: runner-dataflow
> Reporter: Sam Whittle
> Assignee: Sam Whittle
> Priority: P2
> Labels: stale-assigned
>
> The ReaderCache has a default timeout of 1 minute. If the source is not
> polled within that time, the UnboundedReader is destroyed and will be
> reinitilized from the checkpoint when next polled. This is too short for runs
> where the source is not polled for a while due to other keys to process. In
> particular this seems to affect Streaming engine pipelines with backlog.
> From an initial look, recovery from the Kafka UnboundedReader checkpoint does
> not seem to use the previous watermark for initializing the timestamp policy
> and perhaps the timestamp policy will return unknown watermark if it has not
> yet observed a record.
> So possible fixes would be:
> - make cache timeout configurable, or maybe dynamic, and increase it
> - ensure that KafkaIO and KinesisIO recovery from checkpoints initializes the
> watermark properly so that cache eviction does not matter for watermark
> smoothness
> - ensure that cache eviction happens in the background instead of blocking
> threads acquiring readers which might not expect blocking
--
This message was sent by Atlassian Jira
(v8.3.4#803005)