[
https://issues.apache.org/jira/browse/HUDI-4615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-4615:
-----------------------------
Reviewers: sivabalan narayanan
> Fix empty commits being made by deltastreamer with S3EventsSource when there
> is no data in SQS on starting a new pipeline
> -------------------------------------------------------------------------------------------------------------------------
>
> Key: HUDI-4615
> URL: https://issues.apache.org/jira/browse/HUDI-4615
> Project: Apache Hudi
> Issue Type: Bug
> Components: deltastreamer
> Reporter: sivabalan narayanan
> Assignee: Vinish Reddy
> Priority: Blocker
> Labels: pull-request-available
> Fix For: 0.12.1
>
>
> When we start a new deltastreamer with S3EventsSource, checkpoint is
> Option.empty(). After consumption from source, if there is no data, the
> source returns "val=0" as the checkpoint. So, deltastreamer assumes
> checkpoint has changed and makes an empty commit. This needs fixing.
>
> [https://github.com/apache/hudi/blob/0d0a4152cfd362185066519ae926ac4513c7a152/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/S3EventsMetaSelector.java#L151]
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)