[
https://issues.apache.org/jira/browse/HUDI-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas Jiang updated HUDI-6317:
---------------------------------
Summary: Streaming read should skip compaction and clustering instants to
avoid duplicates (was: Streaming read should skip clustering instants to avoid
duplicated reading)
> Streaming read should skip compaction and clustering instants to avoid
> duplicates
> ---------------------------------------------------------------------------------
>
> Key: HUDI-6317
> URL: https://issues.apache.org/jira/browse/HUDI-6317
> Project: Apache Hudi
> Issue Type: Bug
> Components: flink
> Reporter: Nicholas Jiang
> Assignee: Nicholas Jiang
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.14.0
>
>
> At present, the default value of read.streaming.skip_clustering is false,
> which could cause the situation that streaming reading reads the replaced
> file slices of clustering, so that streaming reading may read T-1 day data
> when clustering the data of T-1 day to cause duplicated data. Therefore
> streaming read should skip clustering instants for all cases to avoid reading
> the replaced file slices.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)