hd zhou created HUDI-3644:
-----------------------------
Summary: hoodie log scan bug cause data duplication
Key: HUDI-3644
URL: https://issues.apache.org/jira/browse/HUDI-3644
Project: Apache Hudi
Issue Type: Bug
Reporter: hd zhou
AbstractHoodieLogRecordReader
{code:java}
//代码占位符
if (!completedInstantsTimeline.containsOrBeforeTimelineStarts(instantTime)
|| inflightInstantsTimeline.containsInstant(instantTime)) {
// hit an uncommitted block possibly from a failed write, move to the next
one and skip processing this one
continue;
} {code}
completedInstantsTimeline.containsOrBeforeTimelineStarts(instantTime) is true
will merge log file. this is not good.
when log file block append sucess. And deltacommit rollback. And this instance
time is not before activeTimeline starts. This log file block will be merged,
cause data duplication.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)