[
https://issues.apache.org/jira/browse/HUDI-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17225017#comment-17225017
]
Vinoth Chandar commented on HUDI-1323:
--------------------------------------
{code:java}
if (r.getBlockType() != CORRUPT_BLOCK
&&
!HoodieTimeline.compareTimestamps(r.getLogBlockHeader().get(INSTANT_TIME),
HoodieTimeline.LESSER_THAN_OR_EQUALS, this.latestInstantTime
)) {
// hit a block with instant time greater than should be processed, stop
processing further
break;
} {code}
the log scanner already stops at the provided `latestInstantTime`. So it may be
sufficient to pass this in correctly, to the metadata reader
> Fence metadata reads using latest data timeline commit times!
> -------------------------------------------------------------
>
> Key: HUDI-1323
> URL: https://issues.apache.org/jira/browse/HUDI-1323
> Project: Apache Hudi
> Issue Type: Sub-task
> Reporter: Prashant Wason
> Assignee: Vinoth Chandar
> Priority: Major
>
> Problem D: We need to fence metadata reads using latest data timeline commit
> times! and limit to only handing out files that belong to a committed instant
> on the data timeline. Otherwise, metadata table can hand uncommitted files to
> cleaner etc and cause us to delete legit latest file slices i.e data loss
--
This message was sent by Atlassian Jira
(v8.3.4#803005)