[
https://issues.apache.org/jira/browse/HUDI-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Danny Chen closed HUDI-5517.
----------------------------
Fix Version/s: 0.13.1
0.14.0
Resolution: Fixed
Fixed via master branch: 77039ae734aead741e8f528637342a2538b4c456
> HoodieTimeline support filter instants by state transition time
> ---------------------------------------------------------------
>
> Key: HUDI-5517
> URL: https://issues.apache.org/jira/browse/HUDI-5517
> Project: Apache Hudi
> Issue Type: New Feature
> Components: core, incremental-query
> Reporter: Hui An
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.13.1, 0.14.0
>
>
> Hudi timeline can actually miss some instants if we incremental pulling from
> upstream hudi table, which is written by several writers.
> For example, say we have 2 writers writing data to the hudi table, and the
> last success incremental pulling end timestamp is 001
> w1 is writing 002, w2 is writing 003, if w2 is finished earlier than the w1,
> then the incremental pulling end timestamp will be updated to 003, and
> actually w1's commit: 002 will be skipped since it's instant time is earlier
> than the w2's.
> We actually needs to use commit end time(state transition time) to filter the
> commits if using incremental pulling. As w2's state transition time is
> earlier than the w1's, so w1's data won't be filtered.
> This relates to the HUDI-1623 but not adding end time to the end of each
> commit, instead use `FileStatus.getModificationTime` to represent the end
> time.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)