[
https://issues.apache.org/jira/browse/HUDI-689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
satish updated HUDI-689:
------------------------
Description:
was:
example timeline:
t0 -> create bucket1.parquet
t1 -> create and append updates bucket1.log
t2 -> request compaction
t3 -> create bucket2.parquet
if compaction at t2 takes a long time, incremental reads using
HoodieParquetInputFormat can skip data ingested at t1 leading to 'data loss'
(Data will still be on disk, but incremental readers wont see it because its in
log file and readers move to t3)
To workaround this problem, we want to stop returning data belonging to commits
> t1. After compaction is complete, incremental reader would see updates in t2,
t3, so on.
> Fix hudi cli commands with overlapping words
> --------------------------------------------
>
> Key: HUDI-689
> URL: https://issues.apache.org/jira/browse/HUDI-689
> Project: Apache Hudi (incubating)
> Issue Type: Improvement
> Reporter: satish
> Assignee: satish
> Priority: Critical
> Labels: pull-request-available
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)