steven zhang created HUDI-1428:
----------------------------------
Summary: Clean old fileslice is invalid
Key: HUDI-1428
URL: https://issues.apache.org/jira/browse/HUDI-1428
Project: Apache Hudi
Issue Type: Bug
Components: Cleaner
Affects Versions: 0.6.0, 0.7.0
Reporter: steven zhang
Fix For: 0.7.0
Reproduce :
Table type MERGE_ON_READ and set hoodie.cleaner.commits.retained=2
Do insert into table three times into same partition
And it will create follow parquet file
a2c57b73-5ba9-4744-9027-640075a179ec-0_0-213-1740_20201201200149.parquet
a2c57b73-5ba9-4744-9027-640075a179ec-0_0-176-1702_20201201195638.parquet
a2c57b73-5ba9-4744-9027-640075a179ec-0_0-139-1664_20201201195219.parquet
a2c57b73-5ba9-4744-9027-640075a179ec-0_0-103-1632_20201201193835.parquet
And 20201201200149.parquet is newest
The old parquet files have not be deleted when do client.clean()
The reason is CleanPlanner init commitTimeline with
hoodieTable.getCompletedCommitTimeline();
And getCompletedCommitTimeline method not include DELTA_COMMIT_ACTION
--
This message was sent by Atlassian Jira
(v8.3.4#803005)