steven zhang created HUDI-1428:
----------------------------------

             Summary:  Clean old fileslice is invalid 
                 Key: HUDI-1428
                 URL: https://issues.apache.org/jira/browse/HUDI-1428
             Project: Apache Hudi
          Issue Type: Bug
          Components: Cleaner
    Affects Versions: 0.6.0, 0.7.0
            Reporter: steven zhang
             Fix For: 0.7.0


Reproduce : 

Table type MERGE_ON_READ and set hoodie.cleaner.commits.retained=2

Do insert into table three times  into same partition

And it will create follow parquet file

a2c57b73-5ba9-4744-9027-640075a179ec-0_0-213-1740_20201201200149.parquet

a2c57b73-5ba9-4744-9027-640075a179ec-0_0-176-1702_20201201195638.parquet

a2c57b73-5ba9-4744-9027-640075a179ec-0_0-139-1664_20201201195219.parquet

a2c57b73-5ba9-4744-9027-640075a179ec-0_0-103-1632_20201201193835.parquet

And 20201201200149.parquet is newest 

The old parquet files have not be deleted when do client.clean()

The reason is CleanPlanner  init commitTimeline with 
hoodieTable.getCompletedCommitTimeline();

And getCompletedCommitTimeline method not include DELTA_COMMIT_ACTION



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to