HunterHunter created HUDI-4271:
----------------------------------

             Summary: Throw NoSuchElementException: FileID xx of partition path 
xx does not exist. when execute HoodieMergeHandle.getLatestBaseFile but FileID 
is exist in path. 
                 Key: HUDI-4271
                 URL: https://issues.apache.org/jira/browse/HUDI-4271
             Project: Apache Hudi
          Issue Type: Bug
          Components: flink
            Reporter: HunterHunter


{code:java}
//代码占位符
{code}
**When debugging, it is found that the next commit will throw an exception 
after the clean is completed**
I found that `HoodieTableFileSystemView.partitionToFileGroupsMap` lost the last 
`instant commit` fileGoup infomation.
(I  execute `hoodieTable.getHoodieView().reset()` after Throw Exception,and its 
working after retry `getLatestBaseFile`)
hudi table config:
{code:java}
            "'table.type' = 'COPY_ON_WRITE',\n" +
                "'hoodie.parquet.small.file.limit' = '20', \n" +
                "'write.operation' = 'insert', \n" +
                "'write.insert.cluster' = 'true', \n" +
                "'hoodie.datasource.write.hive_style_partitioning' = 'true',\n" 
                "'write.task.max.size' = '4096', \n" +
                "'write.merge.max_memory'= '2048',\n" +
                "'write.precombine' = 'true',\n" +
                "'write.tasks' = '1',\n" +
                "'write.bucket_assign.tasks' = '1',\n" +
                "'hive_sync.skip_ro_suffix' = 'true',\n" +
                "'write.ignore.failed' = 'true',\n" +
                "'clean.async.enabled' = 'true',\n" +
                "'clean.retain_commits' = '6' \n" + {code}
{code:java}
The determining factor is
'hoodie.parquet.small.file.limit' = '20' -- Trigger new file generation
 and 
 'clean.async.enabled' = 'true' -- Trigger async clean
'clean.retain_commits' = '6'  {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to