zhangyue19921010 commented on pull request #3240:
URL: https://github.com/apache/hudi/pull/3240#issuecomment-877251884


   Hi @leesf Thanks for your review. Yes, `when clustering plan contains the 
small files, the new insert should not get into small files` only works when 
users set `this.config.isClusteringEnabled()` during insert job linked here: 
https://github.com/apache/hudi/blob/650c4455c600b0346fed8b5b6aa4cc0bf3452e8c/hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java#L149
   
   what I means is that this limitation is not good enough because others job 
like HoodieClusteringJob also can generate cluster plan and the insert job 
mentioned above can't be aware of it and can't use 
`filterSmallFilesInClustering` func to filter cluster-related small files.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to