zhangyue19921010 commented on pull request #3240: URL: https://github.com/apache/hudi/pull/3240#issuecomment-877251884
Hi @leesf Thanks for your review. Yes, `when clustering plan contains the small files, the new insert should not get into small files` only works when users set `this.config.isClusteringEnabled()` during insert job linked here: https://github.com/apache/hudi/blob/650c4455c600b0346fed8b5b6aa4cc0bf3452e8c/hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java#L149 what I means is that this limitation is not good enough because others job like HoodieClusteringJob also can generate cluster plan and the insert job mentioned above can't be aware of it and can't use `filterSmallFilesInClustering` func to filter cluster-related small files. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
