YuweiXiao commented on issue #5770: URL: https://github.com/apache/hudi/issues/5770#issuecomment-1163823698
Clustering is a table service that re-organize the data files' layout, maybe it is not relevant in your case. About the `average record size`, hudi use this (`hoodie.copyonwrite.record.size.estimate`, default 1KB) to estimate the total file size in your initial write (no available commits to compute the average size). The code for file size management in `UpsertPartitioner::assignInserts`. Maybe you could also check logs and see if it take effects. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
