cocopc edited a comment on issue #1737: URL: https://github.com/apache/hudi/issues/1737#issuecomment-645076628
@bhasudha when use COW, the `hoodie.cleaner.commits.retained=2` help me control the number of parquet files . the `hoodie.parquet.max.file.size=120M` control each parquet file size , new batch data will append the last batch parquet files until the parquet file size increase 120M. but when use MOR, each batch will create a parquet file , new batch data will not append the last batch small parquet files. the `hoodie.logfile.max.size=1G` , only work when each batch data is very large, but when each batch data is small, eg. 1KB. then will create small parquet files. I not found the conf control the number of logfile or `logifle.min.size` before is rolled over to the next version ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
