parisni commented on issue #5767: URL: https://github.com/apache/hudi/issues/5767#issuecomment-1364484016
Well, eventually we figures out what was the cause of this. The number of log files was growing up due to inflight commit in the timeline . the OP was missing this, but as a result when files where > 50 the MDT was unable to be compacted unless inceeasing spark.hadoop.fs.s3a.connection.maximum To fix this we end up turning the cleaning to automatic to clean inflight commit and reduce the log files thanks to compaction -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
