Leoyzen commented on issue #7546: URL: https://github.com/apache/hudi/issues/7546#issuecomment-1373053740
@danny0405 Here is the situation I've meet: 1. start a job with service mode enabled. It's quite a large job (200+ filegroup with 1GB+ each, 100+ compaction tasks). 2. the first round(load all instants) finished, and the second round(newly added compaction task) start to rollback the tasks which just has been done within first round. 3. looking into the log, I've found there is no committing after each compaction task.So when enter second round, the task has to rollback all the task just been done and do it again(although the file has been created, but with no instant.commit file). 4. the dirty files keeps second round compaction failing(the final parquet file already exists), I have to replace CREATE with OVERWRITE within the code to avoid failure. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
