Leoyzen commented on issue #7546:
URL: https://github.com/apache/hudi/issues/7546#issuecomment-1373053740

   @danny0405 Here is the situation I've meet:
   
   1. start a job with service mode enabled. It's quite a large job (200+ 
filegroup with 1GB+ each, 100+ compaction tasks).
   2. the first round(load all instants) finished, and the second round(newly 
added compaction task) start to rollback the tasks which just has been done 
within first round.
   3. looking into the log, I've found there is no committing after each 
compaction task.So when enter second round, the task has to rollback all the 
task just been done and do it again(although the file has been created, but 
with no instant.commit file).
   4. the dirty files keeps second round compaction failing(the final parquet 
file already exists), I have to replace CREATE with OVERWRITE within the code 
to avoid failure.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to