guanziyue edited a comment on pull request #4444: URL: https://github.com/apache/hudi/pull/4444#issuecomment-1044857803
Try to think it over again. We may find that log file is not always fail-safe as we expected. So we may need more to make it correct: option 1: still focus on this problem, we can treat such log file as data file. They are totally in same role. We can delete those partial generated log files when hudi use marker file clear data file. option 2. Having a place to get correct results of log writing outside log file it self and generate. For example, meta table. We may find that commit meta is not a good choice. Relevant append results can be archived before compaction of fileGroup is triggered. And commit meta is not designed for query on fileGroup dimension. We may firstly have a this viable option and totally fix it when meta table is universally used. option 3. Having a mechanism to make log block fail-safe. would like to have a try now to see if it can be solved in a short term. I want to make log blocks written exactly sorted with commit time (actually it is can be unsorted now). And then I would like to write a defensive rollback block to rollback any failed task attempt for this commit and then start writing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
