guanziyue edited a comment on pull request #4444:
URL: https://github.com/apache/hudi/pull/4444#issuecomment-1044857803


   Try to think it over again.  We may find that log file is not always 
fail-safe as we expected. So we may need more to make it correct:
   option 1: still focus on this problem, we can treat such log file as data 
file. They are totally in same role. We can delete those partial generated log 
files when hudi use marker file clear data file.
   option 2. Having a place to get correct results of log writing outside log 
file it self and generate. For example, meta table. We may find that commit 
meta is not a good choice. Relevant append results can be archived before 
compaction of fileGroup is triggered. And commit meta is not designed for query 
on fileGroup dimension. 
   We may firstly have a this viable option and totally fix it when meta table 
is universally used.
   option 3. Having a mechanism to make log block fail-safe. would like to have 
a try now to see if it can be solved in a short term. I want to make log blocks 
written exactly sorted with commit time (actually it is can be unsorted now). 
And then I would like to write a defensive rollback block to rollback any 
failed task attempt for this commit and then start writing. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to