guanziyue edited a comment on pull request #4444:
URL: https://github.com/apache/hudi/pull/4444#issuecomment-1044103610


   > Thanks, i saw your description in JIRA:
   > 
   > > In the first attempt 1, we write three records 5,4,3 to 
fileID_1_log.1_attempt1. But this attempt failed. Spark will have a try in the 
second task attempt (attempt 2), we write four records 1,2,3,4 to  
fileID_1_log.1_attempt2. And then, we find this filegroup is large enough by 
call canWrite. So hudi write record 5 to fileID_2_log.1_attempt2 and finish 
this commit.
   > 
   > Do you mean the attempt 1 writes a complete/full log block in the log file 
then failed ? Can we write a rollback block when spark tasl failover there ?
   > 
   > If we made the change as this path, the ability for precise file size 
control/bin-packing lost.
   
   Thanks. Agree with your concern. I will go deep into marker file mechanism 
to find if there is a way to be aware of previous failed task attempts. I have 
less knowledge relevant to marker file. Will have this updated as soon as I 
come up with a good solution.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to