zhangyue19921010 commented on pull request #4078:
URL: https://github.com/apache/hudi/pull/4078#issuecomment-992236967


   Hi @yihua Thanks a lot for your attention. 
   
   I agree with your opinion, indeed deleting archived files will lose 
historical instants information and affect some hudi functions.
   The current implementation is a simple but can solve most of the problems, 
at least from my experience.
   
   At present, the user's use of hudi still involves the their own judgment, 
such as 
   1. How to make the cleaner delete the data, and the deletion of the data 
will affect the time travial. 
   2. How to make hudi archive instant, once instant is archived, it will 
affect the use of active timeline. 
   
   Sometimes users still need to have a clear understanding of their 
configuration, just as enable archive files number will lose historical instant 
information, same as time travial to cleaner.
   
   Fortunately, users have options to use this function according on their own 
circumstances. If users need to keep all instant information, just disable it. 
If the user does not care about the instant after the archive, they can turn it 
on and keep a smaller value.
   
   On the other hand, I think the loss of information is inevitable, and we 
cannot keep all the data forever. The questions are when and how.
   
   Of course, the improvement you mentioned is very reasonable such as let 
hoodie implement append archive files function for unsupport-append dfs.
   
   Do you think we need to  get it done in this PR or maybe we can walk step by 
step to reach the final state. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to