zhangyue19921010 commented on pull request #4078: URL: https://github.com/apache/hudi/pull/4078#issuecomment-992236967
Hi @yihua Thanks a lot for your attention. I agree with your opinion, indeed deleting archived files will lose historical instants information and affect some hudi functions. The current implementation is a simple but can solve most of the problems, at least from my experience. At present, the user's use of hudi still involves the their own judgment, such as 1. How to make the cleaner delete the data, and the deletion of the data will affect the time travial. 2. How to make hudi archive instant, once instant is archived, it will affect the use of active timeline. Sometimes users still need to have a clear understanding of their configuration, just as enable archive files number will lose historical instant information, same as time travial to cleaner. Fortunately, users have options to use this function according on their own circumstances. If users need to keep all instant information, just disable it. If the user does not care about the instant after the archive, they can turn it on and keep a smaller value. On the other hand, I think the loss of information is inevitable, and we cannot keep all the data forever. The questions are when and how. Of course, the improvement you mentioned is very reasonable such as let hoodie implement append archive files function for unsupport-append dfs. Do you think we need to get it done in this PR or maybe we can walk step by step to reach the final state. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
