VitoMakarevich opened a new issue, #10964: URL: https://github.com/apache/hudi/issues/10964
**_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at [email protected]. - If you have triaged this as a bug, then file an [issue](https://issues.apache.org/jira/projects/HUDI/issues) directly. **Describe the problem you faced** Hello, this is a followup from https://github.com/apache/hudi/issues/10878. We managed to run clustering, but I'm obsessed with a potential recovery plan. So what behavior I know - when `.commit.requested` and `.commit.inflight` created, but not `.commit` - then subsequent write will do a rollback. - this works for normal commits. However, if I start clustering - if the job stops before `.inflight` is created - subsequent write will fail if affects partition present in `.replacecommit.requested` - controlled by [hoodie.clustering.updates.strategy](https://hudi.apache.org/docs/configurations/#hoodieclusteringupdatesstrategy). So here I can only either run clustering from CLI or just delete instant(can you confirm? per code looks like it's safe if there is no `.inflight`). But - if it fails after start writing files(after `.replacecommit.inflight` is created, but before `.replacecommit` is created) - which choices do I have? As I checked through the code - it looks like there is no automatic rollback for `replacecommit`, and `hudi-cli` has rollback only for finished instants. Given this, can you answer 2 questions: 1. If clustering failed after `.replacecommit.requested`, but before `.replacecommit.inflight` - is it safe to just delete commit file itself? Recently you added this PR and it looks to be doing exactly this https://github.com/apache/hudi/pull/10645/files 2. If clustering failed after `.replacecommit.inflight`, but before `.replacecommit` - what are recovery steps? **To Reproduce** Steps to reproduce the behavior: 1. 2. 3. 4. **Expected behavior** A clear and concise description of what you expected to happen. **Environment Description** * Hudi version : 0.12.2 * Spark version : 3.3.0 * Hive version : * Hadoop version : * Storage (HDFS/S3/GCS..) : * Running on Docker? (yes/no) : **Additional context** Add any other context about the problem here. **Stacktrace** ```Add the stacktrace of the error.``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
