kbuci opened a new issue, #17901: URL: https://github.com/apache/hudi/issues/17901
### Task Description **What needs to be done:** Add a new policy for hoodie.clean.failed.writes.policy` - `LAZY_WITH_PREWRITE` : Same as `LAZY`, except that `rollbackFailedWrites` will also be attempted in `startCommit` before creating the new instant Add a new clean config that, if enabled, attempts a `clean` in `startCommit` before creating the new instant **Why this task is needed:** - We have had cases where write ingestion jobs repeatedly fail. Because we use LAZY policy, files from these failed writes were not implicitly rolled back between these ingestion jobs. This leads to a buildup of files in DFS partition directory and increase unbounded. And can cause them to hit file quota limits. To mitigate we had to manually run clean jobs or temporarily set policy to EAGER - We want to handle potential future cases where write jobs make keep on repeatedly failing after completing the commit but before attempting post-commit clean, which can result in the same impact as above. For example, if there are infra issues like spark jobs failing/timing out when the HUDI write reaches the post-commit phase. We have added this functionality to our internal 0.x HUDI build to address these issues, we can upstream once we achieve consensus. ### Task Type Code improvement/refactoring ### Related Issues **Parent feature issue:** (if applicable ) **Related issues:** NOTE: Use `Relationships` button to add parent/blocking issues after issue is created. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
