kbuci opened a new issue, #17901:
URL: https://github.com/apache/hudi/issues/17901

   ### Task Description
   
   **What needs to be done:**
   
   Add a new policy for  hoodie.clean.failed.writes.policy`
   - `LAZY_WITH_PREWRITE` : Same as `LAZY`, except that `rollbackFailedWrites` 
will also be attempted in `startCommit` before creating the new instant
   
   Add a new clean config that, if enabled, attempts a `clean` in `startCommit` 
before creating the new instant
   
   **Why this task is needed:**
   - We have had cases where write ingestion jobs repeatedly fail. Because we 
use LAZY policy, files from these failed writes were not implicitly rolled back 
between these ingestion jobs. This leads to a buildup of files in DFS partition 
directory and increase unbounded. And can cause them to hit file quota limits. 
To mitigate we had to manually run clean jobs or temporarily set policy to EAGER
   - We want to handle potential future cases where write jobs make keep on 
repeatedly failing after completing the commit but before attempting 
post-commit clean, which can result in the same impact as above. For example, 
if there are infra issues like spark jobs failing/timing out when the HUDI 
write reaches the post-commit phase.
   
   We have added this functionality to our internal 0.x HUDI build to address 
these issues, we can upstream once we achieve consensus. 
   
   
   ### Task Type
   
   Code improvement/refactoring
   
   ### Related Issues
   
   **Parent feature issue:** (if applicable )
   **Related issues:**
   NOTE: Use `Relationships` button to add parent/blocking issues after issue 
is created.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to