prashantwason opened a new pull request, #8604:
URL: https://github.com/apache/hudi/pull/8604

   [HUDI-6151] Rollback previously applied commits to MDT when operations are 
retried.
   
   ### Change Logs
   
   Operations like Clean, Compaction are retried after failures with the same 
instant time. If the previous run of the operation successfully committed to 
the MDT but failed to commit to the dataset, then the operation will be retried 
later with the same instantTime causing duplicate updates applied to MDT.
   
   Currently, we simply delete the completed deltacommit without rolling back 
the deltacommit.
   
   To handle this, we detect a replay of operation and rollback any changes 
from that operation in MDT.
   
   ### Impact
   
   Fixes the issue of duplicate log blocks written in the MDT. This is 
deterimental for indexes where duplicates are not allowed.
   
   ### Risk level (write none, low medium or high below)
   
   None. Unit test has been added.
   
   ### Documentation Update
   
   None
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to