bhasudha commented on code in PR #9709:
URL: https://github.com/apache/hudi/pull/9709#discussion_r1327962056


##########
website/docs/rollbacks.md:
##########
@@ -0,0 +1,67 @@
+---
+title: Partially Failed Commits
+toc: true
+---
+
+## Partially failed commits
+
+Your pipelines could fail due to numerous reasons like crashes, valid bugs in 
the code, unavailability of any external 
+third party system (like lock provider), or user could kill mid-way to change 
some properties. A well designed system should 
+detect such partially failed commits and ensure dirty data is not exposed to 
the read queries and also clean them up. 
+We have already took a peek into Hudi’s timeline which forms the core for 
reader and writer isolation. If a commit has 
+not transitioned to complete as per the hudi timeline, the readers will ignore 
the data from the respective write. 
+And so partially failed writes are never read by any readers (for all query 
types). But the curious question is, how 
+does the partially written data is eventually deleted? Does it require manual 
command to be executed from time to time 
+or should it be automatically handled by the system?
+
+### Handling partially failed commits
+Hudi has a lot of platformization built in so as to ease the 
operationalization of lakehouse tables. Once such feature 

Review Comment:
   ```suggestion
   Hudi has a lot of platformization built in so as to ease the 
operationalization of lakehouse tables. One such feature 
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to