VitoMakarevich commented on issue #10964:
URL: https://github.com/apache/hudi/issues/10964#issuecomment-2040415437

   Update - dug into the code `clusteringHandleUpdate`, and see that if:
   Updates rejected - write fails.
   Updates accepted - 
if(`hoodie.clustering.rollback.pending.replacecommit.on.conflict` is `true`) - 
those pending clustering instants that conflict with update records - rolled 
back.
   Updates accepted - 
if(`hoodie.clustering.rollback.pending.replacecommit.on.conflict` is `false`) - 
pending clustering instants left on commit line, updates made to previous files.
   
   So it looks like switching these 2:
   
[hoodie.clustering.updates.strategy](https://hudi.apache.org/docs/configurations/#hoodieclusteringupdatesstrategy)
 -> 
`org.apache.hudi.client.clustering.update.strategy.SparkRejectUpdateStrategy` 
(non-default)
   
[hoodie.clustering.rollback.pending.replacecommit.on.conflict](https://hudi.apache.org/docs/configurations/#hoodieclusteringrollbackpendingreplacecommitonconflict)
 -> `true`(non-default)
   is generally safe for all operations inline and single writer.
   e.g. if the commit fails in the middle of clustering - subsequent commit 
will be run and it will synchronously rollback clustering instants, and writing 
updates into old files.
   
   Can someone confirm? @nsivabalan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to