nsivabalan commented on code in PR #12236:
URL: https://github.com/apache/hudi/pull/12236#discussion_r1992435832


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieTableServiceClient.java:
##########
@@ -298,12 +314,56 @@ protected HoodieWriteMetadata<O> compact(String 
compactionInstantTime, boolean s
       table.getMetaClient().reloadActiveTimeline();
     }
     compactionTimer = metrics.getCompactionCtx();
+    // start commit in MDT if enabled
+    Option<HoodieTableMetadataWriter> metadataWriterOpt = 
getMetadataWriterFunc.apply(compactionInstantTime, table.getMetaClient());
+    if (metadataWriterOpt.isPresent()) {

Review Comment:
   table services are done this way (and is different from ingestion commits), 
bcoz the schedulding and execution could happen separately. but with MDT, if we 
start the commit during compaction scheduling in data table, and defer the 
execution later, some other thread in MDT could detect failed heart beats for 
the corresponding DC in MDT and can trigger rollback. So, we are deferring the 
starting of DC in MDT for data table table services just when the execution of 
table services start. So, that we know the heart beats will be continuous and 
if anything failed mid-way, it will get lazily rolled back. 
   
   But wanted to jam something on this end. Can we completely disable auto 
rollbacks in MDT. the data table writer is the only one that can trigger 
rollbacks for the current commit its dealing with. 
   
   What this means is: 
   When an ingestion commit in DT fails mid-way in MDT:
    - the resp DC in MDT will be inflight until the rollback of data table 
kicks in. And when the rollback in data table reaches MDT layer, it can 
rollback as usual. 
   
   For compaction and clustering:
    - Compaction in DT failed mid-way while writing to resp DC in MDT. This 
will stay inflight until the next attempt of DT compaction resumes. On 
resuming, hudi triggers a rollback of the compaction commit in DT which will 
then gets applied to MDT as well. i..e result in rolling back the compaction 
commit. and then the compaction in DT will go through 2nd attempt. which in 
turn will get applied as DC in MDT. 
   
   So, if there are any table services or ingestion commit stays inflight in 
data table for a long duration, this could also mean, inflight hanging around 
in MDT. 
   
   
    



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to