kbuci commented on code in PR #10965:
URL: https://github.com/apache/hudi/pull/10965#discussion_r1568070078


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java:
##########
@@ -1135,8 +1138,36 @@ protected void 
completeLogCompaction(HoodieCommitMetadata metadata, HoodieTable
    */
   protected HoodieWriteMetadata<O> compact(String compactionInstantTime, 
boolean shouldComplete) {
     HoodieTable table = createTable(config, context.getHadoopConf().get());
+    Option<HoodieInstant> instantToCompactOption = 
Option.fromJavaOptional(table.getActiveTimeline()
+        .filterCompletedAndCompactionInstants()
+        .getInstants()
+        .stream()
+        .filter(instant -> 
HoodieActiveTimeline.EQUALS.test(instant.getTimestamp(), compactionInstantTime))
+        .findFirst());
+    try {
+      // Transaction serves to ensure only one compact job for this instant 
will start heartbeat, and any other concurrent
+      // compact job will abort if they attempt to execute compact before 
heartbeat expires
+      // Note that as long as all jobs for this table use this API for 
compact, then this alone should prevent
+      // compact rollbacks from running concurrently to compact commits.
+      txnManager.beginTransaction(instantToCompactOption, 
txnManager.getLastCompletedTransactionOwner());

Review Comment:
   > Even on REQUESTED: since we are taking a lock and checking for heart beat 
client, wouldn't that ensure only one writer can proceed and the other writer 
will fail.
   
   > why would (3) not see heartbeat? we are taking a block and start emitting 
heart beats right.
   
   Oh in step (3) the job (B) will see a heartbeat. Just to clarify the 
scenario here only applies if we were to have jobs ignore heartbeat check if 
the compact plan is not inflight yet (in case that might have been the source 
of confusion here).
   If you meant why Job (C) in (7)  doesn't see heartbeat, then the reason for 
that is that Job (A) created a heartbeat but Job (B) did not, so once (A) stops 
then the heartbeat is expired and (C) has no way to know that there is a job 
(B) working on this plan. At first glance it might seem like having Job (B) 
also emit a heartbeat (during Step (2)) would avoid this issue, but that would 
cause another issue: when Job (A) terminates it would clean up the heartbeat 
file. So since there is no semaphore-like or shared-exclusive lock-like 
functionality in heartbeating I'm not sure how or if we can have concurrent 
compact jobs share or ignore a heartbeat.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to