yihua commented on code in PR #18439:
URL: https://github.com/apache/hudi/pull/18439#discussion_r3271475399


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/lock/StorageBasedLockProvider.java:
##########
@@ -424,22 +445,60 @@ private boolean believesLockMightBeHeld() {
    * Unlock by marking our current lock file "expired": true.
    */
   @Override
-  public synchronized void unlock() {
-    assertHeartbeatManagerExists();
-    if (!believesLockMightBeHeld()) {
-      return;
+  public void unlock() {
+    ExpireLockResult expireResult;
+    // Snapshot the exact StorageLockFile we intend to expire. Identity 
comparison in the retry
+    // path guards against a concurrent tryLock() that, if our original lock 
had expired during
+    // the 1s sleep, could have installed a fresh lock that the retry must NOT 
mark expired.
+    StorageLockFile lockToExpire;
+    synchronized (this) {
+      assertHeartbeatManagerExists();
+      if (!believesLockMightBeHeld()) {
+        return;
+      }
+
+      // Try to stop the heartbeat first
+      if (heartbeatManager.hasActiveHeartbeat()) {
+        logger.debug("Owner {}: Gracefully shutting down heartbeat.", ownerId);
+        if (!heartbeatManager.stopHeartbeat(true)) {
+          
hoodieLockMetrics.ifPresent(HoodieLockMetrics::updateLockReleaseFailureMetric);
+          throw new 
HoodieLockException(generateLockStateMessage(FAILED_TO_RELEASE));
+        }
+      }
+
+      // Then expire the current lock.
+      lockToExpire = getLock();
+      expireResult = tryExpireCurrentLock(false);
     }
-    boolean believesNoLongerHoldsLock = true;
 
-    // Try to stop the heartbeat first
-    if (heartbeatManager.hasActiveHeartbeat()) {
-      logger.debug("Owner {}: Gracefully shutting down heartbeat.", ownerId);
-      believesNoLongerHoldsLock &= heartbeatManager.stopHeartbeat(true);
+    // If throttled, retry once after sleeping outside the monitor to avoid 
blocking other threads.
+    // Note: when unlock() is called via close() -> shutdown(), the outer 
synchronized caller still
+    // holds the provider monitor through reentrant locking, so other threads 
remain blocked in
+    // that scenario. This is acceptable since close() is a shutdown path, not 
the hot path.
+    if (expireResult == ExpireLockResult.THROTTLED) {
+      logger.warn("Owner {}: Lock expiration was throttled, retrying after {} 
seconds.",
+          ownerId, THROTTLE_RETRY_DELAY_SECONDS);
+      try {
+        TimeUnit.SECONDS.sleep(THROTTLE_RETRY_DELAY_SECONDS);
+      } catch (InterruptedException ie) {

Review Comment:
   Could we use 3 exponential retries to be more robust against throttling 
errors?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to