[ 
https://issues.apache.org/jira/browse/HDDS-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen resolved HDDS-6377.
------------------------------
    Resolution: Fixed

> Redundant loop while doing triggerHeartbeat in DatanodeStateMachine
> -------------------------------------------------------------------
>
>                 Key: HDDS-6377
>                 URL: https://issues.apache.org/jira/browse/HDDS-6377
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Janus Chow
>            Assignee: Janus Chow
>            Priority: Major
>              Labels: pull-request-available
>
> The code related to checking heartbeat is as follows.
>  
> {code:java}
> L1    while (context.getState() != DatanodeStates.SHUTDOWN) {
> L2      try {
> L3        LOG.debug("Executing cycle Number : {}", 
> context.getExecutionCount());
> L4        long heartbeatFrequency = context.getHeartbeatFrequency();
> L5        nextHB.set(Time.monotonicNow() + heartbeatFrequency);
> L6        context.execute(executorService, heartbeatFrequency,
> L7            TimeUnit.MILLISECONDS);
> L8      } catch (InterruptedException e) {
> L9        // Someone has sent interrupt signal, this could be because
> L10      // 1. Trigger heartbeat immediately
> L11      // 2. Shutdown has be initiated.
> L12      Thread.currentThread().interrupt();
> L13    } catch (Exception e) {
> L14      LOG.error("Unable to finish the execution.", e);
> L15    }      
> L16
> L17    now = Time.monotonicNow();
> L18    if (now < nextHB.get()) {
> L19      if (!Thread.interrupted()) {
> L20        try {
> L21          Thread.sleep(nextHB.get() - now);
> L22        } catch (InterruptedException e) {
> L23          //triggerHeartbeat is called during the sleep
> L24          Thread.currentThread().interrupt();
> L25        }
> L26      }
> L27    }
>      {code}
> The redundant case happens as follows:
>  # triggerHeartBeat() called while stateMachineThread sleeping at L21.
>  # IterruptedException catched in L22, "interrupted" state reset to false.
>  # L24 set "interrupted" state to true.
>  # Then back to while loop, in try-catch block of L2, since "interrupted" 
> state was set to true, it will go to L8, then L12 set the "interrupted" state 
> to true.
>  # In L19, "Thread.interrupted()" was checked, since the current value is 
> true, it will skip the sleep and go to next loop of while, and "interrupted" 
> state is reset to false here.
>  # Then in try-catch block of L2, since the "interrupted" state is false, now 
> the heartbeat is triggered.
> The issue is in the above step3, we don't need to set the "interrupted" state 
> back to true, so that the next loop can execute the heartbeat directly.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to