Ivan Andika created HDDS-9823:
---------------------------------

             Summary: Pipeline failure should trigger heartbeat immediately
                 Key: HDDS-9823
                 URL: https://issues.apache.org/jira/browse/HDDS-9823
             Project: Apache Ozone
          Issue Type: Improvement
          Components: Ozone Datanode, SCM
            Reporter: Ivan Andika
            Assignee: Ivan Andika


XceiverServerRatis#handlePipelineFailure is called in CSM failure scenarios
 * XceiverServerRatis#handleNodeSlowness
 ** From StateMachine#notifyFollowerSlowness
 * XceiverServerRatis#handleNoLeader
 ** From StateMachine#notifyExtendedNoLeader
 * XceiverServerRatis#handleInstallSnapshotFromLeader
 ** From StateMachine#notifyInstallSnapshotFromLeader

The possible issue is that XceieverServerRatis#handlePipelineFailure does not 
trigger Heartbeat to SCM immediately. Instead, it waits until the next 
heartbeat (default 60s) to send the pipeline close action command. This might 
cause SCM to still allocate blocks to these "failed" pipelines during this 
duration which might impact on client writing to these blocks.

To minimize the impact on client. I suggest to trigger heartbeat for every 
pipeline action close command.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to