[jira] [Created] (HDDS-9823) Pipeline failure should trigger heartbeat immediately

Ivan Andika (Jira) Mon, 04 Dec 2023 00:28:12 -0800

Ivan Andika created HDDS-9823:
---------------------------------

             Summary: Pipeline failure should trigger heartbeat immediately
                 Key: HDDS-9823
                 URL: https://issues.apache.org/jira/browse/HDDS-9823
             Project: Apache Ozone
          Issue Type: Improvement
          Components: Ozone Datanode, SCM
            Reporter: Ivan Andika
            Assignee: Ivan Andika



XceiverServerRatis#handlePipelineFailure is called in CSM failure scenarios
 * XceiverServerRatis#handleNodeSlowness
 ** From StateMachine#notifyFollowerSlowness
 * XceiverServerRatis#handleNoLeader
 ** From StateMachine#notifyExtendedNoLeader
 * XceiverServerRatis#handleInstallSnapshotFromLeader
 ** From StateMachine#notifyInstallSnapshotFromLeader

The possible issue is that XceieverServerRatis#handlePipelineFailure does not 
trigger Heartbeat to SCM immediately. Instead, it waits until the next 
heartbeat (default 60s) to send the pipeline close action command. This might 
cause SCM to still allocate blocks to these "failed" pipelines during this 
duration which might impact on client writing to these blocks.

To minimize the impact on client. I suggest to trigger heartbeat for every 
pipeline action close command.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (HDDS-9823) Pipeline failure should trigger heartbeat immediately

Reply via email to