Ivan Andika created HDDS-9823:
---------------------------------
Summary: Pipeline failure should trigger heartbeat immediately
Key: HDDS-9823
URL: https://issues.apache.org/jira/browse/HDDS-9823
Project: Apache Ozone
Issue Type: Improvement
Components: Ozone Datanode, SCM
Reporter: Ivan Andika
Assignee: Ivan Andika
XceiverServerRatis#handlePipelineFailure is called in CSM failure scenarios
* XceiverServerRatis#handleNodeSlowness
** From StateMachine#notifyFollowerSlowness
* XceiverServerRatis#handleNoLeader
** From StateMachine#notifyExtendedNoLeader
* XceiverServerRatis#handleInstallSnapshotFromLeader
** From StateMachine#notifyInstallSnapshotFromLeader
The possible issue is that XceieverServerRatis#handlePipelineFailure does not
trigger Heartbeat to SCM immediately. Instead, it waits until the next
heartbeat (default 60s) to send the pipeline close action command. This might
cause SCM to still allocate blocks to these "failed" pipelines during this
duration which might impact on client writing to these blocks.
To minimize the impact on client. I suggest to trigger heartbeat for every
pipeline action close command.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]