[PR] HDDS-9823. Pipeline failure should trigger heartbeat immediately [ozone]

via GitHub Mon, 04 Dec 2023 06:15:22 -0800


ivandika3 opened a new pull request, #5725:
URL: https://github.com/apache/ozone/pull/5725


   ## What changes were proposed in this pull request?
   
   XceiverServerRatis#handlePipelineFailure is called in CSM failure scenarios
   
   - XceiverServerRatis#handleNodeSlowness
      - From StateMachine#notifyFollowerSlowness 
      - Set to hdds.ratis.rpc.slowness.timeout (default value 300s)
         - Note: Ratis default value is 60s
   - XceiverServerRatis#handleNoLeader
      - From StateMachine#notifyExtendedNoLeader
      - Set to hdds.ratis.notification.no-leader.timeout (default value 300s)
         - Note: Ratis default value is 60s
   - XceiverServerRatis#handleInstallSnapshotFromLeader
      - From StateMachine#notifyInstallSnapshotFromLeader
   
   Currently, XceiverServerRatis#handlePipelineFailure does not trigger 
Heartbeat to SCM immediately. Instead, it waits until the next heartbeat 
(default 60s) to send the pipeline close action command. This might cause SCM 
to still allocate blocks to these "failed" pipelines during this duration which 
might impact on client writing to these blocks.
   
   To minimize the impact on the client and the datanodes on the failed 
pipeline. I suggest that the datanode trigger the pipeline close command 
immediately for every pipeline action close command triggered due to pipeline 
failure.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-9823
   
   ## How was this patch tested?
   
   Existing tests.
   
   Clean CI run: https://github.com/ivandika3/ozone/actions/runs/7084351468


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] HDDS-9823. Pipeline failure should trigger heartbeat immediately [ozone]

Reply via email to