Ivan Andika created HDDS-10717:
----------------------------------
Summary: nodeFailureTimeoutMs should be initialized before
syncTimeoutRetry
Key: HDDS-10717
URL: https://issues.apache.org/jira/browse/HDDS-10717
Project: Apache Ozone
Issue Type: Bug
Components: DN, Ozone Datanode
Affects Versions: 1.4.0
Reporter: Ivan Andika
Assignee: Ivan Andika
It is found that the Ratis WriteLog retry is "0/0" which means the WriteLog
will not retry at all, and the datanode will trigger a pipeline failure to
close the pipeline. This might explain why there are a lot of pipeline close
events sent by the datanodes during high IO events.
The issue was due to nodeFailureTimeoutMs initialized after newRaftProperties
and setStateMachineDataConfigurations which causes an issue.
Need to fix the ordering so that it's the syncTimeoutRetry is calculated
correctly (default 30 times).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]