[ 
https://issues.apache.org/jira/browse/HDDS-11693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandakumar updated HDDS-11693:
------------------------------
    Description: 
This parent Jira tracks the SCM Safemode improvements.

Currently, the Safemode Rule validation is driven by processing of reports sent 
by Datanodes, this is unnecessary and this also introduces additional 
complexity as every rule is maintaining the same information that the 
NodeManager, PipelineManager & the ContainerManager are maintaining.

Since the information about Datanodes/Pipelines/Containers are maintained at 
two different places, this can go out of sync and introduce bugs into safemode 
logic. (eg: HDDS-5263)

We will move to a simpler model where the SafemodeManager will not care about 
report processing (Safemode Rules will not process any Datanode Reports) but 
check NodeManager, PipelineManager and ContainerManager to validate the exit 
rule.

With this change the Safemode logic will become simple and easy to follow.

  was:
This parent Jira tracks the SCM Safemode improvements.

Currently, the Safemode Rule validation is driven by processing of reports sent 
by Datanodes, this is unnecessary and this also introduces additional 
complexity as every rule is maintaining the same information that the 
NodeManager, PipelineManager & the ContainerManager are maintaining.

Since the information about Datanodes/Pipelines/Containers are maintained at 
two different places, this can go out of sync and introduce bugs into safemode 
logic. (eg: HDDS-5263)

We will move to a simpler model where we SafemodeManager will not care about 
report processing (Safemode Rules will not process any Datanode Reports) but 
check NodeManager, PipelineManager and ContainerManager to validate the exit 
rule.

With this change the Safemode logic will become simple and easy to follow.


> SCM Safemode Improvements
> -------------------------
>
>                 Key: HDDS-11693
>                 URL: https://issues.apache.org/jira/browse/HDDS-11693
>             Project: Apache Ozone
>          Issue Type: Improvement
>          Components: SCM
>            Reporter: Nandakumar
>            Assignee: Nandakumar
>            Priority: Major
>
> This parent Jira tracks the SCM Safemode improvements.
> Currently, the Safemode Rule validation is driven by processing of reports 
> sent by Datanodes, this is unnecessary and this also introduces additional 
> complexity as every rule is maintaining the same information that the 
> NodeManager, PipelineManager & the ContainerManager are maintaining.
> Since the information about Datanodes/Pipelines/Containers are maintained at 
> two different places, this can go out of sync and introduce bugs into 
> safemode logic. (eg: HDDS-5263)
> We will move to a simpler model where the SafemodeManager will not care about 
> report processing (Safemode Rules will not process any Datanode Reports) but 
> check NodeManager, PipelineManager and ContainerManager to validate the exit 
> rule.
> With this change the Safemode logic will become simple and easy to follow.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to