[
https://issues.apache.org/jira/browse/HDDS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HDDS-14084:
----------------------------------
Labels: pull-request-available (was: )
> Highlight Data Nodes with Insufficient Write Space
> --------------------------------------------------
>
> Key: HDDS-14084
> URL: https://issues.apache.org/jira/browse/HDDS-14084
> Project: Apache Ozone
> Issue Type: Improvement
> Affects Versions: 2.0.0, 1.4.1, 2.1.0
> Reporter: Jason O'Sullivan
> Assignee: Jason O'Sullivan
> Priority: Minor
> Labels: pull-request-available
> Fix For: 2.2.0
>
>
> The *Data Node selection process* for forming new *Ratis pipelines* currently
> *silently ignores* nodes that do not meet the minimum space requirements for
> either metadata or data writes. This behaviour makes it difficult for
> operators to quickly identify and troubleshoot Data Nodes that are failing to
> join pipelines.
> The following properties are used by the *Storage Container Manager (SCM)*
> during node selection to enforce minimum space requirements:
> * *Metadata Write Space:*
> *
> ** {{ozone.scm.datanode.ratis.volume.free-space.min}} (Minimum free space
> required on the Data Node's disk volume for {*}Ratis metadata{*}.)
> * *Data Write Space:*
> *
> ** {{ozone.scm.container.size}} (The target size of a container, which acts
> as the effective *minimum free space* required to allocate a new container
> for data.)
> Currently, the reason for ignoring a Data Node is only recorded at the
> *DEBUG* log level.
> The proposal is to raise the log level for these specific space-check
> failures to *INFO* or *WARN* within the SCM node selection logic. This change
> will ensure that Data Nodes with insufficient write space (for either
> metadata or data) are *highlighted sooner* in the logs, providing immediate
> visibility to operators without requiring increased log verbosity for the
> entire system.
> *Note:* While this change offers better visibility, it may lead to an
> increase in log output under heavy operation or when a large number of Data
> Nodes are consistently low on space.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]