[
https://issues.apache.org/jira/browse/HDDS-8062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HDDS-8062:
---------------------------------
Labels: pull-request-available (was: )
> Persist reason for container replica being marked unhealthy
> -----------------------------------------------------------
>
> Key: HDDS-8062
> URL: https://issues.apache.org/jira/browse/HDDS-8062
> Project: Apache Ozone
> Issue Type: Sub-task
> Reporter: Ethan Rose
> Assignee: Ethan Rose
> Priority: Major
> Labels: pull-request-available
> Attachments: container_log_v1.pdf
>
>
> Once a container replica is marked unhealthy by the scanner, it would be
> helpful for debugging to persist why the container was marked unhealthy. Just
> logging to the main datanode log will eventually roll off and would require
> more filtering to figure out what happened.
> Reasons for marking unhealthy include:
> * Corrupted block (and which block was corrupted)
> * Corrupted container metadata file
> * Volume failure
> Some options for persisting the information are:
> * Into the .container file itself.
> ** May not work if the container file is corrupted.
> * To the datanode audit log
> ** Would get mixed up with client operations like put block.
> * To a different file within the container
> ** This could be used to track the entire lifecycle of the container, like
> when it was created, closed, replicated, and marked unhealthy.
> * To a dedicated log4j logger that can be configured to go to a different
> file.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]