[
https://issues.apache.org/jira/browse/HDDS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HDDS-7097:
---------------------------------
Labels: pull-request-available (was: )
> Container scanner log output lacks useful information
> -----------------------------------------------------
>
> Key: HDDS-7097
> URL: https://issues.apache.org/jira/browse/HDDS-7097
> Project: Apache Ozone
> Issue Type: Sub-task
> Reporter: Ethan Rose
> Assignee: Dave Teng
> Priority: Major
> Labels: pull-request-available
>
> Currently the output from the container scanner may look like this
> {code}
> 2022-08-04 14:16:37,702 WARN
> org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer: Moving
> container
> /hadoop-ozone/datanode/data/hdds/CID-5612c780-06f8-4ac5-9eae-498159abd009/current/containerDir1/1008
> to state UNHEALTHY from state:UNHEALTHY
> Trace:java.base/java.lang.Thread.getStackTrace(Thread.java:1606)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.markContainerUnhealthy(KeyValueContainer.java:335)
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.markContainerUnhealthy(KeyValueHandler.java:1017)
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.markContainerUnhealthy(ContainerController.java:116)
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerDataScanner.runIteration(ContainerDataScanner.java:108)
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerDataScanner.run(ContainerDataScanner.java:81)
> ...
> 2022-08-04 14:30:19,407 ERROR
> org.apache.hadoop.ozone.container.keyvalue.KeyValueContainerCheck: Corruption
> detected in container: [2] Exception: [null]
> {code}
> There's numerous problems with this:
> - The previous container state is not logged. The new unhealthy state is
> incorrectly logged as the previous state.
> - The exception identifying the corruption only has its message printed. The
> exception object itself should be logged to better identify the failure and
> catch cases like above where there is no exception message (probably caused
> by a bug).
> - The stack trace of the call to {{KeyValueContainer#markContainerUnhealthy}}
> is logged, which both verbose and not useful.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]