Re: [PR] HDDS-12156. Add container health task metrics in Recon. [ozone]

via GitHub Tue, 18 Feb 2025 06:55:34 -0800


devmadhuu commented on code in PR #7786:
URL: https://github.com/apache/ozone/pull/7786#discussion_r1959912326



##########
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/fsck/ContainerHealthTask.java:
##########
@@ -192,30 +198,58 @@ private void checkAndProcessContainers(
   }
 
   private void logUnhealthyContainerStats(
-      Map<UnHealthyContainerStates, Map<String, Long>>
-          unhealthyContainerStateStatsMap) {
-    unhealthyContainerStateStatsMapForTesting =
-        new HashMap<>(unhealthyContainerStateStatsMap);
+      Map<UnHealthyContainerStates, Map<String, Long>> 
unhealthyContainerStateStatsMap) {
+
+    unhealthyContainerStateStatsMapForTesting = new 
HashMap<>(unhealthyContainerStateStatsMap);
+
     // If any EMPTY_MISSING containers, then it is possible that such
     // containers got stuck in the closing state which never got
     // any replicas created on the datanodes. In this case, we log it as
     // EMPTY_MISSING in unhealthy container statistics but do not add it to 
the table.
-    unhealthyContainerStateStatsMap.entrySet().forEach(stateEntry -> {
-      UnHealthyContainerStates unhealthyContainerState = stateEntry.getKey();
-      Map<String, Long> containerStateStatsMap = stateEntry.getValue();
-      StringBuilder logMsgBuilder =
-          new StringBuilder(unhealthyContainerState.toString());
-      logMsgBuilder.append(" **Container State Stats:** \n\t");
-      containerStateStatsMap.entrySet().forEach(statsEntry -> {
-        logMsgBuilder.append(statsEntry.getKey());
-        logMsgBuilder.append(" -> ");
-        logMsgBuilder.append(statsEntry.getValue());
-        logMsgBuilder.append(" , ");
-      });
-      LOG.info(logMsgBuilder.toString());
+    unhealthyContainerStateStatsMap.forEach((unhealthyContainerState, 
containerStateStatsMap) -> {
+      // Reset metrics to zero if the map is empty for MISSING or 
UNDER_REPLICATED

Review Comment:
   Good question @devabhishekpal , Idea behind adding these 2 metrics is to 
integrate these with alert mechanism in future and these 2 UNHEALTHY states are 
worrisome and needs attention. I think we can also add for `mis-replication`. 
In case of `over-replication`, this seems less troublesome as cluster has just 
more replicas and over period of time Replication Manager will balance out.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HDDS-12156. Add container health task metrics in Recon. [ozone]

Reply via email to