ArafatKhan2198 commented on code in PR #10565:
URL: https://github.com/apache/ozone/pull/10565#discussion_r3465223514


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueContainer.java:
##########
@@ -391,7 +391,18 @@ public void markContainerUnhealthy() throws 
StorageContainerException {
     writeLock();
     final State prevState = containerData.getState();
     try {
-      updateContainerState(UNHEALTHY);
+      if (!getContainerFile().getParentFile().exists()) {

Review Comment:
   PR description contradicts the implementation. The "What changes were 
proposed" text says the fix calls mkdirs() to recreate the metadata directory 
and persist the .container file so the UNHEALTHY state survives a restart. The 
actual diff does the opposite: it skips the file write and explicitly does not 
recreate the directory or persist.



##########
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueContainer.java:
##########
@@ -759,6 +760,35 @@ public void testReportOfUnhealthyContainer(
     assertNotNull(keyValueContainer.getContainerReport());
   }
 
+  /**

Review Comment:
   Can we test this out locally as well? by manually deleting the metadata 
folder?



##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueContainer.java:
##########


Review Comment:
   > // Attempting to write the .container file would fail, and writing a 
partial file // with only the state field is more harmful than not writing one 
at all.
   > 
   
   `ContainerDataYaml.createContainerFile()` serializes the **entire** 
in-memory `containerData` (all fields loaded at startup), not "only the state 
field." So the "partial file with only the state field" rationale doesn't hold.



##########
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueContainer.java:
##########
@@ -759,6 +760,35 @@ public void testReportOfUnhealthyContainer(
     assertNotNull(keyValueContainer.getContainerReport());
   }
 
+  /**
+   * When a container's metadata directory is missing (MISSING_METADATA_DIR 
detected by the scanner),
+   * markContainerUnhealthy must succeed without throwing. Writing a partial 
.container file with only
+   * the state field would lose other metadata and is more harmful than 
writing nothing. The in-memory
+   * UNHEALTHY state is sufficient for SCM to receive it via ICR and schedule 
deletion.
+   */
+  @ContainerTestVersionInfo.ContainerTest
+  public void testMarkUnhealthyWithMissingMetadataDir(ContainerTestVersionInfo 
versionInfo) throws Exception {
+    init(versionInfo);
+    keyValueContainer.create(volumeSet, volumeChoosingPolicy, scmId);
+
+    // Simulate MISSING_METADATA_DIR using the same corruption helper used in 
scanner tests.
+    File metadataDir = new File(keyValueContainerData.getMetadataPath());
+    assertTrue(metadataDir.exists(), "Metadata dir should exist before 
corruption");
+    MISSING_METADATA_DIR.applyTo(keyValueContainer);
+
+    // markContainerUnhealthy must not throw even though the metadata dir is 
absent.
+    keyValueContainer.markContainerUnhealthy();
+
+    // In-memory state must be UNHEALTHY.
+    assertEquals(ContainerProtos.ContainerDataProto.State.UNHEALTHY,
+        keyValueContainer.getContainerState());
+
+    // The metadata directory must NOT be recreated; no partial .container 
file should be written.
+    assertFalse(metadataDir.exists(), "Metadata dir should not be recreated by 
markContainerUnhealthy");

Review Comment:
   `assertFalse(keyValueContainer.getContainerFile().exists())` (`:788`) is 
trivially true regardless of the fix - the file can't exist when its parent dir 
was just deleted, and the *old* buggy code also never created it (it threw). 
The assertion that carries the fix is "does not throw" (`:780`) plus the state 
check (`:783`). The file/dir assertions mainly serve as a regression guard 
against a future `mkdirs`/persist approach - fine to keep, but they don't prove 
the bug is fixed on their own.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to