[ 
https://issues.apache.org/jira/browse/HDFS-16547?focusedWorklogId=765117&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-765117
 ]

ASF GitHub Bot logged work on HDFS-16547:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 02/May/22 20:16
            Start Date: 02/May/22 20:16
    Worklog Time Spent: 10m 
      Work Description: xkrogen commented on code in PR #4201:
URL: https://github.com/apache/hadoop/pull/4201#discussion_r863153584


##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java:
##########
@@ -1899,6 +1899,10 @@ synchronized void transitionToStandby() throws 
IOException {
   synchronized void transitionToObserver() throws IOException {
     String operationName = "transitionToObserver";
     namesystem.checkSuperuserPrivilege(operationName);
+    if (namesystem.isInSafeMode()) {

Review Comment:
   I think we can guard this by `notBecomeActiveInSafemode`. Though the config 
claims to be about "active" status, the logic in `monitorHealth` just generally 
considers a standby NN as unhealthy if it's in safemode, and I think the intent 
here is the same as with that config.
   
   
   Also: `namesystem.isInSafeMode()` -> `isInSafeMode()` (`NameNode` redefines 
this method)



##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDFSZKFailoverController.java:
##########
@@ -301,6 +304,40 @@ public void testManualFailoverWithDFSHAAdmin() throws 
Exception {
     waitForHAState(1, HAServiceState.STANDBY);
   }
 
+  /**
+   * Tests that a Namenode in safe mode should not be transfer to observer 
state.
+   */
+  @Test
+  public void testManualFailoverWithDFSHAAdminInSafemode() throws Exception {
+    startCluster();
+    NamenodeProtocols nn1 = cluster.getNameNode(1).getRpcServer();
+
+    // Enter safe mode.
+    nn1.setSafeMode(HdfsConstants.SafeModeAction.SAFEMODE_ENTER, false);
+    // Test NameNodeRpcServer.
+    LambdaTestUtils.intercept(SafeModeException.class,
+        "Cannot transition to observer. Name node is in safe mode",
+        () -> nn1.transitionToObserver(
+            new StateChangeRequestInfo(RequestSource.REQUEST_BY_USER_FORCED)));
+
+    // Test DFSHAAdmin.
+    DFSHAAdmin tool = new DFSHAAdmin();
+    tool.setConf(conf);
+    System.setIn(new ByteArrayInputStream("yes\n".getBytes()));
+    int result = tool.run(
+        new String[]{"-transitionToObserver", "-forcemanual", "nn2"});
+    assertEquals("State transition returned: " + result, -1, result);

Review Comment:
   This should be in `TestDFSHAAdminMiniCluster`



##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDFSZKFailoverController.java:
##########
@@ -301,6 +304,40 @@ public void testManualFailoverWithDFSHAAdmin() throws 
Exception {
     waitForHAState(1, HAServiceState.STANDBY);
   }
 
+  /**
+   * Tests that a Namenode in safe mode should not be transfer to observer 
state.
+   */
+  @Test
+  public void testManualFailoverWithDFSHAAdminInSafemode() throws Exception {
+    startCluster();
+    NamenodeProtocols nn1 = cluster.getNameNode(1).getRpcServer();
+
+    // Enter safe mode.
+    nn1.setSafeMode(HdfsConstants.SafeModeAction.SAFEMODE_ENTER, false);
+    // Test NameNodeRpcServer.
+    LambdaTestUtils.intercept(SafeModeException.class,
+        "Cannot transition to observer. Name node is in safe mode",
+        () -> nn1.transitionToObserver(
+            new StateChangeRequestInfo(RequestSource.REQUEST_BY_USER_FORCED)));

Review Comment:
   This should probably be in `TestHASafeMode`, where we already have 
`testTransitionToActiveWhenSafeMode`



##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java:
##########
@@ -1899,6 +1899,10 @@ synchronized void transitionToStandby() throws 
IOException {
   synchronized void transitionToObserver() throws IOException {
     String operationName = "transitionToObserver";
     namesystem.checkSuperuserPrivilege(operationName);
+    if (namesystem.isInSafeMode()) {
+      throw namesystem.newSafemodeException("Cannot transition to " +
+          OBSERVER_STATE);

Review Comment:
   Consolidate this logic with the exception from `transitionToActive`:
   ```java
       if (notBecomeActiveInSafemode && isInSafeMode()) {
         throw new ServiceFailedException(getRole() + " still not leave 
safemode");
       }
   ```
   I don't think we need to make `newSafemodeException` public, seems fine to 
just throw a new exception here?





Issue Time Tracking
-------------------

    Worklog Id:     (was: 765117)
    Time Spent: 50m  (was: 40m)

> [SBN read] Namenode in safe mode should not be transfered to observer state
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-16547
>                 URL: https://issues.apache.org/jira/browse/HDFS-16547
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Tao Li
>            Assignee: Tao Li
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently, when a Namenode is in safemode(under starting or enter safemode 
> manually), we can transfer this Namenode to Observer by command. This 
> Observer node may receive many requests and then throw a SafemodeException, 
> this causes unnecessary failover on the client.
> So Namenode in safe mode should not be transfer to observer state.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to