[ https://issues.apache.org/jira/browse/AMBARI-19289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814244#comment-15814244 ]
Hadoop QA commented on AMBARI-19289: ------------------------------------ {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12846481/AMBARI-19289_branch-2.5.01.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in ambari-server. Test results: https://builds.apache.org/job/Ambari-trunk-test-patch/9973//testReport/ Console output: https://builds.apache.org/job/Ambari-trunk-test-patch/9973//console This message is automatically generated. > HDFS Service check fails if previous active NN is down > ------------------------------------------------------ > > Key: AMBARI-19289 > URL: https://issues.apache.org/jira/browse/AMBARI-19289 > Project: Ambari > Issue Type: Bug > Components: ambari-server > Affects Versions: 2.4.2 > Reporter: Weiwei Yang > Assignee: Weiwei Yang > Attachments: AMBARI-19289_branch-2.5.01.patch, > AMBARI-19289_trunk.01.patch, AMBARI-19289_trunk.02.patch > > > *Reproduce steps* > # Enable namenode HA > # Shutdown the active namenode, standby takes over > # Run HDFS service check > hdfs service check script uses > {{hdfs dfsadmin -fs hdfs://mycluster -safemode get | grep OFF}} > to check if namenode is out of safemode. However this command will fail if > 1st NN is down without checking the state of 2nd NN. This is likely a HDFS > bug similar to HDFS-8277. > *Proposal* > There are several approaches to fix this > # Loop each namenode address and get safemode with {{hdfs dfsadmin -fs > hdfs://nn_host:8020 -safemode get | grep OFF}}, as long as there is one NN > returns OFF, consider DFS is not in safemode and continue the rest of check. > However is it really necessary to add such complexity for service check? > # Remove the safemode check code, if HDFS is in safemode, read/write > operations will fail anyway so service check won't pass > I am preferring to #2 because it makes script simpler and work in all cases. > Note this is service check, it should pass as long as HDFS is in working > state. It is not namenode check. -- This message was sent by Atlassian JIRA (v6.3.4#6332)