[
https://issues.apache.org/jira/browse/HDFS-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon updated HDFS-3071:
------------------------------
Attachment: hdfs-3071.txt
Found one more issue in manual testing, which made me go back and add automated
tests for this feature. I fixed TestDFSHAAdminMiniCluster to actually record
the error output, and added an assertion to check that it's correct for the
safemode case. Also tested locally. I ran all the HA tests in both common and
HDFS as well.
> haadmin failover command does not provide enough detail for when target NN is
> not ready to be active
> ----------------------------------------------------------------------------------------------------
>
> Key: HDFS-3071
> URL: https://issues.apache.org/jira/browse/HDFS-3071
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: ha
> Affects Versions: 0.24.0
> Reporter: Philip Zeyliger
> Assignee: Todd Lipcon
> Attachments: hdfs-3071.txt, hdfs-3071.txt, hdfs-3071.txt,
> hdfs-3071.txt, hdfs-3071.txt, hdfs-3071.txt
>
>
> When running the failover command, you can get an error message like the
> following:
> {quote}
> $ hdfs --config $(pwd) haadmin -failover namenode2 namenode1
> Failover failed: xxx.yyy/1.2.3.4:8020 is not ready to become active
> {quote}
> Unfortunately, the error message doesn't describe why that node isn't ready
> to be active. In my case, the target namenode's logs don't indicate anything
> either. It turned out that the issue was "Safe mode is ON.Resources are low
> on NN. Safe mode must be turned off manually.", but ideally the user would be
> told that at the time of the failover.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira