Hello.
I have resolved this issue. Fencing method via ssh-keys doesn't work
properly when host is down. It uses nc to connect and doesn't terminate
connection after timeout even set dfs.ha.fencing.ssh.connect-timeout.
So, i use next block
<property>
<name>dfs.ha.fencing.methods</name>
<value>shell(/bin/true)</value>
</property>
and it switching to active Namenode when host or service is down. HA for
Resource Manager also switching correct.
22.05.2015 14:15, [email protected] пишет:
Thanks for reply, Chris. Now, I understand. Lookm i have a 2 NameNodes
(maximum at HA-cluster), when started ZKFS. So, when host with one node
halt, there is ONLY one ZKFS is running. And it cannot elect a leader.
When i try to run a ZKFC on datanodes or ResMan i get an error:
Exception in thread "main"
org.apache.hadoop.HadoopIllegalArgumentException: Could not get the
namenode ID of this node. You may run zkfc on the node other than namenode.
at
org.apache.hadoop.hdfs.tools.DFSZKFailoverController.create(DFSZKFailoverController.java:128)
at
org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:177)
(-u) 999990
virtual memory (kbytes, -v) unlimited