HI,
DO you mean Active Namenode which is killed is not transition to STANDBY..?
>>> Here Namenode will not start as standby if you kill..Again you need to
>>> start manually.
Automatic failover means when over Active goes down Standy Node will
transition to Active automatically..it's not like starting killed process and
making the Active(which is standby.)
Please refer the following doc for same ..( Section : Verifying automatic
failover)
http://hadoop.apache.org/docs/r2.3.0/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithNFS.html
OR
DO you mean Standby Namenode is not transition to ACTIVE..?
>>>> Please check ZKFC logs,, Mostly this might not happen from the logs you
>>>> pasted
Thanks & Regards
Brahma Reddy Battula
________________________________
From: [email protected] [[email protected]]
Sent: Monday, August 04, 2014 4:38 PM
To: [email protected]
Cc: [email protected]
Subject: Hadoop 2.4.1 Verifying Automatic Failover Failed: Unable to trigger a
roll of the active NN
Hi,
I have setup Hadoop 2.4.1 HA Cluster using Quorum Journal, I am verifying
automatic failover, after killing the process of namenode from Active one, the
name node was not failover to standby node,
Please advise
Regards
Arthur
2014-08-04 18:54:40,453 WARN
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unable to trigger a
roll of the active NN
java.net.ConnectException: Call From standbynode to activenode:8020 failed on
connection exception: java.net.ConnectException: Connection refused; For more
details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
at org.apache.hadoop.ipc.Client.call(Client.java:1414)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy16.rollEditLog(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.rollEditLog(NamenodeProtocolTranslatorPB.java:139)
at
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:271)
at
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$600(EditLogTailer.java:61)
at
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:313)
at
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:282)
at
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:299)
at
org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
at
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:295)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:604)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:699)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1462)
at org.apache.hadoop.ipc.Client.call(Client.java:1381)
... 11 more
2014-08-04 18:55:03,458 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1
on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getListing from
activenode:54571 Call#17 Retry#1: org.apache.hadoop.ipc.StandbyException:
Operation category READ is not supported in state standby
2014-08-04 18:55:06,683 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7
on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getListing from
activenode:54571 Call#17 Retry#3: org.apache.hadoop.ipc.StandbyException:
Operation category READ is not supported in state standby
2014-08-04 18:55:16,643 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7
on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo from
activenode:54602 Call#0 Retry#1: org.apache.hadoop.ipc.StandbyException:
Operation category READ is not supported in state standby
2014-08-04 18:55:19,530 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1
on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getListing from
activenode:54610 Call#17 Retry#5: org.apache.hadoop.ipc.StandbyException:
Operation category READ is not supported in state standby
2014-08-04 18:55:20,756 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5
on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo from
activenode:54602 Call#0 Retry#3: org.apache.hadoop.ipc.StandbyException:
Operation category READ is not supported in state standby