Harsh J created HADOOP-13515:
--------------------------------
Summary: Redundant transitionToActive call can cause a NameNode to
crash
Key: HADOOP-13515
URL: https://issues.apache.org/jira/browse/HADOOP-13515
Project: Hadoop Common
Issue Type: Bug
Components: ha
Affects Versions: 2.5.0
Reporter: Harsh J
Priority: Minor
The situation in parts is similar to HADOOP-8217, but the cause is different
and so is the result.
Consider this situation:
- At the beginning NN1 is Active, NN2 is Standby
- ZKFC1 faces a ZK disconnect (not a session timeout, just a socket disconnect)
and thereby reconnects
{code}
2016-08-11 07:00:46,068 INFO org.apache.zookeeper.ClientCnxn: Client session
timed out, have not heard from server in 4000ms for sessionid
0x4566f0c97500bd9, closing socket connection and attempting reconnect
2016-08-11 07:00:46,169 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session
disconnected. Entering neutral mode...
…
2016-08-11 07:00:46,610 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session
connected.
{code}
- The reconnection on the ZKFC1 triggers the elector code, and the elector
re-run finds that NN1 should be the new active (a redundant decision cause NN1
is already active)
{code}
2016-08-11 07:00:46,615 INFO org.apache.hadoop.ha.ActiveStandbyElector:
Checking for any old active which needs to be fenced...
2016-08-11 07:00:46,630 INFO org.apache.hadoop.ha.ActiveStandbyElector: Old
node exists: …
2016-08-11 07:00:46,630 INFO org.apache.hadoop.ha.ActiveStandbyElector: But old
node has our own data, so don't need to fence it.
{code}
- The ZKFC1 sets the new ZK data, and fires a NN1 RPC call of transitionToActive
{code}
2016-08-11 07:00:46,630 INFO org.apache.hadoop.ha.ActiveStandbyElector: Writing
znode /hadoop-ha/nameservice1/ActiveBreadCrumb to indicate that the local node
is the most recent active...
2016-08-11 07:00:46,649 TRACE org.apache.hadoop.ipc.ProtobufRpcEngine: 175:
Call -> nn01/10.10.10.10:8022: transitionToActive {reqInfo { reqSource:
REQUEST_BY_ZKFC }}
{code}
- At the same time as the transitionToActive call is in progress at NN1, but
not complete yet, the ZK session of ZKFC1 is timed out by ZK Quorum, and a
watch notification is sent to ZKFC2
{code}
2016-08-11 07:01:00,003 DEBUG org.apache.zookeeper.ClientCnxn: Got notification
sessionid:0x4566f0c97500bde
2016-08-11 07:01:00,004 DEBUG org.apache.zookeeper.ClientCnxn: Got WatchedEvent
state:SyncConnected type:NodeDeleted
path:/hadoop-ha/nameservice1/ActiveStandbyElectorLock for sessionid
0x4566f0c97500bde
{code}
- ZKFC2 responds by marking NN2 as standby, which succeeds (NN hasn't handled
transitionToActive call yet due to busy status, but has handled
transitionToStandby before it)
{code}
2016-08-11 07:01:00,013 INFO org.apache.hadoop.ha.ActiveStandbyElector:
Checking for any old active which needs to be fenced...
2016-08-11 07:01:00,018 INFO org.apache.hadoop.ha.ZKFailoverController: Should
fence: NameNode at nn01/10.10.10.10:8022
2016-08-11 07:01:00,020 TRACE org.apache.hadoop.ipc.ProtobufRpcEngine: 412:
Call -> nn01/10.10.10.10:8022: transitionToStandby {reqInfo { reqSource:
REQUEST_BY_ZKFC }}
2016-08-11 07:01:03,880 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call:
transitionToStandby took 3860ms
{code}
- ZKFC2 then marks NN2 as active, and NN2 begins its transition (is in midst of
it, not done yet at this point)
{code}
2016-08-11 07:01:03,894 INFO org.apache.hadoop.ha.ZKFailoverController: Trying
to make NameNode at nn02/11.11.11.11:8022 active...
2016-08-11 07:01:03,895 TRACE org.apache.hadoop.ipc.ProtobufRpcEngine: 412:
Call -> nn02/11.11.11.11:8022: transitionToActive {reqInfo { reqSource:
REQUEST_BY_ZKFC }}
…
{code}
{code}
2016-08-11 07:01:09,558 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required
for active state
…
2016-08-11 07:01:19,968 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing
edit logs at txnid 5635
{code}
- At the same time in parallel NN1 processes the transitionToActive requests
finally, and becomes active
{code}
2016-08-11 07:01:13,281 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required
for active state
…
2016-08-11 07:01:19,599 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing
edit logs at txnid 5635
…
2016-08-11 07:01:19,602 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog:
Starting log segment at 5635
{code}
- NN2's active transition fails as a result of this parallel active transition
on NN1 which has completed right before it tries to take over
{code}
2016-08-11 07:01:19,968 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing
edit logs at txnid 5635
2016-08-11 07:01:22,799 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode:
Error encountered requiring NN shutdown. Shutting down immediately.
java.lang.IllegalStateException: Cannot start writing at txid 5635 when there
is a stream available for read:
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@53feee4f
at
org.apache.hadoop.hdfs.server.namenode.FSEditLog.openForWrite(FSEditLog.java:312)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1174)
at
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1654)
at
org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
at
org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64)
at
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1527)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1302)
at
org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
at
org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
{code}
(Reflects as a failure on the ZKFC2 too):
{code}
2016-08-11 07:01:31,598 DEBUG org.apache.hadoop.ipc.Client: closing ipc
connection to nn02/11.11.11.11:8022: null
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at
org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1055)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:950)
2016-08-11 07:01:31,626 FATAL org.apache.hadoop.ha.ZKFailoverController:
Couldn't make NameNode at nn02/11.11.11.11:8022 active
java.io.IOException: Failed on local exception: java.io.EOFException; Host
Details : local host is: "nn02/11.11.11.11"; destination host is: "nn02":8022;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
at org.apache.hadoop.ipc.Client.call(Client.java:1415)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy15.transitionToActive(Unknown Source)
at
org.apache.hadoop.ha.protocolPB.HAServiceProtocolClientSideTranslatorPB.transitionToActive(HAServiceProtocolClientSideTranslatorPB.java:100)
at
org.apache.hadoop.ha.HAServiceProtocolHelper.transitionToActive(HAServiceProtocolHelper.java:48)
at
org.apache.hadoop.ha.ZKFailoverController.becomeActive(ZKFailoverController.java:377)
at
org.apache.hadoop.ha.ZKFailoverController.access$900(ZKFailoverController.java:60)
at
org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.becomeActive(ZKFailoverController.java:868)
at
org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:804)
at
org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:415)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at
org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1055)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:950)
2016-08-11 07:01:31,665 WARN org.apache.hadoop.ha.ActiveStandbyElector:
Exception handling the winning of election
org.apache.hadoop.ha.ServiceFailedException: Couldn't transition to active
{code}
We should perhaps avoid sending an active RPC if the NN is already active, by
checking the service status before firing the transition RPC inside
ZKFailoverController.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]