[ 
https://issues.apache.org/jira/browse/HADOOP-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14579884#comment-14579884
 ] 

lvchuanwen commented on HADOOP-10251:
-------------------------------------

Attaching some logs:

2015-06-10 02:57:53,727 DEBUG org.apache.hadoop.ipc.Server: Successfully 
authorized userInfo {
  effectiveUser: "hdfs"
}
protocol: "org.apache.hadoop.ha.ZKFCProtocol"

2015-06-10 02:57:53,728 DEBUG org.apache.hadoop.ipc.Server:  got #0
2015-06-10 02:57:53,728 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 
2 on 8019: org.apache.hadoop.ha.ZKFCProtocol.gracefulFailover from 
10.43.156.196:49132 Call#0 Retry#0 for RpcKind RPC_PROTOCOL_BUFFER
2015-06-10 02:57:53,730 DEBUG org.apache.hadoop.security.UserGroupInformation: 
PrivilegedAction as:hdfs (auth:SIMPLE) 
from:org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
2015-06-10 02:57:53,755 DEBUG org.apache.hadoop.security.Groups: Returning 
fetched groups for 'hdfs'
2015-06-10 02:57:53,755 INFO 
org.apache.hadoop.hdfs.tools.DFSZKFailoverController: Allowed RPC access from 
hdfs (auth:SIMPLE) at 10.43.156.196
2015-06-10 02:57:53,756 DEBUG org.apache.hadoop.security.UserGroupInformation: 
PrivilegedAction as:hdfs (auth:SIMPLE) 
from:org.apache.hadoop.ha.ZKFailoverController.gracefulFailoverToYou(ZKFailoverController.java:603)
2015-06-10 02:57:53,760 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply 
sessionid:0x34dcf74b50a05d7, packet:: clientPath:null serverPath:null 
finished:false header:: 4,4  replyHeader:: 4,38654797504,0  request:: 
'/hadoop-ha/cluster195/ActiveStandbyElectorLock,F  response:: 
#aa636c75737465723139351236e6e321a67a646831393620ffffffa84628ffffffd33e,s{38654797500,38654797500,1433876199859,1433876199859,0,0,0,93891338261633392,31,0,38654797500}
 
2015-06-10 02:57:53,768 DEBUG org.apache.hadoop.hdfs.server.namenode.NameNode: 
Setting fs.defaultFS to hdfs://zdh196:9000
2015-06-10 02:57:53,770 INFO org.apache.hadoop.ha.ZKFailoverController: Asking 
NameNode at zdh196/10.43.156.196:9000 to cede its active state for 10000ms
2015-06-10 02:57:53,772 DEBUG org.apache.hadoop.ipc.Client: getting client out 
of cache: org.apache.hadoop.ipc.Client@7007cf85
2015-06-10 02:57:53,779 DEBUG org.apache.hadoop.ipc.Client: The ping interval 
is 60000 ms.
2015-06-10 02:57:53,779 DEBUG org.apache.hadoop.ipc.Client: Connecting to 
zdh196/10.43.156.196:8019
2015-06-10 02:57:53,781 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh196/10.43.156.196:8019 from hdfs: starting, having 
connections 2
2015-06-10 02:57:53,781 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh196/10.43.156.196:8019 from hdfs sending #147
2015-06-10 02:57:53,969 DEBUG org.apache.zookeeper.ClientCnxn: Got notification 
sessionid:0x34dcf74b50a05d7
2015-06-10 02:57:53,970 DEBUG org.apache.zookeeper.ClientCnxn: Got WatchedEvent 
state:SyncConnected type:NodeDeleted 
path:/hadoop-ha/cluster195/ActiveStandbyElectorLock for sessionid 
0x34dcf74b50a05d7
2015-06-10 02:57:53,974 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Watcher event type: NodeDeleted with state:SyncConnected for 
path:/hadoop-ha/cluster195/ActiveStandbyElectorLock connectionState: CONNECTED 
for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:57:53,979 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh196/10.43.156.196:8019 from hdfs got value #147
2015-06-10 02:57:53,979 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply 
sessionid:0x34dcf74b50a05d7, packet:: 
clientPath:/hadoop-ha/cluster195/ActiveStandbyElectorLock 
serverPath:/hadoop-ha/cluster195/ActiveStandbyElectorLock finished:false 
header:: 5,1  replyHeader:: 5,38654797507,0  request:: 
'/hadoop-ha/cluster195/ActiveStandbyElectorLock,#aa636c75737465723139351236e6e311a67a646831393520ffffffa84628ffffffd33e,v{s{31,s{'world,'anyone}}},1
  response:: '/hadoop-ha/cluster195/ActiveStandbyElectorLock 
2015-06-10 02:57:53,979 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
cedeActive took 200ms
2015-06-10 02:57:53,982 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
CreateNode result: 0 for path: /hadoop-ha/cluster195/ActiveStandbyElectorLock 
connectionState: CONNECTED  for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:57:53,982 INFO org.apache.hadoop.ha.ActiveStandbyElector: 
Checking for any old active which needs to be fenced...
2015-06-10 02:57:53,985 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply 
sessionid:0x34dcf74b50a05d7, packet:: clientPath:null serverPath:null 
finished:false header:: 6,4  replyHeader:: 6,38654797507,-101  request:: 
'/hadoop-ha/cluster195/ActiveBreadCrumb,F  response::  
2015-06-10 02:57:53,996 INFO org.apache.hadoop.ha.ActiveStandbyElector: No old 
node to fence
2015-06-10 02:57:53,996 INFO org.apache.hadoop.ha.ActiveStandbyElector: Writing 
znode /hadoop-ha/cluster195/ActiveBreadCrumb to indicate that the local node is 
the most recent active...
2015-06-10 02:57:54,002 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply 
sessionid:0x34dcf74b50a05d7, packet:: clientPath:null serverPath:null 
finished:false header:: 7,1  replyHeader:: 7,38654797508,0  request:: 
'/hadoop-ha/cluster195/ActiveBreadCrumb,#aa636c75737465723139351236e6e311a67a646831393520ffffffa84628ffffffd33e,v{s{31,s{'world,'anyone}}},0
  response:: '/hadoop-ha/cluster195/ActiveBreadCrumb 
2015-06-10 02:57:54,006 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Becoming active for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000

2015-06-10 02:57:54,008 INFO org.apache.hadoop.ha.ZKFailoverController: Trying 
to make NameNode at zdh195/10.43.156.195:9000 active...
2015-06-10 02:57:54,010 DEBUG org.apache.hadoop.ipc.Client: getting client out 
of cache: org.apache.hadoop.ipc.Client@7007cf85
2015-06-10 02:57:54,012 DEBUG org.apache.hadoop.ipc.Client: The ping interval 
is 60000 ms.
2015-06-10 02:57:54,012 DEBUG org.apache.hadoop.ipc.Client: Connecting to 
zdh195/10.43.156.195:9000
2015-06-10 02:57:54,014 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs: starting, having 
connections 3
2015-06-10 02:57:54,014 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs sending #148
2015-06-10 02:57:54,084 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs sending #149

2015-06-10 02:57:56,071 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs got value #148
2015-06-10 02:57:56,071 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs got value #149
2015-06-10 02:57:56,071 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
transitionToActive took 2059ms
2015-06-10 02:57:56,071 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
getServiceStatus took 1988ms
2015-06-10 02:57:56,072 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs sending #150
2015-06-10 02:57:56,073 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs got value #150
2015-06-10 02:57:56,073 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
monitorHealth took 2ms
2015-06-10 02:57:56,073 INFO org.apache.hadoop.ha.ZKFailoverController: 
Successfully transitioned NameNode at zdh195/10.43.156.195:9000 to active state

2015-06-10 02:57:56,079 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply 
sessionid:0x34dcf74b50a05d7, packet:: 
clientPath:/hadoop-ha/cluster195/ActiveStandbyElectorLock 
serverPath:/hadoop-ha/cluster195/ActiveStandbyElectorLock finished:false 
header:: 8,3  replyHeader:: 8,38654797508,0  request:: 
'/hadoop-ha/cluster195/ActiveStandbyElectorLock,T  response:: 
s{38654797507,38654797507,1433876273975,1433876273975,0,0,0,238074455480993239,31,0,38654797507}
 
2015-06-10 02:57:56,081 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
StatNode result: 0 for path: /hadoop-ha/cluster195/ActiveStandbyElectorLock 
connectionState: CONNECTED for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000

2015-06-10 02:57:56,092 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh196/10.43.156.196:8019 from hdfs got value #151
2015-06-10 02:57:56,092 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
cedeActive took 18ms
2015-06-10 02:57:56,092 INFO org.apache.hadoop.ha.ZKFailoverController: 
Successfully became active. Successfully transitioned NameNode at 
zdh195/10.43.156.195:9000 to active state
2015-06-10 02:57:56,093 DEBUG org.apache.hadoop.ipc.Server: Served: 
gracefulFailover queueTime= 20 procesingTime= 2345
2015-06-10 02:57:56,094 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 
2 on 8019: responding to org.apache.hadoop.ha.ZKFCProtocol.gracefulFailover 
from 10.43.156.196:49132 Call#0 Retry#0
2015-06-10 02:57:56,095 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 
2 on 8019: responding to org.apache.hadoop.ha.ZKFCProtocol.gracefulFailover 
from 10.43.156.196:49132 Call#0 Retry#0 Wrote 32 bytes.
2015-06-10 02:57:56,120 DEBUG org.apache.hadoop.ipc.Server: Socket Reader #1 
for port 8019: disconnecting client 10.43.156.196:49132. Number of active 
connections: 0
2015-06-10 02:57:57,077 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs sending #152
2015-06-10 02:57:57,079 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs got value #152
2015-06-10 02:57:57,079 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
getServiceStatus took 3ms
2015-06-10 02:57:57,080 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs sending #153
2015-06-10 02:57:57,081 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs got value #153
2015-06-10 02:57:57,082 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
monitorHealth took 3ms
2015-06-10 02:57:57,082 ERROR org.apache.hadoop.ha.ZKFailoverController: Local 
service NameNode at zdh195/10.43.156.195:9000 has changed the serviceState to 
active. Expected was standby. Quitting election marking fencing necessary.
2015-06-10 02:57:57,082 INFO org.apache.hadoop.ha.ActiveStandbyElector: 
Yielding from election
2015-06-10 02:57:57,085 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Terminating ZK connection for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:57:57,085 DEBUG org.apache.zookeeper.ZooKeeper: Closing session: 
0x34dcf74b50a05d7
2015-06-10 02:57:57,085 DEBUG org.apache.zookeeper.ClientCnxn: Closing client 
for session: 0x34dcf74b50a05d7
2015-06-10 02:57:57,088 DEBUG org.apache.zookeeper.ClientCnxn: Got notification 
sessionid:0x34dcf74b50a05d7
2015-06-10 02:57:57,089 DEBUG org.apache.zookeeper.ClientCnxn: Got WatchedEvent 
state:SyncConnected type:NodeDeleted 
path:/hadoop-ha/cluster195/ActiveStandbyElectorLock for sessionid 
0x34dcf74b50a05d7
2015-06-10 02:57:57,090 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply 
sessionid:0x34dcf74b50a05d7, packet:: clientPath:null serverPath:null 
finished:false header:: 9,-11  replyHeader:: 9,38654797511,0  request:: null 
response:: null
2015-06-10 02:57:57,090 DEBUG org.apache.zookeeper.ClientCnxn: Disconnecting 
client for session: 0x34dcf74b50a05d7
2015-06-10 02:57:57,090 INFO org.apache.zookeeper.ZooKeeper: Session: 
0x34dcf74b50a05d7 closed
2015-06-10 02:57:57,090 DEBUG org.apache.zookeeper.ClientCnxn: An exception was 
thrown while closing send thread for session 0x34dcf74b50a05d7 : Unable to read 
additional data from server sessionid 0x34dcf74b50a05d7, likely server has 
closed socket
2015-06-10 02:57:57,091 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
Ignoring stale result from old client with sessionId 0x34dcf74b50a05d7
2015-06-10 02:57:57,091 INFO org.apache.zookeeper.ClientCnxn: EventThread shut 
down
2015-06-10 02:57:58,091 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs sending #154
2015-06-10 02:57:58,093 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs got value #154
2015-06-10 02:57:58,093 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
getServiceStatus took 3ms
2015-06-10 02:57:58,094 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs sending #155
2015-06-10 02:57:58,095 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs got value #155
2015-06-10 02:57:58,096 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
monitorHealth took 3ms
2015-06-10 02:57:58,096 DEBUG org.apache.hadoop.ha.ZKFailoverController: 
rechecking for electability from bad state
2015-06-10 02:57:58,099 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Attempting active election for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:57:58,102 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Establishing zookeeper connection for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:57:58,102 INFO org.apache.zookeeper.ZooKeeper: Initiating client 
connection, connectString=zdh196:2181,zdh195:2181,zdh197:2181 
sessionTimeout=10000 
watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@31b40327
2015-06-10 02:57:58,105 INFO org.apache.zookeeper.ClientCnxn: Opening socket 
connection to server zdh195/10.43.156.195:2181. Will not attempt to 
authenticate using SASL (unknown error)
2015-06-10 02:57:58,106 INFO org.apache.zookeeper.ClientCnxn: Socket connection 
established to zdh195/10.43.156.195:2181, initiating session
2015-06-10 02:57:58,106 DEBUG org.apache.zookeeper.ClientCnxn: Session 
establishment request sent on zdh195/10.43.156.195:2181
2015-06-10 02:57:58,109 INFO org.apache.zookeeper.ClientCnxn: Session 
establishment complete on server zdh195/10.43.156.195:2181, sessionid = 
0x24d91acb5cb1967, negotiated timeout = 10000
2015-06-10 02:57:58,112 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Created new connection for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:57:58,116 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Watcher event type: None with state:SyncConnected for path:null 
connectionState: TERMINATED for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:57:58,116 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session 
connected.
2015-06-10 02:57:58,117 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply 
sessionid:0x24d91acb5cb1967, packet:: 
clientPath:/hadoop-ha/cluster195/ActiveStandbyElectorLock 
serverPath:/hadoop-ha/cluster195/ActiveStandbyElectorLock finished:false 
header:: 1,1  replyHeader:: 1,38654797515,-110  request:: 
'/hadoop-ha/cluster195/ActiveStandbyElectorLock,#aa636c75737465723139351236e6e311a67a646831393520ffffffa84628ffffffd33e,v{s{31,s{'world,'anyone}}},1
  response::  
2015-06-10 02:57:58,120 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
CreateNode result: -110 for path: 
/hadoop-ha/cluster195/ActiveStandbyElectorLock connectionState: CONNECTED  for 
elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:57:58,123 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Becoming standby for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:57:58,123 INFO org.apache.hadoop.ha.ZKFailoverController: ZK 
Election indicated that NameNode at zdh195/10.43.156.195:9000 should become 
standby
2015-06-10 02:57:58,125 DEBUG org.apache.hadoop.ipc.Client: getting client out 
of cache: org.apache.hadoop.ipc.Client@7007cf85
2015-06-10 02:57:58,125 DEBUG org.apache.hadoop.ipc.Client: The ping interval 
is 60000 ms.
2015-06-10 02:57:58,126 DEBUG org.apache.hadoop.ipc.Client: Connecting to 
zdh195/10.43.156.195:9000
2015-06-10 02:57:58,128 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs sending #156
2015-06-10 02:57:58,128 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs: starting, having 
connections 4
2015-06-10 02:57:58,132 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs got value #156
2015-06-10 02:57:58,132 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
transitionToStandby took 7ms
2015-06-10 02:57:58,132 INFO org.apache.hadoop.ha.ZKFailoverController: 
Successfully transitioned NameNode at zdh195/10.43.156.195:9000 to standby state
2015-06-10 02:57:58,135 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Monitoring active leader for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:57:58,138 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply 
sessionid:0x24d91acb5cb1967, packet:: 
clientPath:/hadoop-ha/cluster195/ActiveStandbyElectorLock 
serverPath:/hadoop-ha/cluster195/ActiveStandbyElectorLock finished:false 
header:: 2,3  replyHeader:: 2,38654797515,0  request:: 
'/hadoop-ha/cluster195/ActiveStandbyElectorLock,T  response:: 
s{38654797512,38654797512,1433876277092,1433876277092,0,0,0,165948932252965222,31,0,38654797512}
 
2015-06-10 02:57:58,141 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
StatNode result: 0 for path: /hadoop-ha/cluster195/ActiveStandbyElectorLock 
connectionState: CONNECTED for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:57:58,850 DEBUG org.apache.zookeeper.ClientCnxn: Got notification 
sessionid:0x24d91acb5cb1967
2015-06-10 02:57:58,850 DEBUG org.apache.zookeeper.ClientCnxn: Got WatchedEvent 
state:SyncConnected type:NodeDeleted 
path:/hadoop-ha/cluster195/ActiveStandbyElectorLock for sessionid 
0x24d91acb5cb1967
2015-06-10 02:57:58,854 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Watcher event type: NodeDeleted with state:SyncConnected for 
path:/hadoop-ha/cluster195/ActiveStandbyElectorLock connectionState: CONNECTED 
for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:57:58,858 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply 
sessionid:0x24d91acb5cb1967, packet:: 
clientPath:/hadoop-ha/cluster195/ActiveStandbyElectorLock 
serverPath:/hadoop-ha/cluster195/ActiveStandbyElectorLock finished:false 
header:: 3,1  replyHeader:: 3,38654797517,0  request:: 
'/hadoop-ha/cluster195/ActiveStandbyElectorLock,#aa636c75737465723139351236e6e311a67a646831393520ffffffa84628ffffffd33e,v{s{31,s{'world,'anyone}}},1
  response:: '/hadoop-ha/cluster195/ActiveStandbyElectorLock 
2015-06-10 02:57:58,861 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
CreateNode result: 0 for path: /hadoop-ha/cluster195/ActiveStandbyElectorLock 
connectionState: CONNECTED  for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:57:58,861 INFO org.apache.hadoop.ha.ActiveStandbyElector: 
Checking for any old active which needs to be fenced...
2015-06-10 02:57:58,863 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply 
sessionid:0x24d91acb5cb1967, packet:: clientPath:null serverPath:null 
finished:false header:: 4,4  replyHeader:: 4,38654797517,0  request:: 
'/hadoop-ha/cluster195/ActiveBreadCrumb,F  response:: 
#aa636c75737465723139351236e6e321a67a646831393620ffffffa84628ffffffd33e,s{38654797508,38654797513,1433876273998,1433876277194,1,0,0,0,31,0,38654797508}
 
2015-06-10 02:57:58,864 INFO org.apache.hadoop.ha.ActiveStandbyElector: Old 
node exists: 0a0a636c757374657231393512036e6e321a067a646831393620a84628d33e
2015-06-10 02:57:58,867 DEBUG org.apache.hadoop.hdfs.server.namenode.NameNode: 
Setting fs.defaultFS to hdfs://zdh196:9000
2015-06-10 02:57:58,868 INFO org.apache.hadoop.ha.ZKFailoverController: Should 
fence: NameNode at zdh196/10.43.156.196:9000
2015-06-10 02:57:58,870 DEBUG org.apache.hadoop.ipc.Client: getting client out 
of cache: org.apache.hadoop.ipc.Client@7007cf85
2015-06-10 02:57:58,871 DEBUG org.apache.hadoop.ipc.Client: The ping interval 
is 60000 ms.
2015-06-10 02:57:58,871 DEBUG org.apache.hadoop.ipc.Client: Connecting to 
zdh196/10.43.156.196:9000
2015-06-10 02:57:58,872 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh196/10.43.156.196:9000 from hdfs sending #157
2015-06-10 02:57:58,873 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh196/10.43.156.196:9000 from hdfs: starting, having 
connections 5
2015-06-10 02:57:58,949 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh196/10.43.156.196:9000 from hdfs got value #157
2015-06-10 02:57:58,949 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
transitionToStandby took 79ms
2015-06-10 02:57:58,949 DEBUG org.apache.hadoop.ipc.Client: stopping client 
from cache: org.apache.hadoop.ipc.Client@7007cf85
2015-06-10 02:57:58,949 INFO org.apache.hadoop.ha.ZKFailoverController: 
Successfully transitioned NameNode at zdh196/10.43.156.196:9000 to standby 
state without fencing
2015-06-10 02:57:58,949 INFO org.apache.hadoop.ha.ActiveStandbyElector: Writing 
znode /hadoop-ha/cluster195/ActiveBreadCrumb to indicate that the local node is 
the most recent active...
2015-06-10 02:57:58,954 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply 
sessionid:0x24d91acb5cb1967, packet:: clientPath:null serverPath:null 
finished:false header:: 5,5  replyHeader:: 5,38654797518,0  request:: 
'/hadoop-ha/cluster195/ActiveBreadCrumb,#aa636c75737465723139351236e6e311a67a646831393520ffffffa84628ffffffd33e,1
  response:: 
s{38654797508,38654797518,1433876273998,1433876278951,2,0,0,0,31,0,38654797508} 
2015-06-10 02:57:58,955 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Becoming active for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000



2015-06-10 02:57:58,955 INFO org.apache.hadoop.ha.ZKFailoverController: Trying 
to make NameNode at zdh195/10.43.156.195:9000 active...
2015-06-10 02:57:58,956 DEBUG org.apache.hadoop.ipc.Client: getting client out 
of cache: org.apache.hadoop.ipc.Client@7007cf85
2015-06-10 02:57:58,957 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs sending #158

2015-06-10 02:57:59,113 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs sending #159
2015-06-10 02:58:00,545 DEBUG org.apache.hadoop.ipc.Server: IPC Server idle 
connection scanner for port 8019: task running
2015-06-10 02:58:00,573 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs got value #159
2015-06-10 02:58:00,573 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs got value #158
2015-06-10 02:58:00,574 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
getServiceStatus took 1461ms
2015-06-10 02:58:00,574 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
transitionToActive took 1617ms
2015-06-10 02:58:00,574 INFO org.apache.hadoop.ha.ZKFailoverController: 
Successfully transitioned NameNode at zdh195/10.43.156.195:9000 to active state
2015-06-10 02:58:00,574 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs sending #160

2015-06-10 02:58:00,576 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs got value #160
2015-06-10 02:58:00,576 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
monitorHealth took 2ms
2015-06-10 02:58:00,577 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Monitoring active leader for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:58:00,580 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply 
sessionid:0x24d91acb5cb1967, packet:: 
clientPath:/hadoop-ha/cluster195/ActiveStandbyElectorLock 
serverPath:/hadoop-ha/cluster195/ActiveStandbyElectorLock finished:false 
header:: 6,3  replyHeader:: 6,38654797520,0  request:: 
'/hadoop-ha/cluster195/ActiveStandbyElectorLock,T  response:: 
s{38654797517,38654797517,1433876278854,1433876278854,0,0,0,165948932252965223,31,0,38654797517}
 

2015-06-10 02:58:01,578 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs sending #161
2015-06-10 02:58:01,580 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs got value #161
2015-06-10 02:58:01,580 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
getServiceStatus took 2ms
2015-06-10 02:58:01,581 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs sending #162
2015-06-10 02:58:01,582 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs got value #162
2015-06-10 02:58:01,582 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
monitorHealth took 2ms
2015-06-10 02:58:01,583 ERROR org.apache.hadoop.ha.ZKFailoverController: Local 
service NameNode at zdh195/10.43.156.195:9000 has changed the serviceState to 
active. Expected was standby. Quitting election marking fencing necessary.
2015-06-10 02:58:01,583 INFO org.apache.hadoop.ha.ActiveStandbyElector: 
Yielding from election
2015-06-10 02:58:01,585 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Terminating ZK connection for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:58:01,586 DEBUG org.apache.zookeeper.ZooKeeper: Closing session: 
0x24d91acb5cb1967
2015-06-10 02:58:01,586 DEBUG org.apache.zookeeper.ClientCnxn: Closing client 
for session: 0x24d91acb5cb1967
2015-06-10 02:58:01,588 DEBUG org.apache.zookeeper.ClientCnxn: Got notification 
sessionid:0x24d91acb5cb1967
2015-06-10 02:58:01,588 DEBUG org.apache.zookeeper.ClientCnxn: Got WatchedEvent 
state:SyncConnected type:NodeDeleted 
path:/hadoop-ha/cluster195/ActiveStandbyElectorLock for sessionid 
0x24d91acb5cb1967
2015-06-10 02:58:01,589 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply 
sessionid:0x24d91acb5cb1967, packet:: clientPath:null serverPath:null 
finished:false header:: 7,-11  replyHeader:: 7,38654797521,0  request:: null 
response:: null
2015-06-10 02:58:01,589 DEBUG org.apache.zookeeper.ClientCnxn: Disconnecting 
client for session: 0x24d91acb5cb1967
2015-06-10 02:58:01,589 DEBUG org.apache.zookeeper.ClientCnxn: An exception was 
thrown while closing send thread for session 0x24d91acb5cb1967 : Unable to read 
additional data from server sessionid 0x24d91acb5cb1967, likely server has 
closed socket
2015-06-10 02:58:01,589 INFO org.apache.zookeeper.ZooKeeper: Session: 
0x24d91acb5cb1967 closed
2015-06-10 02:58:01,590 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
Ignoring stale result from old client with sessionId 0x24d91acb5cb1967
2015-06-10 02:58:01,590 INFO org.apache.zookeeper.ClientCnxn: EventThread shut 
down
2015-06-10 02:58:02,590 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs sending #163
2015-06-10 02:58:02,592 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs got value #163
2015-06-10 02:58:02,593 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
getServiceStatus took 3ms
2015-06-10 02:58:02,593 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs sending #164
2015-06-10 02:58:02,595 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs got value #164
2015-06-10 02:58:02,595 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
monitorHealth took 2ms
2015-06-10 02:58:02,595 DEBUG org.apache.hadoop.ha.ZKFailoverController: 
rechecking for electability from bad state
2015-06-10 02:58:02,598 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Attempting active election for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:58:02,600 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Establishing zookeeper connection for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:58:02,600 INFO org.apache.zookeeper.ZooKeeper: Initiating client 
connection, connectString=zdh196:2181,zdh195:2181,zdh197:2181 
sessionTimeout=10000 
watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@6d4f0202
2015-06-10 02:58:02,603 INFO org.apache.zookeeper.ClientCnxn: Opening socket 
connection to server zdh195/10.43.156.195:2181. Will not attempt to 
authenticate using SASL (unknown error)
2015-06-10 02:58:02,603 INFO org.apache.zookeeper.ClientCnxn: Socket connection 
established to zdh195/10.43.156.195:2181, initiating session
2015-06-10 02:58:02,603 DEBUG org.apache.zookeeper.ClientCnxn: Session 
establishment request sent on zdh195/10.43.156.195:2181
2015-06-10 02:58:02,606 INFO org.apache.zookeeper.ClientCnxn: Session 
establishment complete on server zdh195/10.43.156.195:2181, sessionid = 
0x24d91acb5cb1969, negotiated timeout = 10000
2015-06-10 02:58:02,610 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Created new connection for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:58:02,613 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Watcher event type: None with state:SyncConnected for path:null 
connectionState: TERMINATED for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:58:02,614 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session 
connected.
2015-06-10 02:58:02,614 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply 
sessionid:0x24d91acb5cb1969, packet:: 
clientPath:/hadoop-ha/cluster195/ActiveStandbyElectorLock 
serverPath:/hadoop-ha/cluster195/ActiveStandbyElectorLock finished:false 
header:: 1,1  replyHeader:: 1,38654797525,-110  request:: 
'/hadoop-ha/cluster195/ActiveStandbyElectorLock,#aa636c75737465723139351236e6e311a67a646831393520ffffffa84628ffffffd33e,v{s{31,s{'world,'anyone}}},1
  response::  
2015-06-10 02:58:02,618 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
CreateNode result: -110 for path: 
/hadoop-ha/cluster195/ActiveStandbyElectorLock connectionState: CONNECTED  for 
elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:58:02,621 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Becoming standby for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:58:02,621 INFO org.apache.hadoop.ha.ZKFailoverController: ZK 
Election indicated that NameNode at zdh195/10.43.156.195:9000 should become 
standby
2015-06-10 02:58:02,622 DEBUG org.apache.hadoop.ipc.Client: getting client out 
of cache: org.apache.hadoop.ipc.Client@7007cf85
2015-06-10 02:58:02,622 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs sending #165
2015-06-10 02:58:02,624 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs got value #165
2015-06-10 02:58:02,624 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
transitionToStandby took 2ms
2015-06-10 02:58:02,624 INFO org.apache.hadoop.ha.ZKFailoverController: 
Successfully transitioned NameNode at zdh195/10.43.156.195:9000 to standby state
2015-06-10 02:58:02,625 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Monitoring active leader for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:58:02,627 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply 
sessionid:0x24d91acb5cb1969, packet:: 
clientPath:/hadoop-ha/cluster195/ActiveStandbyElectorLock 
serverPath:/hadoop-ha/cluster195/ActiveStandbyElectorLock finished:false 
header:: 2,3  replyHeader:: 2,38654797525,0  request:: 
'/hadoop-ha/cluster195/ActiveStandbyElectorLock,T  response:: 
s{38654797522,38654797522,1433876281592,1433876281592,0,0,0,165948932252965224,31,0,38654797522}
 
2015-06-10 02:58:02,628 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
StatNode result: 0 for path: /hadoop-ha/cluster195/ActiveStandbyElectorLock 
connectionState: CONNECTED for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:58:03,185 DEBUG org.apache.zookeeper.ClientCnxn: Got notification 
sessionid:0x24d91acb5cb1969
2015-06-10 02:58:03,185 DEBUG org.apache.zookeeper.ClientCnxn: Got WatchedEvent 
state:SyncConnected type:NodeDeleted 
path:/hadoop-ha/cluster195/ActiveStandbyElectorLock for sessionid 
0x24d91acb5cb1969
2015-06-10 02:58:03,188 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Watcher event type: NodeDeleted with state:SyncConnected for 
path:/hadoop-ha/cluster195/ActiveStandbyElectorLock connectionState: CONNECTED 
for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:58:03,191 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply 
sessionid:0x24d91acb5cb1969, packet:: 
clientPath:/hadoop-ha/cluster195/ActiveStandbyElectorLock 
serverPath:/hadoop-ha/cluster195/ActiveStandbyElectorLock finished:false 
header:: 3,1  replyHeader:: 3,38654797527,0  request:: 
'/hadoop-ha/cluster195/ActiveStandbyElectorLock,#aa636c75737465723139351236e6e311a67a646831393520ffffffa84628ffffffd33e,v{s{31,s{'world,'anyone}}},1
  response:: '/hadoop-ha/cluster195/ActiveStandbyElectorLock 
2015-06-10 02:58:03,194 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
CreateNode result: 0 for path: /hadoop-ha/cluster195/ActiveStandbyElectorLock 
connectionState: CONNECTED  for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000
2015-06-10 02:58:03,194 INFO org.apache.hadoop.ha.ActiveStandbyElector: 
Checking for any old active which needs to be fenced...
2015-06-10 02:58:03,196 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply 
sessionid:0x24d91acb5cb1969, packet:: clientPath:null serverPath:null 
finished:false header:: 4,4  replyHeader:: 4,38654797527,0  request:: 
'/hadoop-ha/cluster195/ActiveBreadCrumb,F  response:: 
#aa636c75737465723139351236e6e321a67a646831393620ffffffa84628ffffffd33e,s{38654797508,38654797523,1433876273998,1433876281689,3,0,0,0,31,0,38654797508}
 
2015-06-10 02:58:03,198 INFO org.apache.hadoop.ha.ActiveStandbyElector: Old 
node exists: 0a0a636c757374657231393512036e6e321a067a646831393620a84628d33e
2015-06-10 02:58:03,201 DEBUG org.apache.hadoop.hdfs.server.namenode.NameNode: 
Setting fs.defaultFS to hdfs://zdh196:9000
2015-06-10 02:58:03,202 INFO org.apache.hadoop.ha.ZKFailoverController: Should 
fence: NameNode at zdh196/10.43.156.196:9000
2015-06-10 02:58:03,204 DEBUG org.apache.hadoop.ipc.Client: getting client out 
of cache: org.apache.hadoop.ipc.Client@7007cf85
2015-06-10 02:58:03,205 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh196/10.43.156.196:9000 from hdfs sending #166
2015-06-10 02:58:03,269 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh196/10.43.156.196:9000 from hdfs got value #166
2015-06-10 02:58:03,270 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
transitionToStandby took 65ms
2015-06-10 02:58:03,270 DEBUG org.apache.hadoop.ipc.Client: stopping client 
from cache: org.apache.hadoop.ipc.Client@7007cf85
2015-06-10 02:58:03,270 INFO org.apache.hadoop.ha.ZKFailoverController: 
Successfully transitioned NameNode at zdh196/10.43.156.196:9000 to standby 
state without fencing
2015-06-10 02:58:03,270 INFO org.apache.hadoop.ha.ActiveStandbyElector: Writing 
znode /hadoop-ha/cluster195/ActiveBreadCrumb to indicate that the local node is 
the most recent active...
2015-06-10 02:58:03,274 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply 
sessionid:0x24d91acb5cb1969, packet:: clientPath:null serverPath:null 
finished:false header:: 5,5  replyHeader:: 5,38654797528,0  request:: 
'/hadoop-ha/cluster195/ActiveBreadCrumb,#aa636c75737465723139351236e6e311a67a646831393520ffffffa84628ffffffd33e,3
  response:: 
s{38654797508,38654797528,1433876273998,1433876283271,4,0,0,0,31,0,38654797508} 
2015-06-10 02:58:03,276 DEBUG org.apache.hadoop.ha.ActiveStandbyElector: 
Becoming active for elector id=1893177739 
appData=0a0a636c757374657231393512036e6e311a067a646831393520a84628d33e 
cb=Elector callbacks for NameNode at zdh195/10.43.156.195:9000

2015-06-10 02:58:03,277 INFO org.apache.hadoop.ha.ZKFailoverController: Trying 
to make NameNode at zdh195/10.43.156.195:9000 active...
2015-06-10 02:58:03,278 DEBUG org.apache.hadoop.ipc.Client: getting client out 
of cache: org.apache.hadoop.ipc.Client@7007cf85
2015-06-10 02:58:03,278 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(256152889) connection to zdh195/10.43.156.195:9000 from hdfs sending #167


> Both NameNodes could be in STANDBY State if SNN network is unstable
> -------------------------------------------------------------------
>
>                 Key: HADOOP-10251
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10251
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ha
>    Affects Versions: 2.2.0
>            Reporter: Vinayakumar B
>            Assignee: Vinayakumar B
>            Priority: Critical
>             Fix For: 2.5.0
>
>         Attachments: HADOOP-10251.patch, HADOOP-10251.patch, 
> HADOOP-10251.patch, HADOOP-10251.patch, HADOOP-10251.patch
>
>
> Following corner scenario happened in one of our cluster.
> 1. NN1 was Active and NN2 was Standby
> 2. NN2 machine's network was slow 
> 3. NN1 got shutdown.
> 4. NN2 ZKFC got the notification and trying to check for old active for 
> fencing. (This took little more time, again due to slow network)
> 5. In between, NN1 got restarted by our automatic monitoring, and ZKFC made 
> it Active.
> 6. Now NN2 ZKFC got Old Active as NN1 and it did graceful fencing of NN1 to 
> STANBY.
> 7. Before writing ActiveBreadCrumb to ZK, NN2 ZKFC got session timeout and 
> got shutdown before making NN2 Active.
> *Now cluster having both NameNodes as STANDBY.*
> NN1 ZKFC still thinks that its nameNode is in Active state. 
> NN2 ZKFC waiting for election.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to