>From the log, looks like the connections between NameNodes and ZK quorum are not stable. And the ZK session is time-out. You can check the log of the Zookeeper servers. You may find some errors about the connection failure.
On Fri, Aug 1, 2014 at 2:06 PM, cho ju il <[email protected]> wrote: > > > > > Why does suddenly ha switching? > > My hadoop cluster HA active namenode(host1) suddenly switch to standby > namenode(host2). > > I could not found any error in hadoop logs (in any server) to identify the > root cause. > > > > Tthe Namenodes following error appeared in hdfs logs frequently and non of > the application could read the HDFS files. > > > > > > *** namenode log > > 2014-08-01 04:20:39,133 INFO > org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: > Rescanning after 30000 milliseconds > > 2014-08-01 04:20:39,151 INFO > org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: > Scanned 0 directive(s) and 0 block(s) in 19 millisecond(s). > > 2014-08-01 04:21:03,608 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services > started for active state > > 2014-08-01 04:21:03,728 INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: > > > *** zkfc log > > 2014-08-01 04:21:03,601 INFO org.apache.zookeeper.ClientCnxn: Client > session timed out, have not heard from server in 46910ms for sessionid > 0x147000ee1f70137, closing socket connection and attempting reconnect > > 2014-08-01 04:21:03,703 INFO org.apache.hadoop.ha.ActiveStandbyElector: > Session disconnected. Entering neutral mode... > > -- Regards Gordon Wang
