Hi, In the YARN HA for Resource Manager, I noticed that the HA has been fine initially during the HA setup but however after sometime I notice that restarting one resource manager gets the other resource manager stopped/killed. Below is what I see the logs on the killed resource manager instance. I am using hadoop version 2.5.1, if that helps.
Has anyone seen this before? Any ideas on how do I go about this one? thanks, Nikhil ----- 2015-02-24 16:47:37,555 INFO org.apache.hadoop.ha.ActiveStandbyElector: Yielding from election 2015-02-24 16:47:37,555 INFO org.apache.hadoop.ha.ActiveStandbyElector: Deleting bread-crumb of active node... 2015-02-24 16:47:37,555 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder 2015-02-24 16:47:37,580 INFO org.apache.zookeeper.ZooKeeper: Session: 0x14b997543fd001e closed 2015-02-24 16:47:37,580 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x14b997543fd001e 2015-02-24 16:47:37,580 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2015-02-24 16:47:37,580 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioning to standby state 2015-02-24 16:47:37,581 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping ResourceManager metrics system... 2015-02-24 16:47:37,587 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ResourceManager metrics system stopped. 2015-02-24 16:47:37,588 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ResourceManager metrics system shutdown complete. 2015-02-24 16:47:37,588 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$VerifyActiveStatusThread thread interrupted! Exiting! 2015-02-24 16:47:37,616 INFO org.apache.zookeeper.ZooKeeper: Session: 0x24b13ab5b4c069a closed 2015-02-24 16:47:37,616 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2015-02-24 16:47:37,616 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: AsyncDispatcher is draining to stop, igonring any new events. 2015-02-24 16:47:37,617 WARN org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher: org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher$LauncherThread interrupted. Returning. 2015-02-24 16:47:37,618 INFO org.apache.hadoop.ipc.Server: Stopping server on 8032 2015-02-24 16:47:37,622 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8032 2015-02-24 16:47:37,622 INFO org.apache.hadoop.ipc.Server: Stopping server on 8030 2015-02-24 16:47:37,623 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder 2015-02-24 16:47:37,627 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8030 2015-02-24 16:47:37,627 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder 2015-02-24 16:47:37,629 INFO org.apache.hadoop.ipc.Server: Stopping server on 8031 2015-02-24 16:47:37,633 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8031 2015-02-24 16:47:37,633 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder 2015-02-24 16:47:37,634 INFO org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: NMLivelinessMonitor thread interrupted -----
