[ 
https://issues.apache.org/jira/browse/YARN-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609197#comment-14609197
 ] 

Xuan Gong commented on YARN-3871:
---------------------------------

>From the RM logs:
{code}
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: 
KeeperErrorCode = ConnectionLoss
{code}

Looks like that RM lost the ZK connection. Could you check whether the zk is 
off ?

> ResourceManager down after Blueprint install 
> ---------------------------------------------
>
>                 Key: YARN-3871
>                 URL: https://issues.apache.org/jira/browse/YARN-3871
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.7.1
>         Environment: ambari-2.1.0-1295, hdp-2.3.0.0-2497, sles11sp3
>            Reporter: Zack Marsh
>         Attachments: yarn-yarn-resourcemanager-piripiri3.log, 
> yarn-yarn-resourcemanager-piripiri3.out
>
>
> On a 3-Master HDP 2.3 cluster installed with HDP-2.3.0.0-2482 and 
> Ambari-2.1.0-1266, the YARN ResourceManager was down following the Blueprint 
> install.
> It's important to note that nothing failed during the Blueprint install. The 
> ResourceManager shutdown because of an inability to connect to Zookeeper.
> Excerpt from the ResourceManager log:
> {code}
> 2015-06-26 03:35:47,188 INFO  zookeeper.ZooKeeper 
> (Environment.java:logEnv(100)) - Client 
> environment:java.library.path=:/usr/hdp/2.3.0.0-2482/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.3.0.0-2482/hadoop/lib/native:/usr/hdp/2.3.0.0-2482/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.3.0.0-2482/hadoop/lib/native
> 2015-06-26 03:35:47,188 INFO  zookeeper.ZooKeeper 
> (Environment.java:logEnv(100)) - Client environment:java.io.tmpdir=/tmp
> 2015-06-26 03:35:47,188 INFO  zookeeper.ZooKeeper 
> (Environment.java:logEnv(100)) - Client environment:java.compiler=<NA>
> 2015-06-26 03:35:47,188 INFO  zookeeper.ZooKeeper 
> (Environment.java:logEnv(100)) - Client environment:os.name=Linux
> 2015-06-26 03:35:47,188 INFO  zookeeper.ZooKeeper 
> (Environment.java:logEnv(100)) - Client environment:os.arch=amd64
> 2015-06-26 03:35:47,188 INFO  zookeeper.ZooKeeper 
> (Environment.java:logEnv(100)) - Client 
> environment:os.version=3.0.101-0.50.TDC.1.R.0-default
> 2015-06-26 03:35:47,188 INFO  zookeeper.ZooKeeper 
> (Environment.java:logEnv(100)) - Client environment:user.name=yarn
> 2015-06-26 03:35:47,188 INFO  zookeeper.ZooKeeper 
> (Environment.java:logEnv(100)) - Client environment:user.home=/home/yarn
> 2015-06-26 03:35:47,188 INFO  zookeeper.ZooKeeper 
> (Environment.java:logEnv(100)) - Client 
> environment:user.dir=/usr/hdp/2.3.0.0-2482/hadoop-yarn
> 2015-06-26 03:35:47,190 INFO  zookeeper.ZooKeeper 
> (ZooKeeper.java:<init>(438)) - Initiating client connection, 
> connectString=piripiri2.labs.teradata.com:2181,piripiri1.labs.teradata.com:2181,piripiri3.labs.teradata.com:2181
>  sessionTimeout=10000 
> watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@59d2103b
> 2015-06-26 03:35:47,209 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:logStartConnect(975)) - Opening socket connection to server 
> piripiri2.labs.teradata.com/39.0.40.2:2181. Will not attempt to authenticate 
> using SASL (unknown error)
> 2015-06-26 03:35:47,276 WARN  zookeeper.ClientCnxn 
> (ClientCnxn.java:run(1102)) - Session 0x0 for server null, unexpected error, 
> closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
>         at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>         at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
> 2015-06-26 03:35:47,380 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:logStartConnect(975)) - Opening socket connection to server 
> piripiri3.labs.teradata.com/39.0.40.3:2181. Will not attempt to authenticate 
> using SASL (unknown error)
> 2015-06-26 03:35:47,381 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:primeConnection(852)) - Socket connection established to 
> piripiri3.labs.teradata.com/39.0.40.3:2181, initiating session
> 2015-06-26 03:35:47,452 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:run(1098)) - Unable to read additional data from server 
> sessionid 0x0, likely server has closed socket, closing socket connection and 
> attempting reconnect
> 2015-06-26 03:35:48,067 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:logStartConnect(975)) - Opening socket connection to server 
> piripiri1.labs.teradata.com/39.0.40.1:2181. Will not attempt to authenticate 
> using SASL (unknown error)
> 2015-06-26 03:35:48,378 WARN  zookeeper.ClientCnxn 
> (ClientCnxn.java:run(1102)) - Session 0x0 for server null, unexpected error, 
> closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
>         at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>         at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
> 2015-06-26 03:35:49,914 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:logStartConnect(975)) - Opening socket connection to server 
> piripiri2.labs.teradata.com/39.0.40.2:2181. Will not attempt to authenticate 
> using SASL (unknown error)
> 2015-06-26 03:35:49,915 WARN  zookeeper.ClientCnxn 
> (ClientCnxn.java:run(1102)) - Session 0x0 for server null, unexpected error, 
> closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
>         at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>         at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
> 2015-06-26 03:35:50,028 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:logStartConnect(975)) - Opening socket connection to server 
> piripiri3.labs.teradata.com/39.0.40.3:2181. Will not attempt to authenticate 
> using SASL (unknown error)
> 2015-06-26 03:35:50,028 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:primeConnection(852)) - Socket connection established to 
> piripiri3.labs.teradata.com/39.0.40.3:2181, initiating session
> 2015-06-26 03:35:50,030 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:run(1098)) - Unable to read additional data from server 
> sessionid 0x0, likely server has closed socket, closing socket connection and 
> attempting reconnect
> 2015-06-26 03:35:50,133 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:logStartConnect(975)) - Opening socket connection to server 
> piripiri1.labs.teradata.com/39.0.40.1:2181. Will not attempt to authenticate 
> using SASL (unknown error)
> 2015-06-26 03:35:50,134 WARN  zookeeper.ClientCnxn 
> (ClientCnxn.java:run(1102)) - Session 0x0 for server null, unexpected error, 
> closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
>         at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>         at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
> 2015-06-26 03:35:52,064 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:logStartConnect(975)) - Opening socket connection to server 
> piripiri2.labs.teradata.com/39.0.40.2:2181. Will not attempt to authenticate 
> using SASL (unknown error)
> 2015-06-26 03:35:52,065 WARN  zookeeper.ClientCnxn 
> (ClientCnxn.java:run(1102)) - Session 0x0 for server null, unexpected error, 
> closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
>         at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>         at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
> 2015-06-26 03:35:52,901 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:logStartConnect(975)) - Opening socket connection to server 
> piripiri3.labs.teradata.com/39.0.40.3:2181. Will not attempt to authenticate 
> using SASL (unknown error)
> 2015-06-26 03:35:52,901 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:primeConnection(852)) - Socket connection established to 
> piripiri3.labs.teradata.com/39.0.40.3:2181, initiating session
> 2015-06-26 03:35:52,902 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:run(1098)) - Unable to read additional data from server 
> sessionid 0x0, likely server has closed socket, closing socket connection and 
> attempting reconnect
> 2015-06-26 03:35:53,570 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:logStartConnect(975)) - Opening socket connection to server 
> piripiri1.labs.teradata.com/39.0.40.1:2181. Will not attempt to authenticate 
> using SASL (unknown error)
> 2015-06-26 03:35:53,571 WARN  zookeeper.ClientCnxn 
> (ClientCnxn.java:run(1102)) - Session 0x0 for server null, unexpected error, 
> closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
>         at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>         at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
> 2015-06-26 03:35:55,541 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:logStartConnect(975)) - Opening socket connection to server 
> piripiri2.labs.teradata.com/39.0.40.2:2181. Will not attempt to authenticate 
> using SASL (unknown error)
> 2015-06-26 03:35:55,542 WARN  zookeeper.ClientCnxn 
> (ClientCnxn.java:run(1102)) - Session 0x0 for server null, unexpected error, 
> closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
>         at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>         at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
> 2015-06-26 03:35:56,513 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:logStartConnect(975)) - Opening socket connection to server 
> piripiri3.labs.teradata.com/39.0.40.3:2181. Will not attempt to authenticate 
> using SASL (unknown error)
> 2015-06-26 03:35:56,514 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:primeConnection(852)) - Socket connection established to 
> piripiri3.labs.teradata.com/39.0.40.3:2181, initiating session
> 2015-06-26 03:35:56,515 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:run(1098)) - Unable to read additional data from server 
> sessionid 0x0, likely server has closed socket, closing socket connection and 
> attempting reconnect
> 2015-06-26 03:35:56,821 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:logStartConnect(975)) - Opening socket connection to server 
> piripiri1.labs.teradata.com/39.0.40.1:2181. Will not attempt to authenticate 
> using SASL (unknown error)
> 2015-06-26 03:35:56,822 WARN  zookeeper.ClientCnxn 
> (ClientCnxn.java:run(1102)) - Session 0x0 for server null, unexpected error, 
> closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
>         at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>         at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
> 2015-06-26 03:35:57,205 ERROR ha.ActiveStandbyElector 
> (ActiveStandbyElector.java:waitForZKConnectionEvent(1044)) - Connection timed 
> out: couldn't connect to ZooKeeper in 10000 milliseconds
> 2015-06-26 03:35:57,396 INFO  zookeeper.ZooKeeper (ZooKeeper.java:close(684)) 
> - Session: 0x0 closed
> 2015-06-26 03:35:57,397 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(512)) 
> - EventThread shut down
> 2015-06-26 03:35:57,403 INFO  service.AbstractService 
> (AbstractService.java:noteFailure(272)) - Service 
> org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService failed 
> in state INITED; cause: 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef.waitForZKConnectionEvent(ActiveStandbyElector.java:1047)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef.access$400(ActiveStandbyElector.java:1018)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector.getNewZooKeeper(ActiveStandbyElector.java:633)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector.createConnection(ActiveStandbyElector.java:767)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector.<init>(ActiveStandbyElector.java:227)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.serviceInit(EmbeddedElectorService.java:92)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceInit(AdminService.java:149)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:261)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1226)
> 2015-06-26 03:35:57,404 INFO  service.AbstractService 
> (AbstractService.java:noteFailure(272)) - Service 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService failed in state 
> INITED; cause: org.apache.hadoop.service.ServiceStateException: 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss
> org.apache.hadoop.service.ServiceStateException: 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss
>         at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
>         at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceInit(AdminService.java:149)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:261)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1226)
> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: 
> KeeperErrorCode = ConnectionLoss
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef.waitForZKConnectionEvent(ActiveStandbyElector.java:1047)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef.access$400(ActiveStandbyElector.java:1018)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector.getNewZooKeeper(ActiveStandbyElector.java:633)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector.createConnection(ActiveStandbyElector.java:767)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector.<init>(ActiveStandbyElector.java:227)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.serviceInit(EmbeddedElectorService.java:92)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         ... 7 more
> 2015-06-26 03:35:57,404 INFO  service.AbstractService 
> (AbstractService.java:noteFailure(272)) - Service ResourceManager failed in 
> state INITED; cause: org.apache.hadoop.service.ServiceStateException: 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss
> org.apache.hadoop.service.ServiceStateException: 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss
>         at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
>         at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceInit(AdminService.java:149)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:261)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1226)
> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: 
> KeeperErrorCode = ConnectionLoss
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef.waitForZKConnectionEvent(ActiveStandbyElector.java:1047)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef.access$400(ActiveStandbyElector.java:1018)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector.getNewZooKeeper(ActiveStandbyElector.java:633)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector.createConnection(ActiveStandbyElector.java:767)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector.<init>(ActiveStandbyElector.java:227)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.serviceInit(EmbeddedElectorService.java:92)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         ... 7 more
> 2015-06-26 03:35:57,405 INFO  resourcemanager.ResourceManager 
> (ResourceManager.java:transitionToStandby(1068)) - Transitioning to standby 
> state
> 2015-06-26 03:35:57,405 INFO  resourcemanager.ResourceManager 
> (ResourceManager.java:transitionToStandby(1075)) - Transitioned to standby 
> state
> 2015-06-26 03:35:57,405 FATAL resourcemanager.ResourceManager 
> (ResourceManager.java:main(1230)) - Error starting ResourceManager
> org.apache.hadoop.service.ServiceStateException: 
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss
>         at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
>         at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceInit(AdminService.java:149)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:261)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1226)
> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: 
> KeeperErrorCode = ConnectionLoss
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef.waitForZKConnectionEvent(ActiveStandbyElector.java:1047)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef.access$400(ActiveStandbyElector.java:1018)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector.getNewZooKeeper(ActiveStandbyElector.java:633)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector.createConnection(ActiveStandbyElector.java:767)
>         at 
> org.apache.hadoop.ha.ActiveStandbyElector.<init>(ActiveStandbyElector.java:227)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.serviceInit(EmbeddedElectorService.java:92)
>         at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         ... 7 more
> 2015-06-26 03:35:57,407 INFO  resourcemanager.ResourceManager 
> (LogAdapter.java:info(45)) - SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down ResourceManager at piripiri3/39.0.40.3
> ************************************************************/
> {code}
> This issue was observed again on a 3-Master cluster installed with 
> HDP-2.3.0.0-2497 and Ambari-2.1.0-1295.
> YARN logs attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to