[ https://issues.apache.org/jira/browse/STORM-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aaron Gresch reassigned STORM-3124: ----------------------------------- Assignee: Aaron Gresch > Failures talking to Pacemaker > ----------------------------- > > Key: STORM-3124 > URL: https://issues.apache.org/jira/browse/STORM-3124 > Project: Apache Storm > Issue Type: Bug > Affects Versions: 2.0.0 > Reporter: Aaron Gresch > Assignee: Aaron Gresch > Priority: Critical > > {code:java} > 2018-06-25 20:21:05.220 o.a.s.p.PacemakerClient timer [ERROR] Not getting > response or getting null response. Making 7 more attempts. > 2018-06-25 20:21:06.220 o.a.s.p.PacemakerClient timer [ERROR] error > attempting to write to a channel Timed out waiting for channel ready.. > 2018-06-25 20:21:06.220 o.a.s.p.PacemakerClient timer [ERROR] Not getting > response or getting null response. Making 6 more attempts. > 2018-06-25 20:21:07.220 o.a.s.p.PacemakerClient timer [ERROR] error > attempting to write to a channel Timed out waiting for channel ready.. > 2018-06-25 20:21:07.221 o.a.s.p.PacemakerClient timer [ERROR] Not getting > response or getting null response. Making 5 more attempts. > 2018-06-25 20:21:08.221 o.a.s.p.PacemakerClient timer [ERROR] error > attempting to write to a channel Timed out waiting for channel ready.. > 2018-06-25 20:21:08.221 o.a.s.p.PacemakerClient timer [ERROR] Not getting > response or getting null response. Making 4 more attempts. > 2018-06-25 20:21:09.222 o.a.s.p.PacemakerClient timer [ERROR] error > attempting to write to a channel Timed out waiting for channel ready.. > 2018-06-25 20:21:09.222 o.a.s.p.PacemakerClient timer [ERROR] Not getting > response or getting null response. Making 3 more attempts. > 2018-06-25 20:21:10.222 o.a.s.p.PacemakerClient timer [ERROR] error > attempting to write to a channel Timed out waiting for channel ready.. > 2018-06-25 20:21:10.222 o.a.s.p.PacemakerClient timer [ERROR] Not getting > response or getting null response. Making 2 more attempts. > 2018-06-25 20:21:11.223 o.a.s.p.PacemakerClient timer [ERROR] error > attempting to write to a channel Timed out waiting for channel ready.. > 2018-06-25 20:21:11.223 o.a.s.p.PacemakerClient timer [ERROR] Not getting > response or getting null response. Making 1 more attempts. > 2018-06-25 20:21:12.223 o.a.s.p.PacemakerClient timer [ERROR] error > attempting to write to a channel Timed out waiting for channel ready.. > 2018-06-25 20:21:12.223 o.a.s.p.PacemakerClient timer [ERROR] Not getting > response or getting null response. Making 0 more attempts. > 2018-06-25 20:21:13.224 o.a.s.p.PacemakerClientPool timer [WARN] Failed to > connect to the pacemaker server openqe74blue-n2.blue.ygrid.yahoo.com > 2018-06-25 20:21:13.229 o.a.s.d.n.Nimbus pool-37-thread-481 [INFO] > uploadedJar > /home/y/var/storm/nimbus/inbox/stormjar-c5893ba3-21c6-4397-84e2-54aab8e091a9.jar > 2018-06-25 20:21:13.225 o.a.s.d.n.Nimbus timer [ERROR] Error while processing > event > java.lang.RuntimeException: java.lang.RuntimeException: > org.apache.storm.pacemaker.PacemakerConnectionException: Failed to connect to > any Pacemaker. > at > org.apache.storm.daemon.nimbus.Nimbus.lambda$launchServer$37(Nimbus.java:2773) > ~[storm-server-2.0.0.y.jar:2.0.0.y] > at org.apache.storm.StormTimer$1.run(StormTimer.java:110) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.StormTimer$StormTimerTask.run(StormTimer.java:226) > [storm-client-2.0.0.y.jar:2.0.0.y] > Caused by: java.lang.RuntimeException: > org.apache.storm.pacemaker.PacemakerConnectionException: Failed to connect to > any Pacemaker. > at > org.apache.storm.cluster.PaceMakerStateStorage.get_worker_hb_children(PaceMakerStateStorage.java:214) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.cluster.StormClusterStateImpl.heartbeatStorms(StormClusterStateImpl.java:482) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.daemon.nimbus.Nimbus.topoIdsToClean(Nimbus.java:897) > ~[storm-server-2.0.0.y.jar:2.0.0.y] > at org.apache.storm.daemon.nimbus.Nimbus.doCleanup(Nimbus.java:2469) > ~[storm-server-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.daemon.nimbus.Nimbus.lambda$launchServer$37(Nimbus.java:2771) > ~[storm-server-2.0.0.y.jar:2.0.0.y] > ... 2 more > Caused by: org.apache.storm.pacemaker.PacemakerConnectionException: Failed to > connect to any Pacemaker. > at > org.apache.storm.pacemaker.PacemakerClientPool.sendAll(PacemakerClientPool.java:71) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.cluster.PaceMakerStateStorage.get_worker_hb_children(PaceMakerStateStorage.java:199) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.cluster.StormClusterStateImpl.heartbeatStorms(StormClusterStateImpl.java:482) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.daemon.nimbus.Nimbus.topoIdsToClean(Nimbus.java:897) > ~[storm-server-2.0.0.y.jar:2.0.0.y] > at org.apache.storm.daemon.nimbus.Nimbus.doCleanup(Nimbus.java:2469) > ~[storm-server-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.daemon.nimbus.Nimbus.lambda$launchServer$37(Nimbus.java:2771) > ~[storm-server-2.0.0.y.jar:2.0.0.y] > ... 2 more > 2018-06-25 20:21:13.231 o.a.s.u.Utils timer [ERROR] Halting process: Error > while processing event > java.lang.RuntimeException: Halting process: Error while processing event > at org.apache.storm.utils.Utils.exitProcess(Utils.java:470) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.daemon.nimbus.Nimbus.lambda$new$17(Nimbus.java:490) > ~[storm-server-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.StormTimer$StormTimerTask.run(StormTimer.java:253) > [storm-client-2.0.0.y.jar:2.0.0.y] > 2018-06-25 20:21:13.232 o.a.s.d.n.Nimbus Thread-12 [INFO] Shutting down master > 2018-06-25 20:21:13.232 o.a.s.u.Utils Thread-13 [INFO] Halting after 10 > seconds > 2018-06-25 20:21:13.677 o.a.s.d.n.Nimbus pool-37-thread-481 [INFO] desired > replication count 1 achieved, current-replication-count for conf key = 1, > current-replication-count for code key = 1, current-replication-count for jar > key = 1 > 2018-06-25 20:21:13.678 o.a.s.d.n.Nimbus pool-37-thread-481 [WARN] Topology > submission exception. (topology name='run') > java.lang.IllegalStateException: instance must be started before calling this > method > at > org.apache.storm.shade.org.apache.curator.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:444) > ~[shaded-deps-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CuratorFrameworkImpl.checkExists(CuratorFrameworkImpl.java:432) > ~[shaded-deps-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.zookeeper.ClientZookeeper.existsNode(ClientZookeeper.java:144) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.zookeeper.ClientZookeeper.mkdirsImpl(ClientZookeeper.java:288) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.zookeeper.ClientZookeeper.mkdirs(ClientZookeeper.java:70) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.cluster.ZKStateStorage.mkdirs(ZKStateStorage.java:114) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.cluster.PaceMakerStateStorage.mkdirs(PaceMakerStateStorage.java:69) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.cluster.StormClusterStateImpl.setupHeatbeats(StormClusterStateImpl.java:435) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.daemon.nimbus.Nimbus.submitTopologyWithOpts(Nimbus.java:3024) > [storm-server-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.generated.Nimbus$Processor$submitTopologyWithOpts.getResult(Nimbus.java:3511) > [storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.generated.Nimbus$Processor$submitTopologyWithOpts.getResult(Nimbus.java:3490) > [storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) > [shaded-deps-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > [shaded-deps-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.security.auth.sasl.SaslTransportPlugin$TUGIWrapProcessor.process(SaslTransportPlugin.java:147) > [storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:291) > [shaded-deps-2.0.0.y.jar:2.0.0.y] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [?:1.8.0_131] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [?:1.8.0_131] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131] > 2018-06-25 20:21:13.680 o.a.s.t.ProcessFunction pool-37-thread-481 [ERROR] > Internal error processing submitTopologyWithOpts > java.lang.RuntimeException: java.lang.IllegalStateException: instance must be > started before calling this method > at > org.apache.storm.daemon.nimbus.Nimbus.submitTopologyWithOpts(Nimbus.java:3049) > ~[storm-server-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.generated.Nimbus$Processor$submitTopologyWithOpts.getResult(Nimbus.java:3511) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.generated.Nimbus$Processor$submitTopologyWithOpts.getResult(Nimbus.java:3490) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) > [shaded-deps-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > [shaded-deps-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.security.auth.sasl.SaslTransportPlugin$TUGIWrapProcessor.process(SaslTransportPlugin.java:147) > [storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:291) > [shaded-deps-2.0.0.y.jar:2.0.0.y] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [?:1.8.0_131] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [?:1.8.0_131] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131] > Caused by: java.lang.IllegalStateException: instance must be started before > calling this method > at > org.apache.storm.shade.org.apache.curator.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:444) > ~[shaded-deps-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CuratorFrameworkImpl.checkExists(CuratorFrameworkImpl.java:432) > ~[shaded-deps-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.zookeeper.ClientZookeeper.existsNode(ClientZookeeper.java:144) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.zookeeper.ClientZookeeper.mkdirsImpl(ClientZookeeper.java:288) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.zookeeper.ClientZookeeper.mkdirs(ClientZookeeper.java:70) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.cluster.ZKStateStorage.mkdirs(ZKStateStorage.java:114) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.cluster.PaceMakerStateStorage.mkdirs(PaceMakerStateStorage.java:69) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.cluster.StormClusterStateImpl.setupHeatbeats(StormClusterStateImpl.java:435) > ~[storm-client-2.0.0.y.jar:2.0.0.y] > at > org.apache.storm.daemon.nimbus.Nimbus.submitTopologyWithOpts(Nimbus.java:3024) > ~[storm-server-2.0.0.y.jar:2.0.0.y] > ... 9 more > {code} > We're having sporadic failures talking to Pacemaker. This callstack shows us > unable to launch topologies. -- This message was sent by Atlassian JIRA (v7.6.3#76005)