I think this would happen these scenarios: * the FQDN cannot be reached across all members in the cluster (I had to use hosts file as it was using intranet aws hostnames) * appropriate ports used are blocked * zookeeper data is corrupted
In the last case, this can happen if you upgraded the cluster, like from 0.10.0 to 1.0.2. The zookeeper data needs to be to be purged completely, and storm 1.0.2 will populate it again. Apache Storm does not seem to schema check unfortunately, just assumes the version of the schema is correct, so this definitely happens in upgrade scenarios. In my case, I went into ZK, stop the services, rm -rf where the data store was, started ZK again, bounced the nimbus services (and supervisors), and it all worked again. - Joaquin Menchaca > On Nov 4, 2016, at 3:44 AM, Назар Кушпір <[email protected]> wrote: > > I'm using Storm 1.0.2 and zookeeper 3.3.5 and sometimes I get an error on > Storm UI: > "org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader > nimbus from seed hosts ["<my_nimbus_host_fqdn>"]. Did you specify a valid > list of > nimbus hosts for config nimbus.seeds?" > > Also, in nimbus.log I got the following messages: > [timer] ERROR o.a.s.b.BlobStoreUtils - Could not update the blob with > key<topology_name>-1-1475258141-stormcode.ser > > > --- storm.yaml ----: > storm.zookeeper.servers: > - "<zookeeper_host_fqdn>" > > storm.zookeeper.root: "/<zknode_name>" > storm.local.dir: "/data/storm" > > nimbus.seeds : ["<my_nimbus_host_fqdn>"] > > supervisor.slots.ports: > - 6700 > - 6701 > - 6702 > - 6703 > > nimbus.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true" > supervisor.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true" > ui.childopts: "-Xmx768m -Djava.net.preferIPv4Stack=true" > worker.childopts: "-Xmx5120m -Djava.net.preferIPv4Stack=true > -XX:+PrintGCDetails -Xloggc:artifacts/gc.log -XX:+PrintGCDateStamps > -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 > -XX:GCLogFileSize=1M -XX:+HeapDumpOnOutOfMemoryError > -XX:HeapDumpPath=artifacts/heapdump" > > # Set limit for Spout's output queue. > topology.max.spout.pending: 500 > > supervisor.worker.timeout.secs: 900 > nimbus.task.timeout.secs: 600 > nimbus.supervisor.timeout.secs: 600 > ----------------------------- > > What is the reason for these errors? Is it Zookeeper or Storm, or maybe both? > or is it something else? > And what is the fix for it? > > > Thanks!
