Hi Martin, Did you preserve configuration file (storm.yaml) from 1.0.4? If then, which value is "storm.local.dir"? Linux exit code 13 is EACCES: permission denied, and directory related bug was fixed from Storm 1.0.5.
https://issues.apache.org/jira/browse/STORM-2660 So if you're using "storm.local.dir" to be relative value, it was relative to working directory which runs Nimbus in Storm 1.0.4, and it becomes relative to storm home directory in Storm 1.0.5 and newer releases. Still wondering that Supervisor in Storm 1.0.4 was already working like how Nimbus becomes. Does clearing all states from disk and ZK resolve the issue? Or does the issue still persist? Thanks, Jungtaek Lim (HeartSaVioR) 2017년 9월 19일 (화) 오후 5:36, Martin Burian <[email protected]>님이 작성: > I updated our cluster from storm 1.0.4 to 1.0.5. The supervisors are fine, > but the nimbus keeps dying every 10s. It just dies silently, there are no > errors in the logs, nor in the JVM stdout. Nimbus exits with status 13. > Logs follow: > > ... > 2017-09-19 09:51:20.200 o.a.s.n.NimbusInfo main [INFO] Overriding nimbus > host to storm.local.hostname -> 172.17.0.3 > 2017-09-19 09:51:20.311 o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl main [INFO] > Starting > 2017-09-19 09:51:20.320 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client > environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT > 2017-09-19 09:51:20.320 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client > environment:host.name=85c13f835de1 > 2017-09-19 09:51:20.320 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client > environment:java.version=1.8.0_121 > 2017-09-19 09:51:20.320 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client > environment:java.vendor=Oracle Corporation > 2017-09-19 09:51:20.320 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client > environment:java.home=/usr/lib/jvm/java-8-openjdk-amd64/jre > 2017-09-19 09:51:20.320 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client > environment:java.class.path=/opt/apache-storm-1.0.5/lib/objenesis-2.1.jar:/opt/apache-storm-1.0.5/lib/log4j-slf4j-impl-2.8.jar:/opt/apache-storm-1.0.5/lib/kryo-3.0.3.jar:/opt/apache-storm-1.0.5/lib/disruptor-3.3.2.jar:/opt/apache-storm-1.0.5/lib/asm-5.0.3.jar:/opt/apache-storm-1.0.5/lib/log4j-core-2.8.jar:/opt/apache-storm-1.0.5/lib/minlog-1.3.0.jar:/opt/apache-storm-1.0.5/lib/slf4j-api-1.7.21.jar:/opt/apache-storm-1.0.5/lib/reflectasm-1.10.1.jar:/opt/apache-storm-1.0.5/lib/storm-core-1.0.5.jar:/opt/apache-storm-1.0.5/lib/storm-rename-hack-1.0.5.jar:/opt/apache-storm-1.0.5/lib/clojure-1.7.0.jar:/opt/apache-storm-1.0.5/lib/log4j-over-slf4j-1.6.6.jar:/opt/apache-storm-1.0.5/lib/servlet-api-2.5.jar:/opt/apache-storm-1.0.5/lib/log4j-api-2.8.jar:/opt/apache-storm-1.0.5/lib/airbrake-java.jar:/opt/apache-storm-1.0.5/conf > 2017-09-19 09:51:20.320 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client > environment:java.library.path=/usr/local/lib:/opt/local/lib:/usr/lib > 2017-09-19 09:51:20.320 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client > environment:java.io.tmpdir=/tmp > 2017-09-19 09:51:20.320 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client > environment:java.compiler=<NA> > 2017-09-19 09:51:20.320 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client > environment:os.name=Linux > 2017-09-19 09:51:20.320 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client > environment:os.arch=amd64 > 2017-09-19 09:51:20.320 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client > environment:os.version=4.11.6-3-ARCH > 2017-09-19 09:51:20.320 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client > environment:user.name=storm > 2017-09-19 09:51:20.320 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client > environment:user.home=/home/storm > 2017-09-19 09:51:20.320 o.a.s.s.o.a.z.ZooKeeper main [INFO] Client > environment:user.dir=/home/storm > 2017-09-19 09:51:20.321 o.a.s.s.o.a.z.ZooKeeper main [INFO] Initiating > client connection, connectString=172.17.0.2:2181/storm > sessionTimeout=20000 > watcher=org.apache.storm.shade.org.apache.curator.ConnectionState@455c1d8c > 2017-09-19 09:51:20.357 o.a.s.s.o.a.z.ClientCnxn main-SendThread( > 172.17.0.2:2181) [INFO] Opening socket connection to server > 172.17.0.2/172.17.0.2:2181. Will not attempt to authenticate using SASL > (unknown error) > 2017-09-19 09:51:20.366 o.a.s.b.FileBlobStoreImpl main [INFO] Creating new > blob store based in /home/storm/data/blobs > 2017-09-19 09:51:20.393 o.a.s.d.nimbus main [INFO] Using custom scheduler: > tparking.storm.scheduler.StaticScheduler > 2017-09-19 09:51:31.406 o.a.s.d.nimbus main [INFO] Starting Nimbus with > conf {"topology.builtin.metrics.bucket.size.secs" 60, "nimbus.childopts" > "-Xmx1024m > ... > > The thing is that the problem persists even after downgrade back to 1.0.4. > I cleared all the state from the disk before both the up- and downgrade, > everything in the nimbus data dir and all the zookeeper state. > > Does anyone have an idea about what's going on? > > Thanks in advance, Martin >
