Yeah, we had to disable issues on nathanmarz/storm in order to get everyone to start using the Apache infrastructure. An unfortunate side effect of that was that now links to old github issues return a 404.
All the github issues before the move have been migrated to JIRA now (https://issues.apache.org/jira/browse/STORM). You may have to do some searching to find them, but James Xu did a lot of work to ensure that most of the comments got transferred to JIRA. I’ll consolidate them under a single JIRA when I get the chance, but the relevant issues are: STORM-202, STORM-105, STORM-29, STORM-143, and STORM-95 Or use this query: https://issues.apache.org/jira/browse/STORM-202?jql=project%20%3D%20STORM%20AND%20text%20~%20%22stdout%22 It would be great if we could come up with a proper solution. For those who have been bitten by this it’s painful — zero indication of what the issue is. At the very least, it should be well documented. Heisenbugs suck. ;) - Taylor On Feb 13, 2014, at 11:10 AM, Marc Vaillant <[email protected]> wrote: > I was going to mention something similar. The stdout/stderr issue is > insidious. Only happens in cluster mode and while it was fixed for some > stdout/stderr pathways, others have not been fixed AFAIK. Taylor has > mentioned one, another is stdout/stderr coming from native code via JNI > (i.e. printf, std::cout, std::cerr from C and C++). Unfortunately, I > seem to be unable to access the original issue thread from github that I > tried to reopen about 9 months ago. > > Marc > > On Wed, Feb 12, 2014 at 03:50:52PM -0500, P. Taylor Goetz wrote: >> It looks like you have GC logging turned on, but you haven t specified an >> output file: >> >> "worker.childopts" "-verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails >> -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false >> -Dcom.sun.management.jmxremote.authenticate=false >> -Dcom.sun.management.jmxremote.port=1%ID% -Xmx1200m >> >> I ve run into this before: >> >> "The root cause was that GC logging for workers had been turned on without >> specifying an output file. As a result the GC logging went to standard output >> without being redirected to logback. Eventually the buffer filled and the JVM >> would hang (and thus stop heartbeating) and be killed. The amount of time the >> worker would last depended on memory pressure and allocated heap size >> (obviously, in hindsight the more GC, the more GC logging, the faster the >> buffer fills). >> >> The symptoms were workers timing out and being killed for no apparent >> reason. >> >> I would try with GC logging turned off or redirected to a file. >> >> - Taylor >> >> On Feb 12, 2014, at 3:12 PM, Mark Hu <[email protected]> wrote: >> >> >> Hi guys, >> >> >> Just an update. >> >> 1. After investigating the hang, it looked like the Spout was properly >> sending tuples to a bolt, but the bolt would never acknowledge the >> incoming >> tuple. After switching from using the Netty transport layer back to 0MQ, >> recovery and reliability guarantees were restored. Please note that we're >> using jruby-1.7.3 on top of storm java - could this cause issues with >> netty? >> 2. Can anyone confirm if the above worker restart behaviour is normal? >> >> Thanks, >> Mark >> >> >> >> On Feb 6, 2014, at 10:20 PM, Mark Hu <[email protected]> wrote: >> >> >> HISTORY: >> I have past experience with Storm 0.8.2 and encountered silent worker >> death >> during development. After experimenting with various parallelism, worker, >> and max_spout configurations, we were able to find one that worked well >> with high reliability despite this issue. >> >> The silent worker death basically involves a worker not responding and >> waiting 30 seconds before starting to process work again. A full load test >> reveals no data loss here. >> >> I have this topology running well for the past 3 months and the only >> errors >> were due to the user. >> >> PRESENT: >> I am now testing Storm 0.9.0.1 and it appears the silent worker death >> issue >> is still present, but now the topology hangs. Here are the details: >> >> Type and description for my test topology >> 1. One spout - makes web calls to a rest service to retrieve ids. >> 2. One bolt - matchRetrieveBolt - retrieves blobs for those ids. >> 3. 50 bolts - matchProcessBolt - does some processing on the blobs. >> >> Storm performed swimmingly for about 63,000 blobs before it hung. After >> examining the logs, it appears that one worker was processing both bolt >> type 2 and bolt type 3. For some reason, a reset occurs and the bolts >> reinitialize but become unresponsive. The spout continues to send work but >> times out and retries. >> >> Here is the data for worker-6710.log: >> >> >> 2014-02-07 01:31:26 MatchProcessBolt [INFO] [BOLT-MatchProcessBolt] >> bitfield already allocated, ttl: -1 >> 2014-02-07 01:31:26 MatchProcessBolt [INFO] [BOLT-MatchProcessBolt] All >> done. Acking now >> 2014-02-07 01:31:26 MatchRetrieveBolt [INFO] [BOLT-1] process: first = >> 66300, last = 66350 >> 2014-02-07 01:31:26 MatchRetrieveBolt [INFO] [BOLT-1] url: SOMEURL >> >> >> As per the above, this worker is performing bolt1 and bolt2 work. >> >> Suddenly, a restart occurs as per the below: >> >> >> 2014-02-07 01:32:00 o.a.z.ZooKeeper [INFO] Client >> environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT >> 2014-02-07 01:32:00 o.a.z.ZooKeeper [INFO] Client >> environment:host.name=dev-shared2-supervisor-i-e67015c7 >> 2014-02-07 01:32:00 o.a.z.ZooKeeper [INFO] Client >> environment:java.version=1.6.0_27 >> 2014-02-07 01:32:00 o.a.z.ZooKeeper [INFO] Client >> environment:java.vendor=Sun Microsystems Inc. >> 2014-02-07 01:32:00 o.a.z.ZooKeeper [INFO] Client >> environment:java.home=/usr/lib/jvm/java-6-openjdk-amd64/jre >> 2014-02-07 01:32:00 o.a.z.ZooKeeper [INFO] Client >> environment:java.class.path=/opt/storm/storm-0.9.0.1/storm-netty-0.9.0.1.jar:/opt/storm/storm-0.9.0.1/storm-console-logging-0.9.0.1.jar:/opt/storm/storm-0.9.0.1/storm-core-0.9.0.1.jar:/opt/storm/storm-0.9.0.1/lib/jline-0.9.94.jar:/opt/storm/storm-0.9.0.1/lib/servlet-api-2.5.jar:/opt/storm/storm-0.9.0.1/lib/minlog-1.2.jar:/opt/storm/storm-0.9.0.1/lib/curator-framework-1.0.1.jar:/opt/storm/storm-0.9.0.1/lib/snakeyaml-1.11.jar:/opt/storm/storm-0.9.0.1/lib/log4j-over-slf4j-1.6.6.jar:/opt/storm/storm-0.9.0.1/lib/clj-time-0.4.1.jar:/opt/storm/storm-0.9.0.1/lib/ring-core-1.1.5.jar:/opt/storm/storm-0.9.0.1/lib/mockito-all-1.9.5.jar:/opt/storm/storm-0.9.0.1/lib/tools.logging-0.2.3.jar:/opt/storm/storm-0.9.0.1/lib/disruptor-2.10.1.jar:/opt/storm/storm-0.9.0.1/lib/zookeeper-3.3.3.jar:/opt/storm/storm-0.9.0.1/lib/servlet-api-2.5-20081211.jar:/opt/storm/storm-0.9.0.1/lib/carbonite-1.5.0.jar:/opt/storm/storm-0.9.0.1/lib/guava-13.0.jar:/opt/storm/storm-0.9.0.1/lib/commo> >> >> ns-fileupload-1.2.1.jar:/opt/storm/storm-0.9.0.1/lib/joda-time-2.0.jar:/opt/storm/storm-0.9.0.1/lib/jgrapht-0.8.3.jar:/opt/storm/storm-0.9.0.1/lib/netty-3.6.3.Final.jar:/opt/storm/storm-0.9.0.1/lib/clojure-complete-0.2.3.jar:/opt/storm/storm-0.9.0.1/lib/tools.cli-0.2.2.jar:/opt/storm/storm-0.9.0.1/lib/logback-classic-1.0.6.jar:/opt/storm/storm-0.9.0.1/lib/ring-servlet-0.3.11.jar:/opt/storm/storm-0.9.0.1/lib/asm-4.0.jar:/opt/storm/storm-0.9.0.1/lib/commons-exec-1.1.jar:/opt/storm/storm-0.9.0.1/lib/ring-jetty-adapter-0.3.11.jar:/opt/storm/storm-0.9.0.1/lib/commons-lang-2.5.jar:/opt/storm/storm-0.9.0.1/lib/commons-logging-1.1.1.jar:/opt/storm/storm-0.9.0.1/lib/clout-1.0.1.jar:/opt/storm/storm-0.9.0.1/lib/clj-stacktrace-0.2.2.jar:/opt/storm/storm-0.9.0.1/lib/reflectasm-1.07-shaded.jar:/opt/storm/storm-0.9.0.1/lib/junit-3.8.1.jar:/opt/storm/storm-0.9.0.1/lib/jzmq-2.1.0.jar:/opt/storm/storm-0.9.0.1/lib/jetty-6.1.26.jar:/opt/storm/storm-0.9.0.1/lib/kryo-2.17.jar:/opt/storm/storm-0.9.0.1/lib/ring-devel-0.3.11.jar:/o> >> >> pt/storm/storm-0.9.0.1/lib/hiccup-0.3.6.jar:/opt/storm/storm-0.9.0.1/lib/compojure-1.1.3.jar:/opt/storm/storm-0.9.0.1/lib/jetty-util-6.1.26.jar:/opt/storm/storm-0.9.0.1/lib/commons-io-1.4.jar:/opt/storm/storm-0.9.0.1/lib/slf4j-api-1.6.5.jar:/opt/storm/storm-0.9.0.1/lib/clojure-1.4.0.jar:/opt/storm/storm-0.9.0.1/lib/tools.nrepl-0.2.3.jar:/opt/storm/storm-0.9.0.1/lib/curator-client-1.0.1.jar:/opt/storm/storm-0.9.0.1/lib/httpcore-4.1.jar:/opt/storm/storm-0.9.0.1/lib/logback-core-1.0.6.jar:/opt/storm/storm-0.9.0.1/lib/commons-codec-1.4.jar:/opt/storm/storm-0.9.0.1/lib/json-simple-1.1.jar:/opt/storm/storm-0.9.0.1/lib/core.incubator-0.1.0.jar:/opt/storm/storm-0.9.0.1/lib/libthrift7-0.7.0-2.jar:/opt/storm/storm-0.9.0.1/lib/tools.macro-0.1.0.jar:/opt/storm/storm-0.9.0.1/lib/httpclient-4.1.1.jar:/opt/storm/storm-0.9.0.1/lib/objenesis-1.2.jar:/opt/storm/storm-0.9.0.1/lib/math.numeric-tower-0.0.1.jar:/opt/storm/storm-0.9.0.1/conf:/mnt/storm/supervisor/stormdist/match_parser-12-1391735866/stormjar.jar >> 2014-02-07 01:32:00 o.a.z.ZooKeeper [INFO] Client >> environment:java.library.path=/usr/local/lib:/opt/local/lib:/usr/lib >> 2014-02-07 01:32:00 o.a.z.ZooKeeper [INFO] Client >> environment:java.io.tmpdir=/tmp >> 2014-02-07 01:32:00 o.a.z.ZooKeeper [INFO] Client >> environment:java.compiler=<NA> >> 2014-02-07 01:32:00 o.a.z.ZooKeeper [INFO] Client >> environment:os.name=Linux >> 2014-02-07 01:32:00 o.a.z.ZooKeeper [INFO] Client >> environment:os.arch=amd64 >> 2014-02-07 01:32:00 o.a.z.ZooKeeper [INFO] Client >> environment:os.version=3.2.0-48-virtual >> 2014-02-07 01:32:00 o.a.z.ZooKeeper [INFO] Client >> environment:user.name=storm >> 2014-02-07 01:32:00 o.a.z.ZooKeeper [INFO] Client >> environment:user.home=/home/storm >> 2014-02-07 01:32:00 o.a.z.ZooKeeper [INFO] Client >> environment:user.dir=/opt/storm/storm-0.9.0.1 >> 2014-02-07 01:32:00 o.a.z.s.ZooKeeperServer [INFO] Server >> environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT >> 2014-02-07 01:32:00 o.a.z.s.ZooKeeperServer [INFO] Server >> environment:host.name=dev-shared2-supervisor-i-e67015c7 >> 2014-02-07 01:32:00 o.a.z.s.ZooKeeperServer [INFO] Server >> environment:java.version=1.6.0_27 >> 2014-02-07 01:32:00 o.a.z.s.ZooKeeperServer [INFO] Server >> environment:java.vendor=Sun Microsystems Inc. >> 2014-02-07 01:32:00 o.a.z.s.ZooKeeperServer [INFO] Server >> environment:java.home=/usr/lib/jvm/java-6-openjdk-amd64/jre >> 2014-02-07 01:32:00 o.a.z.s.ZooKeeperServer [INFO] Server >> environment:java.class.path=/opt/storm/storm-0.9.0.1/storm-netty-0.9.0.1.jar:/opt/storm/storm-0.9.0.1/storm-console-logging-0.9.0.1.jar:/opt/storm/storm-0.9.0.1/storm-core-0.9.0.1.jar:/opt/storm/storm-0.9.0.1/lib/jline-0.9.94.jar:/opt/storm/storm-0.9.0.1/lib/servlet-api-2.5.jar:/opt/storm/storm-0.9.0.1/lib/minlog-1.2.jar:/opt/storm/storm-0.9.0.1/lib/curator-framework-1.0.1.jar:/opt/storm/storm-0.9.0.1/lib/snakeyaml-1.11.jar:/opt/storm/storm-0.9.0.1/lib/log4j-over-slf4j-1.6.6.jar:/opt/storm/storm-0.9.0.1/lib/clj-time-0.4.1.jar:/opt/storm/storm-0.9.0.1/lib/ring-core-1.1.5.jar:/opt/storm/storm-0.9.0.1/lib/mockito-all-1.9.5.jar:/opt/storm/storm-0.9.0.1/lib/tools.logging-0.2.3.jar:/opt/storm/storm-0.9.0.1/lib/disruptor-2.10.1.jar:/opt/storm/storm-0.9.0.1/lib/zookeeper-3.3.3.jar:/opt/storm/storm-0.9.0.1/lib/servlet-api-2.5-20081211.jar:/opt/storm/storm-0.9.0.1/lib/carbonite-1.5.0.jar:/opt/storm/storm-0.9.0.1/lib/guava-13.0.jar:/opt/storm/storm-0.9.0.1/l> >> >> ib/commons-fileupload-1.2.1.jar:/opt/storm/storm-0.9.0.1/lib/joda-time-2.0.jar:/opt/storm/storm-0.9.0.1/lib/jgrapht-0.8.3.jar:/opt/storm/storm-0.9.0.1/lib/netty-3.6.3.Final.jar:/opt/storm/storm-0.9.0.1/lib/clojure-complete-0.2.3.jar:/opt/storm/storm-0.9.0.1/lib/tools.cli-0.2.2.jar:/opt/storm/storm-0.9.0.1/lib/logback-classic-1.0.6.jar:/opt/storm/storm-0.9.0.1/lib/ring-servlet-0.3.11.jar:/opt/storm/storm-0.9.0.1/lib/asm-4.0.jar:/opt/storm/storm-0.9.0.1/lib/commons-exec-1.1.jar:/opt/storm/storm-0.9.0.1/lib/ring-jetty-adapter-0.3.11.jar:/opt/storm/storm-0.9.0.1/lib/commons-lang-2.5.jar:/opt/storm/storm-0.9.0.1/lib/commons-logging-1.1.1.jar:/opt/storm/storm-0.9.0.1/lib/clout-1.0.1.jar:/opt/storm/storm-0.9.0.1/lib/clj-stacktrace-0.2.2.jar:/opt/storm/storm-0.9.0.1/lib/reflectasm-1.07-shaded.jar:/opt/storm/storm-0.9.0.1/lib/junit-3.8.1.jar:/opt/storm/storm-0.9.0.1/lib/jzmq-2.1.0.jar:/opt/storm/storm-0.9.0.1/lib/jetty-6.1.26.jar:/opt/storm/storm-0.9.0.1/lib/kryo-2.17.jar:/opt/storm/storm-0.9.0.1/lib/ring-devel-0.3.1> >> >> 1.jar:/opt/storm/storm-0.9.0.1/lib/hiccup-0.3.6.jar:/opt/storm/storm-0.9.0.1/lib/compojure-1.1.3.jar:/opt/storm/storm-0.9.0.1/lib/jetty-util-6.1.26.jar:/opt/storm/storm-0.9.0.1/lib/commons-io-1.4.jar:/opt/storm/storm-0.9.0.1/lib/slf4j-api-1.6.5.jar:/opt/storm/storm-0.9.0.1/lib/clojure-1.4.0.jar:/opt/storm/storm-0.9.0.1/lib/tools.nrepl-0.2.3.jar:/opt/storm/storm-0.9.0.1/lib/curator-client-1.0.1.jar:/opt/storm/storm-0.9.0.1/lib/httpcore-4.1.jar:/opt/storm/storm-0.9.0.1/lib/logback-core-1.0.6.jar:/opt/storm/storm-0.9.0.1/lib/commons-codec-1.4.jar:/opt/storm/storm-0.9.0.1/lib/json-simple-1.1.jar:/opt/storm/storm-0.9.0.1/lib/core.incubator-0.1.0.jar:/opt/storm/storm-0.9.0.1/lib/libthrift7-0.7.0-2.jar:/opt/storm/storm-0.9.0.1/lib/tools.macro-0.1.0.jar:/opt/storm/storm-0.9.0.1/lib/httpclient-4.1.1.jar:/opt/storm/storm-0.9.0.1/lib/objenesis-1.2.jar:/opt/storm/storm-0.9.0.1/lib/math.numeric-tower-0.0.1.jar:/opt/storm/storm-0.9.0.1/conf:/mnt/storm/supervisor/stormdist/match_parser-12-1391735866/stormjar.jar >> 2014-02-07 01:32:00 o.a.z.s.ZooKeeperServer [INFO] Server >> environment:java.library.path=/usr/local/lib:/opt/local/lib:/usr/lib >> 2014-02-07 01:32:00 o.a.z.s.ZooKeeperServer [INFO] Server >> environment:java.io.tmpdir=/tmp >> 2014-02-07 01:32:00 o.a.z.s.ZooKeeperServer [INFO] Server >> environment:java.compiler=<NA> >> 2014-02-07 01:32:00 o.a.z.s.ZooKeeperServer [INFO] Server >> environment:os.name=Linux >> 2014-02-07 01:32:00 o.a.z.s.ZooKeeperServer [INFO] Server >> environment:os.arch=amd64 >> 2014-02-07 01:32:00 o.a.z.s.ZooKeeperServer [INFO] Server >> environment:os.version=3.2.0-48-virtual >> 2014-02-07 01:32:00 o.a.z.s.ZooKeeperServer [INFO] Server >> environment:user.name=storm >> 2014-02-07 01:32:00 o.a.z.s.ZooKeeperServer [INFO] Server >> environment:user.home=/home/storm >> 2014-02-07 01:32:00 o.a.z.s.ZooKeeperServer [INFO] Server >> environment:user.dir=/opt/storm/storm-0.9.0.1 >> 2014-02-07 01:32:01 b.s.d.worker [INFO] Launching worker for >> match_parser-12-1391735866 on f1b66213-7237-483e-a5db-e543bb36bcd9:6710 with >> id 80483ed6-2c51-4f86-9d06-9d2bed705748 and conf {"dev.zookeeper.path" >> "/tmp/dev-storm-zookeeper", "topology.tick.tuple.freq.secs" nil, >> "topology.builtin.metrics.bucket.size.secs" 60, >> "topology.fall.back.on.java.serialization" true, >> "topology.max.error.report.per.interval" 5, "zmq.linger.millis" 5000, >> "topology.skip.missing.kryo.registrations" false, >> "storm.messaging.netty.client_worker_threads" 1, "ui.childopts" "-Xmx768m", >> "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true, >> "topology.trident.batch.emit.interval.millis" 500, >> "nimbus.monitor.freq.secs" 10, "logviewer.childopts" "-Xmx128m", >> "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", >> "topology.executor.send.buffer.size" 1024, "storm.local.dir" "/mnt/storm", >> "storm.messaging.netty.buffer_size" 5242880, >> "supervisor.worker.start.timeout.secs" 120, >> "topology.enable.message.timeouts" true, "nim> bus.cleanup.inbox.freq.secs" >> 600, "nimbus.inbox.jar.expiration.secs" 3600, "drpc.worker.threads" 64, >> "topology.worker.shared.thread.pool.size" 4, "nimbus.host" "10.0.5.50", >> "storm.messaging.netty.min_wait_ms" 100, "storm.zookeeper.port" 2181, >> "transactional.zookeeper.port" nil, "topology.executor.receive.buffer.size" >> 1024, "transactional.zookeeper.servers" nil, "storm.zookeeper.root" >> "/storm", "storm.zookeeper.retry.intervalceiling.millis" 30000, >> "supervisor.enable" true, "storm.messaging.netty.server_worker_threads" 1, >> "storm.zookeeper.servers" ["10.0.6.131" "10.0.6.130" "10.0.6.55"], >> "transactional.zookeeper.root" "/transactional", "topology.acker.executors" >> nil, "topology.transfer.buffer.size" 1024, "topology.worker.childopts" nil, >> "drpc.queue.size" 128, "worker.childopts" "-verbose:gc >> -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -Dcom.sun.management.jmxremote >> -Dcom.sun.management.jmxremote.ssl=false >> -Dcom.sun.management.jmxremote.authenticate=false >> -Dcom.sun.management.jmxremote.port=1%ID% -Xmx1200m", "sup> >> ervisor.heartbeat.frequency.secs" 5, "topology.error.throttle.interval.secs" >> 10, "zmq.hwm" 0, "drpc.port" 3772, "supervisor.monitor.frequency.secs" 3, >> "drpc.childopts" "-Xmx768m", "topology.receiver.buffer.size" 8, >> "task.heartbeat.frequency.secs" 3, "topology.tasks" nil, >> "storm.messaging.netty.max_retries" 100, "topology.spout.wait.strategy" >> "backtype.storm.spout.SleepSpoutWaitStrategy", "topology.max.spout.pending" >> nil, "storm.zookeeper.retry.interval" 1000, >> "topology.sleep.spout.wait.strategy.time.ms" 1, "nimbus.topology.validator" >> "backtype.storm.nimbus.DefaultTopologyValidator", "supervisor.slots.ports" >> [6700 6701 6702 6703 6704 6705 6706 6707 6708 6709 6710 6711 6712 6713 6714 >> 6715 6716 6717 6718 6719 6720 6721], "topology.debug" false, >> "nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60, >> "topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10, >> "topology.workers" 1, "supervisor.childopts" "-verbose:gc >> -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -Dcom.sun.management.jmxremote >> -Dcom.> sun.management.jmxremote.ssl=false >> -Dcom.sun.management.jmxremote.authenticate=false >> -Dcom.sun.management.jmxremote.port=8091", "nimbus.thrift.port" 6627, >> "topology.stats.sample.rate" 0.05, "worker.heartbeat.frequency.secs" 1, >> "topology.tuple.serializer" >> "backtype.storm.serialization.types.ListDelegateSerializer", >> "topology.disruptor.wait.strategy" >> "com.lmax.disruptor.BlockingWaitStrategy", "nimbus.task.timeout.secs" 30, >> "storm.zookeeper.connection.timeout" 15000, "topology.kryo.factory" >> "backtype.storm.serialization.DefaultKryoFactory", "drpc.invocations.port" >> 3773, "logviewer.port" 8000, "zmq.threads" 1, "storm.zookeeper.retry.times" >> 5, "storm.thrift.transport" >> "backtype.storm.security.auth.SimpleTransportPlugin", >> "topology.state.synchronization.timeout.secs" 60, >> "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs" 600, >> "storm.messaging.transport" "backtype.storm.messaging.netty.Context", >> "logviewer.appender.name" "A1", "storm.messaging.netty.max_wait_ms" 1000, >> "drpc.request.timeout.secs> " 600, "storm.local.mode.zmq" false, "ui.port" >> 8080, "nimbus.childopts" "-verbose:gc -XX:+PrintGCTimeStamps >> -XX:+PrintGCDetails -Dcom.sun.management.jmxremote >> -Dcom.sun.management.jmxremote.ssl=false >> -Dcom.sun.management.jmxremote.authenticate=false >> -Dcom.sun.management.jmxremote.port=8090", "storm.cluster.mode" >> "distributed", "topology.optimize" true, "topology.max.task.parallelism" nil} >> 2014-02-07 01:32:01 c.n.c.f.i.CuratorFrameworkImpl [INFO] Starting >> 2014-02-07 01:32:01 o.a.z.ZooKeeper [INFO] Initiating client connection, >> connectString=10.0.6.131:2181,10.0.6.130:2181,10.0.6.55:2181 >> sessionTimeout=20000 watcher=com.netflix.curator.ConnectionState@25de152f >> 2014-02-07 01:32:01 o.a.z.ClientCnxn [INFO] Opening socket connection to >> server /10.0.6.131:2181 >> 2014-02-07 01:32:01 o.a.z.ClientCnxn [INFO] Socket connection established >> to ip-10-0-6-131.ec2.internal/10.0.6.131:2181, initiating session >> 2014-02-07 01:32:01 o.a.z.ClientCnxn [INFO] Session establishment >> complete on server ip-10-0-6-131.ec2.internal/10.0.6.131:2181, sessionid = >> 0xf343fa0ee4d600f0, negotiated timeout = 20000 >> 2014-02-07 01:32:01 b.s.zookeeper [INFO] Zookeeper state update: >> :connected:none >> 2014-02-07 01:32:01 o.a.z.ZooKeeper [INFO] Session: 0xf343fa0ee4d600f0 >> closed >> 2014-02-07 01:32:01 o.a.z.ClientCnxn [INFO] EventThread shut down >> 2014-02-07 01:32:01 c.n.c.f.i.CuratorFrameworkImpl [INFO] Starting >> 2014-02-07 01:32:01 o.a.z.ZooKeeper [INFO] Initiating client connection, >> connectString=10.0.6.131:2181,10.0.6.130:2181,10.0.6.55:2181/storm >> sessionTimeout=20000 watcher=com.netflix.curator.ConnectionState@31930611 >> 2014-02-07 01:32:01 o.a.z.ClientCnxn [INFO] Opening socket connection to >> server /10.0.6.131:2181 >> 2014-02-07 01:32:01 o.a.z.ClientCnxn [INFO] Socket connection established >> to ip-10-0-6-131.ec2.internal/10.0.6.131:2181, initiating session >> 2014-02-07 01:32:01 o.a.z.ClientCnxn [INFO] Session establishment >> complete on server ip-10-0-6-131.ec2.internal/10.0.6.131:2181, sessionid = >> 0xf343fa0ee4d600f1, negotiated timeout = 20000 >> 2014-02-07 01:32:01 b.s.m.TransportFactory [INFO] Storm peer transport >> plugin:backtype.storm.messaging.netty.Context >> 2014-02-07 01:32:01 b.s.d.executor [INFO] Loading executor 2:[2 2] >> 2014-02-07 01:32:01 b.s.d.executor [INFO] Loaded executor tasks 2:[2 2] >> 2014-02-07 01:32:01 b.s.d.executor [INFO] Finished loading executor 2:[2 >> 2] >> 2014-02-07 01:32:01 b.s.d.executor [INFO] Loading executor __acker:[102 >> 102] >> 2014-02-07 01:32:01 b.s.d.executor [INFO] Preparing bolt 2:(2) >> 2014-02-07 01:32:01 b.s.d.executor [INFO] Loaded executor tasks >> __acker:[102 102] >> 2014-02-07 01:32:01 b.s.d.executor [INFO] Timeouts disabled for executor >> __acker:[102 102] >> 2014-02-07 01:32:01 b.s.d.executor [INFO] Preparing bolt __acker:(102) >> 2014-02-07 01:32:01 b.s.d.executor [INFO] Finished loading executor >> __acker:[102 102] >> 2014-02-07 01:32:01 b.s.d.executor [INFO] Prepared bolt __acker:(102) >> 2014-02-07 01:32:01 b.s.d.executor [INFO] Loading executor 3:[52 52] >> 2014-02-07 01:32:01 b.s.d.executor [INFO] Loaded executor tasks 3:[52 52] >> 2014-02-07 01:32:01 b.s.d.executor [INFO] Finished loading executor 3:[52 >> 52] >> 2014-02-07 01:32:01 b.s.d.executor [INFO] Preparing bolt 3:(52) >> 2014-02-07 01:32:01 b.s.d.executor [INFO] Loading executor __system:[-1 >> -1] >> 2014-02-07 01:32:01 b.s.d.executor [INFO] Loaded executor tasks >> __system:[-1 -1] >> 2014-02-07 01:32:01 b.s.d.executor [INFO] Finished loading executor >> __system:[-1 -1] >> 2014-02-07 01:32:01 b.s.d.executor [INFO] Preparing bolt __system:(-1) >> 2014-02-07 01:32:01 b.s.d.worker [INFO] Launching receive-thread for >> f1b66213-7237-483e-a5db-e543bb36bcd9:6710 >> 2014-02-07 01:32:01 b.s.d.executor [INFO] Prepared bolt __system:(-1) >> 2014-02-07 01:32:02 b.s.d.worker [INFO] Worker has topology config >> {"storm.id" "match_parser-12-1391735866", "dev.zookeeper.path" >> "/tmp/dev-storm-zookeeper", "topology.tick.tuple.freq.secs" nil, >> "topology.builtin.metrics.bucket.size.secs" 60, >> "topology.fall.back.on.java.serialization" true, >> "topology.max.error.report.per.interval" 5, "zmq.linger.millis" 5000, >> "topology.skip.missing.kryo.registrations" false, >> "storm.messaging.netty.client_worker_threads" 1, "ui.childopts" "-Xmx768m", >> "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true, >> "topology.trident.batch.emit.interval.millis" 500, >> "nimbus.monitor.freq.secs" 10, "logviewer.childopts" "-Xmx128m", >> "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", >> "executor.send.buffer.size" 16384, "topology.executor.send.buffer.size" >> 1024, "storm.local.dir" "/mnt/storm", "iwparser_socket" >> "/tmp/iwparser2Sockets/iwparserClients.ipc", >> "storm.messaging.netty.buffer_size" 5242880, >> "supervisor.worker.start.timeout.secs" 120, "topology.enable.message.t> >> imeouts" true, "nimbus.cleanup.inbox.freq.secs" 600, "batch_size" 50, >> "nimbus.inbox.jar.expiration.secs" 3600, "aws_creds.aws_secret_access_key" >> "wmGLQ4ItvdhniRFMJe3yDJvBwTJUTqbuLYAS5ZPU", "drpc.worker.threads" 64, >> "redis.processed.match.log.size" 100000, "redis.key.match.ok" "MATCH-OK", >> "topology.worker.shared.thread.pool.size" 4, "nimbus.host" "10.0.5.50", >> "storm.messaging.netty.min_wait_ms" 100, "storm.zookeeper.port" 2181, >> "transactional.zookeeper.port" nil, "topology.executor.receive.buffer.size" >> 1024, "aws_creds.aws_access_key_id" "AKIAJP4O6MTXON3MLV2Q", >> "executor.receive.buffer.size" 16384, "transactional.zookeeper.servers" nil, >> "storm.zookeeper.root" "/storm", "aws.s3.match_bucket_name" "prod-matches", >> "storm.zookeeper.retry.intervalceiling.millis" 30000, "supervisor.enable" >> true, "storm.messaging.netty.server_worker_threads" 1, >> "storm.zookeeper.servers" ["10.0.6.131" "10.0.6.130" "10.0.6.55"], >> "transactional.zookeeper.root" "/transactional", "topology.acker.executors" >> 50, "topology.kryo.decorators" > (), "topology.name" "match_parser", >> "topology.transfer.buffer.size" 1024, "topology.worker.childopts" >> "-Djruby.compat.version=RUBY1_9", "drpc.queue.size" 128, "worker.childopts" >> "-verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails >> -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false >> -Dcom.sun.management.jmxremote.authenticate=false >> -Dcom.sun.management.jmxremote.port=1%ID% -Xmx1200m", >> "supervisor.heartbeat.frequency.secs" 5, "transfer.buffer.size" 32, >> "topology.error.throttle.interval.secs" 10, "throttle.spout" 1, "zmq.hwm" 0, >> "drpc.port" 3772, "supervisor.monitor.frequency.secs" 3, "drpc.childopts" >> "-Xmx768m", "topology.receiver.buffer.size" 8, >> "task.heartbeat.frequency.secs" 3, "topology.tasks" nil, >> "storm.messaging.netty.max_retries" 100, "topology.spout.wait.strategy" >> "backtype.storm.spout.SleepSpoutWaitStrategy", "receiver.buffer.size" 8, >> "debug.logging.redis" true, "topology.max.spout.pending" 1, >> "storm.zookeeper.retry.interval" 1000, >> "topology.sleep.spout.wait.strategy.time.ms" 1> , "platform" "xbox", >> "nimbus.topology.validator" >> "backtype.storm.nimbus.DefaultTopologyValidator", "supervisor.slots.ports" >> [6700 6701 6702 6703 6704 6705 6706 6707 6708 6709 6710 6711 6712 6713 6714 >> 6715 6716 6717 6718 6719 6720 6721], "iwparser_port" 12330, "redis.port" >> 6379, "topology.debug" false, "nimbus.task.launch.secs" 120, >> "nimbus.supervisor.timeout.secs" 60, "debug.logging.stdout" true, >> "topology.kryo.register" nil, "topology.message.timeout.secs" 30, >> "task.refresh.poll.secs" 10, "dw_retries.max_attempts" 6, "topology.workers" >> 50, "supervisor.childopts" "-verbose:gc -XX:+PrintGCTimeStamps >> -XX:+PrintGCDetails -Dcom.sun.management.jmxremote >> -Dcom.sun.management.jmxremote.ssl=false >> -Dcom.sun.management.jmxremote.authenticate=false >> -Dcom.sun.management.jmxremote.port=8091", "nimbus.thrift.port" 6627, >> "redis_prefix" "prod", "topology.stats.sample.rate" 0.05, >> "worker.heartbeat.frequency.secs" 1, "topology.tuple.serializer" >> "backtype.storm.serialization.types.ListDelegateSerializer", >> "topology.disruptor.w> ait.strategy" >> "com.lmax.disruptor.BlockingWaitStrategy", "http_proxy" "proxy", >> "nimbus.task.timeout.secs" 30, "storm.zookeeper.connection.timeout" 15000, >> "topology.kryo.factory" "backtype.storm.serialization.DefaultKryoFactory", >> "drpc.invocations.port" 3773, "logviewer.port" 8000, "zmq.threads" 1, >> "storm.zookeeper.retry.times" 5, "storm.thrift.transport" >> "backtype.storm.security.auth.SimpleTransportPlugin", >> "topology.state.synchronization.timeout.secs" 60, >> "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs" 600, >> "storm.messaging.transport" "backtype.storm.messaging.netty.Context", >> "logviewer.appender.name" "A1", "storm.messaging.netty.max_wait_ms" 1000, >> "debug.logging.kibana" false, "iwparser_host" "10.200.6.145", >> "drpc.request.timeout.secs" 600, "demonware.endpoint" >> "http://10.0.1.78:9000/matches/", "redis_simple_prefix" "prod", "redis.host" >> "storm090-redis.uvvujo.0001.use1.cache.amazonaws.com", >> "dw_retries.wait_interval" 2, "storm.local.mode.zmq" false, "ui.port" 8080, >> "nimbus.childopts> " "-verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails >> -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false >> -Dcom.sun.management.jmxremote.authenticate=false >> -Dcom.sun.management.jmxremote.port=8090", "storm.cluster.mode" >> "distributed", "topology.optimize" true, "topology.max.task.parallelism" >> nil, "statsd.endpoint" "statsd", "redis.log.size" 100000} >> 2014-02-07 01:32:02 b.s.d.worker [INFO] Worker >> 80483ed6-2c51-4f86-9d06-9d2bed705748 for storm match_parser-12-1391735866 on >> f1b66213-7237-483e-a5db-e543bb36bcd9:6710 has finished loading >> 2014-02-07 01:32:08 MatchRetrieveBolt [INFO] [BOLT-1] on_init >> 2014-02-07 01:32:08 MatchProcessBolt [INFO] [BOlT-MatchProcessBolt] >> on_init >> 2014-02-07 01:32:08 b.s.d.executor [INFO] Prepared bolt 2:(2) >> 2014-02-07 01:32:08 b.s.d.executor [INFO] Prepared bolt 3:(52) >> >> >> The bolts have reinitialized. The Storm UI is showing that bolt1 is still >> being handled by this specific worker. >> >> >> >> The spout still sends tuples to matchRetrieveBolt but never gets a >> response. The failure is caught and retries continue. >> >> >> Can anyone comment or provide any thoughts on the above issue? >> >> Thanks, >> Mark >> >> > >
signature.asc
Description: Message signed with OpenPGP using GPGMail
