Hi Geeta, check the first line of your error log you will see that it says: Remote address is not reachable. We will close this client. Your remote Nimbus isn't available to receive the topology you are submit, make sure you got the right IP address on your /home/yourname/.storm/storm.yaml file.
On Mon, May 5, 2014 at 4:18 PM, Geeta Iyer <[email protected]> wrote: > Hi, > > I am exploring storm-0.9.1-incubating for running a topology. The topology > consists of 1 spout and 4 bolts. I was trying this out on a 10 node > cluster. When I start streaming messages through the topology, the workers > fail again and again with the exception: > > 2014-05-05 10:40:28 b.s.m.n.Client [WARN] Remote address is not reachable. > We will close this client. > 2014-05-05 10:40:31 b.s.util [ERROR] Async loop died! > java.lang.RuntimeException: java.lang.RuntimeException: Client is being > closed, and does not take requests any more > at > backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:107) > ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating] > at > backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:78) > ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating] > at > backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:77) > ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating] > at > backtype.storm.disruptor$consume_loop_STAR_$fn__1577.invoke(disruptor.clj:89) > ~[na:na] > at backtype.storm.util$async_loop$fn__384.invoke(util.clj:433) > ~[na:na] > at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] > Caused by: java.lang.RuntimeException: Client is being closed, and does > not take requests any more > at backtype.storm.messaging.netty.Client.send(Client.java:125) > ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating] > at > backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398$fn__4399.invoke(worker.clj:319) > ~[na:na] > at > backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398.invoke(worker.clj:308) > ~[na:na] > at > backtype.storm.disruptor$clojure_handler$reify__1560.onEvent(disruptor.clj:58) > ~[na:na] > at > backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:104) > ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating] > ... 6 common frames omitted > > I tried multiple configurations, by running multiple executors/tasks on > single worker versus assigning one task per worker and running multiple > workers per node. However, every time, it is the same issue. > > The same topology works fine on storm-0.8.2 version with the same amount > of traffic. > > Is there any configuration that needs to be tweaked? Any suggestions will > be really helpful. > > I want to compare the performance of using storm-0.8.2 with 0mq and 0.9.1 > with netty for my topology and see if we can achieve better performance > with storm 0.9.1 > > This is what my storm.yaml on supervisor nodes look like currently: > > ########### These MUST be filled in for a storm configuration > storm.zookeeper.servers: > - "<zk-hostname-1>" > - "<zk-hostname-2>" > - "<zk-hostname-3>" > > nimbus.host: "<nimbus-host>" > > storm.local.dir: "/tmp/forStorm" > > supervisor.slots.ports: > - 6900 > - 6901 > - 6902 > - 6903 > - 6904 > > storm.messaging.transport: "backtype.storm.messaging.netty.Context" > storm.messaging.netty.server_worker_threads: 1 > storm.messaging.netty.client_worker_threads: 1 > storm.messaging.netty.buffer_size: 5242880 > storm.messaging.netty.max_retries: 10 > storm.messaging.netty.max_wait_ms: 1000 > storm.messaging.netty.min_wait_ms: 100 > worker.childopts: "-Xmx6144m -Djava.net.preferIPv4Stack=true > -Dcom.sun.management.jmxremote.port=1%ID% > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.authenticate=false" > > > > > -- > Thanks, > Geeta > -- Ebot T.
