Basically, our spout is REST service based. This is what is amounting to the http requests. We have the same topology with the exact same configuration running without any issues on 0.8.2. So is there any configuration on 0.9.1 that I should be tweaking explicitly?
Also, apart from the REST Spout based workers, even the workers that execute the bolt tasks fail consistently. On Mon, May 5, 2014 at 10:30 PM, Ebot Tabi <[email protected]> wrote: > Hey Geeta, > from what i see on the logs, your topology kind does an http request to > external service for some extra data, and its gets alot of time and failed > connections as well, this ends up causing your topology not to correctly. I > will suggestion you implement a better way to do http request, remember > storm is pretty fast, and if you getting a stream of data coming and for > you to hit the external source everyone, it's not good. > > > On Mon, May 5, 2014 at 4:55 PM, Geeta Iyer <[email protected]> wrote: > >> I am attaching the logs of one of the workers. Hope that helps... >> >> >> On Mon, May 5, 2014 at 10:18 PM, Ebot Tabi <[email protected]> wrote: >> >>> can you check the logs on your production server and see why it keeps >>> restarting ? i could suspect zookeeper, but not sure if it is the case >>> here. if you can get the logs on production server that will be great. >>> >>> >>> On Mon, May 5, 2014 at 4:34 PM, Geeta Iyer <[email protected]> wrote: >>> >>>> I verified the nimbus hostname in the configuration nimbus.host: >>>> "<nimbus-host>". >>>> It is correct. The topology do run for some short time and acks some >>>> very small number of messages successfully. But as time progresses, the >>>> workers keep restarting. >>>> >>>> >>>> On Mon, May 5, 2014 at 9:58 PM, Ebot Tabi <[email protected]> wrote: >>>> >>>>> Hi Geeta, >>>>> check the first line of your error log you will see that it says: Remote >>>>> address is not reachable. We will close this client. >>>>> Your remote Nimbus isn't available to receive the topology you are >>>>> submit, make sure you got the right IP address on your >>>>> /home/yourname/.storm/storm.yaml file. >>>>> >>>>> >>>>> On Mon, May 5, 2014 at 4:18 PM, Geeta Iyer <[email protected]> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I am exploring storm-0.9.1-incubating for running a topology. The >>>>>> topology consists of 1 spout and 4 bolts. I was trying this out on a 10 >>>>>> node cluster. When I start streaming messages through the topology, the >>>>>> workers fail again and again with the exception: >>>>>> >>>>>> 2014-05-05 10:40:28 b.s.m.n.Client [WARN] Remote address is not >>>>>> reachable. We will close this client. >>>>>> 2014-05-05 10:40:31 b.s.util [ERROR] Async loop died! >>>>>> java.lang.RuntimeException: java.lang.RuntimeException: Client is >>>>>> being closed, and does not take requests any more >>>>>> at >>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:107) >>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating] >>>>>> at >>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:78) >>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating] >>>>>> at >>>>>> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:77) >>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating] >>>>>> at >>>>>> backtype.storm.disruptor$consume_loop_STAR_$fn__1577.invoke(disruptor.clj:89) >>>>>> ~[na:na] >>>>>> at >>>>>> backtype.storm.util$async_loop$fn__384.invoke(util.clj:433) ~[na:na] >>>>>> at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na] >>>>>> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] >>>>>> Caused by: java.lang.RuntimeException: Client is being closed, and >>>>>> does not take requests any more >>>>>> at >>>>>> backtype.storm.messaging.netty.Client.send(Client.java:125) >>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating] >>>>>> at >>>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398$fn__4399.invoke(worker.clj:319) >>>>>> ~[na:na] >>>>>> at >>>>>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398.invoke(worker.clj:308) >>>>>> ~[na:na] >>>>>> at >>>>>> backtype.storm.disruptor$clojure_handler$reify__1560.onEvent(disruptor.clj:58) >>>>>> ~[na:na] >>>>>> at >>>>>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:104) >>>>>> ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating] >>>>>> ... 6 common frames omitted >>>>>> >>>>>> I tried multiple configurations, by running multiple executors/tasks >>>>>> on single worker versus assigning one task per worker and running >>>>>> multiple >>>>>> workers per node. However, every time, it is the same issue. >>>>>> >>>>>> The same topology works fine on storm-0.8.2 version with the same >>>>>> amount of traffic. >>>>>> >>>>>> Is there any configuration that needs to be tweaked? Any suggestions >>>>>> will be really helpful. >>>>>> >>>>>> I want to compare the performance of using storm-0.8.2 with 0mq and >>>>>> 0.9.1 with netty for my topology and see if we can achieve better >>>>>> performance with storm 0.9.1 >>>>>> >>>>>> This is what my storm.yaml on supervisor nodes look like currently: >>>>>> >>>>>> ########### These MUST be filled in for a storm configuration >>>>>> storm.zookeeper.servers: >>>>>> - "<zk-hostname-1>" >>>>>> - "<zk-hostname-2>" >>>>>> - "<zk-hostname-3>" >>>>>> >>>>>> nimbus.host: "<nimbus-host>" >>>>>> >>>>>> storm.local.dir: "/tmp/forStorm" >>>>>> >>>>>> supervisor.slots.ports: >>>>>> - 6900 >>>>>> - 6901 >>>>>> - 6902 >>>>>> - 6903 >>>>>> - 6904 >>>>>> >>>>>> storm.messaging.transport: "backtype.storm.messaging.netty.Context" >>>>>> storm.messaging.netty.server_worker_threads: 1 >>>>>> storm.messaging.netty.client_worker_threads: 1 >>>>>> storm.messaging.netty.buffer_size: 5242880 >>>>>> storm.messaging.netty.max_retries: 10 >>>>>> storm.messaging.netty.max_wait_ms: 1000 >>>>>> storm.messaging.netty.min_wait_ms: 100 >>>>>> worker.childopts: "-Xmx6144m -Djava.net.preferIPv4Stack=true >>>>>> -Dcom.sun.management.jmxremote.port=1%ID% >>>>>> -Dcom.sun.management.jmxremote.ssl=false >>>>>> -Dcom.sun.management.jmxremote.authenticate=false" >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Thanks, >>>>>> Geeta >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Ebot T. >>>>> >>>>> >>>> >>>> >>>> -- >>>> Thanks, >>>> Geeta >>>> >>> >>> >>> >>> -- >>> Ebot T. >>> >>> >> >> >> -- >> Thanks, >> Geeta >> > > > > -- > Ebot T. > > -- Thanks, Geeta
