GC may cause topology rebalance, but if topology rebalance frequent,it may
be unnormal case.
you should check worker log,may work happen error,
then check supervisor log.may be supervisor frequent kill worker.


2014-09-09 20:34 GMT+08:00 Vikas Agarwal <[email protected]>:

> Yes, GC might be the issue. I read somewhere in archives.
> http://grokbase.com/t/gg/storm-user/133mdhagd0/worker-dying
>
>
> On Tue, Sep 9, 2014 at 5:09 PM, Palak Shah <[email protected]> wrote:
>
>> I looked at the supervisor and nimbus logs and confirmed that the
>> topology is being rebalanced. It could be because of timeout. Could you
>> help me figure out exactly why this timeout occurs and how I can fix it?
>>
>> I have enabled GC for my workers and supervisor. Could it be the reason
>> that worker is not able to send a heartbeat? I tried increasing the heap
>> size allotted to each worker by tweaking the value of worker.childopts to
>> have "-Xmx768m". But I did not see any difference in behaviour of my
>> topology. How can I fix this issue?
>>
>> here are my supervisor logs -
>>
>> 2014-09-09 16:56:37 b.s.d.supervisor [INFO] Shutting down and clearing
>> state for id a8733f89-9f41-4624-9bce-d9f2d8c449ee. Current supervisor time:
>> 1410261997. State: :timed-out, Heartbeat:
>> #backtype.storm.daemon.common.WorkerHeartbeat{:time-secs 1410261964,
>> :storm-id "popeyeTopology-2-1410260830", :executors #{[4 4] [7 7] [-1 -1]
>> [1 1]}, :port 6700}
>> 2014-09-09 16:56:37 b.s.d.supervisor [INFO] Shutting down
>> 9f26d478-1963-425e-a0c2-139712f32b9e:a8733f89-9f41-4624-9bce-d9f2d8c449ee
>> 2014-09-09 16:56:37 b.s.util [INFO] Error when trying to kill 28950.
>> Process is probably already dead.
>> 2014-09-09 16:56:37 b.s.d.supervisor [INFO] Shut down
>> 9f26d478-1963-425e-a0c2-139712f32b9e:a8733f89-9f41-4624-9bce-d9f2d8c449ee
>> 2014-09-09 16:56:37 b.s.d.supervisor [INFO] Launching worker with
>> assignment #backtype.storm.daemon.supervisor.LocalAssignment{:storm-id
>> "popeyeTopology-2-1410260830", :executors ([7 7] [4 4] [1 1])} for this
>> supervisor 9f26d478-1963-425e-a0c2-139712f32b9e on port 6700 with id
>> 78233b86-b7f3-4590-9709-ba0d71130d7e
>> 2014-09-09 16:56:37 b.s.d.supervisor [INFO] Launching worker with
>> command: '/usr/lib/jvm/java-7-oracle/bin/java' '-server' '-Xmx768m'
>> '-verbose:gc' '-XX:+PrintGCTimeStamps' '-XX:+PrintGCDetails'
>> '-Dcom.sun.management.jmxremote' '-Dcom.sun.management.jmxremote.ssl=false'
>> '-Dcom.sun.management.jmxremote.authenticate=false'
>> '-Dcom.sun.management.jmxremote.port=16700'
>> '-Djava.library.path=/tmp/stormtmp/supervisor/stormdist/popeyeTopology-2-1410260830/resources/Linux-amd64:/tmp/stormtmp/supervisor/stormdist/popeyeTopology-2-1410260830/resources:/usr/lib/jvm/java-7-oracle/lib'
>> '-Dlogfile.name=worker-6700.log'
>> '-Dstorm.home=/home/stormcluster/Storm/apache-storm-0.9.2-incubating'
>> '-Dlogback.configurationFile=/home/stormcluster/Storm/apache-storm-0.9.2-incubating/logback/cluster.xml'
>> '-Dstorm.id=popeyeTopology-2-1410260830'
>> '-Dworker.id=78233b86-b7f3-4590-9709-ba0d71130d7e' '-Dworker.port=6700'
>> '-cp'
>> '/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/chill-java-0.3.5.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/commons-exec-1.1.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/commons-io-2.4.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/joda-time-2.0.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/ring-servlet-0.3.11.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/scala-library-2.9.2.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/clout-1.0.1.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/logback-core-1.0.6.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/json-simple-1.1.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/curator-client-2.4.0.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/carbonite-1.4.0.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/asm-4.0.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/httpcore-4.3.2.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/jetty-util-6.1.26.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/clj-stacktrace-0.2.4.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/log4j-over-slf4j-1.6.6.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/netty-3.2.2.Final.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/compojure-1.1.3.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/zookeeper-3.4.5.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/ring-devel-0.3.11.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/jgrapht-core-0.9.0.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/guava-13.0.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/objenesis-1.2.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/minlog-1.2.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/core.incubator-0.1.0.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/clj-time-0.4.1.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/servlet-api-2.5.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/storm-core-0.9.2-incubating.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/commons-logging-1.1.3.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/logback-classic-1.0.6.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/clojure-1.5.1.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/ring-jetty-adapter-0.3.11.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/tools.macro-0.1.0.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/reflectasm-1.07-shaded.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/math.numeric-tower-0.0.1.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/commons-fileupload-1.2.1.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/servlet-api-2.5-20081211.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/jetty-6.1.26.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/jline-2.11.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/tools.cli-0.2.4.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/slf4j-api-1.6.5.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/hiccup-0.3.6.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/commons-codec-1.6.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/snakeyaml-1.11.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/zmq.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/netty-3.6.3.Final.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/storm-kafka-0.9.2-incubating.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/ring-core-1.1.5.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/kafka_2.9.2-0.8.1.1.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/tools.logging-0.2.3.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/kryo-2.21.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/disruptor-2.10.1.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/httpclient-4.3.3.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/commons-lang-2.5.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/curator-framework-2.4.0.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/conf:/tmp/stormtmp/supervisor/stormdist/popeyeTopology-2-1410260830/stormjar.jar'
>> 'backtype.storm.daemon.worker' 'popeyeTopology-2-1410260830'
>> '9f26d478-1963-425e-a0c2-139712f32b9e' '6700'
>> '78233b86-b7f3-4590-9709-ba0d71130d7e'
>> 2014-09-09 16:56:37 b.s.d.supervisor [INFO]
>> 78233b86-b7f3-4590-9709-ba0d71130d7e still hasn't started
>> 2014-09-09 16:56:37 b.s.d.supervisor [INFO]
>> 78233b86-b7f3-4590-9709-ba0d71130d7e still hasn't started
>> 2014-09-09 16:56:38 b.s.d.supervisor [INFO]
>> 78233b86-b7f3-4590-9709-ba0d71130d7e still hasn't started
>> 2014-09-09 16:56:38 b.s.d.supervisor [INFO]
>> 78233b86-b7f3-4590-9709-ba0d71130d7e still hasn't started
>> 2014-09-09 16:56:39 b.s.d.supervisor [INFO]
>> 78233b86-b7f3-4590-9709-ba0d71130d7e still hasn't started
>> 2014-09-09 16:56:39 b.s.d.supervisor [INFO]
>> 78233b86-b7f3-4590-9709-ba0d71130d7e still hasn't started
>> 2014-09-09 16:56:40 b.s.d.supervisor [INFO]
>> 78233b86-b7f3-4590-9709-ba0d71130d7e still hasn't started
>> 2014-09-09 16:56:40 b.s.d.supervisor [INFO]
>> 78233b86-b7f3-4590-9709-ba0d71130d7e still hasn't started
>>
>>
>> Thanks,
>> Palak
>>
>> On Mon, Sep 8, 2014 at 6:46 PM, Cyrille Karmann <[email protected]
>> > wrote:
>>
>>> Look at the supervisors logs. There could be a timeout that make a
>>> supervisor to force a worker to shut down, triggering a rebalancing.
>>>
>>>
>>>
>>> 2014-09-08 7:17 GMT-04:00 Palak Shah <[email protected]>:
>>>
>>> I have a topology that uses a Kafka spout to read values from a Kafka
>>>> queue. I used the kafkaSpout that came with storm-0.9.2-incubating.
>>>>
>>>> I observed that the nimbus rebalances the topology very often. The
>>>> topology suddenly shuts down and starts again with the tasks running on
>>>> different machines. I wanted to know why storm nimbus is rebalancing my
>>>> topology, so I observed the storm throughput, latency, load on bolts and
>>>> even system metrics like cpu and memory utilization, but I could not see a
>>>> pattern.
>>>>
>>>> Can someone explain what are the factors that lead to rebalancing of
>>>> topology in storm?
>>>>
>>>> Thanks,
>>>> Palak Shah
>>>>
>>>
>>>
>>>
>>> --
>>> Cyrille Karmann
>>> +1-514-659-1209
>>> [email protected]
>>>
>>
>>
>
>
> --
> Regards,
> Vikas Agarwal
> 91 – 9928301411
>
> InfoObjects, Inc.
> Execution Matters
> http://www.infoobjects.com
> 2041 Mission College Boulevard, #280
> Santa Clara, CA 95054
> +1 (408) 988-2000 Work
> +1 (408) 716-2726 Fax
>
>

Reply via email to