Try reducing the Spout pending and then increase gradually. May be start with 100.
Can send a screen capture of Storm UI. Thanks and Regards, Devang On May 19, 2016 10:14 PM, "Alexandre Wilhelm" <[email protected]> wrote: > Hi, > > I have one kafka partition and one spout and this spout is written in go, > not sure if it’s related to the java process. > > -- > Alexandre Wilhelm > > On May 19, 2016 at 5:15:09 AM, Devang Shah ([email protected]) wrote: > > How many Kafka partitions exist on the topic ? > > How many Kafka Spout instances within the topology? > > Play around with number of consumers subscribing to the Kafka topic. > Observe the latency, throughput and cpu usage by changing these parameters. > > Thanks and Regards, > Devang > On May 19, 2016 3:09 AM, "Alexandre Wilhelm" < > [email protected]> wrote: > >> Hi guys, >> >> I did some tests again and attached more profiling images (enclosed in >> the email) >> >> Configuration of the topology : >> >> conf.put(Config.TOPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_MS, >> 1000); >> conf.put(Config.TOPOLOGY_DISRUPTOR_BATCH_SIZE, 10); >> conf.put(Config.TOPOLOGY_DISRUPTOR_BATCH_TIMEOUT_MILLIS, 500); >> conf.put(Config.TOPOLOGY_DISRUPTOR_WAIT_TIMEOUT_MILLIS, 500); >> conf.setMaxSpoutPending(10000); >> conf.put(Config.TASK_HEARTBEAT_FREQUENCY_SECS , 60); >> conf.setMaxTaskParallelism(1); >> conf.setNumWorkers(1); >> >> Interesting stuff, the profiler shows me a cpu usage of 20% but top ~70% >> for the java process and stop ~70 for the python process (storm.py) The pid >> between the python and java process are the same though. >> The go process however are at 0% of CPU usage. >> >> Does someone know what could be the issue ? Once again I’m running on >> storm 1.0.1, with multilang in go. >> >> Thanks, >> -- >> Alexandre Wilhelm >> >> On May 16, 2016 at 9:52:23 AM, [email protected] ( >> [email protected]) wrote: >> >> Hi John, >> >> The results sent used a default configuration of storm, the max spout >> spending was not set to 1 million. I can try with different configurations >> of you want, what would be good number ? >> >> Thanks >> Alexandre >> >> On May 16, 2016, at 03:58, John Yost <[email protected]> wrote: >> >> Hi Alexandre, >> >> The max.spout.pending is at 1 million, which is the highest I've ever >> seen! :) This means you have a whole bunch of messages within the system >> at any given time and your topology spends most of it's time in the LMAX >> disruptor layer caching messages as they come in from Netty and before they >> are sent back out via Netty. The jvisualvm screen shots support this >> conclusion. >> >> I recommend dailing back max.spout.pending back to 100K and please report >> back with those results. >> >> --John >> >> On Mon, May 16, 2016 at 3:44 AM, Alexandre Wilhelm < >> [email protected]> wrote: >> >>> Hi John, >>> >>> Thanks for the reply, this is the result I got, if it miss something >>> just let me know and I’ll redo it. >>> The test : default configuration with flag debug set to false, I send >>> 100 000 messages to kafka, one each 1ms. >>> >>> 1. Just after launching storm locally >>> >>> CPU : ~8% >>> >>> 2. When getting messages from kafka >>> >>> CPU : ~100% >>> Bolt and spout : ~20% >>> Interesting things, CPU usage is not the same between the jvisualvm >>> (~20%) and top and htop (100%). >>> >>> 3. When every messages were sent >>> >>> CPU : 60% >>> Bolt and spout : ~0% >>> Same as above, CP usage is not the same between the tool and top >>> >>> 4. After few minutes >>> >>> CPU : 60% >>> Bolt and spout : ~0% >>> Same as above, CP usage is not the same between the tool and top >>> >>> Pictures of profiling after few minutes enclosed. >>> >>> Any idea ? >>> >>> Thanks for the help >>> -- >>> Alexandre Wilhelm >>> >>> On May 14, 2016 at 12:21:46 PM, [email protected] ( >>> [email protected]) wrote: >>> >>> Hi, I recommend profiling with jvisualvm to see which methods are using >>> the most CPU time. This should provide insight into which methods are >>> causing CPU usage to go up. >>> >>> --John >>> >>> Sent from my iPhone >>> >>> On May 14, 2016, at 2:42 PM, Alexandre Wilhelm < >>> [email protected]> wrote: >>> >>> Hi guys, >>> >>> I’m a new user of storm and I need help for an issue I have with storm. >>> >>> At a glance, I have a topology with three spouts and bolts ; >>> >>> - Spout -> get messages from Kafka >>> - Bolt -> transform the message from Kafka to a well formed son >>> - Bolt -> send the JSON to a kairosDB server >>> >>> All of these bolts/spouts are written in go by using the library >>> https://github.com/jsgilmore/gostorm >>> >>> Unfortunately after using the topology for a while, the CPU usage is >>> just growing and growing and never goes down. For instance after sending 1 >>> million messages, the CPU stays at 200% of cpu usage. >>> >>> I tried to play with the configuration and get better result ; with the >>> following configuration : >>> >>> conf.put(Config.TOPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_MS, >>> 1000); >>> conf.put(Config.TOPOLOGY_DISRUPTOR_BATCH_SIZE, 10); >>> conf.put(Config.TOPOLOGY_DISRUPTOR_BATCH_TIMEOUT_MILLIS, >>> 500); >>> conf.put(Config.TOPOLOGY_DISRUPTOR_WAIT_TIMEOUT_MILLIS, 500); >>> conf.setMaxSpoutPending(1000000); >>> conf.put(Config.TASK_HEARTBEAT_FREQUENCY_SECS , 600); >>> >>> My CPU usage after 1 million messages will be 50%, but it never goes >>> down... >>> >>> What could be the issue(s) and how can I debug that ? I’m pretty new to >>> the java/clojure world as well... >>> >>> Im running on storm 1.0.1 fyi >>> >>> Thanks in advance, >>> -- >>> Alexandre Wilhelm >>> >>> >>
