How many Kafka partitions exist on the topic ? How many Kafka Spout instances within the topology?
Play around with number of consumers subscribing to the Kafka topic. Observe the latency, throughput and cpu usage by changing these parameters. Thanks and Regards, Devang On May 19, 2016 3:09 AM, "Alexandre Wilhelm" <[email protected]> wrote: > Hi guys, > > I did some tests again and attached more profiling images (enclosed in the > email) > > Configuration of the topology : > > conf.put(Config.TOPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_MS, > 1000); > conf.put(Config.TOPOLOGY_DISRUPTOR_BATCH_SIZE, 10); > conf.put(Config.TOPOLOGY_DISRUPTOR_BATCH_TIMEOUT_MILLIS, 500); > conf.put(Config.TOPOLOGY_DISRUPTOR_WAIT_TIMEOUT_MILLIS, 500); > conf.setMaxSpoutPending(10000); > conf.put(Config.TASK_HEARTBEAT_FREQUENCY_SECS , 60); > conf.setMaxTaskParallelism(1); > conf.setNumWorkers(1); > > Interesting stuff, the profiler shows me a cpu usage of 20% but top ~70% > for the java process and stop ~70 for the python process (storm.py) The pid > between the python and java process are the same though. > The go process however are at 0% of CPU usage. > > Does someone know what could be the issue ? Once again I’m running on > storm 1.0.1, with multilang in go. > > Thanks, > -- > Alexandre Wilhelm > > On May 16, 2016 at 9:52:23 AM, [email protected] ( > [email protected]) wrote: > > Hi John, > > The results sent used a default configuration of storm, the max spout > spending was not set to 1 million. I can try with different configurations > of you want, what would be good number ? > > Thanks > Alexandre > > On May 16, 2016, at 03:58, John Yost <[email protected]> wrote: > > Hi Alexandre, > > The max.spout.pending is at 1 million, which is the highest I've ever > seen! :) This means you have a whole bunch of messages within the system > at any given time and your topology spends most of it's time in the LMAX > disruptor layer caching messages as they come in from Netty and before they > are sent back out via Netty. The jvisualvm screen shots support this > conclusion. > > I recommend dailing back max.spout.pending back to 100K and please report > back with those results. > > --John > > On Mon, May 16, 2016 at 3:44 AM, Alexandre Wilhelm < > [email protected]> wrote: > >> Hi John, >> >> Thanks for the reply, this is the result I got, if it miss something just >> let me know and I’ll redo it. >> The test : default configuration with flag debug set to false, I send 100 >> 000 messages to kafka, one each 1ms. >> >> 1. Just after launching storm locally >> >> CPU : ~8% >> >> 2. When getting messages from kafka >> >> CPU : ~100% >> Bolt and spout : ~20% >> Interesting things, CPU usage is not the same between the jvisualvm >> (~20%) and top and htop (100%). >> >> 3. When every messages were sent >> >> CPU : 60% >> Bolt and spout : ~0% >> Same as above, CP usage is not the same between the tool and top >> >> 4. After few minutes >> >> CPU : 60% >> Bolt and spout : ~0% >> Same as above, CP usage is not the same between the tool and top >> >> Pictures of profiling after few minutes enclosed. >> >> Any idea ? >> >> Thanks for the help >> -- >> Alexandre Wilhelm >> >> On May 14, 2016 at 12:21:46 PM, [email protected] ( >> [email protected]) wrote: >> >> Hi, I recommend profiling with jvisualvm to see which methods are using >> the most CPU time. This should provide insight into which methods are >> causing CPU usage to go up. >> >> --John >> >> Sent from my iPhone >> >> On May 14, 2016, at 2:42 PM, Alexandre Wilhelm < >> [email protected]> wrote: >> >> Hi guys, >> >> I’m a new user of storm and I need help for an issue I have with storm. >> >> At a glance, I have a topology with three spouts and bolts ; >> >> - Spout -> get messages from Kafka >> - Bolt -> transform the message from Kafka to a well formed son >> - Bolt -> send the JSON to a kairosDB server >> >> All of these bolts/spouts are written in go by using the library >> https://github.com/jsgilmore/gostorm >> >> Unfortunately after using the topology for a while, the CPU usage is just >> growing and growing and never goes down. For instance after sending 1 >> million messages, the CPU stays at 200% of cpu usage. >> >> I tried to play with the configuration and get better result ; with the >> following configuration : >> >> conf.put(Config.TOPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_MS, >> 1000); >> conf.put(Config.TOPOLOGY_DISRUPTOR_BATCH_SIZE, 10); >> conf.put(Config.TOPOLOGY_DISRUPTOR_BATCH_TIMEOUT_MILLIS, 500); >> conf.put(Config.TOPOLOGY_DISRUPTOR_WAIT_TIMEOUT_MILLIS, 500); >> conf.setMaxSpoutPending(1000000); >> conf.put(Config.TASK_HEARTBEAT_FREQUENCY_SECS , 600); >> >> My CPU usage after 1 million messages will be 50%, but it never goes >> down... >> >> What could be the issue(s) and how can I debug that ? I’m pretty new to >> the java/clojure world as well... >> >> Im running on storm 1.0.1 fyi >> >> Thanks in advance, >> -- >> Alexandre Wilhelm >> >> >
