probably caused by

https://issues.apache.org/jira/browse/STORM-503

https://issues.apache.org/jira/browse/STORM-350



________________________________
De : Binh Nguyen Van <[email protected]>
Envoyé : 22 mai 2015 04:57
À : [email protected]
Objet : High CPU usage

Hi all,

I am using Storm-0.9.4 to build a Trident topology that uses 
OpaqueTridentKafkaSpout to read data
from Kafka, processes it and save the result to the database. The topology is 
working but I am seeing
it uses a lot of CPU. When the topology is idle (no new data in spout), each 
process use 30-60%
of the CPU and when it starts consuming data the CPU is even much higher 
(160-250% each).
I tried to tune the GC but it does not help, the CPU usage is still high. I 
don't know if I have missing
anything or whether I missed configure it. This is the configuration that I am 
using:

>worker.childopts: "-verbose:gc -Xmx4096m -Xms4096m -Xss256k -XX:NewSize=3200m 
>-XX:MaxNewSize=3200m -XX:MaxPermSize=128m -XX:PermSize=96m -XX:+UseParNewGC 
>-XX:+UseConcMarkSweepGC -XX:+AggressiveOpts -XX:+UseCompressedOops 
>-XX:+CMSParallelRemarkEnabled -XX:-CMSConcurrentMTEnabled 
>-XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=75 
>-XX:MaxTenuringThreshold=4 -XX:SurvivorRatio=9 -Djava.net.preferIPv4Stack=true 
>-Xloggc:/var/log/storm/gc-worker-%ID%.log -XX:GCLogFileSize=1m 
>-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:+PrintGCDateStamps 
>-XX:+PrintGCDetails"
>topology.receiver.buffer.size: 8
>topology.transfer.buffer.size: 1024
>topology.executor.receive.buffer.size: 1024
>topology.executor.send.buffer.size: 2048
>topology.sleep.spout.wait.strategy.time.ms<http://topology.sleep.spout.wait.strategy.time.ms>:
> 100
>storm.messaging.netty.server_worker_threads: 4
>storm.messaging.netty.client_worker_threads: 4
>storm.messaging.netty.buffer_size: 10485760

The box that I am running has 24 cores and the topology has 11 workers, 212 
executors. The spout
has 5 tasks to read from a topic with 10 partitions, the batch's max size is 
set to 1MB, with this setting
the topology can pull 7-8MB/sec from Kafka and process latency is about 1200ms 
- 1500ms.

I can't figure out why the CPU is so high, can anyone please help?

Thank you
-Binh

Reply via email to