Thank for replying Eric,

This is quite critical issue I think, the CPU usage keep climbing up when I
add more bolts to the topology.
Each worker consumes about 30-40% of its CPU when I have 10 bolts and it
increases to 100% or more
when I add 20 more bolts to the topology (majority of bolts are for
persistentAggregate operation) and this
is when the topology does not consume any data from Kafka.
I tuned so that YGC happens once every 100s and each YGC took about 50-80ms
to complete so I don't
think it is GC issue.


On Mon, May 25, 2015 at 7:34 AM, Eric Ruel <[email protected]>
wrote:

>  probably caused by
>
>
>  https://issues.apache.org/jira/browse/STORM-503
>
> https://issues.apache.org/jira/browse/STORM-350
>
>
>
>  ------------------------------
> *De :* Binh Nguyen Van <[email protected]>
> *Envoyé :* 22 mai 2015 04:57
> *À :* [email protected]
> *Objet :* High CPU usage
>
>  Hi all,
>
>  I am using Storm-0.9.4 to build a Trident topology that uses
> OpaqueTridentKafkaSpout to read data
> from Kafka, processes it and save the result to the database. The topology
> is working but I am seeing
> it uses a lot of CPU. When the topology is idle (no new data in spout),
> each process use 30-60%
> of the CPU and when it starts consuming data the CPU is even much higher
> (160-250% each).
> I tried to tune the GC but it does not help, the CPU usage is still high.
> I don't know if I have missing
> anything or whether I missed configure it. This is the configuration that
> I am using:
>
>  >worker.childopts: "-verbose:gc -Xmx4096m -Xms4096m -Xss256k
> -XX:NewSize=3200m -XX:MaxNewSize=3200m -XX:MaxPermSize=128m
> -XX:PermSize=96m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
> -XX:+AggressiveOpts -XX:+UseCompressedOops -XX:+CMSParallelRemarkEnabled
> -XX:-CMSConcurrentMTEnabled -XX:+UseCMSInitiatingOccupancyOnly
> -XX:CMSInitiatingOccupancyFraction=75 -XX:MaxTenuringThreshold=4
> -XX:SurvivorRatio=9 -Djava.net.preferIPv4Stack=true
> -Xloggc:/var/log/storm/gc-worker-%ID%.log -XX:GCLogFileSize=1m
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:+PrintGCDateStamps
> -XX:+PrintGCDetails"
> >topology.receiver.buffer.size: 8
> >topology.transfer.buffer.size: 1024
> >topology.executor.receive.buffer.size: 1024
> >topology.executor.send.buffer.size: 2048
> >topology.sleep.spout.wait.strategy.time.ms: 100
> >storm.messaging.netty.server_worker_threads: 4
> >storm.messaging.netty.client_worker_threads: 4
> >storm.messaging.netty.buffer_size: 10485760
>
>  The box that I am running has 24 cores and the topology has 11 workers,
> 212 executors. The spout
> has 5 tasks to read from a topic with 10 partitions, the batch's max size
> is set to 1MB, with this setting
> the topology can pull 7-8MB/sec from Kafka and process latency is about
> 1200ms - 1500ms.
>
>  I can't figure out why the CPU is so high, can anyone please help?
>
>  Thank you
> -Binh
>
>

Reply via email to