[Reposting without attachment since earlier post did not seem to be received by the list]
Hi all, I have a simple Storm distributed cluster running a simple topology that runs slightly modified version of the ExclamationTopology sample. When using netty, the CPU usage max outs (goes to 2100% on 16-core/thread machine) and the throughput drastically drops (~3000 TPS). When switching to zeromq with the same configs, the CPU usage is normal and the throughput is around 100,000 - 170,000. *Worker Node Config:* Intel Xeon CPU E5-2470 0 @ 2.30GHz - 8 cores (16 threads) RAM - 24 GB JDK - 1.7.0_45 Netty configs (# of client worker threads, # of server worker threads, buffer size) are at default values. Found a lot of references to this EPollArrayWrapper JDK bug [1], which seems to be fixed in JDK >= 1.7.0_11. But a thread dump when CPU was hogging had the following stack traces. "New I/O boss #6" prio=10 tid=0x00007f08c839c800 nid=0x7638 runnable [0x00007f08cc8fa000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87) - locked <0x00000000f8410cd8> (a sun.nio.ch.Util$2) - locked <0x00000000f8410cc8> (a java.util.Collections$UnmodifiableSet) - locked <0x00000000f8410bb0> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98) at org.jboss.netty.channel.socket.nio.SelectorUtil.select(SelectorUtil.java:64) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.select(AbstractNioSelector.java:409) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:206) at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:41) at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) "New I/O worker #5" prio=10 tid=0x00007f08c839c000 nid=0x7637 runnable [0x00007f08cc9fb000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87) - locked <0x00000000f839b7d8> (a sun.nio.ch.Util$2) - locked <0x00000000f839b7c8> (a java.util.Collections$UnmodifiableSet) - locked <0x00000000f839b6b0> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98) at org.jboss.netty.channel.socket.nio.SelectorUtil.select(SelectorUtil.java:64) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.select(AbstractNioSelector.java:409) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:206) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88) at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) So wondering if it is the same issue or maybe there is some configuration I've done incorrectly? Any idea why this is happening? Any clues, pointers would be much appreciated.. :-). [1] https://github.com/netty/netty/issues/327 Thanks, Lasantha
