After tuning a trident topology (kafka->storm->cassandra) to run on 1
worker (so on 1 server), it works really well.

I tried to deploy it using 2 workers on 1 server or 2 workers on 2 servers.
The result is the same, nothing happens, no tuples are emitted and no
messages in the logs.

A quick profiling showed me that :

77% of CPU time is main-SendThread(a.zookeeper.hostname:2181)
org.apache.zookeeper.ClientCnx$sendThreadrun()
sun.nio.ch.SelectorImpl.select()

The rest mainly come from 2 threads "New I/O"
org.jboss.netty.channel.socket.nio.SelectorUtil.select()
sun.nio.ch.SelectorImpl.select()

Therefore I am wondering if the problem can come from one of the followings
:

- Zookeeper cluster version is 3.4.6, which is different from the 3.3.x
used by Storm 0.9.1-incubating ?
But that is strange because there are absolutely no problem when using the
same settings but with only 1 worker

- Communication layer is netty, which can be not working well with my
hardware ? (is this possible?)
In case of 1 worker only netty seems not to be too much involved (no inter
worker communication)
Maybe changing to ZeroMQ ?

Has someone faced similar issue ? Any pointer ? Or anything in particular
to monitor / profile ?

Reply via email to