Hi,
We are attempting a real-time distributed computing using storm and the
solution has only one problem - inter bolt latency on same machine or
across machines ranges between 2 - 250 ms. I am not able to figure out why.
Network latency is under 0.5 ms. By latency, I mean the time between an
emit of one bolt/spout to getting the message in execute() of next bolt.

I have a topology like the below -
A (Spout) ->(Emits a number say 1000) -> B (bolt) [Receives this number and
divides this into 10 emits of 100 each) -> C (bolt) [Recieves these emits
and divides this to 10 emits of 10 numbers) -> D (bolt) [Does some
computation on the number and emits one message] -> E (bolt) [Aggregates
all the data and confirms if all the 1000 messages are processed)

Every bolt takes under 3 msec to complete and as a result, I estimated that
the end to end processing for 1000 takes not more than 50 msec including
any latencies.

*Observations*
1. The end to end time from Spout A to Bolt E takes 200 msec to 3 seconds.
My estimate was under 50 msec given that each bolt and spout take under 3
msec to execute including any latencies.
2. I noticed that the most of the time is spent between Emit from a
Spout/Bolt and execute() of the consuming bolt.
3. Network latency is under 0.5 msec.

I am not able to figure out why it takes so much time between a spout/bolt
to next bolt. I understand that the spout/bolt buffers the data into a
queue and then the subsequent bolt consumes from there.

*Infrastructure*
1. 5 VMs with 4 CPU and 8 GB ram. Workers are with 1024 MB and there are 20
workers overall.

*Test*
1. The test was done with 25 messages to the spout => 25 messages are sent
to spout in a span of 5 seconds.

*Config values*
Config config = new Config();
config.put(Config.TOPOLOGY_WORKERS, Integer.parseInt(20));
config.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE, 16384);
config.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE, 16384);
config.put(Config.TOPOLOGY_ACKER_EXECUTORS, 1);
config.put(Config.TOPOLOGY_RECEIVER_BUFFER_SIZE, 8);
config.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE, 64);

Please let me know if you have encountered similar issues and any steps you
have taken to mitigate the time taken between spout/bolt and another bolt.

Thanks
Kashyap

Reply via email to