Hello all, I have two questions:
1) How do you exactly measure latency? I am doing the same thing and I have a problem getting the exact milliseconds of latency (mainly because of clock drifting). 2) (to Nathan) Is there a difference in speeds among different groupings? For instance, is shuffle faster than direct grouping? Thanks, Nick 2015-07-15 17:37 GMT-04:00 Nathan Leung <[email protected]>: > Two things. Your math may be off depending on parallelism. One emit from A > becomes 100 emitted from C, and you are joining all of them. > > Second, try the default number of ackers (one per worker). All your ack > traffic is going to a single task. > > Also you can try local or shuffle grouping if possible to reduce network > transfers. > On Jul 15, 2015 12:45 PM, "Kashyap Mhaisekar" <[email protected]> wrote: > >> Hi, >> We are attempting a real-time distributed computing using storm and the >> solution has only one problem - inter bolt latency on same machine or >> across machines ranges between 2 - 250 ms. I am not able to figure out why. >> Network latency is under 0.5 ms. By latency, I mean the time between an >> emit of one bolt/spout to getting the message in execute() of next bolt. >> >> I have a topology like the below - >> A (Spout) ->(Emits a number say 1000) -> B (bolt) [Receives this number >> and divides this into 10 emits of 100 each) -> C (bolt) [Recieves these >> emits and divides this to 10 emits of 10 numbers) -> D (bolt) [Does some >> computation on the number and emits one message] -> E (bolt) [Aggregates >> all the data and confirms if all the 1000 messages are processed) >> >> Every bolt takes under 3 msec to complete and as a result, I estimated >> that the end to end processing for 1000 takes not more than 50 msec >> including any latencies. >> >> *Observations* >> 1. The end to end time from Spout A to Bolt E takes 200 msec to 3 >> seconds. My estimate was under 50 msec given that each bolt and spout take >> under 3 msec to execute including any latencies. >> 2. I noticed that the most of the time is spent between Emit from a >> Spout/Bolt and execute() of the consuming bolt. >> 3. Network latency is under 0.5 msec. >> >> I am not able to figure out why it takes so much time between a >> spout/bolt to next bolt. I understand that the spout/bolt buffers the data >> into a queue and then the subsequent bolt consumes from there. >> >> *Infrastructure* >> 1. 5 VMs with 4 CPU and 8 GB ram. Workers are with 1024 MB and there are >> 20 workers overall. >> >> *Test* >> 1. The test was done with 25 messages to the spout => 25 messages are >> sent to spout in a span of 5 seconds. >> >> *Config values* >> Config config = new Config(); >> config.put(Config.TOPOLOGY_WORKERS, Integer.parseInt(20)); >> config.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE, 16384); >> config.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE, 16384); >> config.put(Config.TOPOLOGY_ACKER_EXECUTORS, 1); >> config.put(Config.TOPOLOGY_RECEIVER_BUFFER_SIZE, 8); >> config.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE, 64); >> >> Please let me know if you have encountered similar issues and any steps >> you have taken to mitigate the time taken between spout/bolt and another >> bolt. >> >> Thanks >> Kashyap >> > -- Nikolaos Romanos Katsipoulakis, University of Pittsburgh, PhD candidate
