Storm task ids don't change: https://groups.google.com/forum/#!topic/storm-user/7P23beQIL4c
On Thu, Jul 16, 2015 at 4:28 PM, Andrew Xor <[email protected]> wrote: > Direct grouping as it is shown in storm docs, means that you have to have > a specific task id and use "direct streams" which is error prone, probably > increase latency and might introduce redundancy problems as the producer of > tuple needs to know the id of the task the tuple will have to go; so > imagine a scenario where the receiving task fails for some reason and the > producer can't relay the tuples unless it received the re-spawned task's id. > > Hope this helps. > > Kindly yours, > > Andrew Grammenos > > -- PGP PKey -- > <https://www.dropbox.com/s/2kcxe59zsi9nrdt/pgpsig.txt> > https://www.dropbox.com/s/ei2nqsen641daei/pgpsig.txt > > On Thu, Jul 16, 2015 at 11:24 PM, Nick R. Katsipoulakis < > [email protected]> wrote: > >> Hello again, >> >> Nathan, I am using direct-grouping because the application I am working >> on has to be able to send tuples directly to specific tasks. In general >> control the data flow. Can you please explain to me why you would not >> recommend direct grouping? Is there any particular reason in the >> architecture of Storm? >> >> Thanks, >> Nick >> >> 2015-07-16 16:20 GMT-04:00 Nathan Leung <[email protected]>: >> >>> I would not recommend direct grouping unless you have a good reason for >>> it. Shuffle grouping is essentially random with even distribution which >>> makes it easier to characterize its performance. Local or shuffle grouping >>> stays in process so generally it will be faster. However you have to be >>> careful in certain cases to avoid task starvation (e.g. you have kafka >>> spout with 1 partition on the topic and 1 spout task, feeding 10 bolt "A" >>> tasks in 10 worker processes). Direct grouping depends on your code (i.e. >>> you can create hotspots), fields grouping depends on your key distribution, >>> etc. >>> >>> On Thu, Jul 16, 2015 at 3:50 PM, Nick R. Katsipoulakis < >>> [email protected]> wrote: >>> >>>> Hello all, >>>> >>>> I have two questions: >>>> >>>> 1) How do you exactly measure latency? I am doing the same thing and I >>>> have a problem getting the exact milliseconds of latency (mainly because of >>>> clock drifting). >>>> 2) (to Nathan) Is there a difference in speeds among different >>>> groupings? For instance, is shuffle faster than direct grouping? >>>> >>>> Thanks, >>>> Nick >>>> >>>> 2015-07-15 17:37 GMT-04:00 Nathan Leung <[email protected]>: >>>> >>>>> Two things. Your math may be off depending on parallelism. One emit >>>>> from A becomes 100 emitted from C, and you are joining all of them. >>>>> >>>>> Second, try the default number of ackers (one per worker). All your >>>>> ack traffic is going to a single task. >>>>> >>>>> Also you can try local or shuffle grouping if possible to reduce >>>>> network transfers. >>>>> On Jul 15, 2015 12:45 PM, "Kashyap Mhaisekar" <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> We are attempting a real-time distributed computing using storm and >>>>>> the solution has only one problem - inter bolt latency on same >>>>>> machine or across machines ranges between 2 - 250 ms. I am not able to >>>>>> figure out why. Network latency is under 0.5 ms. By latency, I mean >>>>>> the time between an emit of one bolt/spout to getting the message in >>>>>> execute() of next bolt. >>>>>> >>>>>> I have a topology like the below - >>>>>> A (Spout) ->(Emits a number say 1000) -> B (bolt) [Receives this >>>>>> number and divides this into 10 emits of 100 each) -> C (bolt) [Recieves >>>>>> these emits and divides this to 10 emits of 10 numbers) -> D (bolt) [Does >>>>>> some computation on the number and emits one message] -> E (bolt) >>>>>> [Aggregates all the data and confirms if all the 1000 messages are >>>>>> processed) >>>>>> >>>>>> Every bolt takes under 3 msec to complete and as a result, I >>>>>> estimated that the end to end processing for 1000 takes not more than 50 >>>>>> msec including any latencies. >>>>>> >>>>>> *Observations* >>>>>> 1. The end to end time from Spout A to Bolt E takes 200 msec to 3 >>>>>> seconds. My estimate was under 50 msec given that each bolt and spout >>>>>> take >>>>>> under 3 msec to execute including any latencies. >>>>>> 2. I noticed that the most of the time is spent between Emit from a >>>>>> Spout/Bolt and execute() of the consuming bolt. >>>>>> 3. Network latency is under 0.5 msec. >>>>>> >>>>>> I am not able to figure out why it takes so much time between a >>>>>> spout/bolt to next bolt. I understand that the spout/bolt buffers the >>>>>> data >>>>>> into a queue and then the subsequent bolt consumes from there. >>>>>> >>>>>> *Infrastructure* >>>>>> 1. 5 VMs with 4 CPU and 8 GB ram. Workers are with 1024 MB and there >>>>>> are 20 workers overall. >>>>>> >>>>>> *Test* >>>>>> 1. The test was done with 25 messages to the spout => 25 messages are >>>>>> sent to spout in a span of 5 seconds. >>>>>> >>>>>> *Config values* >>>>>> Config config = new Config(); >>>>>> config.put(Config.TOPOLOGY_WORKERS, Integer.parseInt(20)); >>>>>> config.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE, 16384); >>>>>> config.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE, 16384); >>>>>> config.put(Config.TOPOLOGY_ACKER_EXECUTORS, 1); >>>>>> config.put(Config.TOPOLOGY_RECEIVER_BUFFER_SIZE, 8); >>>>>> config.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE, 64); >>>>>> >>>>>> Please let me know if you have encountered similar issues and any >>>>>> steps you have taken to mitigate the time taken between spout/bolt and >>>>>> another bolt. >>>>>> >>>>>> Thanks >>>>>> Kashyap >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Nikolaos Romanos Katsipoulakis, >>>> University of Pittsburgh, PhD candidate >>>> >>> >>> >> >> >> -- >> Nikolaos Romanos Katsipoulakis, >> University of Pittsburgh, PhD candidate >> > >
