With a spout parallelism of 2, and topology.max.spout.pending of 5, you will have a total of 10 tuples in flight. Your mergeBolt is taking almost 30 seconds to process each tuple. While that bolt is processing a tuple, any additional tuples routed to an instance of that bolt have to wait in an internal buffer. That wait time will contribute to the overall complete latency.
-Taylor > On Sep 29, 2016, at 12:47 PM, Suma Cherukuri <[email protected]> wrote: > > Hi, > I am using storm to do file concatenations on S3. The question is regarding > the complete latency of the spout. There is a huge difference between the > bolt process latencies and the spout complete latencies. Can anyone please > help me understand whats causing this behavior in storm. > > Below are the storm configurations: > > topology.executor.receive.buffer.size: 128 > topology.executor.send.buffer.size: 128 > topology.receiver.buffer.size: 8 > topology.transfer.buffer.size: 32 > topology.max.spout.pending: 5 > topology.message.timeout.secs: 600 > topology.spout.wait.strategy: "backtype.storm.spout.SleepSpoutWaitStrategy" > > Please find the attached screenshot for the latencies. > > Thanks > Suma Cherukuri > > <Screen Shot 2016-09-29 at 8.58.40 AM.png>
signature.asc
Description: Message signed with OpenPGP using GPGMail
