With a spout parallelism of 2, and topology.max.spout.pending of 5, you will 
have a total of 10 tuples in flight. Your mergeBolt is taking almost 30 seconds 
to process each tuple. While that bolt is processing a tuple, any additional 
tuples routed to an instance of that bolt have to wait in an internal buffer. 
That wait time will contribute to the overall complete latency.

-Taylor


> On Sep 29, 2016, at 12:47 PM, Suma Cherukuri <[email protected]> wrote:
> 
> Hi,
> I am using storm to do file concatenations on S3. The question is regarding 
> the complete latency of the spout. There is a huge difference between the 
> bolt process latencies and the spout complete latencies. Can anyone please 
> help me understand whats causing this behavior in storm.
> 
> Below are the storm configurations:
> 
> topology.executor.receive.buffer.size: 128
> topology.executor.send.buffer.size: 128
> topology.receiver.buffer.size: 8
> topology.transfer.buffer.size: 32
> topology.max.spout.pending: 5
> topology.message.timeout.secs: 600
> topology.spout.wait.strategy: "backtype.storm.spout.SleepSpoutWaitStrategy"
> 
> Please find the attached screenshot for the latencies.
> 
> Thanks
> Suma Cherukuri
> 
> <Screen Shot 2016-09-29 at 8.58.40 AM.png>

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to