Try observing the garbage collection on the worker.  Many of my heartbeat 
timeout issues went away after I switched to the G1GC garbage collector.

From: Anishek Agarwal <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: 2015,Monday, May 11 at 04:53
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: occasionally long running bolt timeout

as you mentioned if processing of a single tuple at 5th bolt itself is going to 
take long increasing the parallelism wont help, unless you can break the 
operation the 5th bolt does for a tuple  such that you export small files in 
each operation, then you can do one additional bolt between 4th and 5th and 
then increase the parallelism of 5th.

On Mon, May 11, 2015 at 8:26 AM, Subrat Basnet 
<[email protected]<mailto:[email protected]>> wrote:
Hello everyone,

I have a topology with 5 bolts, all bolts with a parallelism of 2.

The 5th bolt (depending on the tuple passed) will every now and then, export 
data files to AWS (which is a slow process) and this results in high process 
latency, and I think that leads to a heartbeat timeout.

I understand that increasing the heartbeat timeout might work, but what other 
*better* options do I have to avoid such a timeout/crash?

Will setting the parallelism of the 5th bolt to say, 7 (vs 2 on the others) 
work? So that there is always a process ready to export files?

Thanks for your help!

Regards,
Subrat Basnet

Reply via email to