I did some more digging around and found that a message gets transmitted from the kafka spout, reaches the bolt but never goes in to the bolt's execute method. Eventually after a minute kafka spout fails the message. I am using a custom FailureHandler which does not allow storm to replay the message. Please see the following log for one such message http://pastebin.com/fYEfWURw I think that this could be related to the messages traveling slowly through storm. Any pointers?
Thanks. On Wed, Oct 7, 2015 at 10:56 PM, Rohit Kelkar <[email protected]> wrote: > Does it matter that I am ack-ing the tick tuples in the execute method of > the bolt? Will that give rise to double ack-ing of the tuples and create > some inconsistent state? > > Thanks. > > On Wed, Oct 7, 2015 at 10:53 PM, Rohit Kelkar <[email protected]> > wrote: > >> Thanks for responding. >> Kafka Spout parallelism = 1, number of partitions of kafka topic=1, >> Tried various values for topology.max.spout.pending=64,100, 1024 >> >> Thanks. >> >> On Wed, Oct 7, 2015 at 10:06 PM, Harsha <[email protected]> wrote: >> >>> whats your Kafka Spout parallelism and how many partitions you've in >>> your kafka topic. Also did you try to tune topology.max.spout.pending >>> -Harsha >>> >>> >>> On Wed, Oct 7, 2015, at 10:54 AM, Rohit Kelkar wrote: >>> >>> I have a kafka spout and single bolt topology running on a cluster in >>> debug mode. I see that the message is sent from kafka spout at 17:02:48 and >>> received at the bolt at 17:02:56 >>> Here are the relevant logs - http://pastebin.com/WBxmPjLk >>> I checked the network traffic. No signs of saturation. >>> Am I filling up the internal buffers of storm? Is there any way to >>> identify this? >>> >>> Thanks. >>> >>> >>> >>> >> >> >
