My bad ☹ There was one specific message for which there was an Out Of memory exception . After fixing the problem it is working fine now.
Thanks for the guidance From: Pablo Recabal [mailto:[email protected]] Sent: Thursday, June 09, 2016 10:30 PM To: [email protected] Subject: Re: Topology gets stuck Are you seeing any failed tuples at the spout? The replaying of the tuples seems to indicate to me that the bolts have not acked your tuples, either because of a timeout, or an exception... 2016-06-09 12:52 GMT-04:00 Nitin Gupta <[email protected]<mailto:[email protected]>>: Few more pointers to the problem I restarted the topology but still the consumer offset doesn’t move. I then cleared all the messages in the queue , cleared the consumer offset in the zookeeper .I reduced the frequency at which the messages are pushed to the queue and then it seems to work. However there can be a problem in production where in the messages in the queue increases and for some reason the topology doesn’t process these messages. The topology should at least start processing from the last commit offset after restart. From: Nitin Gupta [mailto:[email protected]<mailto:[email protected]>] Sent: Thursday, June 09, 2016 7:48 PM To: [email protected]<mailto:[email protected]> Subject: RE: Topology gets stuck Yes there are failing tuples. The message timeout is set to 300 seconds using the below configuration conf.put(Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS,300); [cid:[email protected]] From: Jungtaek Lim [mailto:[email protected]] Sent: Thursday, June 09, 2016 7:41 PM To: [email protected]<mailto:[email protected]> Subject: Re: Topology gets stuck Nitin, Could you check your configuration for message timeout seconds to see it's set to enough amount of time, and also check there're failing tuples at that time? Thanks, Jungtaek Lim (HeartSaVioR) 2016년 6월 9일 (목) 오후 11:05, Nitin Gupta <[email protected]<mailto:[email protected]>>님이 작성: Thanks Abhishek for the guidance. As I mentioned the Bolt processes the same set of messages again and again and the consumer offset doesn’t move ahead. It processes around 100 messages , stops for few minutes (around 3-5 minutes)and then again processes these messages. This cycle keeps repeating but the consumer offset in the zookeeper is not updated. I will check if it is possible to move to a new version of storm as the system is already in production. Thanks & Regards, Nitin Gupta From: Abhishek Agarwal [mailto:[email protected]<mailto:[email protected]>] Sent: Thursday, June 09, 2016 7:26 PM To: [email protected]<mailto:[email protected]> Subject: Re: Topology gets stuck If spout is blocked only for a short duration, it may be due to slow bolt. If it remains blocked forever, there is a possibility of deadlock. you can check out STORM-1027 On Thu, Jun 9, 2016 at 6:48 PM, Nitin Gupta <[email protected]<mailto:[email protected]>> wrote: Hi Abhishek, I am using version 0.9.4 . From the log it seems the topology keeps processing a fix set of messages from the last offset it got stuck . Looks it is equal to the max spout pending messages. It processes these messages but doesn’t update the zookeeper with the processed offset. It then stop for a few minutes and repeats the same process. The rate at which the messages are being written is very high . So the lag is very high . Not sure if the KafkaSpout stops processing after a specific number of messages pending to be processed. Thanks & Regards, Nitin Gupta From: Abhishek Agarwal [mailto:[email protected]<mailto:[email protected]>] Sent: Thursday, June 09, 2016 6:40 PM To: [email protected]<mailto:[email protected]> Subject: Re: Topology gets stuck check the thread dump of your worker process after it gets stuck. which version do you use? On Thu, Jun 9, 2016 at 6:36 PM, Nitin Gupta <[email protected]<mailto:[email protected]>> wrote: Dear All, I am using a Kafkaspout to process messages from Kafka. The bolt takes around 700 milliseconds to process the message. I observe the topology worked fine for a few hours but once the number of pending messages increases it stops to update the consumer offset in the zookeeper. I am facing this problem from quite few days. Any help will be highly appreciated. Thanks & Regards, Nitin Gupta -- Regards, Abhishek Agarwal -- Regards, Abhishek Agarwal
