Kashyap, Yes you will need to upgrade Storm version on cluster as well. Personally, I would run tests to see if it fixes existing issue before upgrading.
Thanks, Ganesh From: Joseph Beard [mailto:[email protected]] Sent: Friday, September 04, 2015 12:07 PM To: [email protected] Subject: Re: Netty reconnect We also ran into the same issue with Storm 0.9.4. We chose to upgrade to 0.10.0-beta1 which solved the problem and has been otherwise stable for our needs. Joe — Joseph Beard [email protected]<mailto:[email protected]> On Sep 3, 2015, at 10:03 AM, Kashyap Mhaisekar <[email protected]<mailto:[email protected]>> wrote: Thanks for the advices. Will upgrade from 0.9.3 to 0.9.4. A lame question - Does it mean that the existing clusters need to be rebuilt with 0.9.4? Thanks Kashyap On Sep 3, 2015 08:32, "Nick R. Katsipoulakis" <[email protected]<mailto:[email protected]>> wrote: Ganesh, No I am not. Cheers, Nick 2015-09-03 9:25 GMT-04:00 Ganesh Chandrasekaran <[email protected]<mailto:[email protected]>>: Are you using multilang protocol? I know that after upgrading to 0.9.4 it seemed like I was being affected by this bug - https://issues.apache.org/jira/browse/STORM-738 and rolled back to previous stable version of 0.8.2. I did not verify this thoroughly on my cluster though. From: Nick R. Katsipoulakis [mailto:[email protected]<mailto:[email protected]>] Sent: Thursday, September 03, 2015 9:08 AM To: [email protected]<mailto:[email protected]> Subject: Re: Netty reconnect Hello again, I read STORM-404 and I saw that is resolved on version 0.9.4. However, I have version 0.9.4 installed in my cluster, and I have seen similar behavior in my workers. In fact, at random times I would see that some workers were considered dead (Netty was dropping messages) and they would be restarted by the nimbus. Currently, I only see dropped messages but not restarted workers. FYI, my cluster has the following information * 3X AWS m4.xlarge instances for ZooKeeper and Nimbus * 4X AWS m4.xlarge instances for Supervisors (each one with 2 workers) Thanks, Nick 2015-09-03 8:38 GMT-04:00 Ganesh Chandrasekaran <[email protected]<mailto:[email protected]>>: Agreed with Jitendra. We were using 0.9.3 version and facing the same issue of netty reconnects which was the issue 404. Upgrading to 0.9.4 fixed the issue. Thanks, Ganesh From: Jitendra Yadav [mailto:[email protected]<mailto:[email protected]>] Sent: Thursday, September 03, 2015 8:20 AM To: [email protected]<mailto:[email protected]> Subject: Re: Netty reconnect I don't know your storm version, but it's worth to check these Jira's and see if similar scenario occurring. https://issues.apache.org/jira/browse/STORM-404 https://issues.apache.org/jira/browse/STORM-450 Thanks Jitendra On Thu, Sep 3, 2015 at 5:22 PM, John Yost <[email protected]<mailto:[email protected]>> wrote: Hi Everyone, When I see this, it is evidence that one or more of the workers are not starting up, which results in connections either not occuring or reconnecting occuring when supervisors kill workers that don't start up properly. I recommend checking the supervisor and nimbus logs to see if there are any root causes other than network issues causing the connect/reconnect. --John On Thu, Sep 3, 2015 at 7:32 AM, Nick R. Katsipoulakis <[email protected]<mailto:[email protected]>> wrote: Hello Kashyap, I have been having the same issue for some time now on my AWS cluster. To be honest, I do not know how to resolve it. Regards, Nick 2015-09-03 0:07 GMT-04:00 Kashyap Mhaisekar <[email protected]<mailto:[email protected]>>: Hi, Has anyone experienced Netty reconnects repeatedly? My workers seem to be eternally in reconnect state and topology doesn't serve messages at all. It gets connected once in a while and then goes back to getting reconnecting. Any fixes for this? "Reconnect started for Netty-Client" Thanks Kashyap -- Nikolaos Romanos Katsipoulakis, University of Pittsburgh, PhD candidate -- Nikolaos Romanos Katsipoulakis, University of Pittsburgh, PhD candidate -- Nikolaos Romanos Katsipoulakis, University of Pittsburgh, PhD candidate
