Re: Zookeeper connection errors in Helix Controller

2019-06-01 Thread Lei Xia
Before our new release is out, if you see that is a problem in your prod deployment, one thing you may try is to add a newer zookeeper version as an explicit dependency in your project, then during the build time, maven (or other build tool) will pick new version instead the one specified in

Re: Zookeeper connection errors in Helix Controller

2019-05-31 Thread DImuthu Upeksha
Hi Lee, Understood and thanks for the heads up. We are currently in middle of production deployment with 0.8.2 and most of the users are already notified with the schedule. Basically we are a happy with the stability and functional correctness of 0.8.2 except for above mentioned case where we

Re: Zookeeper connection errors in Helix Controller

2019-05-31 Thread Hunter Lee
Hey Dimuthu - We are actually in the process of preparing a new release, and this will come with the previously mentioned bug fixes in Task Framework. It also contains various ZK-related fixes - I don't know what your deployment schedule is but it might be worth the wait of another week or so.

Re: Zookeeper connection errors in Helix Controller

2019-05-31 Thread DImuthu Upeksha
Now I'm seeing following error in controller log. Restarting the controller fixed the issue. We are time to time seeing this in controller with zk connection issues. Is this also something to do with zk client version? 2019-05-31 13:21:46,669 [Thread-0-SendThread(localhost:2181)] WARN

Re: Zookeeper connection errors in Helix Controller

2019-05-31 Thread DImuthu Upeksha
Hi Lei, We use 0.8.2. We initially had 0.8.4 but it contains an issue with task retry logic so we downgraded to 0.8.2. We are planning to go into production with 0.8.2 by next week so can you please advice a better way to solve this without upgrading to 0.8.4. Thanks Dimuthu On Fri, May 31,

Re: Zookeeper connection errors in Helix Controller

2019-05-31 Thread DImuthu Upeksha
Hi Kishore, Adding -Djute.maxbuffer=49107800 fixed the issue but now I can see a whole lot of logs printing with following line and participant is executing a bulk of Tasks once in a while with around 5 minute delay in between. 2019-05-31 12:45:58,804 [GenericHelixController-event_process] WARN

Re: Zookeeper connection errors in Helix Controller

2019-05-31 Thread kishore g
can you grep for zookeeper state in controller log. On Fri, May 31, 2019 at 7:52 AM DImuthu Upeksha wrote: > Hi Folks, > > I'm getting following error in controller log and seems like controller is > not moving froward after that point > > 2019-05-31 10:47:37,084 [main] INFO

Zookeeper connection errors in Helix Controller

2019-05-31 Thread DImuthu Upeksha
Hi Folks, I'm getting following error in controller log and seems like controller is not moving froward after that point 2019-05-31 10:47:37,084 [main] INFO o.a.a.h.i.c.HelixController - Starting helix controller 2019-05-31 10:47:37,089 [main] INFO o.a.a.c.u.ApplicationSettings - Settings