How to avoid CuratorConnectionLossException on leader loss?

Jens Rantil Sun, 13 Sep 2015 09:16:19 -0700

Dear Curator(s),

A couple of days ago we did some maintenance of our Zookeeper ensemble and
did a rolling restart of each node. Restarting the followers worked like a
charm. However, restarting leader started throwing/logging
CuratorConnectionLossException exceptions that trickled down to our
application code until a reelection had occured. Example:


https://gist.github.com/JensRantil/309fa1bf17ee2982b8e7

We were hoping that Curator would gracefully retry until a leader had been
reelected, but I'm sure there is something we need to tweak for this to
avoid happening again.

*Question:* To avoid this to happen in the future, should we simply
increase our retry policy to retry longer before giving up?

Additional information:

   - Zookeeper version 1.4.5
   - Curator version 2.7.0
   - We are currently using the following retrying policy: new
   ExponentialBackoffRetry(1000, 3);
   - Zookeeper configuration all default except initLimit=60 and
   syncLimit=30.

Thanks,
Jens

-- 
Jens Rantil
Backend engineer
Tink AB

Email: [email protected]
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook <https://www.facebook.com/#!/tink.se> Linkedin
<http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary>
 Twitter <https://twitter.com/tink>

How to avoid CuratorConnectionLossException on leader loss?

Reply via email to