I’d consider that a bug then. Please open an issue in Jira. -Jordan
> On Mar 25, 2016, at 11:50 AM, Purshotam Shah <[email protected]> wrote: > > But in a cause of long GC pause, it doesn't. > This is what we have figure out from our test. If ZK is down, it does retry > based on retry policy. But in case of long GC pause, it doesn't. If GC pause > > session timeout, then curator notifies connection lost without retrying. > > I was thinking that it will be better if we can retry even for GC pause also. > > Thanks, > > > > > On Friday, March 25, 2016 9:44 AM, Jordan Zimmerman > <[email protected]> wrote: > > > Curator does retry when the connection is lost, based on the retry policy. > ConnectionState.LOST implies that the retry policy gave up. > > -Jordan > >> On Mar 25, 2016, at 11:33 AM, Purshotam Shah <[email protected] >> <mailto:[email protected]>> wrote: >> >> Thanks for the information. Doesn't it make sense to retry once curator >> receives connection lost from ZK client? We have seen it doing if ZK is >> down, curator tries with retry policy before notifying as connection lost. >> >> Thanks, >> >> >> >> On Thursday, March 24, 2016 1:52 PM, Jordan Zimmerman >> <[email protected] <mailto:[email protected]>> wrote: >> >> >> The ZooKeeper client (which Curator uses) sends Heartbeats to the connected >> server. The heartbeat is sent every 2/3 of a session. If the hearbeat fails, >> the connection drops. Please read Tech Note 10 for detais: >> https://cwiki.apache.org/confluence/display/CURATOR/TN10 >> <https://cwiki.apache.org/confluence/display/CURATOR/TN10> >> >> -Jordan >> >>> On Mar 24, 2016, at 12:30 PM, Purshotam Shah <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> >>> We use apache curator to connect to ZK. >>> We create curator client with following settings. >>> 1. session timeout = 5 min >>> 2. connection time = 3 min >>> 3. Retry = ExponentialBackoffRetry(1000, 10) >>> >>> We have also setup ConnectionStateListener. We use curator mostly for >>> distributed locking. We shutdown the system when there is a connection lost. >>> >>> We noticed that if there is long GC pause, we get notified as >>> ConnectionState.LOST and this is causing our system to go down. >>> >>> We are working on to figure out why there is log GC pause. >>> My question even if we have long GC pause > session timeout, doesn't >>> curator use Retrypolicy to retry before notifying as ConnectionState.LOST >>> >>> Thanks, >>> >> >> >> > > >
