I’d consider that a bug then. Please open an issue in Jira.

-Jordan

> On Mar 25, 2016, at 11:50 AM, Purshotam Shah <[email protected]> wrote:
> 
> But in a cause of long GC pause, it doesn't.
> This is what we have figure out from our test. If ZK is down, it does retry 
> based on retry policy. But in case of long GC pause, it doesn't. If GC pause 
> > session timeout, then curator notifies connection lost without retrying.
> 
> I was thinking that it will be better if we can retry even for GC pause also.
> 
> Thanks,
> 
> 
> 
> 
> On Friday, March 25, 2016 9:44 AM, Jordan Zimmerman 
> <[email protected]> wrote:
> 
> 
> Curator does retry when the connection is lost, based on the retry policy. 
> ConnectionState.LOST implies that the retry policy gave up.
> 
> -Jordan
> 
>> On Mar 25, 2016, at 11:33 AM, Purshotam Shah <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Thanks for the information. Doesn't it make sense to retry once curator 
>> receives connection lost from ZK client? We have seen it doing if ZK is 
>> down, curator tries with retry policy before notifying as connection lost.
>> 
>> Thanks,
>> 
>> 
>> 
>> On Thursday, March 24, 2016 1:52 PM, Jordan Zimmerman 
>> <[email protected] <mailto:[email protected]>> wrote:
>> 
>> 
>> The ZooKeeper client (which Curator uses) sends Heartbeats to the connected 
>> server. The heartbeat is sent every 2/3 of a session. If the hearbeat fails, 
>> the connection drops. Please read Tech Note 10 for detais: 
>> https://cwiki.apache.org/confluence/display/CURATOR/TN10 
>> <https://cwiki.apache.org/confluence/display/CURATOR/TN10>
>> 
>> -Jordan
>> 
>>> On Mar 24, 2016, at 12:30 PM, Purshotam Shah <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> 
>>> We use apache curator to connect to ZK.
>>> We create curator client with following settings.
>>> 1. session timeout = 5 min
>>> 2. connection time = 3 min
>>> 3. Retry = ExponentialBackoffRetry(1000, 10)
>>> 
>>> We have also setup ConnectionStateListener. We use curator mostly for 
>>> distributed locking. We shutdown the system when there is a connection lost.
>>> 
>>> We noticed that if there is long GC pause, we get notified as 
>>> ConnectionState.LOST and this is causing our system to go down.
>>> 
>>> We are working on to figure out why there is log GC pause. 
>>> My question even if we have long GC pause > session timeout, doesn't 
>>> curator use Retrypolicy to retry before notifying as ConnectionState.LOST
>>> 
>>> Thanks,
>>> 
>> 
>> 
>> 
> 
> 
> 

Reply via email to