Benjamin Jaton created CURATOR-134:
--------------------------------------

             Summary: Curator sends a connection LOST event before 
sessionTimeout
                 Key: CURATOR-134
                 URL: https://issues.apache.org/jira/browse/CURATOR-134
             Project: Apache Curator
          Issue Type: Bug
          Components: Client
    Affects Versions: 2.6.0
         Environment: Ubuntu 12.04
            Reporter: Benjamin Jaton
            Priority: Critical


Created a Curator client with:
- connection timeout: 10 seconds
- session timeout: 30 seconds
- retry policy: RetryNTimes(3, 10000)

A scenario where the ensemble is lost produces the the curator client to send a 
LOST event in less than the expected 30 seconds:
Fri Aug 01 11:17:19 PDT 2014 - CURATOR STATE: SUSPENDED
Fri Aug 01 11:17:29 PDT 2014 - CURATOR STATE: LOST

The client code is attached, this is the complete output:

Fri Aug 01 11:16:53 PDT 2014 - CURATOR STATE: CONNECTED
Fri Aug 01 11:16:54 PDT 2014 - Creating ZK client...
Fri Aug 01 11:16:54 PDT 2014 - ZK client created...
Fri Aug 01 11:16:54 PDT 2014 - ZOOKEEPER STATE: SyncConnected
Fri Aug 01 11:16:58 PDT 2014 - ZOOKEEPER STATE: Disconnected
Fri Aug 01 11:16:58 PDT 2014 - CURATOR STATE: SUSPENDED
Fri Aug 01 11:17:16 PDT 2014 - CURATOR STATE: RECONNECTED
Fri Aug 01 11:17:17 PDT 2014 - ZOOKEEPER STATE: SyncConnected
Fri Aug 01 11:17:19 PDT 2014 - ZOOKEEPER STATE: Disconnected
Fri Aug 01 11:17:19 PDT 2014 - CURATOR STATE: SUSPENDED
Fri Aug 01 11:17:29 PDT 2014 - CURATOR STATE: LOST

I think that the LOST event is actually 30 seconds away from the very first 
SUSPENDED event, whereas is should be 30 seconds away from the last one.

To reproduce it, I started only 2 ZK servers in a 3 nodes ensembles, then I 
stopped one of them (-> 1st SUSPENDED), waited for 10-20 seconds, then started 
it and stopped it again.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to