Benjamin Jaton created CURATOR-134: -------------------------------------- Summary: Curator sends a connection LOST event before sessionTimeout Key: CURATOR-134 URL: https://issues.apache.org/jira/browse/CURATOR-134 Project: Apache Curator Issue Type: Bug Components: Client Affects Versions: 2.6.0 Environment: Ubuntu 12.04 Reporter: Benjamin Jaton Priority: Critical
Created a Curator client with: - connection timeout: 10 seconds - session timeout: 30 seconds - retry policy: RetryNTimes(3, 10000) A scenario where the ensemble is lost produces the the curator client to send a LOST event in less than the expected 30 seconds: Fri Aug 01 11:17:19 PDT 2014 - CURATOR STATE: SUSPENDED Fri Aug 01 11:17:29 PDT 2014 - CURATOR STATE: LOST The client code is attached, this is the complete output: Fri Aug 01 11:16:53 PDT 2014 - CURATOR STATE: CONNECTED Fri Aug 01 11:16:54 PDT 2014 - Creating ZK client... Fri Aug 01 11:16:54 PDT 2014 - ZK client created... Fri Aug 01 11:16:54 PDT 2014 - ZOOKEEPER STATE: SyncConnected Fri Aug 01 11:16:58 PDT 2014 - ZOOKEEPER STATE: Disconnected Fri Aug 01 11:16:58 PDT 2014 - CURATOR STATE: SUSPENDED Fri Aug 01 11:17:16 PDT 2014 - CURATOR STATE: RECONNECTED Fri Aug 01 11:17:17 PDT 2014 - ZOOKEEPER STATE: SyncConnected Fri Aug 01 11:17:19 PDT 2014 - ZOOKEEPER STATE: Disconnected Fri Aug 01 11:17:19 PDT 2014 - CURATOR STATE: SUSPENDED Fri Aug 01 11:17:29 PDT 2014 - CURATOR STATE: LOST I think that the LOST event is actually 30 seconds away from the very first SUSPENDED event, whereas is should be 30 seconds away from the last one. To reproduce it, I started only 2 ZK servers in a 3 nodes ensembles, then I stopped one of them (-> 1st SUSPENDED), waited for 10-20 seconds, then started it and stopped it again. -- This message was sent by Atlassian JIRA (v6.2#6252)