[ 
https://issues.apache.org/jira/browse/CURATOR-246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman updated CURATOR-246:
-------------------------------------
    Comment: was deleted

(was: Implementation notes so far:

* It makes more sense to alter the meaning of the current LOST state than 
adding a new state
* Now is a good time to fix a very old problem. Every API call bottlenecks 
through RetryLoop.callWithRetry(). The first thing this method does is 
client.internalBlockUntilConnectedOrTimedOut(). If the connection doesn't 
succeed, the actual API call will fail and the retry policy will signal a retry 
which again calls client.internalBlockUntilConnectedOrTimedOut(). This is not 
reasonable behavior and makes having a true LOST session event more difficult. 
So, if the new behavior is enabled, a timeout during connection will 
immediately throw KeeperException.ConnectionLossException without retrying
* ConnectionStateManager has been altered so that the event poller will post a 
LOST state if the configured session timeout elapses
* When the new behavior is enabled, the background sync() call is no longer 
made when the Disconnect is received. It is no longer necessary as the 
ConnectionStateManager is now watching for session timeout.
* The Base testing class now runs each test twice. Once in the pre 3.0 mode and 
once with enableSessionExpiredState set to true)

> Parent task for adding a SESSION_LOST connection state, etc.
> ------------------------------------------------------------
>
>                 Key: CURATOR-246
>                 URL: https://issues.apache.org/jira/browse/CURATOR-246
>             Project: Apache Curator
>          Issue Type: New Feature
>          Components: Framework, Recipes
>            Reporter: Dong Lei
>
> Spark now leverage curator to help manage the connections to ZK and do leader 
> election. 
> Currently, whenever a ZK session gets disassociated, the 
> ConnectionStateManager will be aware and mark the state to be SUSPENDED and a 
> new leader election will be triggered. 
> Even though a ZK session is able to reconnect to another machine very soon. 
> I wonder if we can tolerate such unstable network trembling and do not 
> trigger a leader election. Because the upper layer application's (like spark) 
> reaction of new leader can be very costly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to