GitHub user Randgalt opened a pull request:
https://github.com/apache/curator/pull/97
[CURATOR-247] Extend Curator's connection state to support SESSION_LOST
This is a significant change. Please review carefully and let's have a lot
of discussion. Because the new behavior is so much better (and more consistent
with expectations) <strong>I've enabled it by default</strong>.
Major differences from the older behavior are:
* Session/connection timeouts are no longer managed by the low-level
client. They are managed by the CuratorFramework instance. There should be no
noticeable differences.
* Prior to 3.0.0, each iteration of the retry policy would allow the
connection timeout to elapse if the connection hadn't yet succeeded. This meant
that the true connection timeout was the configured value times the maximum
retries in the retry policy. This longstanding issue has been address. Now, the
connection timeout can elapse only once for a single API call.
* MOST IMPORTANTLY! Prior to 3.0.0, ConnectionState.LOST did not imply a
lost session (much to the confusion of users). Now, Curator will set the LOST
state only when it believes that the ZooKeeper session has expired. ZooKeeper
connections have a session. When the session expires, clients must take
appropriate action. In Curator, this is complicated by the fact that Curator
internally manages the ZooKeeper connection. Now, Curator will set the LOST
state when any of the following occurs: a) ZooKeeper returns a
Watcher.Event.KeeperState.Expired or KeeperException.Code.SESSIONEXPIRED; b)
Curator closes the internally managed ZooKeeper instance; c) The configured
session timeout elapses during a network partition.
Something important to consider. Given the significance of this change it
makes to have it be part of 3.0.0 but if we merge it into 3.0.0 now it will be
harder to maintain master and 3.0.0 as separate branches. Some git expertise is
needed here on how to manage this.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/curator CURATOR-247
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/curator/pull/97.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #97
----
commit 344634ac6e34e61bc0cc7b41923a1df4089c7948
Author: randgalt <[email protected]>
Date: 2015-08-21T17:10:24Z
First pass at new (optional) definition of state LOST
commit 2343daf29388566b0efa0b0a2ad21574fb534a27
Author: randgalt <[email protected]>
Date: 2015-08-21T20:11:59Z
Merge branch 'CURATOR-3.0' into CURATOR-247
commit 62f3c33cdb556eccf6fe1cc87ee74b3458431777
Author: randgalt <[email protected]>
Date: 2015-08-21T22:35:44Z
Continued work on new LOST behavior. Added some tests. To get correct
behavior it's necessary to not retry connection failures. Retrying connection
failures was never a good idea and here's a good opportunity to fix it as this
requires client action to enable
commit c5a49216cc78b052066661a8ded357e50e0b6313
Author: randgalt <[email protected]>
Date: 2015-08-21T22:37:15Z
license
commit d3170099757c7e17ff8fbee0c37d620aacb60d65
Author: randgalt <[email protected]>
Date: 2015-08-21T22:49:55Z
more tests
commit b8d4c3d77de029917820634fa4ed21be19bbcf2c
Author: randgalt <[email protected]>
Date: 2015-08-21T22:59:07Z
minor reformat
commit 847cc0d2415f59c2943d4a2734564119ffb38bb1
Author: randgalt <[email protected]>
Date: 2015-08-22T15:47:01Z
wip
commit ec2f9bd555d01b324bd5ef690b1036d98e1f3702
Author: randgalt <[email protected]>
Date: 2015-08-22T16:06:33Z
Fixed testRetry() for new LOST behavior
commit 6381ccb6536f4710248a50ae5d0313399bbfe858
Author: randgalt <[email protected]>
Date: 2015-08-22T22:50:09Z
removed some test code
commit e239137019608f02cabb23c27ab13adcef88c027
Author: randgalt <[email protected]>
Date: 2015-08-23T00:06:55Z
major refactoring. Abstracting old/new behavior into a pluggable
ConnectionHandlingPolicy. Also, IMPORTANT, made the new behavior the default.
This needs to be discussed but it's a major improvement and we should default
to it.
commit 30bd7b655d201762d8ff74062964621879ac7134
Author: randgalt <[email protected]>
Date: 2015-08-23T00:29:36Z
further refactoring. Abstracted old framework-level connection handling
into ClassicInternalConnectionHandler. Probably more to do here
commit 23554479597d654fa8318cdc579fc3cc29bc2c54
Author: randgalt <[email protected]>
Date: 2015-08-23T01:10:34Z
Curator has a big problem with thread interrupted states getting cleared.
There are several issues on this (CURATOR-208, CURATOR-205, CURATOR-228,
CURATOR-109
commit 05d241da642c6ba0d16b3ce97557128fad4dfe41
Author: randgalt <[email protected]>
Date: 2015-08-23T01:32:41Z
When the connection timeout elapses and there is more than one server in
the connection string, reset the connection and try again
commit face4034e9fdcc9ffdb394c7c1682add834a1e10
Author: randgalt <[email protected]>
Date: 2015-08-23T02:54:24Z
Longer connection timeout needed
commit 5f094f8bb6dca3c056051cb8800b418839cca0e1
Author: randgalt <[email protected]>
Date: 2015-08-23T12:49:17Z
More refinement of classic/new connection handling. Reworked how the retry
policy is invoked for each. New behavior is now confirmed to be: wait for
connection timeout only once. Some tests will need work due to this
commit e001e0098f64baa8e0b3b887507bc98972c775dc
Author: randgalt <[email protected]>
Date: 2015-08-23T14:33:46Z
more work on repairing tests for new connection handling
commit 1a2a94b625e7e1b5e535414e397e9b3a4173ca1b
Author: randgalt <[email protected]>
Date: 2015-08-23T15:54:29Z
more work on repairing tests for new connection handling
commit 64d966c18b9d18c40e13fda98e52d9253b281086
Author: randgalt <[email protected]>
Date: 2015-08-23T15:57:48Z
doc
commit 9c7cf5d8ba495bccdea2bcb6b377e95f5f99d521
Author: randgalt <[email protected]>
Date: 2015-08-23T16:02:19Z
doc
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---