[GitHub] zookeeper pull request #377: [ZOOKEEPER-2901] TTL Nodes don't work with Serv...

2017-10-20 Thread DanBenediktson
Github user DanBenediktson commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/377#discussion_r146089767
  
--- Diff: conf/zoo_sample.cfg ---
@@ -6,6 +6,10 @@ initLimit=10
 # The number of ticks that can pass between 
 # sending a request and getting an acknowledgement
 syncLimit=5
+# enable TTL Nodes
+# IMPORTANT: when enabled, your server ID cannot be greater than 254
--- End diff --

Same comment I left on the JIRA ticket: shouldn't it be 127, not 254? My 
understanding of the problem was that the whole high bit was being used, not 
the whole byte sequence of 255.


---


[GitHub] zookeeper pull request #342: ZOOKEEPER-2488: Synchronized access to shutting...

2017-08-23 Thread DanBenediktson
Github user DanBenediktson commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/342#discussion_r134784530
  
--- Diff: src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java 
---
@@ -1155,6 +1134,19 @@ public void run() {
 }
 }
 
+private void electionAndSetCurVote() {
+reconfigFlagClear();
+if (shuttingDownLE) {
+startLeaderElection();
--- End diff --

How come we don't need to set shuttingDownLE back to false here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] zookeeper pull request #330: ZOOKEEPER-2471: ZK Java client should not count...

2017-08-09 Thread DanBenediktson
GitHub user DanBenediktson opened a pull request:

https://github.com/apache/zookeeper/pull/330

ZOOKEEPER-2471: ZK Java client should not count sleep time as connect time

ClientCnxnSocket uses a member variable "now" to track the current time, 
but does not update it at all potentially-blocking times: in particular, it 
does not update it after the random sleep introduced if an initial connect 
attempt fails. This results in the random sleep time being counted towards 
connect time, resulting in incorrect application of connection timeout 
currently, and if ZOOKEEPER-2869 is taken, a very real possibility (we have 
seen it in production) of wedging the Zookeeper client so that it can never 
successfully reconnect, because its sleep time may grow beyond its connection 
timeout, especially in scenarios where there is a big gap between negotiated 
session timeout and client-requested session timeout.

Rather than fixing the bug by adding another "updateNow()" call, keeping 
the brittle "updateNow()" implementation which led to the bug in the first 
place, I have deleted updateNow() and replaced usage of that member variable 
with actually getting the current system timestamp whenever the implementation 
needs to know the current time.

Regarding unit testing, this is, IMO, too difficult to test without 
introducing a lot of invasive changes to ClientCnxn.java, seeing as the only 
effective change is that, on connection retry, the random sleep time is no 
longer counted towards a time budget. I can throw a lot of mocks at this, like 
ClientReconnectTest, but I'm still going to be stuck depending on the behavior 
of that randomly-generated sleep time, which is going to be inherently 
unreliable. If a fix is taken for ZOOKEEPER-2869, this should become much 
easier to test, since I will then be able to inject a different backoff sleep 
behavior, and since I'm planning to submit a pull request for that ticket as 
well, so maybe as a compromise I can submit a test for this bug fix at that 
time?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/DanBenediktson/zookeeper ZOOKEEPER-2471

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/330.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #330


commit 60f38726e7f07b4bb970cc8fb089363ff48eb3df
Author: Dan Benediktson <dbenedikt...@twitter.com>
Date:   2017-08-09T16:41:42Z

ZOOKEEPER-2471: Zookeeper Java client should not count time spent sleeping 
as time spent connecting

Rather than keep the brittle "updateNow()" implementation which led to the 
bug and fixing the bug by
adding another "updateNow()" call, I have deleted updateNow() and replaced 
usage of that member variable
with actually getting the current system timestamp.

This is, IMO, too difficult to test without introducing a lot of invasive 
changes to ClientCnxn.java,
seeing as the only effective change is that, on connection retry, a random 
sleep time is no longer
counted towards a time budget. If a fix is taken for ZOOKEEPER-2869, this 
should become much easier to
test, and since I'm planning to submit a pull request for that ticket as 
well, maybe as a compromise
I can submit a test for this patch at that time?




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---