[jira] [Commented] (ZOOKEEPER-2080) Fix deadlock in dynamic reconfiguration

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999296#comment-15999296
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2080:
---

Github user hanm commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/247#discussion_r115113425
  
--- Diff: src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java 
---
@@ -682,27 +682,19 @@ public void setQuorumAddress(InetSocketAddress addr){
 }
 
 public InetSocketAddress getElectionAddress(){
-synchronized (QV_LOCK) {
-return myElectionAddr;
-}
+return myElectionAddr;
--- End diff --

Note this synchronization was introduced as part of ZOOKEEPER-2080 (which 
fixed a separate dead lock issue.). The intent was to protect access on these 
states while a reconfig operation is in flight, but my analysis indicate it 
might be totally fine without these synchronizations.


> Fix deadlock in dynamic reconfiguration
> ---
>
> Key: ZOOKEEPER-2080
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2080
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: server
>Affects Versions: 3.5.2
>Reporter: Ted Yu
>Assignee: Michael Han
> Fix For: 3.5.3, 3.6.0
>
> Attachments: jacoco-ZOOKEEPER-2080.unzip-grows-to-70MB.7z, 
> repro-20150816.log, threaddump.log, ZOOKEEPER-2080.patch, 
> ZOOKEEPER-2080.patch, ZOOKEEPER-2080.patch, ZOOKEEPER-2080.patch, 
> ZOOKEEPER-2080.patch, ZOOKEEPER-2080.patch
>
>
> I got the following test failure on MacBook with trunk code:
> {code}
> Testcase: testCurrentObserverIsParticipantInNewConfig took 93.628 sec
>   FAILED
> waiting for server 2 being up
> junit.framework.AssertionFailedError: waiting for server 2 being up
>   at 
> org.apache.zookeeper.server.quorum.ReconfigRecoveryTest.testCurrentObserverIsParticipantInNewConfig(ReconfigRecoveryTest.java:529)
>   at 
> org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2778) Potential server deadlock between follower sync with leader and follower receiving external connection requests.

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999304#comment-15999304
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2778:
---

GitHub user hanm reopened a pull request:

https://github.com/apache/zookeeper/pull/247

ZOOKEEPER-2778: Potential server deadlock between follower sync with leader 
and follower receiving external connection requests.

Remove synchronization requirements on certain methods to prevent dead 
lock. Current analysis indicates these methods don't require synchronization 
for them to work properly. Patch is stress tested with 1k runs of entire unit 
test suites.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hanm/zookeeper ZOOKEEPER-2778

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/247.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #247


commit 4453ab2a32f8b1a6d195a65c5466c1749bbb3464
Author: Michael Han 
Date:   2017-05-06T05:13:26Z

Potential server deadlock between follower sync with leader and follower 
receiving external connection requests.
Remove synchronization requirements on certain methods to prevent dead 
lock. Current analysis indicates these methods don't require synchronization 
for them to work properly.




> Potential server deadlock between follower sync with leader and follower 
> receiving external connection requests.
> 
>
> Key: ZOOKEEPER-2778
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2778
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Michael Han
>Assignee: Michael Han
>Priority: Critical
>
> It's possible to have a deadlock during recovery phase. 
> Found this issue by analyzing thread dumps of "flaky" ReconfigRecoveryTest 
> [1]. . Here is a sample thread dump that illustrates the state of the 
> execution:
> {noformat}
> [junit]  java.lang.Thread.State: BLOCKED
> [junit] at  
> org.apache.zookeeper.server.quorum.QuorumPeer.getElectionAddress(QuorumPeer.java:686)
> [junit] at  
> org.apache.zookeeper.server.quorum.QuorumCnxManager.initiateConnection(QuorumCnxManager.java:265)
> [junit] at  
> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:445)
> [junit] at  
> org.apache.zookeeper.server.quorum.QuorumCnxManager.receiveConnection(QuorumCnxManager.java:369)
> [junit] at  
> org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:642)
> [junit] 
> [junit]  java.lang.Thread.State: BLOCKED
> [junit] at  
> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:472)
> [junit] at  
> org.apache.zookeeper.server.quorum.QuorumPeer.connectNewPeers(QuorumPeer.java:1438)
> [junit] at  
> org.apache.zookeeper.server.quorum.QuorumPeer.setLastSeenQuorumVerifier(QuorumPeer.java:1471)
> [junit] at  
> org.apache.zookeeper.server.quorum.Learner.syncWithLeader(Learner.java:520)
> [junit] at  
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:88)
> [junit] at  
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1133)
> {noformat}
> The dead lock happens between the quorum peer thread which running the 
> follower that doing sync with leader work, and the listener of the qcm of the 
> same quorum peer that doing the receiving connection work. Basically to 
> finish sync with leader, the follower needs to synchronize on both QV_LOCK 
> and the qmc object it owns; while in the receiver thread to finish setup an 
> incoming connection the thread needs to synchronize on both the qcm object 
> the quorum peer owns, and the same QV_LOCK. It's easy to see the problem here 
> is the order of acquiring two locks are different, thus depends on timing / 
> actual execution order, two threads might end up acquiring one lock while 
> holding another.
> [1] 
> org.apache.zookeeper.server.quorum.ReconfigRecoveryTest.testCurrentServersAreObserversInNextConfig



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2778) Potential server deadlock between follower sync with leader and follower receiving external connection requests.

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999303#comment-15999303
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2778:
---

Github user hanm closed the pull request at:

https://github.com/apache/zookeeper/pull/247


> Potential server deadlock between follower sync with leader and follower 
> receiving external connection requests.
> 
>
> Key: ZOOKEEPER-2778
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2778
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum
>Affects Versions: 3.5.3
>Reporter: Michael Han
>Assignee: Michael Han
>Priority: Critical
>
> It's possible to have a deadlock during recovery phase. 
> Found this issue by analyzing thread dumps of "flaky" ReconfigRecoveryTest 
> [1]. . Here is a sample thread dump that illustrates the state of the 
> execution:
> {noformat}
> [junit]  java.lang.Thread.State: BLOCKED
> [junit] at  
> org.apache.zookeeper.server.quorum.QuorumPeer.getElectionAddress(QuorumPeer.java:686)
> [junit] at  
> org.apache.zookeeper.server.quorum.QuorumCnxManager.initiateConnection(QuorumCnxManager.java:265)
> [junit] at  
> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:445)
> [junit] at  
> org.apache.zookeeper.server.quorum.QuorumCnxManager.receiveConnection(QuorumCnxManager.java:369)
> [junit] at  
> org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:642)
> [junit] 
> [junit]  java.lang.Thread.State: BLOCKED
> [junit] at  
> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:472)
> [junit] at  
> org.apache.zookeeper.server.quorum.QuorumPeer.connectNewPeers(QuorumPeer.java:1438)
> [junit] at  
> org.apache.zookeeper.server.quorum.QuorumPeer.setLastSeenQuorumVerifier(QuorumPeer.java:1471)
> [junit] at  
> org.apache.zookeeper.server.quorum.Learner.syncWithLeader(Learner.java:520)
> [junit] at  
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:88)
> [junit] at  
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1133)
> {noformat}
> The dead lock happens between the quorum peer thread which running the 
> follower that doing sync with leader work, and the listener of the qcm of the 
> same quorum peer that doing the receiving connection work. Basically to 
> finish sync with leader, the follower needs to synchronize on both QV_LOCK 
> and the qmc object it owns; while in the receiver thread to finish setup an 
> incoming connection the thread needs to synchronize on both the qcm object 
> the quorum peer owns, and the same QV_LOCK. It's easy to see the problem here 
> is the order of acquiring two locks are different, thus depends on timing / 
> actual execution order, two threads might end up acquiring one lock while 
> holding another.
> [1] 
> org.apache.zookeeper.server.quorum.ReconfigRecoveryTest.testCurrentServersAreObserversInNextConfig



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-1932) org.apache.zookeeper.test.LETest.testLE fails once in a while

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001538#comment-16001538
 ] 

ASF GitHub Bot commented on ZOOKEEPER-1932:
---

Github user hanm commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/106#discussion_r115353017
  
--- Diff: src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java 
---
@@ -809,28 +809,16 @@ synchronized public void stopLeaderElection() {
 responder.interrupt();
--- End diff --

removed.


> org.apache.zookeeper.test.LETest.testLE fails once in a while
> -
>
> Key: ZOOKEEPER-1932
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1932
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: leaderElection
>Affects Versions: 3.5.0
>Reporter: Michi Mutsuzaki
>Assignee: Michael Han
> Fix For: 3.6.0
>
> Attachments: TEST-org.apache.zookeeper.test.LETest.txt, 
> ZOOKEEPER-1932.patch, ZOOKEEPER-1932.patch
>
>
> org.apache.zookeeper.test.LETest.testLE is failing on trunk once in a while. 
> I'm not able to reproduce the failure on my box. I looked at the log, but I 
> couldn't quite figure out what's going on. 
> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper-trunk/2315/testReport/
> Update:
> ==
> Because LE is deprecated there is not much points on spending effort fixing 
> it, as discussed in the JIRA. Updated JIRA title to reflect the state of the 
> issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-1932) org.apache.zookeeper.test.LETest.testLE fails once in a while

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001536#comment-16001536
 ] 

ASF GitHub Bot commented on ZOOKEEPER-1932:
---

Github user hanm commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/106#discussion_r115352994
  
--- Diff: src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java 
---
@@ -930,39 +918,36 @@ protected Election createElectionAlgorithm(int 
electionAlgorithm){
 
 //TODO: use a factory rather than a switch
 switch (electionAlgorithm) {
-case 0:
-le = new LeaderElection(this);
-break;
-case 1:
-le = new AuthFastLeaderElection(this);
-break;
-case 2:
-le = new AuthFastLeaderElection(this, true);
-break;
-case 3:
-qcm = new QuorumCnxManager(this);
-QuorumCnxManager.Listener listener = qcm.listener;
-if(listener != null){
-listener.start();
-FastLeaderElection fle = new FastLeaderElection(this, qcm);
-fle.start();
-le = fle;
-} else {
-LOG.error("Null listener when initializing cnx manager");
-}
-break;
-default:
-assert false;
+case 0:
+assert false : "Leader election algorithm type 0 is not 
supported anymore.";
+break;
+case 1:
+le = new AuthFastLeaderElection(this);
+break;
+case 2:
+le = new AuthFastLeaderElection(this, true);
+break;
+case 3:
+qcm = new QuorumCnxManager(this);
+QuorumCnxManager.Listener listener = qcm.listener;
+if(listener != null){
+listener.start();
+FastLeaderElection fle = new FastLeaderElection(this, 
qcm);
+fle.start();
+le = fle;
+} else {
+LOG.error("Null listener when initializing cnx 
manager");
+}
+break;
+default:
+assert false;
 }
 return le;
 }
 
 @SuppressWarnings("deprecation")
 protected Election makeLEStrategy(){
 LOG.debug("Initializing leader election protocol...");
-if (getElectionType() == 0) {
-electionAlg = new LeaderElection(this);
-}
 return electionAlg;
 }
 
--- End diff --

good catch. fixed.


> org.apache.zookeeper.test.LETest.testLE fails once in a while
> -
>
> Key: ZOOKEEPER-1932
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1932
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: leaderElection
>Affects Versions: 3.5.0
>Reporter: Michi Mutsuzaki
>Assignee: Michael Han
> Fix For: 3.6.0
>
> Attachments: TEST-org.apache.zookeeper.test.LETest.txt, 
> ZOOKEEPER-1932.patch, ZOOKEEPER-1932.patch
>
>
> org.apache.zookeeper.test.LETest.testLE is failing on trunk once in a while. 
> I'm not able to reproduce the failure on my box. I looked at the log, but I 
> couldn't quite figure out what's going on. 
> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper-trunk/2315/testReport/
> Update:
> ==
> Because LE is deprecated there is not much points on spending effort fixing 
> it, as discussed in the JIRA. Updated JIRA title to reflect the state of the 
> issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-1932) org.apache.zookeeper.test.LETest.testLE fails once in a while

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001539#comment-16001539
 ] 

ASF GitHub Bot commented on ZOOKEEPER-1932:
---

Github user hanm commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/106#discussion_r115353053
  
--- Diff: src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml ---
@@ -948,15 +948,14 @@ server.3=zoo3:2888:3888
 
   (No Java system property)
 
-  Election implementation to use. A value of "0" 
corresponds
-  to the original UDP-based version, "1" corresponds to the
+  Election implementation to use. A value of "1" 
corresponds to the
   non-authenticated UDP-based version of fast leader election, 
"2"
   corresponds to the authenticated UDP-based version of fast
   leader election, and "3" corresponds to TCP-based version of
-  fast leader election. Currently, algorithm 3 is the 
default
+  fast leader election. Currently, algorithm 3 is the 
default.
   
   
-   The implementations of leader election 0, 1, and 2 
are now 
+   The implementations of leader election 1, and 2 are 
now
deprecated . We have the 
intention
   of removing them in the next release, at which point only 
the 
   FastLeaderElection will be available. 
--- End diff --

removed.


> org.apache.zookeeper.test.LETest.testLE fails once in a while
> -
>
> Key: ZOOKEEPER-1932
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1932
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: leaderElection
>Affects Versions: 3.5.0
>Reporter: Michi Mutsuzaki
>Assignee: Michael Han
> Fix For: 3.6.0
>
> Attachments: TEST-org.apache.zookeeper.test.LETest.txt, 
> ZOOKEEPER-1932.patch, ZOOKEEPER-1932.patch
>
>
> org.apache.zookeeper.test.LETest.testLE is failing on trunk once in a while. 
> I'm not able to reproduce the failure on my box. I looked at the log, but I 
> couldn't quite figure out what's going on. 
> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper-trunk/2315/testReport/
> Update:
> ==
> Because LE is deprecated there is not much points on spending effort fixing 
> it, as discussed in the JIRA. Updated JIRA title to reflect the state of the 
> issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-1932) org.apache.zookeeper.test.LETest.testLE fails once in a while

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001545#comment-16001545
 ] 

ASF GitHub Bot commented on ZOOKEEPER-1932:
---

Github user hanm commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/106#discussion_r115353405
  
--- Diff: src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java 
---
@@ -930,39 +918,36 @@ protected Election createElectionAlgorithm(int 
electionAlgorithm){
 
 //TODO: use a factory rather than a switch
 switch (electionAlgorithm) {
-case 0:
-le = new LeaderElection(this);
-break;
-case 1:
-le = new AuthFastLeaderElection(this);
-break;
-case 2:
-le = new AuthFastLeaderElection(this, true);
-break;
-case 3:
-qcm = new QuorumCnxManager(this);
-QuorumCnxManager.Listener listener = qcm.listener;
-if(listener != null){
-listener.start();
-FastLeaderElection fle = new FastLeaderElection(this, qcm);
-fle.start();
-le = fle;
-} else {
-LOG.error("Null listener when initializing cnx manager");
-}
-break;
-default:
-assert false;
+case 0:
+assert false : "Leader election algorithm type 0 is not 
supported anymore.";
--- End diff --

I like the idea of catching invalid electionAlg value when parsing config 
file. Code updated by throwing an exception when input value is bad.


> org.apache.zookeeper.test.LETest.testLE fails once in a while
> -
>
> Key: ZOOKEEPER-1932
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1932
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: leaderElection
>Affects Versions: 3.5.0
>Reporter: Michi Mutsuzaki
>Assignee: Michael Han
> Fix For: 3.6.0
>
> Attachments: TEST-org.apache.zookeeper.test.LETest.txt, 
> ZOOKEEPER-1932.patch, ZOOKEEPER-1932.patch
>
>
> org.apache.zookeeper.test.LETest.testLE is failing on trunk once in a while. 
> I'm not able to reproduce the failure on my box. I looked at the log, but I 
> couldn't quite figure out what's going on. 
> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper-trunk/2315/testReport/
> Update:
> ==
> Because LE is deprecated there is not much points on spending effort fixing 
> it, as discussed in the JIRA. Updated JIRA title to reflect the state of the 
> issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-1932) org.apache.zookeeper.test.LETest.testLE fails once in a while

2017-05-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001546#comment-16001546
 ] 

ASF GitHub Bot commented on ZOOKEEPER-1932:
---

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/106
  
@arshadmohammad thanks for review, code is updated, please take a look at. 
I also updated JIRA description so it's consistent with the pull request title.


> org.apache.zookeeper.test.LETest.testLE fails once in a while
> -
>
> Key: ZOOKEEPER-1932
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1932
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: leaderElection
>Affects Versions: 3.5.0
>Reporter: Michi Mutsuzaki
>Assignee: Michael Han
> Fix For: 3.6.0
>
> Attachments: TEST-org.apache.zookeeper.test.LETest.txt, 
> ZOOKEEPER-1932.patch, ZOOKEEPER-1932.patch
>
>
> org.apache.zookeeper.test.LETest.testLE is failing on trunk once in a while. 
> I'm not able to reproduce the failure on my box. I looked at the log, but I 
> couldn't quite figure out what's going on. 
> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper-trunk/2315/testReport/
> Update:
> ==
> Because LE is deprecated there is not much points on spending effort fixing 
> it, as discussed in the JIRA. Updated JIRA title to reflect the state of the 
> issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2779) Add option to not set ACL for reconfig node

2017-05-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16002752#comment-16002752
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2779:
---

GitHub user Randgalt opened a pull request:

https://github.com/apache/zookeeper/pull/248

[ZOOKEEPER-2779} Provide a means to disable setting of the Read Only ACL 
for the reconfig node

Provide a means to disable setting of the Read Only ACL for the reconfig 
node added in ZOOKEEPER-2014. That change made it very cumbersome to use the 
reconfig feature and also could worsen security as the entire ZK database is 
open to "super" user while the reconfig node is being changed (the only 
possible method as of ZOOKEEPER-2014).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Randgalt/zookeeper ZOOKEEPER-2779

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/248.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #248


commit de07e525ec8b4aa24283eedb159f8df2b576e71b
Author: randgalt 
Date:   2017-05-09T14:12:41Z

Provide a means to disable setting of the Read Only ACL for the reconfig 
node added in ZOOKEEPER-2014. That change made it very cumbersome to use the 
reconfig feature and also could worsen security as the entire ZK database is 
open to "super" user while the reconfig node is being changed (the only 
possible method as of ZOOKEEPER-2014).




> Add option to not set ACL for reconfig node
> ---
>
> Key: ZOOKEEPER-2779
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2779
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.4, 3.6.0
>
>
> ZOOKEEPER-2014 changed the behavior of the /zookeeper/config node by setting 
> the ACL to {{ZooDefs.Ids.READ_ACL_UNSAFE}}. This change makes it very 
> cumbersome to use the reconfig APIs. It also, perversely, makes security 
> worse as the ZooKeeper instance must be opened to "super" user while enabled 
> reconfig (per {{ReconfigExceptionTest.java}}). Provide a mechanism for savvy 
> users to disable this ACL so that an application-specific custom ACL can be 
> set.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2779) Add option to not set ACL for reconfig node

2017-05-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16002757#comment-16002757
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2779:
---

GitHub user Randgalt opened a pull request:

https://github.com/apache/zookeeper/pull/249

[ZOOKEEPER-2779] Branch 3.5 backport

Branch 3.5 backport of https://github.com/apache/zookeeper/pull/248

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Randgalt/zookeeper ZOOKEEPER-2779-3.5

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/249.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #249


commit 5b59ee2e2cbcc65a76b76cf4fc5f0b6c6a92a980
Author: randgalt 
Date:   2017-05-09T14:12:41Z

Provide a means to disable setting of the Read Only ACL for the reconfig 
node added in ZOOKEEPER-2014. That change made it very cumbersome to use the 
reconfig feature and also could worsen security as the entire ZK database is 
open to "super" user while the reconfig node is being changed (the only 
possible method as of ZOOKEEPER-2014).




> Add option to not set ACL for reconfig node
> ---
>
> Key: ZOOKEEPER-2779
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2779
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.4, 3.6.0
>
>
> ZOOKEEPER-2014 changed the behavior of the /zookeeper/config node by setting 
> the ACL to {{ZooDefs.Ids.READ_ACL_UNSAFE}}. This change makes it very 
> cumbersome to use the reconfig APIs. It also, perversely, makes security 
> worse as the ZooKeeper instance must be opened to "super" user while enabled 
> reconfig (per {{ReconfigExceptionTest.java}}). Provide a mechanism for savvy 
> users to disable this ACL so that an application-specific custom ACL can be 
> set.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2691) recreateSocketAddresses may recreate the unreachable IP address

2017-05-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005777#comment-16005777
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2691:
---

Github user JiangJiafu commented on the issue:

https://github.com/apache/zookeeper/pull/173
  
@hanm Hi, May I ask when will this problem be fixed? And will it be fixed 
on 3.4.X(stable) version? Thanks.


> recreateSocketAddresses may recreate the unreachable IP address
> ---
>
> Key: ZOOKEEPER-2691
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2691
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.8
> Environment: Centos6.5
> Java8
> ZooKeeper3.4.8
>Reporter: JiangJiafu
>Priority: Minor
>
> The QuorumPeer$QuorumServer.recreateSocketAddress()  is used to resolved the 
> hostname to a new IP address(InetAddress) when any exception happens to the 
> socket. It will be very useful when a hostname can be resolved to more than 
> one IP address.
> But the problem is Java API InetAddress.getByName(String hostname) will 
> always return the first IP address when the hostname can be resolved to more 
> than one IP address, and the first IP address may be unreachable forever. For 
> example, if a machine has two network interfaces: eth0, eth1, say eth0 has 
> ip1, eth1 has ip2, the relationship between hostname and the IP addresses is 
> set in /etc/hosts. When I "close" the eth0 by command "ifdown eth0", the 
> InetAddress.getByName(String hostname)  will still return ip1, which is 
> unreachable forever.
> So I think it will be better to check the IP address by 
> InetAddress.isReachable(long) and choose the reachable IP address. 
> I have modified the ZooKeeper source code, and test the new code in my own 
> environment, and it can work very well when I turn down some network 
> interfaces using "ifdown" command.
> The original code is:
> {code:title=QuorumPeer.java|borderStyle=solid}
> public void recreateSocketAddresses() {
> InetAddress address = null;
> try {
> address = InetAddress.getByName(this.hostname);
> LOG.info("Resolved hostname: {} to address: {}", 
> this.hostname, address);
> this.addr = new InetSocketAddress(address, this.port);
> if (this.electionPort > 0){
> this.electionAddr = new InetSocketAddress(address, 
> this.electionPort);
> }
> } catch (UnknownHostException ex) {
> LOG.warn("Failed to resolve address: {}", this.hostname, ex);
> // Have we succeeded in the past?
> if (this.addr != null) {
> // Yes, previously the lookup succeeded. Leave things as 
> they are
> return;
> }
> // The hostname has never resolved. Create our 
> InetSocketAddress(es) as unresolved
> this.addr = InetSocketAddress.createUnresolved(this.hostname, 
> this.port);
> if (this.electionPort > 0){
> this.electionAddr = 
> InetSocketAddress.createUnresolved(this.hostname,
>
> this.electionPort);
> }
> }
> }
> {code}
> After my modification:
> {code:title=QuorumPeer.java|borderStyle=solid}
> public void recreateSocketAddresses() {
> InetAddress address = null;
> try {
> address = getReachableAddress(this.hostname);
> LOG.info("Resolved hostname: {} to address: {}", 
> this.hostname, address);
> this.addr = new InetSocketAddress(address, this.port);
> if (this.electionPort > 0){
> this.electionAddr = new InetSocketAddress(address, 
> this.electionPort);
> }
> } catch (UnknownHostException ex) {
> LOG.warn("Failed to resolve address: {}", this.hostname, ex);
> // Have we succeeded in the past?
> if (this.addr != null) {
> // Yes, previously the lookup succeeded. Leave things as 
> they are
> return;
> }
> // The hostname has never resolved. Create our 
> InetSocketAddress(es) as unresolved
> this.addr = InetSocketAddress.createUnresolved(this.hostname, 
> this.port);
> if (this.electionPort > 0){
> this.electionAddr = 
> InetSocketAddress.createUnresolved(this.hostname,
>
> this.electionPort);
> }
> }
> }
> public InetAddress getReacha

[jira] [Commented] (ZOOKEEPER-2755) Allow to subclass ClientCnxnSocketNetty and NettyServerCnxn in order to use Netty Local transport

2017-05-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16006087#comment-16006087
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2755:
---

Github user eolivelli commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/227#discussion_r115932842
  
--- Diff: src/java/test/org/apache/zookeeper/test/NettyLocalSuiteTest.java 
---
@@ -0,0 +1,35 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.zookeeper.test;
+
+import org.junit.runners.Suite;
+
+/**
+ * Run tests with: Netty Client against Netty server
+ */
+@Suite.SuiteClasses({
--- End diff --

tagging @arshadmohammad @hanm  for review/merge

@Randgalt do you think this new feature would be useful for Curator and for 
local testing of apps which use ZooKeeper ?


> Allow to subclass ClientCnxnSocketNetty and NettyServerCnxn in order to use 
> Netty Local transport
> -
>
> Key: ZOOKEEPER-2755
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2755
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: java client, server
>Affects Versions: 3.5.2
>Reporter: Enrico Olivelli
>
> ClientCnxnSocketNetty and NettyServerCnxn use explicitly InetSocketAddress 
> class to work with network addresses.
> We can do a little refactoring to use only SocketAddress and make it possible 
> to create subclasses of ClientCnxnSocketNetty and NettyServerCnxn which 
> leverage built-in Netty 'local' channels. 
> Such Netty local channels do not create real sockets and so allow a simple 
> ZooKeeper server + ZooKeeper client to be run on the same JVM without binding 
> to real TCP endpoints.
> Usecases:
> Ability to run concurrently on the same machine tests of projects which use 
> ZooKeeper (usually in unit tests the server and the client run inside the 
> same JVM) without dealing with random ports and in general using less network 
> resources
> Run simplified (standalone, all processes in the same JVM) versions of 
> applications which need a working ZooKeeper ensemble to run.
> Note:
> Embedding ZooKeeper server + client on the same JVM has many risks and in 
> general I think we should encourage users to do so, so I in this patch I will 
> not provide official implementations of ClientCnxnSocketNetty and 
> NettyServerCnxn. There will be implementations only inside the test packages, 
> in order to test that most of the features are working with custom socket 
> factories and in particular with the 'LocalAddress' specific subclass of 
> SocketAddress.
> Note:
> the 'Local' sockets feature will be available on Netty 4 too



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-1932) org.apache.zookeeper.test.LETest.testLE fails once in a while

2017-05-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16006555#comment-16006555
 ] 

ASF GitHub Bot commented on ZOOKEEPER-1932:
---

Github user arshadmohammad commented on the issue:

https://github.com/apache/zookeeper/pull/106
  
LGTM  +1


> org.apache.zookeeper.test.LETest.testLE fails once in a while
> -
>
> Key: ZOOKEEPER-1932
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1932
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: leaderElection
>Affects Versions: 3.5.0
>Reporter: Michi Mutsuzaki
>Assignee: Michael Han
> Fix For: 3.6.0
>
> Attachments: TEST-org.apache.zookeeper.test.LETest.txt, 
> ZOOKEEPER-1932.patch, ZOOKEEPER-1932.patch
>
>
> org.apache.zookeeper.test.LETest.testLE is failing on trunk once in a while. 
> I'm not able to reproduce the failure on my box. I looked at the log, but I 
> couldn't quite figure out what's going on. 
> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper-trunk/2315/testReport/
> Update:
> ==
> Because LE is deprecated there is not much points on spending effort fixing 
> it, as discussed in the JIRA. Updated JIRA title to reflect the state of the 
> issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-1932) org.apache.zookeeper.test.LETest.testLE fails once in a while

2017-05-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16006587#comment-16006587
 ] 

ASF GitHub Bot commented on ZOOKEEPER-1932:
---

Github user asfgit closed the pull request at:

https://github.com/apache/zookeeper/pull/106


> org.apache.zookeeper.test.LETest.testLE fails once in a while
> -
>
> Key: ZOOKEEPER-1932
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1932
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: leaderElection
>Affects Versions: 3.5.0
>Reporter: Michi Mutsuzaki
>Assignee: Michael Han
> Fix For: 3.6.0
>
> Attachments: TEST-org.apache.zookeeper.test.LETest.txt, 
> ZOOKEEPER-1932.patch, ZOOKEEPER-1932.patch
>
>
> org.apache.zookeeper.test.LETest.testLE is failing on trunk once in a while. 
> I'm not able to reproduce the failure on my box. I looked at the log, but I 
> couldn't quite figure out what's going on. 
> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper-trunk/2315/testReport/
> Update:
> ==
> Because LE is deprecated there is not much points on spending effort fixing 
> it, as discussed in the JIRA. Updated JIRA title to reflect the state of the 
> issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2691) recreateSocketAddresses may recreate the unreachable IP address

2017-05-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007175#comment-16007175
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2691:
---

Github user hanm commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/173#discussion_r116099795
  
--- Diff: src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java 
---
@@ -181,6 +197,33 @@ public void recreateSocketAddresses() {
 }
 }
 
+/**
+ * Resolve the hostname to IP addresses, and find one reachable 
address.
+ *
+ * @param hostname the name of the host
+ * @param timeout the time, in millseconds, before {@link 
InetAddress#isReachable}
+ *aborts
+ * @return a reachable IP address. If no such IP address can be 
found,
+ * just return the first IP address of the hostname.
+ *
+ * @exception UnknownHostException
+ */
+public InetAddress getReachableAddress(String hostname, int 
timeout) 
+throws UnknownHostException {
+InetAddress[] addresses = InetAddress.getAllByName(hostname);
+for (InetAddress a : addresses) {
+try {
+if (a.isReachable(timeout)) {
--- End diff --

I think this is a valid concern. On top of this, I think we should make 
sure user can resort to old behavior if needed. With this patch the 
`isReachable` will be called in any case, regardless of the property 
'zookeeper.ipReachableTimeout' is defined or not. How about something like this:

if (zookeeper.ipReachableTimeout is not defined) {
 address = InetAddress.getByName(this.hostname);
} else {
address = getReachableAddress(this.hostname, ipReachableTimeout);
}


> recreateSocketAddresses may recreate the unreachable IP address
> ---
>
> Key: ZOOKEEPER-2691
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2691
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.8, 3.4.9, 3.4.10, 3.5.0, 3.5.1, 3.5.2
> Environment: Centos6.5
> Java8
> ZooKeeper3.4.8
>Reporter: JiangJiafu
>Priority: Minor
>
> The QuorumPeer$QuorumServer.recreateSocketAddress()  is used to resolved the 
> hostname to a new IP address(InetAddress) when any exception happens to the 
> socket. It will be very useful when a hostname can be resolved to more than 
> one IP address.
> But the problem is Java API InetAddress.getByName(String hostname) will 
> always return the first IP address when the hostname can be resolved to more 
> than one IP address, and the first IP address may be unreachable forever. For 
> example, if a machine has two network interfaces: eth0, eth1, say eth0 has 
> ip1, eth1 has ip2, the relationship between hostname and the IP addresses is 
> set in /etc/hosts. When I "close" the eth0 by command "ifdown eth0", the 
> InetAddress.getByName(String hostname)  will still return ip1, which is 
> unreachable forever.
> So I think it will be better to check the IP address by 
> InetAddress.isReachable(long) and choose the reachable IP address. 
> I have modified the ZooKeeper source code, and test the new code in my own 
> environment, and it can work very well when I turn down some network 
> interfaces using "ifdown" command.
> The original code is:
> {code:title=QuorumPeer.java|borderStyle=solid}
> public void recreateSocketAddresses() {
> InetAddress address = null;
> try {
> address = InetAddress.getByName(this.hostname);
> LOG.info("Resolved hostname: {} to address: {}", 
> this.hostname, address);
> this.addr = new InetSocketAddress(address, this.port);
> if (this.electionPort > 0){
> this.electionAddr = new InetSocketAddress(address, 
> this.electionPort);
> }
> } catch (UnknownHostException ex) {
> LOG.warn("Failed to resolve address: {}", this.hostname, ex);
> // Have we succeeded in the past?
> if (this.addr != null) {
> // Yes, previously the lookup succeeded. Leave things as 
> they are
> return;
> }
> // The hostname has never resolved. Create our 
> InetSocketAddress(es) as unresolved
> this.addr = InetSocketAddress.createUnresolved(this.hostname, 
> this.port);
> if (this.electionPort > 0){
> this.electionAddr = 
> InetSocketAddress.createUnresolved(this.hostname,
>

[jira] [Commented] (ZOOKEEPER-2691) recreateSocketAddresses may recreate the unreachable IP address

2017-05-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007180#comment-16007180
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2691:
---

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/173
  
@JiangJiafu Apologize for lagging on code review. I think this patch still 
needs a little bit work to get it merged:

* Provide a way to use old address creation function by checking the sys 
property (See my comment in code.).
* Documentation (see Abe's comment)
* Typo (see Edward's comment)


> recreateSocketAddresses may recreate the unreachable IP address
> ---
>
> Key: ZOOKEEPER-2691
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2691
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.8, 3.4.9, 3.4.10, 3.5.0, 3.5.1, 3.5.2
> Environment: Centos6.5
> Java8
> ZooKeeper3.4.8
>Reporter: JiangJiafu
>Priority: Minor
>
> The QuorumPeer$QuorumServer.recreateSocketAddress()  is used to resolved the 
> hostname to a new IP address(InetAddress) when any exception happens to the 
> socket. It will be very useful when a hostname can be resolved to more than 
> one IP address.
> But the problem is Java API InetAddress.getByName(String hostname) will 
> always return the first IP address when the hostname can be resolved to more 
> than one IP address, and the first IP address may be unreachable forever. For 
> example, if a machine has two network interfaces: eth0, eth1, say eth0 has 
> ip1, eth1 has ip2, the relationship between hostname and the IP addresses is 
> set in /etc/hosts. When I "close" the eth0 by command "ifdown eth0", the 
> InetAddress.getByName(String hostname)  will still return ip1, which is 
> unreachable forever.
> So I think it will be better to check the IP address by 
> InetAddress.isReachable(long) and choose the reachable IP address. 
> I have modified the ZooKeeper source code, and test the new code in my own 
> environment, and it can work very well when I turn down some network 
> interfaces using "ifdown" command.
> The original code is:
> {code:title=QuorumPeer.java|borderStyle=solid}
> public void recreateSocketAddresses() {
> InetAddress address = null;
> try {
> address = InetAddress.getByName(this.hostname);
> LOG.info("Resolved hostname: {} to address: {}", 
> this.hostname, address);
> this.addr = new InetSocketAddress(address, this.port);
> if (this.electionPort > 0){
> this.electionAddr = new InetSocketAddress(address, 
> this.electionPort);
> }
> } catch (UnknownHostException ex) {
> LOG.warn("Failed to resolve address: {}", this.hostname, ex);
> // Have we succeeded in the past?
> if (this.addr != null) {
> // Yes, previously the lookup succeeded. Leave things as 
> they are
> return;
> }
> // The hostname has never resolved. Create our 
> InetSocketAddress(es) as unresolved
> this.addr = InetSocketAddress.createUnresolved(this.hostname, 
> this.port);
> if (this.electionPort > 0){
> this.electionAddr = 
> InetSocketAddress.createUnresolved(this.hostname,
>
> this.electionPort);
> }
> }
> }
> {code}
> After my modification:
> {code:title=QuorumPeer.java|borderStyle=solid}
> public void recreateSocketAddresses() {
> InetAddress address = null;
> try {
> address = getReachableAddress(this.hostname);
> LOG.info("Resolved hostname: {} to address: {}", 
> this.hostname, address);
> this.addr = new InetSocketAddress(address, this.port);
> if (this.electionPort > 0){
> this.electionAddr = new InetSocketAddress(address, 
> this.electionPort);
> }
> } catch (UnknownHostException ex) {
> LOG.warn("Failed to resolve address: {}", this.hostname, ex);
> // Have we succeeded in the past?
> if (this.addr != null) {
> // Yes, previously the lookup succeeded. Leave things as 
> they are
> return;
> }
> // The hostname has never resolved. Create our 
> InetSocketAddress(es) as unresolved
> this.addr = InetSocketAddress.createUnresolved(this.hostname, 
> this.port);
> if (this.electionPort > 0){
> this.electionAddr = 
> Inet

[jira] [Commented] (ZOOKEEPER-2691) recreateSocketAddresses may recreate the unreachable IP address

2017-05-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007456#comment-16007456
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2691:
---

Github user JiangJiafu commented on the issue:

https://github.com/apache/zookeeper/pull/173
  
Eh, sorry for asking this question, but how to update the documentation? 
Should I modify the html files in docs directory?


> recreateSocketAddresses may recreate the unreachable IP address
> ---
>
> Key: ZOOKEEPER-2691
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2691
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.8, 3.4.9, 3.4.10, 3.5.0, 3.5.1, 3.5.2
> Environment: Centos6.5
> Java8
> ZooKeeper3.4.8
>Reporter: JiangJiafu
>Priority: Minor
>
> The QuorumPeer$QuorumServer.recreateSocketAddress()  is used to resolved the 
> hostname to a new IP address(InetAddress) when any exception happens to the 
> socket. It will be very useful when a hostname can be resolved to more than 
> one IP address.
> But the problem is Java API InetAddress.getByName(String hostname) will 
> always return the first IP address when the hostname can be resolved to more 
> than one IP address, and the first IP address may be unreachable forever. For 
> example, if a machine has two network interfaces: eth0, eth1, say eth0 has 
> ip1, eth1 has ip2, the relationship between hostname and the IP addresses is 
> set in /etc/hosts. When I "close" the eth0 by command "ifdown eth0", the 
> InetAddress.getByName(String hostname)  will still return ip1, which is 
> unreachable forever.
> So I think it will be better to check the IP address by 
> InetAddress.isReachable(long) and choose the reachable IP address. 
> I have modified the ZooKeeper source code, and test the new code in my own 
> environment, and it can work very well when I turn down some network 
> interfaces using "ifdown" command.
> The original code is:
> {code:title=QuorumPeer.java|borderStyle=solid}
> public void recreateSocketAddresses() {
> InetAddress address = null;
> try {
> address = InetAddress.getByName(this.hostname);
> LOG.info("Resolved hostname: {} to address: {}", 
> this.hostname, address);
> this.addr = new InetSocketAddress(address, this.port);
> if (this.electionPort > 0){
> this.electionAddr = new InetSocketAddress(address, 
> this.electionPort);
> }
> } catch (UnknownHostException ex) {
> LOG.warn("Failed to resolve address: {}", this.hostname, ex);
> // Have we succeeded in the past?
> if (this.addr != null) {
> // Yes, previously the lookup succeeded. Leave things as 
> they are
> return;
> }
> // The hostname has never resolved. Create our 
> InetSocketAddress(es) as unresolved
> this.addr = InetSocketAddress.createUnresolved(this.hostname, 
> this.port);
> if (this.electionPort > 0){
> this.electionAddr = 
> InetSocketAddress.createUnresolved(this.hostname,
>
> this.electionPort);
> }
> }
> }
> {code}
> After my modification:
> {code:title=QuorumPeer.java|borderStyle=solid}
> public void recreateSocketAddresses() {
> InetAddress address = null;
> try {
> address = getReachableAddress(this.hostname);
> LOG.info("Resolved hostname: {} to address: {}", 
> this.hostname, address);
> this.addr = new InetSocketAddress(address, this.port);
> if (this.electionPort > 0){
> this.electionAddr = new InetSocketAddress(address, 
> this.electionPort);
> }
> } catch (UnknownHostException ex) {
> LOG.warn("Failed to resolve address: {}", this.hostname, ex);
> // Have we succeeded in the past?
> if (this.addr != null) {
> // Yes, previously the lookup succeeded. Leave things as 
> they are
> return;
> }
> // The hostname has never resolved. Create our 
> InetSocketAddress(es) as unresolved
> this.addr = InetSocketAddress.createUnresolved(this.hostname, 
> this.port);
> if (this.electionPort > 0){
> this.electionAddr = 
> InetSocketAddress.createUnresolved(this.hostname,
>
> this.electionPort);
> }
> }

[jira] [Commented] (ZOOKEEPER-2691) recreateSocketAddresses may recreate the unreachable IP address

2017-05-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007588#comment-16007588
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2691:
---

Github user JiangJiafu commented on the issue:

https://github.com/apache/zookeeper/pull/173
  
@hanm Hi, I have modified the code according to your advices except the 
second one:
"Documentation (see Abe's comment)"



> recreateSocketAddresses may recreate the unreachable IP address
> ---
>
> Key: ZOOKEEPER-2691
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2691
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.8, 3.4.9, 3.4.10, 3.5.0, 3.5.1, 3.5.2
> Environment: Centos6.5
> Java8
> ZooKeeper3.4.8
>Reporter: JiangJiafu
>Priority: Minor
>
> The QuorumPeer$QuorumServer.recreateSocketAddress()  is used to resolved the 
> hostname to a new IP address(InetAddress) when any exception happens to the 
> socket. It will be very useful when a hostname can be resolved to more than 
> one IP address.
> But the problem is Java API InetAddress.getByName(String hostname) will 
> always return the first IP address when the hostname can be resolved to more 
> than one IP address, and the first IP address may be unreachable forever. For 
> example, if a machine has two network interfaces: eth0, eth1, say eth0 has 
> ip1, eth1 has ip2, the relationship between hostname and the IP addresses is 
> set in /etc/hosts. When I "close" the eth0 by command "ifdown eth0", the 
> InetAddress.getByName(String hostname)  will still return ip1, which is 
> unreachable forever.
> So I think it will be better to check the IP address by 
> InetAddress.isReachable(long) and choose the reachable IP address. 
> I have modified the ZooKeeper source code, and test the new code in my own 
> environment, and it can work very well when I turn down some network 
> interfaces using "ifdown" command.
> The original code is:
> {code:title=QuorumPeer.java|borderStyle=solid}
> public void recreateSocketAddresses() {
> InetAddress address = null;
> try {
> address = InetAddress.getByName(this.hostname);
> LOG.info("Resolved hostname: {} to address: {}", 
> this.hostname, address);
> this.addr = new InetSocketAddress(address, this.port);
> if (this.electionPort > 0){
> this.electionAddr = new InetSocketAddress(address, 
> this.electionPort);
> }
> } catch (UnknownHostException ex) {
> LOG.warn("Failed to resolve address: {}", this.hostname, ex);
> // Have we succeeded in the past?
> if (this.addr != null) {
> // Yes, previously the lookup succeeded. Leave things as 
> they are
> return;
> }
> // The hostname has never resolved. Create our 
> InetSocketAddress(es) as unresolved
> this.addr = InetSocketAddress.createUnresolved(this.hostname, 
> this.port);
> if (this.electionPort > 0){
> this.electionAddr = 
> InetSocketAddress.createUnresolved(this.hostname,
>
> this.electionPort);
> }
> }
> }
> {code}
> After my modification:
> {code:title=QuorumPeer.java|borderStyle=solid}
> public void recreateSocketAddresses() {
> InetAddress address = null;
> try {
> address = getReachableAddress(this.hostname);
> LOG.info("Resolved hostname: {} to address: {}", 
> this.hostname, address);
> this.addr = new InetSocketAddress(address, this.port);
> if (this.electionPort > 0){
> this.electionAddr = new InetSocketAddress(address, 
> this.electionPort);
> }
> } catch (UnknownHostException ex) {
> LOG.warn("Failed to resolve address: {}", this.hostname, ex);
> // Have we succeeded in the past?
> if (this.addr != null) {
> // Yes, previously the lookup succeeded. Leave things as 
> they are
> return;
> }
> // The hostname has never resolved. Create our 
> InetSocketAddress(es) as unresolved
> this.addr = InetSocketAddress.createUnresolved(this.hostname, 
> this.port);
> if (this.electionPort > 0){
> this.electionAddr = 
> InetSocketAddress.createUnresolved(this.hostname,
>
> this.electionPort);
> }
>

[jira] [Commented] (ZOOKEEPER-2691) recreateSocketAddresses may recreate the unreachable IP address

2017-05-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007652#comment-16007652
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2691:
---

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/173
  
I'll review later, but a quick answer to your question on how to update 
doc: no don't modify the html directly. Instead only modify the source of the 
docs. The sources are in folder src/docs/src/documentation/content/xdocs . In 
addition it would be good to verify your doc change locally by compiling the 
doc source with apache forrest (https://forrest.apache.org/). But please don't 
include the compiled documents (the html and pdf files) as part of the patch - 
you only need to change the source of the documents. You can check commit 
history of src/docs/src/documentation/content/xdocs and learn by example, 
should be pretty straightforward..



> recreateSocketAddresses may recreate the unreachable IP address
> ---
>
> Key: ZOOKEEPER-2691
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2691
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.8, 3.4.9, 3.4.10, 3.5.0, 3.5.1, 3.5.2
> Environment: Centos6.5
> Java8
> ZooKeeper3.4.8
>Reporter: JiangJiafu
>Priority: Minor
>
> The QuorumPeer$QuorumServer.recreateSocketAddress()  is used to resolved the 
> hostname to a new IP address(InetAddress) when any exception happens to the 
> socket. It will be very useful when a hostname can be resolved to more than 
> one IP address.
> But the problem is Java API InetAddress.getByName(String hostname) will 
> always return the first IP address when the hostname can be resolved to more 
> than one IP address, and the first IP address may be unreachable forever. For 
> example, if a machine has two network interfaces: eth0, eth1, say eth0 has 
> ip1, eth1 has ip2, the relationship between hostname and the IP addresses is 
> set in /etc/hosts. When I "close" the eth0 by command "ifdown eth0", the 
> InetAddress.getByName(String hostname)  will still return ip1, which is 
> unreachable forever.
> So I think it will be better to check the IP address by 
> InetAddress.isReachable(long) and choose the reachable IP address. 
> I have modified the ZooKeeper source code, and test the new code in my own 
> environment, and it can work very well when I turn down some network 
> interfaces using "ifdown" command.
> The original code is:
> {code:title=QuorumPeer.java|borderStyle=solid}
> public void recreateSocketAddresses() {
> InetAddress address = null;
> try {
> address = InetAddress.getByName(this.hostname);
> LOG.info("Resolved hostname: {} to address: {}", 
> this.hostname, address);
> this.addr = new InetSocketAddress(address, this.port);
> if (this.electionPort > 0){
> this.electionAddr = new InetSocketAddress(address, 
> this.electionPort);
> }
> } catch (UnknownHostException ex) {
> LOG.warn("Failed to resolve address: {}", this.hostname, ex);
> // Have we succeeded in the past?
> if (this.addr != null) {
> // Yes, previously the lookup succeeded. Leave things as 
> they are
> return;
> }
> // The hostname has never resolved. Create our 
> InetSocketAddress(es) as unresolved
> this.addr = InetSocketAddress.createUnresolved(this.hostname, 
> this.port);
> if (this.electionPort > 0){
> this.electionAddr = 
> InetSocketAddress.createUnresolved(this.hostname,
>
> this.electionPort);
> }
> }
> }
> {code}
> After my modification:
> {code:title=QuorumPeer.java|borderStyle=solid}
> public void recreateSocketAddresses() {
> InetAddress address = null;
> try {
> address = getReachableAddress(this.hostname);
> LOG.info("Resolved hostname: {} to address: {}", 
> this.hostname, address);
> this.addr = new InetSocketAddress(address, this.port);
> if (this.electionPort > 0){
> this.electionAddr = new InetSocketAddress(address, 
> this.electionPort);
> }
> } catch (UnknownHostException ex) {
> LOG.warn("Failed to resolve address: {}", this.hostname, ex);
> // Have we succeeded in the past?
> if (this.addr != null) {
> // Yes, previously the lookup succeeded. Leave things as 
> they are

[jira] [Commented] (ZOOKEEPER-2691) recreateSocketAddresses may recreate the unreachable IP address

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007887#comment-16007887
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2691:
---

Github user JiangJiafu commented on the issue:

https://github.com/apache/zookeeper/pull/173
  
@hanm I have changed the document, please review the code and document, 
thank you.


> recreateSocketAddresses may recreate the unreachable IP address
> ---
>
> Key: ZOOKEEPER-2691
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2691
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.8, 3.4.9, 3.4.10, 3.5.0, 3.5.1, 3.5.2
> Environment: Centos6.5
> Java8
> ZooKeeper3.4.8
>Reporter: JiangJiafu
>Priority: Minor
>
> The QuorumPeer$QuorumServer.recreateSocketAddress()  is used to resolved the 
> hostname to a new IP address(InetAddress) when any exception happens to the 
> socket. It will be very useful when a hostname can be resolved to more than 
> one IP address.
> But the problem is Java API InetAddress.getByName(String hostname) will 
> always return the first IP address when the hostname can be resolved to more 
> than one IP address, and the first IP address may be unreachable forever. For 
> example, if a machine has two network interfaces: eth0, eth1, say eth0 has 
> ip1, eth1 has ip2, the relationship between hostname and the IP addresses is 
> set in /etc/hosts. When I "close" the eth0 by command "ifdown eth0", the 
> InetAddress.getByName(String hostname)  will still return ip1, which is 
> unreachable forever.
> So I think it will be better to check the IP address by 
> InetAddress.isReachable(long) and choose the reachable IP address. 
> I have modified the ZooKeeper source code, and test the new code in my own 
> environment, and it can work very well when I turn down some network 
> interfaces using "ifdown" command.
> The original code is:
> {code:title=QuorumPeer.java|borderStyle=solid}
> public void recreateSocketAddresses() {
> InetAddress address = null;
> try {
> address = InetAddress.getByName(this.hostname);
> LOG.info("Resolved hostname: {} to address: {}", 
> this.hostname, address);
> this.addr = new InetSocketAddress(address, this.port);
> if (this.electionPort > 0){
> this.electionAddr = new InetSocketAddress(address, 
> this.electionPort);
> }
> } catch (UnknownHostException ex) {
> LOG.warn("Failed to resolve address: {}", this.hostname, ex);
> // Have we succeeded in the past?
> if (this.addr != null) {
> // Yes, previously the lookup succeeded. Leave things as 
> they are
> return;
> }
> // The hostname has never resolved. Create our 
> InetSocketAddress(es) as unresolved
> this.addr = InetSocketAddress.createUnresolved(this.hostname, 
> this.port);
> if (this.electionPort > 0){
> this.electionAddr = 
> InetSocketAddress.createUnresolved(this.hostname,
>
> this.electionPort);
> }
> }
> }
> {code}
> After my modification:
> {code:title=QuorumPeer.java|borderStyle=solid}
> public void recreateSocketAddresses() {
> InetAddress address = null;
> try {
> address = getReachableAddress(this.hostname);
> LOG.info("Resolved hostname: {} to address: {}", 
> this.hostname, address);
> this.addr = new InetSocketAddress(address, this.port);
> if (this.electionPort > 0){
> this.electionAddr = new InetSocketAddress(address, 
> this.electionPort);
> }
> } catch (UnknownHostException ex) {
> LOG.warn("Failed to resolve address: {}", this.hostname, ex);
> // Have we succeeded in the past?
> if (this.addr != null) {
> // Yes, previously the lookup succeeded. Leave things as 
> they are
> return;
> }
> // The hostname has never resolved. Create our 
> InetSocketAddress(es) as unresolved
> this.addr = InetSocketAddress.createUnresolved(this.hostname, 
> this.port);
> if (this.electionPort > 0){
> this.electionAddr = 
> InetSocketAddress.createUnresolved(this.hostname,
>
> this.electionPort);
> }
> }
> }
> public InetAddres

[jira] [Commented] (ZOOKEEPER-2781) Flaky test: testClientAuthAgainstNoAuthServerWithLowerSid

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008906#comment-16008906
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2781:
---

GitHub user afine opened a pull request:

https://github.com/apache/zookeeper/pull/251

ZOOKEEPER-2781 Flaky test: testClientAuthAgainstNoAuthServerWithLowerSid

The flaky test appears to be caused by a race condition in QuorumCnxManager 
that could potentially prevent two servers from connecting to each other. I was 
able to reproduce the issue with a debugger and a little bit of patience. It 
would be great if someone can share a less contrived way to reproduce the same 
issue. Here is the basic order of execution required to reproduce the issue 
between two peers (using lines of code from before this patch).  Point of 
clarification, reaching a line means hitting but not yet executing that line 
(equivalent to setting a breakpoint on that line).

1. peer1 enters `startConnection` and reaches QuorumCnxManager.java:365
2. peer0's Listener enters `handleConnection` reaches  
QuorumCnxManager.java:506
2. peer0 enters `startConnection` and reaches QuorumCnxManager.java:353
3. peer1's Listener enters `handleConnection` and reaches 
QuorumCnxManager.java:483
3. peer1 executes QuorumCnxManager.java:365 and reaches 
QuorumCnxManager.java:374
4. peer0's Listener executes QuorumCnxManager.java:506 and starts a 
RecvWorker which stops at QuorumCnxManager.java:1027. The Listener reaches 
QuorumCnxManager.java:516.
5. peer1's Listener continues executing from QuorumCnxManager.java:483, 
which removes the SendWorker and RecvWorker for its connection to peer0, and 
reaches QuorumCnxManager.java:493
6. peer0's RecvWorker executes  QuorumCnxManager.java:1027, the socket had 
since been closed on peer1 and we throw an exception
```
[junit] 2017-05-07 14:48:11,055 [myid:] - WARN  
[RecvWorker:1:QuorumCnxManager$RecvWorker@1042] - Connection broken for id 1, 
my id = 0, error =
[junit] java.io.EOFException
[junit] at java.io.DataInputStream.readInt(DataInputStream.java:392)
[junit] at 
org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1027)
```
9. peer0 executes  QuorumCnxManager.java:353 reaching  
QuorumCnxManager.java:377
10. peer1 executes connectOne at QuorumCnxManager.java:493 and continues 
until it reaches QuorumCnxManager.java:290 which completes the thread
11. peer0's listener continues executing from QuorumCnxManager.java:516. 
Calls `handleConnection ` again but returns at QuorumCnxManager.java:469 since 
peer1 never wrote its sid to the socket.
12. peer0 continues executing from QuorumCnxManager.java:377
13. peer1 continues executing from QuorumCnxManager.java:374


Both threads finish and no quorum has been formed. 
`testClientAuthAgainstNoAuthServerWithLowerSid` times out




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/afine/zookeeper ZOOKEEPER-2781

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/251.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #251


commit 5f231641a9496aaf84a0b58d87f5fc365fa9b7e6
Author: Abraham Fine 
Date:   2017-05-12T20:23:55Z

ZOOKEEPER-2781: Flaky test: testClientAuthAgainstNoAuthServerWithLowerSid




> Flaky test: testClientAuthAgainstNoAuthServerWithLowerSid
> -
>
> Key: ZOOKEEPER-2781
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2781
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.10
>Reporter: Abraham Fine
>Assignee: Abraham Fine
>
> Here is an example failing job: 
> https://builds.apache.org/job/ZooKeeper_branch34_openjdk7/1489/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2552) Revisit release note doc and remove the items which are not related to the released version

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009012#comment-16009012
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2552:
---

Github user eribeiro closed the pull request at:

https://github.com/apache/zookeeper/pull/124


> Revisit release note doc and remove the items which are not related to the 
> released version
> ---
>
> Key: ZOOKEEPER-2552
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2552
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.9
>Reporter: Rakesh R
>Assignee: Edward Ribeiro
> Fix For: 3.4.10
>
> Attachments: closed.py, ZOOKEEPER-2552.patch
>
>
> Couple of issues listed on http://zookeeper.apache.org/
> doc/r3.4.9/releasenotes.html that are either 'Open' or 'Patch available'. For 
> example, issues were wrongly marked as "3.4.8" fix version in jira and has 
> caused the trouble.
> This jira to cross check all the jira issues present in the release note and 
> check the correctness.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2772) Delete node command does not honor Acl policy

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009151#comment-16009151
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2772:
---

GitHub user eribeiro opened a pull request:

https://github.com/apache/zookeeper/pull/252

ZOOKEEPER-2772: Delete node command does not honor Acl policy



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/eribeiro/zookeeper ZOOKEEPER-2772

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/252.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #252


commit ce40dc7e12493817a26a8e45125ecc61d4ac0e80
Author: Edward Ribeiro 
Date:   2017-05-13T05:36:51Z

ZOOKEEPER-2772: Delete node command does not honor Acl policy




> Delete node command does not honor Acl policy
> -
>
> Key: ZOOKEEPER-2772
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2772
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.4.8, 3.4.10
>Reporter: joe smith
>
> I set the acl to not be able to delete a node - but was able to delete 
> regardless.
> I am not familiar with the code, but a reply from Martin in the user@ mailing 
> list seems to confirm the issue.  I will paste his response below - sorry for 
> the long listing.
> Martin's reply are inline prefixed with: MG>
> --
> From: joe smith 
> Sent: Tuesday, May 2, 2017 8:40 AM
> To: u...@zookeeper.apache.org
> Subject: Acl block detete not working
> Hi,
> I'm using 3.4.10 and setting custom aol to block deletion of a znode.  
> However, I'm able to delete the node even after I've set acl from cdrwa to 
> cra.
> Can anyone point out if I missed some step.
> Thanks for the help
> Here is the trace:
> [zk: localhost:2181(CONNECTED) 0] ls /
> [zookeeper]
> [zk: localhost:2181(CONNECTED) 1] create /test "data"
> Created /test
> [zk: localhost:2181(CONNECTED) 2] ls /
> [zookeeper, test]
> [zk: localhost:2181(CONNECTED) 3] addauth myfqdn localhost
> [zk: localhost:2181(CONNECTED) 4] setAcl /test myfqdn:localhost:cra
> cZxid = 0x2
> ctime = Tue May 02 08:28:42 EDT 2017
> mZxid = 0x2
> mtime = Tue May 02 08:28:42 EDT 2017
> pZxid = 0x2
> cversion = 0
> dataVersion = 0
> aclVersion = 1
> ephemeralOwner = 0x0
> dataLength = 4
> numChildren = 0
> MG>in SetAclCommand you can see the acl being parsed and acl being set by 
> setAcl into zk object
> List acl = AclParser.parse(aclStr);
> int version;
> if (cl.hasOption("v")) {
> version = Integer.parseInt(cl.getOptionValue("v"));
> } else {
> version = -1;
> }
> try {
> Stat stat = zk.setACL(path, acl, version);
> MG>later on in DeleteCommand there is no check for aforementioned acl 
> parameter
>   public boolean exec() throws KeeperException, InterruptedException {
> String path = args[1];
> int version;
> if (cl.hasOption("v")) {
> version = Integer.parseInt(cl.getOptionValue("v"));
> } else {
> version = -1;
> }
> try {
> zk.delete(path, version);
> } catch(KeeperException.BadVersionException ex) {
> err.println(ex.getMessage());
> }
> return false;
> MG>as seen here the testCase works properly saving the Zookeeper object
> LsCommand entity = new LsCommand();
> entity.setZk(zk);
> MG>but setACL does not save the zookeeper object anywhere but instead seems 
> to discard zookeeper object with accompanying ACLs
> MG>can you report this bug to Zookeeper?
> https://issues.apache.org/jira/browse/ZOOKEEPER/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel
> ZooKeeper - ASF JIRA - 
> issues.apache.org
> issues.apache.org
> Apache ZooKeeper is a service for coordinating processes of distributed 
> applications. Versions: Unreleased. Name Release date; Unreleased 3.2.3 : 
> Unreleased 3.3.7
> MG>Thanks Joe!
> [zk: localhost:2181(CONNECTED) 5] getAcl /test
> 'myfqdn,'localhost
> : cra
> [zk: localhost:2181(CONNECTED) 6] get /testdata
> cZxid = 0x2
> ctime = Tue May 02 08:28:42 EDT 2017
> mZxid = 0x2
> mtime = Tue May 02 08:28:42 EDT 2017
> pZxid = 0x2
> cversion = 0
> dataVersion = 0
> aclVersion = 1
> ephemeralOwner = 0x0
> dataLength = 4
> numChildren = 0
> [zk: localhost:2181(CONNECTED) 7] set /test "testwrite"
> Authentication is not valid : /test
> [zk: localhost:2181(CONNECTED) 8] delete /test
> [zk: localhost:2181(CONNECTED) 9] ls /
> [zookeeper]
> [zk: 

[jira] [Commented] (ZOOKEEPER-2772) Delete node command does not honor Acl policy

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009154#comment-16009154
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2772:
---

Github user eribeiro commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/252#discussion_r116351552
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java ---
@@ -389,8 +389,8 @@ protected void pRequest2Txn(int type, long zxid, 
Request request, Record record,
 parentPath = path.substring(0, lastSlash);
 parentRecord = getRecordForPath(parentPath);
 ChangeRecord nodeRecord = getRecordForPath(path);
-checkACL(zks, parentRecord.acl, ZooDefs.Perms.DELETE,
-request.authInfo);
+checkACL(zks, parentRecord.acl, ZooDefs.Perms.DELETE, 
request.authInfo);
--- End diff --

I see that when we create a znode we need to check the ACL of the parent. 
But do we still need to check the parent when we are deleting? /cc @phunt @fpj 


> Delete node command does not honor Acl policy
> -
>
> Key: ZOOKEEPER-2772
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2772
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.4.8, 3.4.10
>Reporter: joe smith
>
> I set the acl to not be able to delete a node - but was able to delete 
> regardless.
> I am not familiar with the code, but a reply from Martin in the user@ mailing 
> list seems to confirm the issue.  I will paste his response below - sorry for 
> the long listing.
> Martin's reply are inline prefixed with: MG>
> --
> From: joe smith 
> Sent: Tuesday, May 2, 2017 8:40 AM
> To: u...@zookeeper.apache.org
> Subject: Acl block detete not working
> Hi,
> I'm using 3.4.10 and setting custom aol to block deletion of a znode.  
> However, I'm able to delete the node even after I've set acl from cdrwa to 
> cra.
> Can anyone point out if I missed some step.
> Thanks for the help
> Here is the trace:
> [zk: localhost:2181(CONNECTED) 0] ls /
> [zookeeper]
> [zk: localhost:2181(CONNECTED) 1] create /test "data"
> Created /test
> [zk: localhost:2181(CONNECTED) 2] ls /
> [zookeeper, test]
> [zk: localhost:2181(CONNECTED) 3] addauth myfqdn localhost
> [zk: localhost:2181(CONNECTED) 4] setAcl /test myfqdn:localhost:cra
> cZxid = 0x2
> ctime = Tue May 02 08:28:42 EDT 2017
> mZxid = 0x2
> mtime = Tue May 02 08:28:42 EDT 2017
> pZxid = 0x2
> cversion = 0
> dataVersion = 0
> aclVersion = 1
> ephemeralOwner = 0x0
> dataLength = 4
> numChildren = 0
> MG>in SetAclCommand you can see the acl being parsed and acl being set by 
> setAcl into zk object
> List acl = AclParser.parse(aclStr);
> int version;
> if (cl.hasOption("v")) {
> version = Integer.parseInt(cl.getOptionValue("v"));
> } else {
> version = -1;
> }
> try {
> Stat stat = zk.setACL(path, acl, version);
> MG>later on in DeleteCommand there is no check for aforementioned acl 
> parameter
>   public boolean exec() throws KeeperException, InterruptedException {
> String path = args[1];
> int version;
> if (cl.hasOption("v")) {
> version = Integer.parseInt(cl.getOptionValue("v"));
> } else {
> version = -1;
> }
> try {
> zk.delete(path, version);
> } catch(KeeperException.BadVersionException ex) {
> err.println(ex.getMessage());
> }
> return false;
> MG>as seen here the testCase works properly saving the Zookeeper object
> LsCommand entity = new LsCommand();
> entity.setZk(zk);
> MG>but setACL does not save the zookeeper object anywhere but instead seems 
> to discard zookeeper object with accompanying ACLs
> MG>can you report this bug to Zookeeper?
> https://issues.apache.org/jira/browse/ZOOKEEPER/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel
> ZooKeeper - ASF JIRA - 
> issues.apache.org
> issues.apache.org
> Apache ZooKeeper is a service for coordinating processes of distributed 
> applications. Versions: Unreleased. Name Release date; Unreleased 3.2.3 : 
> Unreleased 3.3.7
> MG>Thanks Joe!
> [zk: localhost:2181(CONNECTED) 5] getAcl /test
> 'myfqdn,'localhost
> : cra
> [zk: localhost:2181(CONNECTED) 6] get /testdata
> cZxid = 0x2
> ctime = Tue May 02 08:28:42 EDT 2017
> mZxid = 0x2
> mtime = Tue May 02 08:28:42 EDT 2017
> pZxid = 0x2
> cversion = 0
> dataVersion = 0
> aclVersion = 1
> ephemeralOwner = 0x0
> dataLength = 4
> numChildren = 0
> [zk: localhost:2181(CONNECTED) 7

[jira] [Commented] (ZOOKEEPER-2772) Delete node command does not honor Acl policy

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009157#comment-16009157
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2772:
---

Github user eribeiro commented on the issue:

https://github.com/apache/zookeeper/pull/252
  
if approved this patch needs to be ported to branch-3.5 and master.


> Delete node command does not honor Acl policy
> -
>
> Key: ZOOKEEPER-2772
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2772
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.4.8, 3.4.10
>Reporter: joe smith
>
> I set the acl to not be able to delete a node - but was able to delete 
> regardless.
> I am not familiar with the code, but a reply from Martin in the user@ mailing 
> list seems to confirm the issue.  I will paste his response below - sorry for 
> the long listing.
> Martin's reply are inline prefixed with: MG>
> --
> From: joe smith 
> Sent: Tuesday, May 2, 2017 8:40 AM
> To: u...@zookeeper.apache.org
> Subject: Acl block detete not working
> Hi,
> I'm using 3.4.10 and setting custom aol to block deletion of a znode.  
> However, I'm able to delete the node even after I've set acl from cdrwa to 
> cra.
> Can anyone point out if I missed some step.
> Thanks for the help
> Here is the trace:
> [zk: localhost:2181(CONNECTED) 0] ls /
> [zookeeper]
> [zk: localhost:2181(CONNECTED) 1] create /test "data"
> Created /test
> [zk: localhost:2181(CONNECTED) 2] ls /
> [zookeeper, test]
> [zk: localhost:2181(CONNECTED) 3] addauth myfqdn localhost
> [zk: localhost:2181(CONNECTED) 4] setAcl /test myfqdn:localhost:cra
> cZxid = 0x2
> ctime = Tue May 02 08:28:42 EDT 2017
> mZxid = 0x2
> mtime = Tue May 02 08:28:42 EDT 2017
> pZxid = 0x2
> cversion = 0
> dataVersion = 0
> aclVersion = 1
> ephemeralOwner = 0x0
> dataLength = 4
> numChildren = 0
> MG>in SetAclCommand you can see the acl being parsed and acl being set by 
> setAcl into zk object
> List acl = AclParser.parse(aclStr);
> int version;
> if (cl.hasOption("v")) {
> version = Integer.parseInt(cl.getOptionValue("v"));
> } else {
> version = -1;
> }
> try {
> Stat stat = zk.setACL(path, acl, version);
> MG>later on in DeleteCommand there is no check for aforementioned acl 
> parameter
>   public boolean exec() throws KeeperException, InterruptedException {
> String path = args[1];
> int version;
> if (cl.hasOption("v")) {
> version = Integer.parseInt(cl.getOptionValue("v"));
> } else {
> version = -1;
> }
> try {
> zk.delete(path, version);
> } catch(KeeperException.BadVersionException ex) {
> err.println(ex.getMessage());
> }
> return false;
> MG>as seen here the testCase works properly saving the Zookeeper object
> LsCommand entity = new LsCommand();
> entity.setZk(zk);
> MG>but setACL does not save the zookeeper object anywhere but instead seems 
> to discard zookeeper object with accompanying ACLs
> MG>can you report this bug to Zookeeper?
> https://issues.apache.org/jira/browse/ZOOKEEPER/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel
> ZooKeeper - ASF JIRA - 
> issues.apache.org
> issues.apache.org
> Apache ZooKeeper is a service for coordinating processes of distributed 
> applications. Versions: Unreleased. Name Release date; Unreleased 3.2.3 : 
> Unreleased 3.3.7
> MG>Thanks Joe!
> [zk: localhost:2181(CONNECTED) 5] getAcl /test
> 'myfqdn,'localhost
> : cra
> [zk: localhost:2181(CONNECTED) 6] get /testdata
> cZxid = 0x2
> ctime = Tue May 02 08:28:42 EDT 2017
> mZxid = 0x2
> mtime = Tue May 02 08:28:42 EDT 2017
> pZxid = 0x2
> cversion = 0
> dataVersion = 0
> aclVersion = 1
> ephemeralOwner = 0x0
> dataLength = 4
> numChildren = 0
> [zk: localhost:2181(CONNECTED) 7] set /test "testwrite"
> Authentication is not valid : /test
> [zk: localhost:2181(CONNECTED) 8] delete /test
> [zk: localhost:2181(CONNECTED) 9] ls /
> [zookeeper]
> [zk: localhost:2181(CONNECTED) 10]
> The auth provider imple is here: 
> http://s000.tinyupload.com/?file_id=42827186839577179157



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2772) Delete node command does not honor Acl policy

2017-05-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009304#comment-16009304
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2772:
---

Github user eribeiro commented on the issue:

https://github.com/apache/zookeeper/pull/252
  
As Ben pointed out, the docs at 

http://zookeeper.apache.org/doc/r3.5.3-beta/zookeeperProgrammers.html#sc_ACLPermissions.

state clearly that "DELETE prevents deletion of children "(like CREATE 
prevents the creation of children). it does not prevent the deletion of the 
znode itself."

So, closing this PR.


> Delete node command does not honor Acl policy
> -
>
> Key: ZOOKEEPER-2772
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2772
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.4.8, 3.4.10
>Reporter: joe smith
>
> I set the acl to not be able to delete a node - but was able to delete 
> regardless.
> I am not familiar with the code, but a reply from Martin in the user@ mailing 
> list seems to confirm the issue.  I will paste his response below - sorry for 
> the long listing.
> Martin's reply are inline prefixed with: MG>
> --
> From: joe smith 
> Sent: Tuesday, May 2, 2017 8:40 AM
> To: u...@zookeeper.apache.org
> Subject: Acl block detete not working
> Hi,
> I'm using 3.4.10 and setting custom aol to block deletion of a znode.  
> However, I'm able to delete the node even after I've set acl from cdrwa to 
> cra.
> Can anyone point out if I missed some step.
> Thanks for the help
> Here is the trace:
> [zk: localhost:2181(CONNECTED) 0] ls /
> [zookeeper]
> [zk: localhost:2181(CONNECTED) 1] create /test "data"
> Created /test
> [zk: localhost:2181(CONNECTED) 2] ls /
> [zookeeper, test]
> [zk: localhost:2181(CONNECTED) 3] addauth myfqdn localhost
> [zk: localhost:2181(CONNECTED) 4] setAcl /test myfqdn:localhost:cra
> cZxid = 0x2
> ctime = Tue May 02 08:28:42 EDT 2017
> mZxid = 0x2
> mtime = Tue May 02 08:28:42 EDT 2017
> pZxid = 0x2
> cversion = 0
> dataVersion = 0
> aclVersion = 1
> ephemeralOwner = 0x0
> dataLength = 4
> numChildren = 0
> MG>in SetAclCommand you can see the acl being parsed and acl being set by 
> setAcl into zk object
> List acl = AclParser.parse(aclStr);
> int version;
> if (cl.hasOption("v")) {
> version = Integer.parseInt(cl.getOptionValue("v"));
> } else {
> version = -1;
> }
> try {
> Stat stat = zk.setACL(path, acl, version);
> MG>later on in DeleteCommand there is no check for aforementioned acl 
> parameter
>   public boolean exec() throws KeeperException, InterruptedException {
> String path = args[1];
> int version;
> if (cl.hasOption("v")) {
> version = Integer.parseInt(cl.getOptionValue("v"));
> } else {
> version = -1;
> }
> try {
> zk.delete(path, version);
> } catch(KeeperException.BadVersionException ex) {
> err.println(ex.getMessage());
> }
> return false;
> MG>as seen here the testCase works properly saving the Zookeeper object
> LsCommand entity = new LsCommand();
> entity.setZk(zk);
> MG>but setACL does not save the zookeeper object anywhere but instead seems 
> to discard zookeeper object with accompanying ACLs
> MG>can you report this bug to Zookeeper?
> https://issues.apache.org/jira/browse/ZOOKEEPER/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel
> ZooKeeper - ASF JIRA - 
> issues.apache.org
> issues.apache.org
> Apache ZooKeeper is a service for coordinating processes of distributed 
> applications. Versions: Unreleased. Name Release date; Unreleased 3.2.3 : 
> Unreleased 3.3.7
> MG>Thanks Joe!
> [zk: localhost:2181(CONNECTED) 5] getAcl /test
> 'myfqdn,'localhost
> : cra
> [zk: localhost:2181(CONNECTED) 6] get /testdata
> cZxid = 0x2
> ctime = Tue May 02 08:28:42 EDT 2017
> mZxid = 0x2
> mtime = Tue May 02 08:28:42 EDT 2017
> pZxid = 0x2
> cversion = 0
> dataVersion = 0
> aclVersion = 1
> ephemeralOwner = 0x0
> dataLength = 4
> numChildren = 0
> [zk: localhost:2181(CONNECTED) 7] set /test "testwrite"
> Authentication is not valid : /test
> [zk: localhost:2181(CONNECTED) 8] delete /test
> [zk: localhost:2181(CONNECTED) 9] ls /
> [zookeeper]
> [zk: localhost:2181(CONNECTED) 10]
> The auth provider imple is here: 
> http://s000.tinyupload.com/?file_id=42827186839577179157



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2772) Delete node command does not honor Acl policy

2017-05-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009305#comment-16009305
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2772:
---

Github user eribeiro closed the pull request at:

https://github.com/apache/zookeeper/pull/252


> Delete node command does not honor Acl policy
> -
>
> Key: ZOOKEEPER-2772
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2772
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.4.8, 3.4.10
>Reporter: joe smith
>
> I set the acl to not be able to delete a node - but was able to delete 
> regardless.
> I am not familiar with the code, but a reply from Martin in the user@ mailing 
> list seems to confirm the issue.  I will paste his response below - sorry for 
> the long listing.
> Martin's reply are inline prefixed with: MG>
> --
> From: joe smith 
> Sent: Tuesday, May 2, 2017 8:40 AM
> To: u...@zookeeper.apache.org
> Subject: Acl block detete not working
> Hi,
> I'm using 3.4.10 and setting custom aol to block deletion of a znode.  
> However, I'm able to delete the node even after I've set acl from cdrwa to 
> cra.
> Can anyone point out if I missed some step.
> Thanks for the help
> Here is the trace:
> [zk: localhost:2181(CONNECTED) 0] ls /
> [zookeeper]
> [zk: localhost:2181(CONNECTED) 1] create /test "data"
> Created /test
> [zk: localhost:2181(CONNECTED) 2] ls /
> [zookeeper, test]
> [zk: localhost:2181(CONNECTED) 3] addauth myfqdn localhost
> [zk: localhost:2181(CONNECTED) 4] setAcl /test myfqdn:localhost:cra
> cZxid = 0x2
> ctime = Tue May 02 08:28:42 EDT 2017
> mZxid = 0x2
> mtime = Tue May 02 08:28:42 EDT 2017
> pZxid = 0x2
> cversion = 0
> dataVersion = 0
> aclVersion = 1
> ephemeralOwner = 0x0
> dataLength = 4
> numChildren = 0
> MG>in SetAclCommand you can see the acl being parsed and acl being set by 
> setAcl into zk object
> List acl = AclParser.parse(aclStr);
> int version;
> if (cl.hasOption("v")) {
> version = Integer.parseInt(cl.getOptionValue("v"));
> } else {
> version = -1;
> }
> try {
> Stat stat = zk.setACL(path, acl, version);
> MG>later on in DeleteCommand there is no check for aforementioned acl 
> parameter
>   public boolean exec() throws KeeperException, InterruptedException {
> String path = args[1];
> int version;
> if (cl.hasOption("v")) {
> version = Integer.parseInt(cl.getOptionValue("v"));
> } else {
> version = -1;
> }
> try {
> zk.delete(path, version);
> } catch(KeeperException.BadVersionException ex) {
> err.println(ex.getMessage());
> }
> return false;
> MG>as seen here the testCase works properly saving the Zookeeper object
> LsCommand entity = new LsCommand();
> entity.setZk(zk);
> MG>but setACL does not save the zookeeper object anywhere but instead seems 
> to discard zookeeper object with accompanying ACLs
> MG>can you report this bug to Zookeeper?
> https://issues.apache.org/jira/browse/ZOOKEEPER/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel
> ZooKeeper - ASF JIRA - 
> issues.apache.org
> issues.apache.org
> Apache ZooKeeper is a service for coordinating processes of distributed 
> applications. Versions: Unreleased. Name Release date; Unreleased 3.2.3 : 
> Unreleased 3.3.7
> MG>Thanks Joe!
> [zk: localhost:2181(CONNECTED) 5] getAcl /test
> 'myfqdn,'localhost
> : cra
> [zk: localhost:2181(CONNECTED) 6] get /testdata
> cZxid = 0x2
> ctime = Tue May 02 08:28:42 EDT 2017
> mZxid = 0x2
> mtime = Tue May 02 08:28:42 EDT 2017
> pZxid = 0x2
> cversion = 0
> dataVersion = 0
> aclVersion = 1
> ephemeralOwner = 0x0
> dataLength = 4
> numChildren = 0
> [zk: localhost:2181(CONNECTED) 7] set /test "testwrite"
> Authentication is not valid : /test
> [zk: localhost:2181(CONNECTED) 8] delete /test
> [zk: localhost:2181(CONNECTED) 9] ls /
> [zookeeper]
> [zk: localhost:2181(CONNECTED) 10]
> The auth provider imple is here: 
> http://s000.tinyupload.com/?file_id=42827186839577179157



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2774) Ephemeral znode will not be removed when sesstion timeout, if the system time of ZooKeeper node changes unexpectedly.

2017-05-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009577#comment-16009577
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2774:
---

GitHub user JiangJiafu opened a pull request:

https://github.com/apache/zookeeper/pull/253

ZOOKEEPER-2774



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/JiangJiafu/zookeeper ZOOKEEPER-2774

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/253.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #253


commit 3ac65ead39fad4f8d9f26365e1bc73f83889f11e
Author: Jiang Jiafu 
Date:   2017-05-13T03:41:52Z

ZOOKEEPER-2774




> Ephemeral znode will not be removed when sesstion timeout, if the system time 
> of ZooKeeper node changes unexpectedly.
> -
>
> Key: ZOOKEEPER-2774
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2774
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.8, 3.4.9, 3.4.10
> Environment: Centos6.5
>Reporter: JiangJiafu
>
> 1. Deploy a ZooKeeper cluster with one node.
> 2. Create a Ephemeral znode.
> 3. Change the system time of the ZooKeeper node to a earlier point.
> 4. Disconnect the client with the ZooKeeper server.
> Then the ephemeral znode will exist for a long time even when session timeout.
> I have read the ZooKeeper source code and I find the code int 
> SessionTrackerImpl.java,
> {code:title=SessionTrackerImpl.java|borderStyle=solid}
> @Override
> synchronized public void run() {
> try {
> while (running) {
> currentTime = System.currentTimeMillis();
> if (nextExpirationTime > currentTime) {
> this.wait(nextExpirationTime - currentTime);
> continue;
> }
> SessionSet set;
> set = sessionSets.remove(nextExpirationTime);
> if (set != null) {
> for (SessionImpl s : set.sessions) {
> setSessionClosing(s.sessionId);
> expirer.expire(s);
> }
> }
> nextExpirationTime += expirationInterval;
> }
> } catch (InterruptedException e) {
> handleException(this.getName(), e);
> }
> LOG.info("SessionTrackerImpl exited loop!");
> }
> {code}
> I think it may be better to use System.nanoTime(), not 
> System.currentTimeMillis, because the later can be changed manually or 
> automatically by a NTP client. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2774) Ephemeral znode will not be removed when sesstion timeout, if the system time of ZooKeeper node changes unexpectedly.

2017-05-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009578#comment-16009578
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2774:
---

Github user JiangJiafu commented on the issue:

https://github.com/apache/zookeeper/pull/253
  
Port the code from  ZOOKEEPER-1366 to branch3.4.


> Ephemeral znode will not be removed when sesstion timeout, if the system time 
> of ZooKeeper node changes unexpectedly.
> -
>
> Key: ZOOKEEPER-2774
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2774
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.8, 3.4.9, 3.4.10
> Environment: Centos6.5
>Reporter: JiangJiafu
>
> 1. Deploy a ZooKeeper cluster with one node.
> 2. Create a Ephemeral znode.
> 3. Change the system time of the ZooKeeper node to a earlier point.
> 4. Disconnect the client with the ZooKeeper server.
> Then the ephemeral znode will exist for a long time even when session timeout.
> I have read the ZooKeeper source code and I find the code int 
> SessionTrackerImpl.java,
> {code:title=SessionTrackerImpl.java|borderStyle=solid}
> @Override
> synchronized public void run() {
> try {
> while (running) {
> currentTime = System.currentTimeMillis();
> if (nextExpirationTime > currentTime) {
> this.wait(nextExpirationTime - currentTime);
> continue;
> }
> SessionSet set;
> set = sessionSets.remove(nextExpirationTime);
> if (set != null) {
> for (SessionImpl s : set.sessions) {
> setSessionClosing(s.sessionId);
> expirer.expire(s);
> }
> }
> nextExpirationTime += expirationInterval;
> }
> } catch (InterruptedException e) {
> handleException(this.getName(), e);
> }
> LOG.info("SessionTrackerImpl exited loop!");
> }
> {code}
> I think it may be better to use System.nanoTime(), not 
> System.currentTimeMillis, because the later can be changed manually or 
> automatically by a NTP client. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16010650#comment-16010650
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

GitHub user arshadmohammad opened a pull request:

https://github.com/apache/zookeeper/pull/254

ZOOKEEPER-2775: ZK Client not able to connect with Xid out of order error

Once client enters into Xid out of order issue, It never comes to normal 
state. It keeps trying to connect and  fail with the same error.  
Recreating/Restarting is the only solution as of now. This happens because of 
bug in the ZK client code. This MR provides the fix. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/arshadmohammad/zookeeper 
ZOOKEEPER-2775-XidOutOfOrder

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/254.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #254






> ZK Client not able to connect with Xid out of order error 
> --
>
> Key: ZOOKEEPER-2775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.10, 3.5.3, 3.6.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Mohammad Arshad
>Priority: Critical
> Attachments: ZOOKEEPER-2775-01.patch
>
>
> During Network unreachable scenario in one of the cluster, we observed Xid 
> out of order and Nothing in the queue error continously. And ZK client it 
> finally not able to connect successully to ZK server. 
> *Logs:*
> unexpected error, closing socket connection and attempting reconnect | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) 
> java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 
> for a packet with details: clientPath:null serverPath:null finished:false 
> header:: 53,101  replyHeader:: 0,0,-4  request:: 
> 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes}
>   response:: null
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
> unexpected error, closing socket connection and attempting reconnect 
> java.io.IOException: Nothing in the queue, but got 1
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
>   
> *Analysis:* 
> 1) First time Client fails to do SASL login due to network unreachable 
> problem.
> 2017-03-29 10:03:59,377 | WARN  | [main-SendThread(192.168.130.8:24002)] | 
> SASL configuration failed: javax.security.auth.login.LoginException: Network 
> is unreachable (sendto failed) Will continue connection to Zookeeper server 
> without SASL authentication, if Zookeeper server allows it. | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) 
>   Here the boolean saslLoginFailed becomes true.
> 2) After some time network connection is recovered and client is successully 
> able to login but still the boolean saslLoginFailed is not reset to false. 
> 3) Now SASL negotiation between client and server start happening and during 
> this time no user request will be sent. ( As the socket channel will be 
> closed for write till sasl negotiation complets)
> 4) Now response from server for SASL packet will be processed by the client 
> and client assumes that tunnelAuthInProgress() is finished ( method checks 
> for saslLoginFailed boolean Since the boolean is true it assumes its done.) 
> and tries to process the packet as a other packet and will result in above 
> errors. 
> *Solution:*  Reset the saslLoginFailed boolean every time before client login



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011197#comment-16011197
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user afine commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/254#discussion_r116565952
  
--- Diff: src/java/test/org/apache/zookeeper/SaslAuthTest.java ---
@@ -0,0 +1,187 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper;
+
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.atomic.AtomicInteger;
+import static org.junit.Assert.assertTrue;
+
+import org.apache.zookeeper.ClientCnxn.SendThread;
+import org.apache.zookeeper.Watcher.Event.KeeperState;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.data.ACL;
+import org.apache.zookeeper.data.Id;
+import org.apache.zookeeper.test.ClientBase;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class SaslAuthTest extends ClientBase {
--- End diff --

why was this file moved?


> ZK Client not able to connect with Xid out of order error 
> --
>
> Key: ZOOKEEPER-2775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.10, 3.5.3, 3.6.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Mohammad Arshad
>Priority: Critical
> Attachments: ZOOKEEPER-2775-01.patch
>
>
> During Network unreachable scenario in one of the cluster, we observed Xid 
> out of order and Nothing in the queue error continously. And ZK client it 
> finally not able to connect successully to ZK server. 
> *Logs:*
> unexpected error, closing socket connection and attempting reconnect | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) 
> java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 
> for a packet with details: clientPath:null serverPath:null finished:false 
> header:: 53,101  replyHeader:: 0,0,-4  request:: 
> 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes}
>   response:: null
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
> unexpected error, closing socket connection and attempting reconnect 
> java.io.IOException: Nothing in the queue, but got 1
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
>   
> *Analysis:* 
> 1) First time Client fails to do SASL login due to network unreachable 
> problem.
> 2017-03-29 10:03:59,377 | WARN  | [main-SendThread(192.168.130.8:24002)] | 
> SASL configuration failed: javax.security.auth.login.LoginException: Network 
> is unreachable (sendto failed) Will continue connection to Zookeeper server 
> without SASL authentication, if Zookeeper server allows it. | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) 
>   Here the boolean saslLoginFailed becomes true.
> 2) After some time network connection is recovered and client is successully 
> able to login but still the boolean saslLoginFailed is not reset to 

[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011195#comment-16011195
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user afine commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/254#discussion_r116581847
  
--- Diff: src/java/test/org/apache/zookeeper/SaslAuthTest.java ---
@@ -0,0 +1,187 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper;
+
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.atomic.AtomicInteger;
+import static org.junit.Assert.assertTrue;
+
+import org.apache.zookeeper.ClientCnxn.SendThread;
+import org.apache.zookeeper.Watcher.Event.KeeperState;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.data.ACL;
+import org.apache.zookeeper.data.Id;
+import org.apache.zookeeper.test.ClientBase;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class SaslAuthTest extends ClientBase {
+
+@BeforeClass
+public static void init() {
+System.setProperty("zookeeper.authProvider.1", 
"org.apache.zookeeper.server.auth.SASLAuthenticationProvider");
+try {
+File tmpDir = createTmpDir();
+File saslConfFile = new File(tmpDir, "jaas.conf");
+FileWriter fwriter = new FileWriter(saslConfFile);
+
+fwriter.write("" + "Server {\n" + "  
org.apache.zookeeper.server.auth.DigestLoginModule required\n"
++ "  user_super=\"test\";\n" + "};\n" + 
"Client {\n"
++ "   
org.apache.zookeeper.server.auth.DigestLoginModule required\n"
++ "   username=\"super\"\n" + "   
password=\"test\";\n" + "};" + "\n");
+fwriter.close();
+System.setProperty("java.security.auth.login.config", 
saslConfFile.getAbsolutePath());
+} catch (IOException e) {
+// could not create tmp directory to hold JAAS conf file : 
test will
+// fail now.
+}
+}
+
+@AfterClass
+public static void clean() {
+System.clearProperty("zookeeper.authProvider.1");
+System.clearProperty("java.security.auth.login.config");
+}
+
+private AtomicInteger authFailed = new AtomicInteger(0);
+
+@Override
+protected TestableZooKeeper createClient(String hp) throws 
IOException, InterruptedException {
+MyWatcher watcher = new MyWatcher();
+return createClient(watcher, hp);
+}
+
+private class MyWatcher extends CountdownWatcher {
+@Override
+public synchronized void process(WatchedEvent event) {
+if (event.getState() == KeeperState.AuthFailed) {
+authFailed.incrementAndGet();
+} else {
+super.process(event);
+}
+}
+}
+
+@Test
+public void testAuth() throws Exception {
+ZooKeeper zk = createClient();
+try {
+zk.create("/path1", null, Ids.CREATOR_ALL_ACL, 
CreateMode.PERSISTENT);
+Thread.sleep(1000);
+} finally {
+zk.close();
+}
+}
+
+@Test
+public void testValidSaslIds() throws Exception {
+ZooKeeper zk = createClient();
+
+List validIds = new ArrayList();
+validIds.add("user");
+validIds.add("service/host.name.com");
+validIds.add("user@KERB.REALM");
+validIds.add("service/host.name.com@KERB.REALM");
+
+int i = 

[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011199#comment-16011199
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user afine commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/254#discussion_r116578675
  
--- Diff: src/java/test/org/apache/zookeeper/SaslAuthTest.java ---
@@ -0,0 +1,187 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper;
+
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.atomic.AtomicInteger;
+import static org.junit.Assert.assertTrue;
+
+import org.apache.zookeeper.ClientCnxn.SendThread;
+import org.apache.zookeeper.Watcher.Event.KeeperState;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.data.ACL;
+import org.apache.zookeeper.data.Id;
+import org.apache.zookeeper.test.ClientBase;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class SaslAuthTest extends ClientBase {
--- End diff --

and i think the diff would be cleaner if a `git mv` was used instead of 
adding one file and deleting the other.


> ZK Client not able to connect with Xid out of order error 
> --
>
> Key: ZOOKEEPER-2775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.10, 3.5.3, 3.6.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Mohammad Arshad
>Priority: Critical
> Attachments: ZOOKEEPER-2775-01.patch
>
>
> During Network unreachable scenario in one of the cluster, we observed Xid 
> out of order and Nothing in the queue error continously. And ZK client it 
> finally not able to connect successully to ZK server. 
> *Logs:*
> unexpected error, closing socket connection and attempting reconnect | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) 
> java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 
> for a packet with details: clientPath:null serverPath:null finished:false 
> header:: 53,101  replyHeader:: 0,0,-4  request:: 
> 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes}
>   response:: null
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
> unexpected error, closing socket connection and attempting reconnect 
> java.io.IOException: Nothing in the queue, but got 1
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
>   
> *Analysis:* 
> 1) First time Client fails to do SASL login due to network unreachable 
> problem.
> 2017-03-29 10:03:59,377 | WARN  | [main-SendThread(192.168.130.8:24002)] | 
> SASL configuration failed: javax.security.auth.login.LoginException: Network 
> is unreachable (sendto failed) Will continue connection to Zookeeper server 
> without SASL authentication, if Zookeeper server allows it. | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) 
>   Here the boolean saslLoginFailed becomes true.
> 2) After some time network connection is recovered and clien

[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011196#comment-16011196
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user afine commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/254#discussion_r116581496
  
--- Diff: src/java/test/org/apache/zookeeper/SaslAuthTest.java ---
@@ -0,0 +1,187 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper;
+
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.atomic.AtomicInteger;
+import static org.junit.Assert.assertTrue;
+
+import org.apache.zookeeper.ClientCnxn.SendThread;
+import org.apache.zookeeper.Watcher.Event.KeeperState;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.data.ACL;
+import org.apache.zookeeper.data.Id;
+import org.apache.zookeeper.test.ClientBase;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class SaslAuthTest extends ClientBase {
+
+@BeforeClass
+public static void init() {
+System.setProperty("zookeeper.authProvider.1", 
"org.apache.zookeeper.server.auth.SASLAuthenticationProvider");
+try {
+File tmpDir = createTmpDir();
+File saslConfFile = new File(tmpDir, "jaas.conf");
+FileWriter fwriter = new FileWriter(saslConfFile);
+
+fwriter.write("" + "Server {\n" + "  
org.apache.zookeeper.server.auth.DigestLoginModule required\n"
++ "  user_super=\"test\";\n" + "};\n" + 
"Client {\n"
++ "   
org.apache.zookeeper.server.auth.DigestLoginModule required\n"
++ "   username=\"super\"\n" + "   
password=\"test\";\n" + "};" + "\n");
+fwriter.close();
+System.setProperty("java.security.auth.login.config", 
saslConfFile.getAbsolutePath());
+} catch (IOException e) {
+// could not create tmp directory to hold JAAS conf file : 
test will
+// fail now.
+}
+}
+
+@AfterClass
+public static void clean() {
+System.clearProperty("zookeeper.authProvider.1");
+System.clearProperty("java.security.auth.login.config");
+}
+
+private AtomicInteger authFailed = new AtomicInteger(0);
+
+@Override
+protected TestableZooKeeper createClient(String hp) throws 
IOException, InterruptedException {
+MyWatcher watcher = new MyWatcher();
+return createClient(watcher, hp);
+}
+
+private class MyWatcher extends CountdownWatcher {
+@Override
+public synchronized void process(WatchedEvent event) {
+if (event.getState() == KeeperState.AuthFailed) {
+authFailed.incrementAndGet();
+} else {
+super.process(event);
+}
+}
+}
+
+@Test
+public void testAuth() throws Exception {
+ZooKeeper zk = createClient();
+try {
+zk.create("/path1", null, Ids.CREATOR_ALL_ACL, 
CreateMode.PERSISTENT);
+Thread.sleep(1000);
+} finally {
+zk.close();
+}
+}
+
+@Test
+public void testValidSaslIds() throws Exception {
+ZooKeeper zk = createClient();
+
+List validIds = new ArrayList();
+validIds.add("user");
+validIds.add("service/host.name.com");
+validIds.add("user@KERB.REALM");
+validIds.add("service/host.name.com@KERB.REALM");
+
+int i = 

[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011198#comment-16011198
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user afine commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/254#discussion_r116570478
  
--- Diff: src/java/main/org/apache/zookeeper/ClientCnxn.java ---
@@ -1080,6 +1080,8 @@ private void startConnect() throws IOException {
 zooKeeperSaslClient.shutdown();
 }
 zooKeeperSaslClient = new 
ZooKeeperSaslClient(getServerPrincipal(addr), clientConfig);
+// SASL login succeeded
+saslLoginFailed = false;
--- End diff --

I wonder if this will impact `clientTunneledAuthenticationInProgress`? 
Perhaps we should revert this when a new connection attempt starts?


> ZK Client not able to connect with Xid out of order error 
> --
>
> Key: ZOOKEEPER-2775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.10, 3.5.3, 3.6.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Mohammad Arshad
>Priority: Critical
> Attachments: ZOOKEEPER-2775-01.patch
>
>
> During Network unreachable scenario in one of the cluster, we observed Xid 
> out of order and Nothing in the queue error continously. And ZK client it 
> finally not able to connect successully to ZK server. 
> *Logs:*
> unexpected error, closing socket connection and attempting reconnect | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) 
> java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 
> for a packet with details: clientPath:null serverPath:null finished:false 
> header:: 53,101  replyHeader:: 0,0,-4  request:: 
> 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes}
>   response:: null
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
> unexpected error, closing socket connection and attempting reconnect 
> java.io.IOException: Nothing in the queue, but got 1
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
>   
> *Analysis:* 
> 1) First time Client fails to do SASL login due to network unreachable 
> problem.
> 2017-03-29 10:03:59,377 | WARN  | [main-SendThread(192.168.130.8:24002)] | 
> SASL configuration failed: javax.security.auth.login.LoginException: Network 
> is unreachable (sendto failed) Will continue connection to Zookeeper server 
> without SASL authentication, if Zookeeper server allows it. | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) 
>   Here the boolean saslLoginFailed becomes true.
> 2) After some time network connection is recovered and client is successully 
> able to login but still the boolean saslLoginFailed is not reset to false. 
> 3) Now SASL negotiation between client and server start happening and during 
> this time no user request will be sent. ( As the socket channel will be 
> closed for write till sasl negotiation complets)
> 4) Now response from server for SASL packet will be processed by the client 
> and client assumes that tunnelAuthInProgress() is finished ( method checks 
> for saslLoginFailed boolean Since the boolean is true it assumes its done.) 
> and tries to process the packet as a other packet and will result in above 
> errors. 
> *Solution:*  Reset the saslLoginFailed boolean every time before client login



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2774) Ephemeral znode will not be removed when sesstion timeout, if the system time of ZooKeeper node changes unexpectedly.

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011268#comment-16011268
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2774:
---

Github user afine commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/253#discussion_r116588671
  
--- Diff: src/java/main/org/apache/zookeeper/ZKUtil.java ---
@@ -120,5 +120,4 @@ public static void deleteRecursive(ZooKeeper zk, final 
String pathRoot, VoidCall
 }
 return tree;
 }
-
-}
\ No newline at end of file
+}
--- End diff --

unnecessary change


> Ephemeral znode will not be removed when sesstion timeout, if the system time 
> of ZooKeeper node changes unexpectedly.
> -
>
> Key: ZOOKEEPER-2774
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2774
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.8, 3.4.9, 3.4.10
> Environment: Centos6.5
>Reporter: JiangJiafu
>
> 1. Deploy a ZooKeeper cluster with one node.
> 2. Create a Ephemeral znode.
> 3. Change the system time of the ZooKeeper node to a earlier point.
> 4. Disconnect the client with the ZooKeeper server.
> Then the ephemeral znode will exist for a long time even when session timeout.
> I have read the ZooKeeper source code and I find the code int 
> SessionTrackerImpl.java,
> {code:title=SessionTrackerImpl.java|borderStyle=solid}
> @Override
> synchronized public void run() {
> try {
> while (running) {
> currentTime = System.currentTimeMillis();
> if (nextExpirationTime > currentTime) {
> this.wait(nextExpirationTime - currentTime);
> continue;
> }
> SessionSet set;
> set = sessionSets.remove(nextExpirationTime);
> if (set != null) {
> for (SessionImpl s : set.sessions) {
> setSessionClosing(s.sessionId);
> expirer.expire(s);
> }
> }
> nextExpirationTime += expirationInterval;
> }
> } catch (InterruptedException e) {
> handleException(this.getName(), e);
> }
> LOG.info("SessionTrackerImpl exited loop!");
> }
> {code}
> I think it may be better to use System.nanoTime(), not 
> System.currentTimeMillis, because the later can be changed manually or 
> automatically by a NTP client. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2756) Add CMake build system for better cross-platform support

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011438#comment-16011438
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2756:
---

GitHub user andschwa opened a pull request:

https://github.com/apache/zookeeper/pull/255

ZOOKEEPER-2756: Add CMake build system for better cross-platform support

ZOOKEEPER-2756: Add CMake build system for better cross-platform support

This notably lacks Solaris and libtool support.

Almost everything else from Autotools has been ported, including 
header/function/library checks, and all targets (zookeeper, hashtable, cli, 
load_gen, and tests).

Both Linux and Windows are supported.

The primary work involved (other than the writing of `CMakeLists.txt`) was 
transitioning the hand-written `winconfig.h` to an auto-generated `config.h` 
file, and guarding code with `#ifdef HAVE_FEATURE`. The `cmake_config.h.in` 
template was modeled after the Autotools config file so that the feature guards 
share the same names.

While `load_gen.c` looks at first glance as if it were ported to Windows, 
it never actually was, so the erroneous `#include "win32port.h"` was removed, 
and the target is not built on Windows.

There are existent warnings which this patch did not attempt to fix, save a 
few easy ones (set but unused `rc` variable).

Fix DLL_EXPORT and USE_STATIC_LIB redefinition.

Some changes to `winconfig.h` necessary to build with Visual Studio 2015 
(and 2017) were included; these originally came from a patch embedded inside 
the Mesos build process.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/andschwa/zookeeper ZOOKEEPER-2756

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/255.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #255


commit 932ca61b3f7d3112b5368872a4bfec7523484ee2
Author: Andrew Schwartzmeyer 
Date:   2017-04-10T23:12:40Z

ZOOKEEPER-2756: Add CMake build system for better cross-platform support

This notably lacks Solaris and libtool support.

Almost everything else from Autotools has been ported,
including header/function/library checks, and all targets
(zookeeper, hashtable, cli, load_gen, and tests).

Both Linux and Windows are supported.

The primary work involved (other than the writing of `CMakeLists.txt`)
was transitioning the hand-written `winconfig.h` to an
auto-generated `config.h` file, and guarding code with `#ifdef
HAVE_FEATURE`. The `cmake_config.h.in` template was modeled after
the Autotools config file so that the feature guards share the same
names.

While `load_gen.c` looks at first glance as if it were ported to Windows,
it never actually was, so the erroneous `#include "win32port.h"` was
removed, and the target is not built on Windows.

There are existent warnings which this patch did not attempt to fix,
save a few easy ones (set but unused `rc` variable).

Fix DLL_EXPORT and USE_STATIC_LIB redefinition.

Some changes to `winconfig.h` necessary to build with Visual Studio 2015
(and 2017) were included; these originally came from a patch embedded
inside the Mesos build process.




> Add CMake build system for better cross-platform support
> 
>
> Key: ZOOKEEPER-2756
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2756
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: build, c client
>Affects Versions: 3.5.2
> Environment: Windows and Linux
>Reporter: Andrew Schwartzmeyer
>Assignee: Andrew Schwartzmeyer
>  Labels: build, windows
> Attachments: ZOOKEEPER-2756.patch
>
>
> The C bindings primary build system is Autotools. This obviously does not 
> work for Windows, and so the original port to Windows simply added a Visual 
> Studio solution to the project, splitting the build system. As new versions 
> of Visual Studio have come along, new (probably auto-converted) solutions 
> have come along (see zookeeper.sln vs zookeeper-vs2013.sln). When Mesos 
> started being ported to Windows, a Visual Studio 2015 solution was needed, 
> and the previous developer created yet another solution, and setup Mesos' 
> build to patch ZooKeeper and add the 2015 solution. Now Visual Studio 2017 
> was released, and in the process of moving Mesos ahead, I realized that I 
> would either have to make *yet another* converted solution for ZooKeeper. So 
> instead I tackled the root problem, and ported the Autotools build to 

[jira] [Commented] (ZOOKEEPER-2756) Add CMake build system for better cross-platform support

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011439#comment-16011439
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2756:
---

Github user andschwa commented on the issue:

https://github.com/apache/zookeeper/pull/255
  
Tests can be run with the correct environment via `ctest` (`make test` 
won't have the right environment without manual setup).


> Add CMake build system for better cross-platform support
> 
>
> Key: ZOOKEEPER-2756
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2756
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: build, c client
>Affects Versions: 3.5.2
> Environment: Windows and Linux
>Reporter: Andrew Schwartzmeyer
>Assignee: Andrew Schwartzmeyer
>  Labels: build, windows
> Attachments: ZOOKEEPER-2756.patch
>
>
> The C bindings primary build system is Autotools. This obviously does not 
> work for Windows, and so the original port to Windows simply added a Visual 
> Studio solution to the project, splitting the build system. As new versions 
> of Visual Studio have come along, new (probably auto-converted) solutions 
> have come along (see zookeeper.sln vs zookeeper-vs2013.sln). When Mesos 
> started being ported to Windows, a Visual Studio 2015 solution was needed, 
> and the previous developer created yet another solution, and setup Mesos' 
> build to patch ZooKeeper and add the 2015 solution. Now Visual Studio 2017 
> was released, and in the process of moving Mesos ahead, I realized that I 
> would either have to make *yet another* converted solution for ZooKeeper. So 
> instead I tackled the root problem, and ported the Autotools build to CMake, 
> which is a meta-build system which generates files for the in-use platform 
> (whether it be Linux or Solaris or MacOS or Windows).
> NOTE: I already have this patch, and will submit it. It has a couple TODOs, 
> and some other things in it that were necessary for Mesos that may need to be 
> pulled into separate patches.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2756) Add CMake build system for better cross-platform support

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011442#comment-16011442
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2756:
---

Github user andschwa commented on the issue:

https://github.com/apache/zookeeper/pull/255
  
@hanm I actually split this into the few commits I would _normally_ have 
posted in my [cmake-pr](https://github.com/andschwa/zookeeper/commits/cmake-pr) 
branch, and then went back to a single commit after re-reading the contributing 
guidelines.


> Add CMake build system for better cross-platform support
> 
>
> Key: ZOOKEEPER-2756
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2756
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: build, c client
>Affects Versions: 3.5.2
> Environment: Windows and Linux
>Reporter: Andrew Schwartzmeyer
>Assignee: Andrew Schwartzmeyer
>  Labels: build, windows
> Attachments: ZOOKEEPER-2756.patch
>
>
> The C bindings primary build system is Autotools. This obviously does not 
> work for Windows, and so the original port to Windows simply added a Visual 
> Studio solution to the project, splitting the build system. As new versions 
> of Visual Studio have come along, new (probably auto-converted) solutions 
> have come along (see zookeeper.sln vs zookeeper-vs2013.sln). When Mesos 
> started being ported to Windows, a Visual Studio 2015 solution was needed, 
> and the previous developer created yet another solution, and setup Mesos' 
> build to patch ZooKeeper and add the 2015 solution. Now Visual Studio 2017 
> was released, and in the process of moving Mesos ahead, I realized that I 
> would either have to make *yet another* converted solution for ZooKeeper. So 
> instead I tackled the root problem, and ported the Autotools build to CMake, 
> which is a meta-build system which generates files for the in-use platform 
> (whether it be Linux or Solaris or MacOS or Windows).
> NOTE: I already have this patch, and will submit it. It has a couple TODOs, 
> and some other things in it that were necessary for Mesos that may need to be 
> pulled into separate patches.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2774) Ephemeral znode will not be removed when sesstion timeout, if the system time of ZooKeeper node changes unexpectedly.

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011577#comment-16011577
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2774:
---

Github user JiangJiafu commented on the issue:

https://github.com/apache/zookeeper/pull/253
  
Thanks for your review work. @afine 


> Ephemeral znode will not be removed when sesstion timeout, if the system time 
> of ZooKeeper node changes unexpectedly.
> -
>
> Key: ZOOKEEPER-2774
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2774
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.8, 3.4.9, 3.4.10
> Environment: Centos6.5
>Reporter: JiangJiafu
>
> 1. Deploy a ZooKeeper cluster with one node.
> 2. Create a Ephemeral znode.
> 3. Change the system time of the ZooKeeper node to a earlier point.
> 4. Disconnect the client with the ZooKeeper server.
> Then the ephemeral znode will exist for a long time even when session timeout.
> I have read the ZooKeeper source code and I find the code int 
> SessionTrackerImpl.java,
> {code:title=SessionTrackerImpl.java|borderStyle=solid}
> @Override
> synchronized public void run() {
> try {
> while (running) {
> currentTime = System.currentTimeMillis();
> if (nextExpirationTime > currentTime) {
> this.wait(nextExpirationTime - currentTime);
> continue;
> }
> SessionSet set;
> set = sessionSets.remove(nextExpirationTime);
> if (set != null) {
> for (SessionImpl s : set.sessions) {
> setSessionClosing(s.sessionId);
> expirer.expire(s);
> }
> }
> nextExpirationTime += expirationInterval;
> }
> } catch (InterruptedException e) {
> handleException(this.getName(), e);
> }
> LOG.info("SessionTrackerImpl exited loop!");
> }
> {code}
> I think it may be better to use System.nanoTime(), not 
> System.currentTimeMillis, because the later can be changed manually or 
> automatically by a NTP client. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2691) recreateSocketAddresses may recreate the unreachable IP address

2017-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16012719#comment-16012719
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2691:
---

Github user hanm commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/173#discussion_r116798516
  
--- Diff: src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java 
---
@@ -198,6 +246,12 @@ public void recreateSocketAddresses() {
 public long id;
 
 public LearnerType type = LearnerType.PARTICIPANT;
+
+/**
+ * the time, in milliseconds, before {@link 
InetAddress#isReachable} aborts
+ * in {@link #getReachableAddress}.
+ */
+private int ipReachableTimeout = 0;
--- End diff --

I think we can remove this. This will also remove the four copies of exact 
same initialization code fragment that you use to initialize the value. We can 
get the sys property and parse the value inside recreateSocketAddress directly.


> recreateSocketAddresses may recreate the unreachable IP address
> ---
>
> Key: ZOOKEEPER-2691
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2691
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.8, 3.4.9, 3.4.10, 3.5.0, 3.5.1, 3.5.2, 3.4.11
> Environment: Centos6.5
> Java8
> ZooKeeper3.4.8
>Reporter: JiangJiafu
>Priority: Minor
>
> The QuorumPeer$QuorumServer.recreateSocketAddress()  is used to resolved the 
> hostname to a new IP address(InetAddress) when any exception happens to the 
> socket. It will be very useful when a hostname can be resolved to more than 
> one IP address.
> But the problem is Java API InetAddress.getByName(String hostname) will 
> always return the first IP address when the hostname can be resolved to more 
> than one IP address, and the first IP address may be unreachable forever. For 
> example, if a machine has two network interfaces: eth0, eth1, say eth0 has 
> ip1, eth1 has ip2, the relationship between hostname and the IP addresses is 
> set in /etc/hosts. When I "close" the eth0 by command "ifdown eth0", the 
> InetAddress.getByName(String hostname)  will still return ip1, which is 
> unreachable forever.
> So I think it will be better to check the IP address by 
> InetAddress.isReachable(long) and choose the reachable IP address. 
> I have modified the ZooKeeper source code, and test the new code in my own 
> environment, and it can work very well when I turn down some network 
> interfaces using "ifdown" command.
> The original code is:
> {code:title=QuorumPeer.java|borderStyle=solid}
> public void recreateSocketAddresses() {
> InetAddress address = null;
> try {
> address = InetAddress.getByName(this.hostname);
> LOG.info("Resolved hostname: {} to address: {}", 
> this.hostname, address);
> this.addr = new InetSocketAddress(address, this.port);
> if (this.electionPort > 0){
> this.electionAddr = new InetSocketAddress(address, 
> this.electionPort);
> }
> } catch (UnknownHostException ex) {
> LOG.warn("Failed to resolve address: {}", this.hostname, ex);
> // Have we succeeded in the past?
> if (this.addr != null) {
> // Yes, previously the lookup succeeded. Leave things as 
> they are
> return;
> }
> // The hostname has never resolved. Create our 
> InetSocketAddress(es) as unresolved
> this.addr = InetSocketAddress.createUnresolved(this.hostname, 
> this.port);
> if (this.electionPort > 0){
> this.electionAddr = 
> InetSocketAddress.createUnresolved(this.hostname,
>
> this.electionPort);
> }
> }
> }
> {code}
> After my modification:
> {code:title=QuorumPeer.java|borderStyle=solid}
> public void recreateSocketAddresses() {
> InetAddress address = null;
> try {
> address = getReachableAddress(this.hostname);
> LOG.info("Resolved hostname: {} to address: {}", 
> this.hostname, address);
> this.addr = new InetSocketAddress(address, this.port);
> if (this.electionPort > 0){
> this.electionAddr = new InetSocketAddress(address, 
> this.electionPort);
> }
> } catch (UnknownHostException ex) {
> LOG.warn("Failed to resolve address: {}", this.hostname, ex);
> // Have we succeeded in the pas

[jira] [Commented] (ZOOKEEPER-2691) recreateSocketAddresses may recreate the unreachable IP address

2017-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16012725#comment-16012725
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2691:
---

Github user hanm commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/173#discussion_r116799126
  
--- Diff: src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java 
---
@@ -117,13 +117,21 @@ private QuorumServer(long id, InetSocketAddress addr,
 this.id = id;
 this.addr = addr;
 this.electionAddr = electionAddr;
+String ipReachableValue = 
System.getProperty("zookeeper.ipReachableTimeout");
--- End diff --

This code is duplicated four times, not good :) - you can pull this into a 
dedicated function instead. Also make sure to add try catch around the parseInt 
to prevent crash against illegal input values. 


> recreateSocketAddresses may recreate the unreachable IP address
> ---
>
> Key: ZOOKEEPER-2691
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2691
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.8, 3.4.9, 3.4.10, 3.5.0, 3.5.1, 3.5.2, 3.4.11
> Environment: Centos6.5
> Java8
> ZooKeeper3.4.8
>Reporter: JiangJiafu
>Priority: Minor
>
> The QuorumPeer$QuorumServer.recreateSocketAddress()  is used to resolved the 
> hostname to a new IP address(InetAddress) when any exception happens to the 
> socket. It will be very useful when a hostname can be resolved to more than 
> one IP address.
> But the problem is Java API InetAddress.getByName(String hostname) will 
> always return the first IP address when the hostname can be resolved to more 
> than one IP address, and the first IP address may be unreachable forever. For 
> example, if a machine has two network interfaces: eth0, eth1, say eth0 has 
> ip1, eth1 has ip2, the relationship between hostname and the IP addresses is 
> set in /etc/hosts. When I "close" the eth0 by command "ifdown eth0", the 
> InetAddress.getByName(String hostname)  will still return ip1, which is 
> unreachable forever.
> So I think it will be better to check the IP address by 
> InetAddress.isReachable(long) and choose the reachable IP address. 
> I have modified the ZooKeeper source code, and test the new code in my own 
> environment, and it can work very well when I turn down some network 
> interfaces using "ifdown" command.
> The original code is:
> {code:title=QuorumPeer.java|borderStyle=solid}
> public void recreateSocketAddresses() {
> InetAddress address = null;
> try {
> address = InetAddress.getByName(this.hostname);
> LOG.info("Resolved hostname: {} to address: {}", 
> this.hostname, address);
> this.addr = new InetSocketAddress(address, this.port);
> if (this.electionPort > 0){
> this.electionAddr = new InetSocketAddress(address, 
> this.electionPort);
> }
> } catch (UnknownHostException ex) {
> LOG.warn("Failed to resolve address: {}", this.hostname, ex);
> // Have we succeeded in the past?
> if (this.addr != null) {
> // Yes, previously the lookup succeeded. Leave things as 
> they are
> return;
> }
> // The hostname has never resolved. Create our 
> InetSocketAddress(es) as unresolved
> this.addr = InetSocketAddress.createUnresolved(this.hostname, 
> this.port);
> if (this.electionPort > 0){
> this.electionAddr = 
> InetSocketAddress.createUnresolved(this.hostname,
>
> this.electionPort);
> }
> }
> }
> {code}
> After my modification:
> {code:title=QuorumPeer.java|borderStyle=solid}
> public void recreateSocketAddresses() {
> InetAddress address = null;
> try {
> address = getReachableAddress(this.hostname);
> LOG.info("Resolved hostname: {} to address: {}", 
> this.hostname, address);
> this.addr = new InetSocketAddress(address, this.port);
> if (this.electionPort > 0){
> this.electionAddr = new InetSocketAddress(address, 
> this.electionPort);
> }
> } catch (UnknownHostException ex) {
> LOG.warn("Failed to resolve address: {}", this.hostname, ex);
> // Have we succeeded in the past?
> if (this.addr != null) {
> // Yes, previously the lookup succeeded. Leave things as 
> they are
>

[jira] [Commented] (ZOOKEEPER-2756) Add CMake build system for better cross-platform support

2017-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16012761#comment-16012761
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2756:
---

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/255
  
Please add Apache License header for CMakeLists.txt and cmake_config.h.in.


> Add CMake build system for better cross-platform support
> 
>
> Key: ZOOKEEPER-2756
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2756
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: build, c client
>Affects Versions: 3.5.2
> Environment: Windows and Linux
>Reporter: Andrew Schwartzmeyer
>Assignee: Andrew Schwartzmeyer
>  Labels: build, windows
> Attachments: ZOOKEEPER-2756.patch
>
>
> The C bindings primary build system is Autotools. This obviously does not 
> work for Windows, and so the original port to Windows simply added a Visual 
> Studio solution to the project, splitting the build system. As new versions 
> of Visual Studio have come along, new (probably auto-converted) solutions 
> have come along (see zookeeper.sln vs zookeeper-vs2013.sln). When Mesos 
> started being ported to Windows, a Visual Studio 2015 solution was needed, 
> and the previous developer created yet another solution, and setup Mesos' 
> build to patch ZooKeeper and add the 2015 solution. Now Visual Studio 2017 
> was released, and in the process of moving Mesos ahead, I realized that I 
> would either have to make *yet another* converted solution for ZooKeeper. So 
> instead I tackled the root problem, and ported the Autotools build to CMake, 
> which is a meta-build system which generates files for the in-use platform 
> (whether it be Linux or Solaris or MacOS or Windows).
> NOTE: I already have this patch, and will submit it. It has a couple TODOs, 
> and some other things in it that were necessary for Mesos that may need to be 
> pulled into separate patches.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2756) Add CMake build system for better cross-platform support

2017-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16012786#comment-16012786
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2756:
---

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/255
  
@andschwa Have you tested if your patch breaks existing MSVC project files 
(zookeeper.sln, zookeeper.vcproj, Cli.vcxproj)?


> Add CMake build system for better cross-platform support
> 
>
> Key: ZOOKEEPER-2756
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2756
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: build, c client
>Affects Versions: 3.5.2
> Environment: Windows and Linux
>Reporter: Andrew Schwartzmeyer
>Assignee: Andrew Schwartzmeyer
>  Labels: build, windows
> Attachments: ZOOKEEPER-2756.patch
>
>
> The C bindings primary build system is Autotools. This obviously does not 
> work for Windows, and so the original port to Windows simply added a Visual 
> Studio solution to the project, splitting the build system. As new versions 
> of Visual Studio have come along, new (probably auto-converted) solutions 
> have come along (see zookeeper.sln vs zookeeper-vs2013.sln). When Mesos 
> started being ported to Windows, a Visual Studio 2015 solution was needed, 
> and the previous developer created yet another solution, and setup Mesos' 
> build to patch ZooKeeper and add the 2015 solution. Now Visual Studio 2017 
> was released, and in the process of moving Mesos ahead, I realized that I 
> would either have to make *yet another* converted solution for ZooKeeper. So 
> instead I tackled the root problem, and ported the Autotools build to CMake, 
> which is a meta-build system which generates files for the in-use platform 
> (whether it be Linux or Solaris or MacOS or Windows).
> NOTE: I already have this patch, and will submit it. It has a couple TODOs, 
> and some other things in it that were necessary for Mesos that may need to be 
> pulled into separate patches.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2774) Ephemeral znode will not be removed when sesstion timeout, if the system time of ZooKeeper node changes unexpectedly.

2017-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16012859#comment-16012859
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2774:
---

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/253
  
* Missing changes to ClientBase.java and QuorumPeerMainTest.java.
* File TimeTest.java is not missing.


> Ephemeral znode will not be removed when sesstion timeout, if the system time 
> of ZooKeeper node changes unexpectedly.
> -
>
> Key: ZOOKEEPER-2774
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2774
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.8, 3.4.9, 3.4.10
> Environment: Centos6.5
>Reporter: JiangJiafu
>
> 1. Deploy a ZooKeeper cluster with one node.
> 2. Create a Ephemeral znode.
> 3. Change the system time of the ZooKeeper node to a earlier point.
> 4. Disconnect the client with the ZooKeeper server.
> Then the ephemeral znode will exist for a long time even when session timeout.
> I have read the ZooKeeper source code and I find the code int 
> SessionTrackerImpl.java,
> {code:title=SessionTrackerImpl.java|borderStyle=solid}
> @Override
> synchronized public void run() {
> try {
> while (running) {
> currentTime = System.currentTimeMillis();
> if (nextExpirationTime > currentTime) {
> this.wait(nextExpirationTime - currentTime);
> continue;
> }
> SessionSet set;
> set = sessionSets.remove(nextExpirationTime);
> if (set != null) {
> for (SessionImpl s : set.sessions) {
> setSessionClosing(s.sessionId);
> expirer.expire(s);
> }
> }
> nextExpirationTime += expirationInterval;
> }
> } catch (InterruptedException e) {
> handleException(this.getName(), e);
> }
> LOG.info("SessionTrackerImpl exited loop!");
> }
> {code}
> I think it may be better to use System.nanoTime(), not 
> System.currentTimeMillis, because the later can be changed manually or 
> automatically by a NTP client. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2756) Add CMake build system for better cross-platform support

2017-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16012900#comment-16012900
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2756:
---

Github user andschwa commented on the issue:

https://github.com/apache/zookeeper/pull/255
  
@hanm I was proposing we replace the existing sln/vcproj/vcxproj entirely, 
as they can be built for any version of VS using CMake. I could record their 
deletion in this PR even. We _could_ keep them and retain compatibility with 
them, but it'd be ugly, as they rely on the manually generated portion of 
`winconfig.h` which I replaced with the auto-generated `config.h`. What do you 
think?

Also, I'm not super comfortable including the changes in 
065056b0b240b7bdff2eebe41db86a9a3ea6ecfc , your thoughts? 


> Add CMake build system for better cross-platform support
> 
>
> Key: ZOOKEEPER-2756
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2756
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: build, c client
>Affects Versions: 3.5.2
> Environment: Windows and Linux
>Reporter: Andrew Schwartzmeyer
>Assignee: Andrew Schwartzmeyer
>  Labels: build, windows
> Attachments: ZOOKEEPER-2756.patch
>
>
> The C bindings primary build system is Autotools. This obviously does not 
> work for Windows, and so the original port to Windows simply added a Visual 
> Studio solution to the project, splitting the build system. As new versions 
> of Visual Studio have come along, new (probably auto-converted) solutions 
> have come along (see zookeeper.sln vs zookeeper-vs2013.sln). When Mesos 
> started being ported to Windows, a Visual Studio 2015 solution was needed, 
> and the previous developer created yet another solution, and setup Mesos' 
> build to patch ZooKeeper and add the 2015 solution. Now Visual Studio 2017 
> was released, and in the process of moving Mesos ahead, I realized that I 
> would either have to make *yet another* converted solution for ZooKeeper. So 
> instead I tackled the root problem, and ported the Autotools build to CMake, 
> which is a meta-build system which generates files for the in-use platform 
> (whether it be Linux or Solaris or MacOS or Windows).
> NOTE: I already have this patch, and will submit it. It has a couple TODOs, 
> and some other things in it that were necessary for Mesos that may need to be 
> pulled into separate patches.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013066#comment-16013066
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/254
  
How about removing saslLoginFailed and use zooKeeperSaslClient to track the 
state of sasl login? The invariant would be:
* sasl login failed: zooKeeperSaslClient == null
* sasl login succeeded or in progress: zooKeeperSaslClient != null


> ZK Client not able to connect with Xid out of order error 
> --
>
> Key: ZOOKEEPER-2775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.10, 3.5.3, 3.6.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Mohammad Arshad
>Priority: Critical
> Attachments: ZOOKEEPER-2775-01.patch
>
>
> During Network unreachable scenario in one of the cluster, we observed Xid 
> out of order and Nothing in the queue error continously. And ZK client it 
> finally not able to connect successully to ZK server. 
> *Logs:*
> unexpected error, closing socket connection and attempting reconnect | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) 
> java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 
> for a packet with details: clientPath:null serverPath:null finished:false 
> header:: 53,101  replyHeader:: 0,0,-4  request:: 
> 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes}
>   response:: null
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
> unexpected error, closing socket connection and attempting reconnect 
> java.io.IOException: Nothing in the queue, but got 1
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
>   
> *Analysis:* 
> 1) First time Client fails to do SASL login due to network unreachable 
> problem.
> 2017-03-29 10:03:59,377 | WARN  | [main-SendThread(192.168.130.8:24002)] | 
> SASL configuration failed: javax.security.auth.login.LoginException: Network 
> is unreachable (sendto failed) Will continue connection to Zookeeper server 
> without SASL authentication, if Zookeeper server allows it. | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) 
>   Here the boolean saslLoginFailed becomes true.
> 2) After some time network connection is recovered and client is successully 
> able to login but still the boolean saslLoginFailed is not reset to false. 
> 3) Now SASL negotiation between client and server start happening and during 
> this time no user request will be sent. ( As the socket channel will be 
> closed for write till sasl negotiation complets)
> 4) Now response from server for SASL packet will be processed by the client 
> and client assumes that tunnelAuthInProgress() is finished ( method checks 
> for saslLoginFailed boolean Since the boolean is true it assumes its done.) 
> and tries to process the packet as a other packet and will result in above 
> errors. 
> *Solution:*  Reset the saslLoginFailed boolean every time before client login



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013195#comment-16013195
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user arshadmohammad commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/254#discussion_r116871274
  
--- Diff: src/java/test/org/apache/zookeeper/SaslAuthTest.java ---
@@ -0,0 +1,187 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper;
+
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.atomic.AtomicInteger;
+import static org.junit.Assert.assertTrue;
+
+import org.apache.zookeeper.ClientCnxn.SendThread;
+import org.apache.zookeeper.Watcher.Event.KeeperState;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.data.ACL;
+import org.apache.zookeeper.data.Id;
+import org.apache.zookeeper.test.ClientBase;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class SaslAuthTest extends ClientBase {
--- End diff --

File was moved to make SendThread accessible in the test case.  Now moved 
with git mv. This is the right way to move the file


> ZK Client not able to connect with Xid out of order error 
> --
>
> Key: ZOOKEEPER-2775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.10, 3.5.3, 3.6.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Mohammad Arshad
>Priority: Critical
> Attachments: ZOOKEEPER-2775-01.patch
>
>
> During Network unreachable scenario in one of the cluster, we observed Xid 
> out of order and Nothing in the queue error continously. And ZK client it 
> finally not able to connect successully to ZK server. 
> *Logs:*
> unexpected error, closing socket connection and attempting reconnect | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) 
> java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 
> for a packet with details: clientPath:null serverPath:null finished:false 
> header:: 53,101  replyHeader:: 0,0,-4  request:: 
> 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes}
>   response:: null
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
> unexpected error, closing socket connection and attempting reconnect 
> java.io.IOException: Nothing in the queue, but got 1
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
>   
> *Analysis:* 
> 1) First time Client fails to do SASL login due to network unreachable 
> problem.
> 2017-03-29 10:03:59,377 | WARN  | [main-SendThread(192.168.130.8:24002)] | 
> SASL configuration failed: javax.security.auth.login.LoginException: Network 
> is unreachable (sendto failed) Will continue connection to Zookeeper server 
> without SASL authentication, if Zookeeper server allows it. | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) 
>   Here the boolean saslLoginFailed becomes true.
> 2) After some time network connection

[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013196#comment-16013196
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user arshadmohammad commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/254#discussion_r116871380
  
--- Diff: src/java/test/org/apache/zookeeper/SaslAuthTest.java ---
@@ -0,0 +1,187 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper;
+
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.atomic.AtomicInteger;
+import static org.junit.Assert.assertTrue;
+
+import org.apache.zookeeper.ClientCnxn.SendThread;
+import org.apache.zookeeper.Watcher.Event.KeeperState;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.data.ACL;
+import org.apache.zookeeper.data.Id;
+import org.apache.zookeeper.test.ClientBase;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class SaslAuthTest extends ClientBase {
+
+@BeforeClass
+public static void init() {
+System.setProperty("zookeeper.authProvider.1", 
"org.apache.zookeeper.server.auth.SASLAuthenticationProvider");
+try {
+File tmpDir = createTmpDir();
+File saslConfFile = new File(tmpDir, "jaas.conf");
+FileWriter fwriter = new FileWriter(saslConfFile);
+
+fwriter.write("" + "Server {\n" + "  
org.apache.zookeeper.server.auth.DigestLoginModule required\n"
++ "  user_super=\"test\";\n" + "};\n" + 
"Client {\n"
++ "   
org.apache.zookeeper.server.auth.DigestLoginModule required\n"
++ "   username=\"super\"\n" + "   
password=\"test\";\n" + "};" + "\n");
+fwriter.close();
+System.setProperty("java.security.auth.login.config", 
saslConfFile.getAbsolutePath());
+} catch (IOException e) {
+// could not create tmp directory to hold JAAS conf file : 
test will
+// fail now.
+}
+}
+
+@AfterClass
+public static void clean() {
+System.clearProperty("zookeeper.authProvider.1");
+System.clearProperty("java.security.auth.login.config");
+}
+
+private AtomicInteger authFailed = new AtomicInteger(0);
+
+@Override
+protected TestableZooKeeper createClient(String hp) throws 
IOException, InterruptedException {
+MyWatcher watcher = new MyWatcher();
+return createClient(watcher, hp);
+}
+
+private class MyWatcher extends CountdownWatcher {
+@Override
+public synchronized void process(WatchedEvent event) {
+if (event.getState() == KeeperState.AuthFailed) {
+authFailed.incrementAndGet();
+} else {
+super.process(event);
+}
+}
+}
+
+@Test
+public void testAuth() throws Exception {
+ZooKeeper zk = createClient();
+try {
+zk.create("/path1", null, Ids.CREATOR_ALL_ACL, 
CreateMode.PERSISTENT);
+Thread.sleep(1000);
+} finally {
+zk.close();
+}
+}
+
+@Test
+public void testValidSaslIds() throws Exception {
+ZooKeeper zk = createClient();
+
+List validIds = new ArrayList();
+validIds.add("user");
+validIds.add("service/host.name.com");
+validIds.add("user@KERB.REALM");
+validIds.add("service/host.name.com@KERB.REALM");
+
+   

[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013197#comment-16013197
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user arshadmohammad commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/254#discussion_r116871534
  
--- Diff: src/java/test/org/apache/zookeeper/SaslAuthTest.java ---
@@ -0,0 +1,187 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper;
+
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.atomic.AtomicInteger;
+import static org.junit.Assert.assertTrue;
+
+import org.apache.zookeeper.ClientCnxn.SendThread;
+import org.apache.zookeeper.Watcher.Event.KeeperState;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.data.ACL;
+import org.apache.zookeeper.data.Id;
+import org.apache.zookeeper.test.ClientBase;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+public class SaslAuthTest extends ClientBase {
+
+@BeforeClass
+public static void init() {
+System.setProperty("zookeeper.authProvider.1", 
"org.apache.zookeeper.server.auth.SASLAuthenticationProvider");
+try {
+File tmpDir = createTmpDir();
+File saslConfFile = new File(tmpDir, "jaas.conf");
+FileWriter fwriter = new FileWriter(saslConfFile);
+
+fwriter.write("" + "Server {\n" + "  
org.apache.zookeeper.server.auth.DigestLoginModule required\n"
++ "  user_super=\"test\";\n" + "};\n" + 
"Client {\n"
++ "   
org.apache.zookeeper.server.auth.DigestLoginModule required\n"
++ "   username=\"super\"\n" + "   
password=\"test\";\n" + "};" + "\n");
+fwriter.close();
+System.setProperty("java.security.auth.login.config", 
saslConfFile.getAbsolutePath());
+} catch (IOException e) {
+// could not create tmp directory to hold JAAS conf file : 
test will
+// fail now.
+}
+}
+
+@AfterClass
+public static void clean() {
+System.clearProperty("zookeeper.authProvider.1");
+System.clearProperty("java.security.auth.login.config");
+}
+
+private AtomicInteger authFailed = new AtomicInteger(0);
+
+@Override
+protected TestableZooKeeper createClient(String hp) throws 
IOException, InterruptedException {
+MyWatcher watcher = new MyWatcher();
+return createClient(watcher, hp);
+}
+
+private class MyWatcher extends CountdownWatcher {
+@Override
+public synchronized void process(WatchedEvent event) {
+if (event.getState() == KeeperState.AuthFailed) {
+authFailed.incrementAndGet();
+} else {
+super.process(event);
+}
+}
+}
+
+@Test
+public void testAuth() throws Exception {
+ZooKeeper zk = createClient();
+try {
+zk.create("/path1", null, Ids.CREATOR_ALL_ACL, 
CreateMode.PERSISTENT);
+Thread.sleep(1000);
+} finally {
+zk.close();
+}
+}
+
+@Test
+public void testValidSaslIds() throws Exception {
+ZooKeeper zk = createClient();
+
+List validIds = new ArrayList();
+validIds.add("user");
+validIds.add("service/host.name.com");
+validIds.add("user@KERB.REALM");
+validIds.add("service/host.name.com@KERB.REALM");
+
+   

[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013200#comment-16013200
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user arshadmohammad commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/254#discussion_r116872750
  
--- Diff: src/java/main/org/apache/zookeeper/ClientCnxn.java ---
@@ -1080,6 +1080,8 @@ private void startConnect() throws IOException {
 zooKeeperSaslClient.shutdown();
 }
 zooKeeperSaslClient = new 
ZooKeeperSaslClient(getServerPrincipal(addr), clientConfig);
+// SASL login succeeded
+saslLoginFailed = false;
--- End diff --

this change has impact on tunnelAuthInProgress.  But yes, we should init 
the variable on new connection start as this is tunnelAuthInProgress logic 
expects.


> ZK Client not able to connect with Xid out of order error 
> --
>
> Key: ZOOKEEPER-2775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.10, 3.5.3, 3.6.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Mohammad Arshad
>Priority: Critical
> Attachments: ZOOKEEPER-2775-01.patch
>
>
> During Network unreachable scenario in one of the cluster, we observed Xid 
> out of order and Nothing in the queue error continously. And ZK client it 
> finally not able to connect successully to ZK server. 
> *Logs:*
> unexpected error, closing socket connection and attempting reconnect | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) 
> java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 
> for a packet with details: clientPath:null serverPath:null finished:false 
> header:: 53,101  replyHeader:: 0,0,-4  request:: 
> 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes}
>   response:: null
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
> unexpected error, closing socket connection and attempting reconnect 
> java.io.IOException: Nothing in the queue, but got 1
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
>   
> *Analysis:* 
> 1) First time Client fails to do SASL login due to network unreachable 
> problem.
> 2017-03-29 10:03:59,377 | WARN  | [main-SendThread(192.168.130.8:24002)] | 
> SASL configuration failed: javax.security.auth.login.LoginException: Network 
> is unreachable (sendto failed) Will continue connection to Zookeeper server 
> without SASL authentication, if Zookeeper server allows it. | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) 
>   Here the boolean saslLoginFailed becomes true.
> 2) After some time network connection is recovered and client is successully 
> able to login but still the boolean saslLoginFailed is not reset to false. 
> 3) Now SASL negotiation between client and server start happening and during 
> this time no user request will be sent. ( As the socket channel will be 
> closed for write till sasl negotiation complets)
> 4) Now response from server for SASL packet will be processed by the client 
> and client assumes that tunnelAuthInProgress() is finished ( method checks 
> for saslLoginFailed boolean Since the boolean is true it assumes its done.) 
> and tries to process the packet as a other packet and will result in above 
> errors. 
> *Solution:*  Reset the saslLoginFailed boolean every time before client login



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013206#comment-16013206
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user arshadmohammad commented on the issue:

https://github.com/apache/zookeeper/pull/254
  
 zooKeeperSaslClient == null is already used in tunnelAuthInProgress() 
which is quite different from saslLoginFailed being false. So I think we can 
not do above change.


> ZK Client not able to connect with Xid out of order error 
> --
>
> Key: ZOOKEEPER-2775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.10, 3.5.3, 3.6.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Mohammad Arshad
>Priority: Critical
> Attachments: ZOOKEEPER-2775-01.patch
>
>
> During Network unreachable scenario in one of the cluster, we observed Xid 
> out of order and Nothing in the queue error continously. And ZK client it 
> finally not able to connect successully to ZK server. 
> *Logs:*
> unexpected error, closing socket connection and attempting reconnect | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) 
> java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 
> for a packet with details: clientPath:null serverPath:null finished:false 
> header:: 53,101  replyHeader:: 0,0,-4  request:: 
> 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes}
>   response:: null
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
> unexpected error, closing socket connection and attempting reconnect 
> java.io.IOException: Nothing in the queue, but got 1
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
>   
> *Analysis:* 
> 1) First time Client fails to do SASL login due to network unreachable 
> problem.
> 2017-03-29 10:03:59,377 | WARN  | [main-SendThread(192.168.130.8:24002)] | 
> SASL configuration failed: javax.security.auth.login.LoginException: Network 
> is unreachable (sendto failed) Will continue connection to Zookeeper server 
> without SASL authentication, if Zookeeper server allows it. | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) 
>   Here the boolean saslLoginFailed becomes true.
> 2) After some time network connection is recovered and client is successully 
> able to login but still the boolean saslLoginFailed is not reset to false. 
> 3) Now SASL negotiation between client and server start happening and during 
> this time no user request will be sent. ( As the socket channel will be 
> closed for write till sasl negotiation complets)
> 4) Now response from server for SASL packet will be processed by the client 
> and client assumes that tunnelAuthInProgress() is finished ( method checks 
> for saslLoginFailed boolean Since the boolean is true it assumes its done.) 
> and tries to process the packet as a other packet and will result in above 
> errors. 
> *Solution:*  Reset the saslLoginFailed boolean every time before client login



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013219#comment-16013219
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user arshadmohammad commented on the issue:

https://github.com/apache/zookeeper/pull/254
  
Thanks @afine @hanm for the reviews. Addressed the comments, Please have a 
look.


> ZK Client not able to connect with Xid out of order error 
> --
>
> Key: ZOOKEEPER-2775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.10, 3.5.3, 3.6.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Mohammad Arshad
>Priority: Critical
> Attachments: ZOOKEEPER-2775-01.patch
>
>
> During Network unreachable scenario in one of the cluster, we observed Xid 
> out of order and Nothing in the queue error continously. And ZK client it 
> finally not able to connect successully to ZK server. 
> *Logs:*
> unexpected error, closing socket connection and attempting reconnect | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) 
> java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 
> for a packet with details: clientPath:null serverPath:null finished:false 
> header:: 53,101  replyHeader:: 0,0,-4  request:: 
> 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes}
>   response:: null
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
> unexpected error, closing socket connection and attempting reconnect 
> java.io.IOException: Nothing in the queue, but got 1
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
>   
> *Analysis:* 
> 1) First time Client fails to do SASL login due to network unreachable 
> problem.
> 2017-03-29 10:03:59,377 | WARN  | [main-SendThread(192.168.130.8:24002)] | 
> SASL configuration failed: javax.security.auth.login.LoginException: Network 
> is unreachable (sendto failed) Will continue connection to Zookeeper server 
> without SASL authentication, if Zookeeper server allows it. | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) 
>   Here the boolean saslLoginFailed becomes true.
> 2) After some time network connection is recovered and client is successully 
> able to login but still the boolean saslLoginFailed is not reset to false. 
> 3) Now SASL negotiation between client and server start happening and during 
> this time no user request will be sent. ( As the socket channel will be 
> closed for write till sasl negotiation complets)
> 4) Now response from server for SASL packet will be processed by the client 
> and client assumes that tunnelAuthInProgress() is finished ( method checks 
> for saslLoginFailed boolean Since the boolean is true it assumes its done.) 
> and tries to process the packet as a other packet and will result in above 
> errors. 
> *Solution:*  Reset the saslLoginFailed boolean every time before client login



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013244#comment-16013244
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user hanm commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/254#discussion_r116879091
  
--- Diff: src/java/test/org/apache/zookeeper/SaslAuthTest.java ---
@@ -16,54 +16,58 @@
  * limitations under the License.
  */
 
-package org.apache.zookeeper.test;
+package org.apache.zookeeper;
+
+import static org.junit.Assert.assertTrue;
 
 import java.io.File;
 import java.io.FileWriter;
 import java.io.IOException;
+import java.lang.reflect.Field;
 import java.util.ArrayList;
 import java.util.List;
 import java.util.concurrent.atomic.AtomicInteger;
 
-import org.apache.zookeeper.CreateMode;
-import org.apache.zookeeper.KeeperException;
-import org.apache.zookeeper.TestableZooKeeper;
-import org.apache.zookeeper.WatchedEvent;
-import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.ClientCnxn.SendThread;
 import org.apache.zookeeper.Watcher.Event.KeeperState;
 import org.apache.zookeeper.ZooDefs.Ids;
 import org.apache.zookeeper.data.ACL;
 import org.apache.zookeeper.data.Id;
+import org.apache.zookeeper.test.ClientBase;
+import org.junit.AfterClass;
 import org.junit.Assert;
+import org.junit.BeforeClass;
 import org.junit.Test;
 
 public class SaslAuthTest extends ClientBase {
-static {
-
System.setProperty("zookeeper.authProvider.1","org.apache.zookeeper.server.auth.SASLAuthenticationProvider");
-
+@BeforeClass
+public static void init() {
+System.setProperty("zookeeper.authProvider.1",
+
"org.apache.zookeeper.server.auth.SASLAuthenticationProvider");
 try {
 File tmpDir = createTmpDir();
 File saslConfFile = new File(tmpDir, "jaas.conf");
 FileWriter fwriter = new FileWriter(saslConfFile);
 
-fwriter.write("" +
-"Server {\n" +
-"  
org.apache.zookeeper.server.auth.DigestLoginModule required\n" +
-"  user_super=\"test\";\n" +
-"};\n" +
-"Client {\n" +
-"   
org.apache.zookeeper.server.auth.DigestLoginModule required\n" +
-"   username=\"super\"\n" +
-"   password=\"test\";\n" +
-"};" + "\n");
+fwriter.write("" + "Server {\n"
--- End diff --

Nit: not sure if change is by accident or not, but this made readability 
worse.


> ZK Client not able to connect with Xid out of order error 
> --
>
> Key: ZOOKEEPER-2775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.10, 3.5.3, 3.6.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Mohammad Arshad
>Priority: Critical
> Attachments: ZOOKEEPER-2775-01.patch
>
>
> During Network unreachable scenario in one of the cluster, we observed Xid 
> out of order and Nothing in the queue error continously. And ZK client it 
> finally not able to connect successully to ZK server. 
> *Logs:*
> unexpected error, closing socket connection and attempting reconnect | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) 
> java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 
> for a packet with details: clientPath:null serverPath:null finished:false 
> header:: 53,101  replyHeader:: 0,0,-4  request:: 
> 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes}
>   response:: null
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
> unexpected error, closing socket connection and attempting reconnect 
> java.io.IOException: Nothing in the queue, but got 1
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
>

[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013245#comment-16013245
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/254
  
>> So I think we can not do above change.

Correct - I was looking to consolidate unnecessary variables used to encode 
states but looks like we do need saslLoginFailed.

lgtm just with one nit on the test file readability.


> ZK Client not able to connect with Xid out of order error 
> --
>
> Key: ZOOKEEPER-2775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.10, 3.5.3, 3.6.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Mohammad Arshad
>Priority: Critical
> Attachments: ZOOKEEPER-2775-01.patch
>
>
> During Network unreachable scenario in one of the cluster, we observed Xid 
> out of order and Nothing in the queue error continously. And ZK client it 
> finally not able to connect successully to ZK server. 
> *Logs:*
> unexpected error, closing socket connection and attempting reconnect | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) 
> java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 
> for a packet with details: clientPath:null serverPath:null finished:false 
> header:: 53,101  replyHeader:: 0,0,-4  request:: 
> 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes}
>   response:: null
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
> unexpected error, closing socket connection and attempting reconnect 
> java.io.IOException: Nothing in the queue, but got 1
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
>   
> *Analysis:* 
> 1) First time Client fails to do SASL login due to network unreachable 
> problem.
> 2017-03-29 10:03:59,377 | WARN  | [main-SendThread(192.168.130.8:24002)] | 
> SASL configuration failed: javax.security.auth.login.LoginException: Network 
> is unreachable (sendto failed) Will continue connection to Zookeeper server 
> without SASL authentication, if Zookeeper server allows it. | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) 
>   Here the boolean saslLoginFailed becomes true.
> 2) After some time network connection is recovered and client is successully 
> able to login but still the boolean saslLoginFailed is not reset to false. 
> 3) Now SASL negotiation between client and server start happening and during 
> this time no user request will be sent. ( As the socket channel will be 
> closed for write till sasl negotiation complets)
> 4) Now response from server for SASL packet will be processed by the client 
> and client assumes that tunnelAuthInProgress() is finished ( method checks 
> for saslLoginFailed boolean Since the boolean is true it assumes its done.) 
> and tries to process the packet as a other packet and will result in above 
> errors. 
> *Solution:*  Reset the saslLoginFailed boolean every time before client login



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2765) modern C++ client

2017-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013512#comment-16013512
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2765:
---

Github user packysauce commented on the issue:

https://github.com/apache/zookeeper/pull/234
  
Using this for my own project, it's missing InitialWatches.h


> modern C++ client
> -
>
> Key: ZOOKEEPER-2765
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2765
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: c client
>Reporter: Edward Carter
>Assignee: Edward Carter
>
> We should add a modern C++ (i.e. C++14, C++17, etc.) client library that 
> wraps the existing C client.  A future issue may replace the C client itself.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2765) modern C++ client

2017-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013530#comment-16013530
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2765:
---

Github user packysauce commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/234#discussion_r116910884
  
--- Diff: src/contrib/cppclient/detail/BasicZookeeperClient.cpp ---
@@ -0,0 +1,1112 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include "zeus/client/detail/BasicZookeeperClient.h"
+#include 
+
+namespace facebook {
+namespace zeus {
+namespace client {
+namespace detail {
+
+std::string BasicZookeeperClient::buildConnectionString(
+const std::vector& servers,
+const std::string& chroot) {
+  std::vector hostPorts;
+  for (const auto& server : servers) {
+hostPorts.push_back(folly::to(
+server.getIPAddress().toFullyQualified(), ':', server.getPort()));
+  }
+  return folly::join(',', hostPorts) + chroot;
+}
+
+Stat BasicZookeeperClient::convertStat(const ::Stat& s) {
+  return Stat{s.czxid,
+  s.mzxid,
+  std::chrono::system_clock::time_point() +
+  std::chrono::milliseconds(s.ctime),
+  std::chrono::system_clock::time_point() +
+  std::chrono::milliseconds(s.mtime),
+  s.version,
+  s.cversion,
+  s.aversion,
+  s.ephemeralOwner,
+  s.dataLength,
+  s.numChildren,
+  s.pzxid};
+}
+
+int BasicZookeeperClient::convertCreateMode(const CreateMode& m) {
+  int flags = 0;
+  if (m.isEphemeral) {
+flags |= ZOO_EPHEMERAL;
+  }
+  if (m.isSequential) {
+flags |= ZOO_SEQUENCE;
+  }
+  return flags;
+}
+
+SessionState BasicZookeeperClient::convertStateType(int state) {
+  if (state == ZOO_EXPIRED_SESSION_STATE) {
+return SessionState::EXPIRED;
+  } else if (state == ZOO_AUTH_FAILED_STATE) {
+return SessionState::AUTH_FAILED;
+  } else if (state == ZOO_CONNECTING_STATE) {
+return SessionState::CONNECTING;
+  } else if (state == ZOO_ASSOCIATING_STATE) {
+return SessionState::ASSOCIATING;
+  } else if (state == ZOO_CONNECTED_STATE) {
+return SessionState::CONNECTED;
+  } else if (state == 0 || state == ZOO_NOTCONNECTED_STATE) {
+return SessionState::DISCONNECTED;
+  } else if (state == ZOO_TIMED_OUT_STATE) {
+return SessionState::TIMED_OUT;
+  } else {
+throw std::runtime_error(
+folly::to("unrecognized ZK state ", state));
+  }
+}
+
+NodeEvent BasicZookeeperClient::convertWatchEventType(
+const char* path,
+int inType,
+int inState,
+size_t index) {
+  WatchEventType type;
+  if (inType == ZOO_CREATED_EVENT) {
+type = WatchEventType::CREATED;
+  } else if (inType == ZOO_DELETED_EVENT) {
+type = WatchEventType::DELETED;
+  } else if (inType == ZOO_CHANGED_EVENT) {
+type = WatchEventType::CHANGED;
+  } else if (inType == ZOO_CHILD_EVENT) {
+type = WatchEventType::CHILD;
+  } else if (inType == ZOO_SESSION_EVENT) {
+type = WatchEventType::SESSION;
+  } else if (inType == ZOO_NOTWATCHING_EVENT) {
+type = WatchEventType::NOT_WATCHING;
+  } else {
+throw std::runtime_error(
+folly::to("unexpected watch event type ", inType));
+  }
+
+  return NodeEvent{index, path, type, convertStateType(inState)};
+}
+
+BasicZookeeperClient::BasicZookeeperClient(
+const std::string& connectionString,
+std::chrono::milliseconds sessionTimeout,
+const SessionToken* token,
+InitialWatches&& initialWatches)
+: initialWatches_(std::move(initialWatches)) {
+  std::vector dataWatchPaths;
+  for (const auto& dataWatc

[jira] [Commented] (ZOOKEEPER-2765) modern C++ client

2017-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013531#comment-16013531
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2765:
---

Github user packysauce commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/234#discussion_r116910998
  
--- Diff: src/contrib/cppclient/detail/BasicZookeeperClient.h ---
@@ -0,0 +1,486 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#pragma once
+
+#include 
+#include 
+#include 
+#include 
+#include "zeus/client/ZookeeperClient.h"
+#include "zeus/client/detail/InitialWatches.h"
--- End diff --

File not included.


> modern C++ client
> -
>
> Key: ZOOKEEPER-2765
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2765
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: c client
>Reporter: Edward Carter
>Assignee: Edward Carter
>
> We should add a modern C++ (i.e. C++14, C++17, etc.) client library that 
> wraps the existing C client.  A future issue may replace the C client itself.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2765) modern C++ client

2017-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013551#comment-16013551
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2765:
---

Github user packysauce commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/234#discussion_r116912433
  
--- Diff: src/contrib/cppclient/detail/BasicZookeeperClient.cpp ---
@@ -0,0 +1,1112 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include "zeus/client/detail/BasicZookeeperClient.h"
+#include 
+
+namespace facebook {
+namespace zeus {
+namespace client {
+namespace detail {
+
+std::string BasicZookeeperClient::buildConnectionString(
+const std::vector& servers,
+const std::string& chroot) {
+  std::vector hostPorts;
+  for (const auto& server : servers) {
+hostPorts.push_back(folly::to(
+server.getIPAddress().toFullyQualified(), ':', server.getPort()));
+  }
+  return folly::join(',', hostPorts) + chroot;
+}
+
+Stat BasicZookeeperClient::convertStat(const ::Stat& s) {
+  return Stat{s.czxid,
+  s.mzxid,
+  std::chrono::system_clock::time_point() +
+  std::chrono::milliseconds(s.ctime),
+  std::chrono::system_clock::time_point() +
+  std::chrono::milliseconds(s.mtime),
+  s.version,
+  s.cversion,
+  s.aversion,
+  s.ephemeralOwner,
+  s.dataLength,
+  s.numChildren,
+  s.pzxid};
+}
+
+int BasicZookeeperClient::convertCreateMode(const CreateMode& m) {
+  int flags = 0;
+  if (m.isEphemeral) {
+flags |= ZOO_EPHEMERAL;
+  }
+  if (m.isSequential) {
+flags |= ZOO_SEQUENCE;
+  }
+  return flags;
+}
+
+SessionState BasicZookeeperClient::convertStateType(int state) {
+  if (state == ZOO_EXPIRED_SESSION_STATE) {
+return SessionState::EXPIRED;
+  } else if (state == ZOO_AUTH_FAILED_STATE) {
+return SessionState::AUTH_FAILED;
+  } else if (state == ZOO_CONNECTING_STATE) {
+return SessionState::CONNECTING;
+  } else if (state == ZOO_ASSOCIATING_STATE) {
+return SessionState::ASSOCIATING;
+  } else if (state == ZOO_CONNECTED_STATE) {
+return SessionState::CONNECTED;
+  } else if (state == 0 || state == ZOO_NOTCONNECTED_STATE) {
+return SessionState::DISCONNECTED;
+  } else if (state == ZOO_TIMED_OUT_STATE) {
+return SessionState::TIMED_OUT;
+  } else {
+throw std::runtime_error(
+folly::to("unrecognized ZK state ", state));
+  }
+}
+
+NodeEvent BasicZookeeperClient::convertWatchEventType(
+const char* path,
+int inType,
+int inState,
+size_t index) {
+  WatchEventType type;
+  if (inType == ZOO_CREATED_EVENT) {
+type = WatchEventType::CREATED;
+  } else if (inType == ZOO_DELETED_EVENT) {
+type = WatchEventType::DELETED;
+  } else if (inType == ZOO_CHANGED_EVENT) {
+type = WatchEventType::CHANGED;
+  } else if (inType == ZOO_CHILD_EVENT) {
+type = WatchEventType::CHILD;
+  } else if (inType == ZOO_SESSION_EVENT) {
+type = WatchEventType::SESSION;
+  } else if (inType == ZOO_NOTWATCHING_EVENT) {
+type = WatchEventType::NOT_WATCHING;
+  } else {
+throw std::runtime_error(
+folly::to("unexpected watch event type ", inType));
+  }
+
+  return NodeEvent{index, path, type, convertStateType(inState)};
+}
+
+BasicZookeeperClient::BasicZookeeperClient(
+const std::string& connectionString,
+std::chrono::milliseconds sessionTimeout,
+const SessionToken* token,
+InitialWatches&& initialWatches)
+: initialWatches_(std::move(initialWatches)) {
+  std::vector dataWatchPaths;
+  for (const auto& dataWatc

[jira] [Commented] (ZOOKEEPER-2785) Server inappropriately throttles connections under load before SASL completes

2017-05-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013687#comment-16013687
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2785:
---

GitHub user abhishek-chouhan opened a pull request:

https://github.com/apache/zookeeper/pull/256

ZOOKEEPER-2785 Server inappropriately throttles connections under loa…

…d before SASL completes

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/abhishek-chouhan/zookeeper ZOOKEEPER-2785

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/256.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #256


commit 52a1037ebe30ece519ee5c1cc6202d0c53235706
Author: Abhishek Singh Chouhan 
Date:   2017-05-17T07:49:07Z

ZOOKEEPER-2785 Server inappropriately throttles connections under load 
before SASL completes




> Server inappropriately throttles connections under load before SASL completes
> -
>
> Key: ZOOKEEPER-2785
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2785
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.10
>Reporter: Abhishek Singh Chouhan
>Priority: Critical
> Fix For: 3.4.11
>
>
> When a zk server is running close to its outstanding requests limit, the 
> server incorrectly throttles the sasl request. This leads to the client 
> waiting for the final sasl packet (session is already established) and 
> deferring all non priming packets till then which also includes the ping 
> packets. The client then waits for the final packet but never gets it and 
> times out saying haven't heard from server. This is fatal for services such 
> as HBase which retry for finite attempts and exit post these attempts.
> Issue being that in ZooKeeperServer.processPacket(..) incase of sasl we send 
> the response and incorrectly also call cnxn.incrOutstandingRequests(h), which 
> throttles the connection if we're running over outstandingrequests limit, 
> which results in the server not processing the subsequent packet from the 
> client. Also we donot have any pending request to send for the connection and 
> hence never call enableRecv(). We should return after sending response to the 
> sasl request.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2691) recreateSocketAddresses may recreate the unreachable IP address

2017-05-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013718#comment-16013718
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2691:
---

Github user JiangJiafu commented on the issue:

https://github.com/apache/zookeeper/pull/173
  
Please have a look to the new code, thank you.


> recreateSocketAddresses may recreate the unreachable IP address
> ---
>
> Key: ZOOKEEPER-2691
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2691
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.8, 3.4.9, 3.4.10, 3.5.0, 3.5.1, 3.5.2, 3.4.11
> Environment: Centos6.5
> Java8
> ZooKeeper3.4.8
>Reporter: JiangJiafu
>Priority: Minor
>
> The QuorumPeer$QuorumServer.recreateSocketAddress()  is used to resolved the 
> hostname to a new IP address(InetAddress) when any exception happens to the 
> socket. It will be very useful when a hostname can be resolved to more than 
> one IP address.
> But the problem is Java API InetAddress.getByName(String hostname) will 
> always return the first IP address when the hostname can be resolved to more 
> than one IP address, and the first IP address may be unreachable forever. For 
> example, if a machine has two network interfaces: eth0, eth1, say eth0 has 
> ip1, eth1 has ip2, the relationship between hostname and the IP addresses is 
> set in /etc/hosts. When I "close" the eth0 by command "ifdown eth0", the 
> InetAddress.getByName(String hostname)  will still return ip1, which is 
> unreachable forever.
> So I think it will be better to check the IP address by 
> InetAddress.isReachable(long) and choose the reachable IP address. 
> I have modified the ZooKeeper source code, and test the new code in my own 
> environment, and it can work very well when I turn down some network 
> interfaces using "ifdown" command.
> The original code is:
> {code:title=QuorumPeer.java|borderStyle=solid}
> public void recreateSocketAddresses() {
> InetAddress address = null;
> try {
> address = InetAddress.getByName(this.hostname);
> LOG.info("Resolved hostname: {} to address: {}", 
> this.hostname, address);
> this.addr = new InetSocketAddress(address, this.port);
> if (this.electionPort > 0){
> this.electionAddr = new InetSocketAddress(address, 
> this.electionPort);
> }
> } catch (UnknownHostException ex) {
> LOG.warn("Failed to resolve address: {}", this.hostname, ex);
> // Have we succeeded in the past?
> if (this.addr != null) {
> // Yes, previously the lookup succeeded. Leave things as 
> they are
> return;
> }
> // The hostname has never resolved. Create our 
> InetSocketAddress(es) as unresolved
> this.addr = InetSocketAddress.createUnresolved(this.hostname, 
> this.port);
> if (this.electionPort > 0){
> this.electionAddr = 
> InetSocketAddress.createUnresolved(this.hostname,
>
> this.electionPort);
> }
> }
> }
> {code}
> After my modification:
> {code:title=QuorumPeer.java|borderStyle=solid}
> public void recreateSocketAddresses() {
> InetAddress address = null;
> try {
> address = getReachableAddress(this.hostname);
> LOG.info("Resolved hostname: {} to address: {}", 
> this.hostname, address);
> this.addr = new InetSocketAddress(address, this.port);
> if (this.electionPort > 0){
> this.electionAddr = new InetSocketAddress(address, 
> this.electionPort);
> }
> } catch (UnknownHostException ex) {
> LOG.warn("Failed to resolve address: {}", this.hostname, ex);
> // Have we succeeded in the past?
> if (this.addr != null) {
> // Yes, previously the lookup succeeded. Leave things as 
> they are
> return;
> }
> // The hostname has never resolved. Create our 
> InetSocketAddress(es) as unresolved
> this.addr = InetSocketAddress.createUnresolved(this.hostname, 
> this.port);
> if (this.electionPort > 0){
> this.electionAddr = 
> InetSocketAddress.createUnresolved(this.hostname,
>
> this.electionPort);
> }
> }
> }
> public InetAddress getReachableAddress(String 

[jira] [Commented] (ZOOKEEPER-2774) Ephemeral znode will not be removed when sesstion timeout, if the system time of ZooKeeper node changes unexpectedly.

2017-05-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013746#comment-16013746
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2774:
---

Github user JiangJiafu commented on the issue:

https://github.com/apache/zookeeper/pull/253
  
@hanm, do you mean that all the test souce files code should be changed if 
they use System.currentTimeMillis()?
Or do you mean that I should just change these two files: ClientBase.java 
and QuorumPeerMainTest.java?




> Ephemeral znode will not be removed when sesstion timeout, if the system time 
> of ZooKeeper node changes unexpectedly.
> -
>
> Key: ZOOKEEPER-2774
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2774
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.8, 3.4.9, 3.4.10
> Environment: Centos6.5
>Reporter: JiangJiafu
>
> 1. Deploy a ZooKeeper cluster with one node.
> 2. Create a Ephemeral znode.
> 3. Change the system time of the ZooKeeper node to a earlier point.
> 4. Disconnect the client with the ZooKeeper server.
> Then the ephemeral znode will exist for a long time even when session timeout.
> I have read the ZooKeeper source code and I find the code int 
> SessionTrackerImpl.java,
> {code:title=SessionTrackerImpl.java|borderStyle=solid}
> @Override
> synchronized public void run() {
> try {
> while (running) {
> currentTime = System.currentTimeMillis();
> if (nextExpirationTime > currentTime) {
> this.wait(nextExpirationTime - currentTime);
> continue;
> }
> SessionSet set;
> set = sessionSets.remove(nextExpirationTime);
> if (set != null) {
> for (SessionImpl s : set.sessions) {
> setSessionClosing(s.sessionId);
> expirer.expire(s);
> }
> }
> nextExpirationTime += expirationInterval;
> }
> } catch (InterruptedException e) {
> handleException(this.getName(), e);
> }
> LOG.info("SessionTrackerImpl exited loop!");
> }
> {code}
> I think it may be better to use System.nanoTime(), not 
> System.currentTimeMillis, because the later can be changed manually or 
> automatically by a NTP client. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013892#comment-16013892
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user rakeshadr commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/254#discussion_r116976191
  
--- Diff: src/java/main/org/apache/zookeeper/ClientCnxn.java ---
@@ -1054,6 +1054,8 @@ private void sendPing() {
 private boolean saslLoginFailed = false;
 
 private void startConnect() throws IOException {
+// initializing it for new connection
+saslLoginFailed = false;
--- End diff --

How about setting to false only after client has successfully logged in 
instead of setting to false at the beginning. 
```
zooKeeperSaslClient = new ZooKeeperSaslClient(getServerPrincipal(addr), 
clientConfig);
saslLoginFailed = false;
```
I'm not sure about any concurrency cases between narrow window of resetting 
the flag to false and the client attempt to do login operation. Since the 
previous operation was failed, the client should still reflect the status until 
the login is successful next time, right?



> ZK Client not able to connect with Xid out of order error 
> --
>
> Key: ZOOKEEPER-2775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.10, 3.5.3, 3.6.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Mohammad Arshad
>Priority: Critical
> Attachments: ZOOKEEPER-2775-01.patch
>
>
> During Network unreachable scenario in one of the cluster, we observed Xid 
> out of order and Nothing in the queue error continously. And ZK client it 
> finally not able to connect successully to ZK server. 
> *Logs:*
> unexpected error, closing socket connection and attempting reconnect | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) 
> java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 
> for a packet with details: clientPath:null serverPath:null finished:false 
> header:: 53,101  replyHeader:: 0,0,-4  request:: 
> 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes}
>   response:: null
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
> unexpected error, closing socket connection and attempting reconnect 
> java.io.IOException: Nothing in the queue, but got 1
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
>   
> *Analysis:* 
> 1) First time Client fails to do SASL login due to network unreachable 
> problem.
> 2017-03-29 10:03:59,377 | WARN  | [main-SendThread(192.168.130.8:24002)] | 
> SASL configuration failed: javax.security.auth.login.LoginException: Network 
> is unreachable (sendto failed) Will continue connection to Zookeeper server 
> without SASL authentication, if Zookeeper server allows it. | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) 
>   Here the boolean saslLoginFailed becomes true.
> 2) After some time network connection is recovered and client is successully 
> able to login but still the boolean saslLoginFailed is not reset to false. 
> 3) Now SASL negotiation between client and server start happening and during 
> this time no user request will be sent. ( As the socket channel will be 
> closed for write till sasl negotiation complets)
> 4) Now response from server for SASL packet will be processed by the client 
> and client assumes that tunnelAuthInProgress() is finished ( method checks 
> for saslLoginFailed boolean Since the boolean is true it assumes its done.) 
> and tries to process the packet as a other packet and will result in above 
> errors. 
> *Solution:*  Reset the saslLoginFailed boolean every time before client login



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16014110#comment-16014110
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user arshadmohammad commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/254#discussion_r117014954
  
--- Diff: src/java/main/org/apache/zookeeper/ClientCnxn.java ---
@@ -1054,6 +1054,8 @@ private void sendPing() {
 private boolean saslLoginFailed = false;
 
 private void startConnect() throws IOException {
+// initializing it for new connection
+saslLoginFailed = false;
--- End diff --

Only one thread involved in new connection creation. So there will not any 
chance another thread setting the flag in between.



> ZK Client not able to connect with Xid out of order error 
> --
>
> Key: ZOOKEEPER-2775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.10, 3.5.3, 3.6.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Mohammad Arshad
>Priority: Critical
> Attachments: ZOOKEEPER-2775-01.patch
>
>
> During Network unreachable scenario in one of the cluster, we observed Xid 
> out of order and Nothing in the queue error continously. And ZK client it 
> finally not able to connect successully to ZK server. 
> *Logs:*
> unexpected error, closing socket connection and attempting reconnect | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) 
> java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 
> for a packet with details: clientPath:null serverPath:null finished:false 
> header:: 53,101  replyHeader:: 0,0,-4  request:: 
> 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes}
>   response:: null
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
> unexpected error, closing socket connection and attempting reconnect 
> java.io.IOException: Nothing in the queue, but got 1
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
>   
> *Analysis:* 
> 1) First time Client fails to do SASL login due to network unreachable 
> problem.
> 2017-03-29 10:03:59,377 | WARN  | [main-SendThread(192.168.130.8:24002)] | 
> SASL configuration failed: javax.security.auth.login.LoginException: Network 
> is unreachable (sendto failed) Will continue connection to Zookeeper server 
> without SASL authentication, if Zookeeper server allows it. | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) 
>   Here the boolean saslLoginFailed becomes true.
> 2) After some time network connection is recovered and client is successully 
> able to login but still the boolean saslLoginFailed is not reset to false. 
> 3) Now SASL negotiation between client and server start happening and during 
> this time no user request will be sent. ( As the socket channel will be 
> closed for write till sasl negotiation complets)
> 4) Now response from server for SASL packet will be processed by the client 
> and client assumes that tunnelAuthInProgress() is finished ( method checks 
> for saslLoginFailed boolean Since the boolean is true it assumes its done.) 
> and tries to process the packet as a other packet and will result in above 
> errors. 
> *Solution:*  Reset the saslLoginFailed boolean every time before client login



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16014119#comment-16014119
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user arshadmohammad commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/254#discussion_r117016986
  
--- Diff: src/java/main/org/apache/zookeeper/ClientCnxn.java ---
@@ -1054,6 +1054,8 @@ private void sendPing() {
 private boolean saslLoginFailed = false;
 
 private void startConnect() throws IOException {
+// initializing it for new connection
+saslLoginFailed = false;
--- End diff --

Successfull Login is not necessary to successfully connect to ZK Server.  
ZK server may allow connection without client login being successful, based on 
configuration. So it is better to set the flag at the start of the new 
connection.  
This change has impact on tunnelAuthInProgress, in this method also same 
thing is expeted. In this there is a sceanrio where zooKeeperSaslClient can be 
null


> ZK Client not able to connect with Xid out of order error 
> --
>
> Key: ZOOKEEPER-2775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.10, 3.5.3, 3.6.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Mohammad Arshad
>Priority: Critical
> Attachments: ZOOKEEPER-2775-01.patch
>
>
> During Network unreachable scenario in one of the cluster, we observed Xid 
> out of order and Nothing in the queue error continously. And ZK client it 
> finally not able to connect successully to ZK server. 
> *Logs:*
> unexpected error, closing socket connection and attempting reconnect | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) 
> java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 
> for a packet with details: clientPath:null serverPath:null finished:false 
> header:: 53,101  replyHeader:: 0,0,-4  request:: 
> 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes}
>   response:: null
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
> unexpected error, closing socket connection and attempting reconnect 
> java.io.IOException: Nothing in the queue, but got 1
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
>   
> *Analysis:* 
> 1) First time Client fails to do SASL login due to network unreachable 
> problem.
> 2017-03-29 10:03:59,377 | WARN  | [main-SendThread(192.168.130.8:24002)] | 
> SASL configuration failed: javax.security.auth.login.LoginException: Network 
> is unreachable (sendto failed) Will continue connection to Zookeeper server 
> without SASL authentication, if Zookeeper server allows it. | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) 
>   Here the boolean saslLoginFailed becomes true.
> 2) After some time network connection is recovered and client is successully 
> able to login but still the boolean saslLoginFailed is not reset to false. 
> 3) Now SASL negotiation between client and server start happening and during 
> this time no user request will be sent. ( As the socket channel will be 
> closed for write till sasl negotiation complets)
> 4) Now response from server for SASL packet will be processed by the client 
> and client assumes that tunnelAuthInProgress() is finished ( method checks 
> for saslLoginFailed boolean Since the boolean is true it assumes its done.) 
> and tries to process the packet as a other packet and will result in above 
> errors. 
> *Solution:*  Reset the saslLoginFailed boolean every time before client login



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16014137#comment-16014137
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user arshadmohammad commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/254#discussion_r117021379
  
--- Diff: src/java/test/org/apache/zookeeper/SaslAuthTest.java ---
@@ -16,54 +16,58 @@
  * limitations under the License.
  */
 
-package org.apache.zookeeper.test;
+package org.apache.zookeeper;
+
+import static org.junit.Assert.assertTrue;
 
 import java.io.File;
 import java.io.FileWriter;
 import java.io.IOException;
+import java.lang.reflect.Field;
 import java.util.ArrayList;
 import java.util.List;
 import java.util.concurrent.atomic.AtomicInteger;
 
-import org.apache.zookeeper.CreateMode;
-import org.apache.zookeeper.KeeperException;
-import org.apache.zookeeper.TestableZooKeeper;
-import org.apache.zookeeper.WatchedEvent;
-import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.ClientCnxn.SendThread;
 import org.apache.zookeeper.Watcher.Event.KeeperState;
 import org.apache.zookeeper.ZooDefs.Ids;
 import org.apache.zookeeper.data.ACL;
 import org.apache.zookeeper.data.Id;
+import org.apache.zookeeper.test.ClientBase;
+import org.junit.AfterClass;
 import org.junit.Assert;
+import org.junit.BeforeClass;
 import org.junit.Test;
 
 public class SaslAuthTest extends ClientBase {
-static {
-
System.setProperty("zookeeper.authProvider.1","org.apache.zookeeper.server.auth.SASLAuthenticationProvider");
-
+@BeforeClass
+public static void init() {
+System.setProperty("zookeeper.authProvider.1",
+
"org.apache.zookeeper.server.auth.SASLAuthenticationProvider");
 try {
 File tmpDir = createTmpDir();
 File saslConfFile = new File(tmpDir, "jaas.conf");
 FileWriter fwriter = new FileWriter(saslConfFile);
 
-fwriter.write("" +
-"Server {\n" +
-"  
org.apache.zookeeper.server.auth.DigestLoginModule required\n" +
-"  user_super=\"test\";\n" +
-"};\n" +
-"Client {\n" +
-"   
org.apache.zookeeper.server.auth.DigestLoginModule required\n" +
-"   username=\"super\"\n" +
-"   password=\"test\";\n" +
-"};" + "\n");
+fwriter.write("" + "Server {\n"
--- End diff --

Same content was already present, I tried to format it which could not help 
to clean.
Now moved jaas config file content generation to new method.


> ZK Client not able to connect with Xid out of order error 
> --
>
> Key: ZOOKEEPER-2775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.10, 3.5.3, 3.6.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Mohammad Arshad
>Priority: Critical
> Attachments: ZOOKEEPER-2775-01.patch
>
>
> During Network unreachable scenario in one of the cluster, we observed Xid 
> out of order and Nothing in the queue error continously. And ZK client it 
> finally not able to connect successully to ZK server. 
> *Logs:*
> unexpected error, closing socket connection and attempting reconnect | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) 
> java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 
> for a packet with details: clientPath:null serverPath:null finished:false 
> header:: 53,101  replyHeader:: 0,0,-4  request:: 
> 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes}
>   response:: null
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
> unexpected error, closing socket connection and attempting reconnect 
> java.io.IOException: Nothing in the queue, but got 1
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)

[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16014938#comment-16014938
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/254
  
lgtm.


> ZK Client not able to connect with Xid out of order error 
> --
>
> Key: ZOOKEEPER-2775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.10, 3.5.3, 3.6.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Mohammad Arshad
>Priority: Critical
> Attachments: ZOOKEEPER-2775-01.patch
>
>
> During Network unreachable scenario in one of the cluster, we observed Xid 
> out of order and Nothing in the queue error continously. And ZK client it 
> finally not able to connect successully to ZK server. 
> *Logs:*
> unexpected error, closing socket connection and attempting reconnect | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) 
> java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 
> for a packet with details: clientPath:null serverPath:null finished:false 
> header:: 53,101  replyHeader:: 0,0,-4  request:: 
> 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes}
>   response:: null
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
> unexpected error, closing socket connection and attempting reconnect 
> java.io.IOException: Nothing in the queue, but got 1
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
>   
> *Analysis:* 
> 1) First time Client fails to do SASL login due to network unreachable 
> problem.
> 2017-03-29 10:03:59,377 | WARN  | [main-SendThread(192.168.130.8:24002)] | 
> SASL configuration failed: javax.security.auth.login.LoginException: Network 
> is unreachable (sendto failed) Will continue connection to Zookeeper server 
> without SASL authentication, if Zookeeper server allows it. | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) 
>   Here the boolean saslLoginFailed becomes true.
> 2) After some time network connection is recovered and client is successully 
> able to login but still the boolean saslLoginFailed is not reset to false. 
> 3) Now SASL negotiation between client and server start happening and during 
> this time no user request will be sent. ( As the socket channel will be 
> closed for write till sasl negotiation complets)
> 4) Now response from server for SASL packet will be processed by the client 
> and client assumes that tunnelAuthInProgress() is finished ( method checks 
> for saslLoginFailed boolean Since the boolean is true it assumes its done.) 
> and tries to process the packet as a other packet and will result in above 
> errors. 
> *Solution:*  Reset the saslLoginFailed boolean every time before client login



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2774) Ephemeral znode will not be removed when sesstion timeout, if the system time of ZooKeeper node changes unexpectedly.

2017-05-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015076#comment-16015076
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2774:
---

Github user JiangJiafu commented on the issue:

https://github.com/apache/zookeeper/pull/253
  
Code has bee changed according to your advice. @hanm 


> Ephemeral znode will not be removed when sesstion timeout, if the system time 
> of ZooKeeper node changes unexpectedly.
> -
>
> Key: ZOOKEEPER-2774
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2774
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.8, 3.4.9, 3.4.10
> Environment: Centos6.5
>Reporter: JiangJiafu
>
> 1. Deploy a ZooKeeper cluster with one node.
> 2. Create a Ephemeral znode.
> 3. Change the system time of the ZooKeeper node to a earlier point.
> 4. Disconnect the client with the ZooKeeper server.
> Then the ephemeral znode will exist for a long time even when session timeout.
> I have read the ZooKeeper source code and I find the code int 
> SessionTrackerImpl.java,
> {code:title=SessionTrackerImpl.java|borderStyle=solid}
> @Override
> synchronized public void run() {
> try {
> while (running) {
> currentTime = System.currentTimeMillis();
> if (nextExpirationTime > currentTime) {
> this.wait(nextExpirationTime - currentTime);
> continue;
> }
> SessionSet set;
> set = sessionSets.remove(nextExpirationTime);
> if (set != null) {
> for (SessionImpl s : set.sessions) {
> setSessionClosing(s.sessionId);
> expirer.expire(s);
> }
> }
> nextExpirationTime += expirationInterval;
> }
> } catch (InterruptedException e) {
> handleException(this.getName(), e);
> }
> LOG.info("SessionTrackerImpl exited loop!");
> }
> {code}
> I think it may be better to use System.nanoTime(), not 
> System.currentTimeMillis, because the later can be changed manually or 
> automatically by a NTP client. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2785) Server inappropriately throttles connections under load before SASL completes

2017-05-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015085#comment-16015085
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2785:
---

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/256
  
lgtm, I'll merge this.


> Server inappropriately throttles connections under load before SASL completes
> -
>
> Key: ZOOKEEPER-2785
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2785
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.10
>Reporter: Abhishek Singh Chouhan
>Priority: Critical
>  Labels: sasl
> Fix For: 3.4.11
>
>
> When a zk server is running close to its outstanding requests limit, the 
> server incorrectly throttles the sasl request. This leads to the client 
> waiting for the final sasl packet (session is already established) and 
> deferring all non priming packets till then which also includes the ping 
> packets. The client then waits for the final packet but never gets it and 
> times out saying haven't heard from server. This is fatal for services such 
> as HBase which retry for finite attempts and exit post these attempts.
> Issue being that in ZooKeeperServer.processPacket(..) incase of sasl we send 
> the response and incorrectly also call cnxn.incrOutstandingRequests(h), which 
> throttles the connection if we're running over outstandingrequests limit, 
> which results in the server not processing the subsequent packet from the 
> client. Also we donot have any pending request to send for the connection and 
> hence never call enableRecv(). We should return after sending response to the 
> sasl request.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2774) Ephemeral znode will not be removed when sesstion timeout, if the system time of ZooKeeper node changes unexpectedly.

2017-05-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015091#comment-16015091
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2774:
---

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/253
  
Patch lgtm, almost there!

There is one test failure in pre-commit build (see 'All checks have failed' 
- Details)

https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/698/testReport/
org.apache.zookeeper.test.ReadOnlyModeTest.testConnectionEvents
This test was not failing without this patch, and (consistently) failing 
with the patch on Apache Jenkins. Please investigate. Generally we'd like have 
a green pre-commit build before merging a PR.


> Ephemeral znode will not be removed when sesstion timeout, if the system time 
> of ZooKeeper node changes unexpectedly.
> -
>
> Key: ZOOKEEPER-2774
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2774
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.8, 3.4.9, 3.4.10
> Environment: Centos6.5
>Reporter: JiangJiafu
>
> 1. Deploy a ZooKeeper cluster with one node.
> 2. Create a Ephemeral znode.
> 3. Change the system time of the ZooKeeper node to a earlier point.
> 4. Disconnect the client with the ZooKeeper server.
> Then the ephemeral znode will exist for a long time even when session timeout.
> I have read the ZooKeeper source code and I find the code int 
> SessionTrackerImpl.java,
> {code:title=SessionTrackerImpl.java|borderStyle=solid}
> @Override
> synchronized public void run() {
> try {
> while (running) {
> currentTime = System.currentTimeMillis();
> if (nextExpirationTime > currentTime) {
> this.wait(nextExpirationTime - currentTime);
> continue;
> }
> SessionSet set;
> set = sessionSets.remove(nextExpirationTime);
> if (set != null) {
> for (SessionImpl s : set.sessions) {
> setSessionClosing(s.sessionId);
> expirer.expire(s);
> }
> }
> nextExpirationTime += expirationInterval;
> }
> } catch (InterruptedException e) {
> handleException(this.getName(), e);
> }
> LOG.info("SessionTrackerImpl exited loop!");
> }
> {code}
> I think it may be better to use System.nanoTime(), not 
> System.currentTimeMillis, because the later can be changed manually or 
> automatically by a NTP client. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2774) Ephemeral znode will not be removed when sesstion timeout, if the system time of ZooKeeper node changes unexpectedly.

2017-05-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015142#comment-16015142
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2774:
---

Github user JiangJiafu commented on the issue:

https://github.com/apache/zookeeper/pull/253
  
It seems like now the unit tests are ok.
I don't know what is the problems now? @hanm 


> Ephemeral znode will not be removed when sesstion timeout, if the system time 
> of ZooKeeper node changes unexpectedly.
> -
>
> Key: ZOOKEEPER-2774
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2774
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.8, 3.4.9, 3.4.10
> Environment: Centos6.5
>Reporter: JiangJiafu
>
> 1. Deploy a ZooKeeper cluster with one node.
> 2. Create a Ephemeral znode.
> 3. Change the system time of the ZooKeeper node to a earlier point.
> 4. Disconnect the client with the ZooKeeper server.
> Then the ephemeral znode will exist for a long time even when session timeout.
> I have read the ZooKeeper source code and I find the code int 
> SessionTrackerImpl.java,
> {code:title=SessionTrackerImpl.java|borderStyle=solid}
> @Override
> synchronized public void run() {
> try {
> while (running) {
> currentTime = System.currentTimeMillis();
> if (nextExpirationTime > currentTime) {
> this.wait(nextExpirationTime - currentTime);
> continue;
> }
> SessionSet set;
> set = sessionSets.remove(nextExpirationTime);
> if (set != null) {
> for (SessionImpl s : set.sessions) {
> setSessionClosing(s.sessionId);
> expirer.expire(s);
> }
> }
> nextExpirationTime += expirationInterval;
> }
> } catch (InterruptedException e) {
> handleException(this.getName(), e);
> }
> LOG.info("SessionTrackerImpl exited loop!");
> }
> {code}
> I think it may be better to use System.nanoTime(), not 
> System.currentTimeMillis, because the later can be changed manually or 
> automatically by a NTP client. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2784) Add some limitations on code level for `SID` to avoid configuration problem

2017-05-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015156#comment-16015156
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2784:
---

GitHub user asdf2014 opened a pull request:

https://github.com/apache/zookeeper/pull/257

ZOOKEEPER-2784: Add same `sid` config problem check

Add some limitations on code level for `SID` to avoid configuration problem

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/asdf2014/zookeeper quorum_sid

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/257.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #257


commit 4447631b3d2b7cd58f0fd4d92923ff87482330a9
Author: asdf2014 <1571805...@qq.com>
Date:   2017-05-18T03:47:00Z

ZOOKEEPER-2784: Add same `sid` config problem check




> Add some limitations on code level for `SID` to avoid configuration problem
> ---
>
> Key: ZOOKEEPER-2784
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2784
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: quorum
>Affects Versions: 3.5.2
>Reporter: Benedict Jin
> Fix For: 3.5.4
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> As so far, `QuorumCnxManager#receiveConnection` cannot find out the same 
> `SID` problem, then the Zookeeper cluster will start successfully. But the 
> cluster is not health, and it will throw some problem like `not 
> synchronized`. So, i thought we should add some limitations on code level for 
> `SID` to find those configuration problem more early.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2784) Add some limitations on code level for `SID` to avoid configuration problem

2017-05-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015241#comment-16015241
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2784:
---

Github user asdf2014 commented on the issue:

https://github.com/apache/zookeeper/pull/257
  
It's total fine in my local IDE...

![image](https://cloud.githubusercontent.com/assets/8108788/26188075/ba301912-3bcf-11e7-92c8-79497eea1a9b.png)



> Add some limitations on code level for `SID` to avoid configuration problem
> ---
>
> Key: ZOOKEEPER-2784
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2784
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: quorum
>Affects Versions: 3.5.2
>Reporter: Benedict Jin
> Fix For: 3.5.4
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> As so far, `QuorumCnxManager#receiveConnection` cannot find out the same 
> `SID` problem, then the Zookeeper cluster will start successfully. But the 
> cluster is not health, and it will throw some problem like `not 
> synchronized`. So, i thought we should add some limitations on code level for 
> `SID` to find those configuration problem more early.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2784) Add some limitations on code level for `SID` to avoid configuration problem

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015514#comment-16015514
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2784:
---

Github user asdf2014 commented on the issue:

https://github.com/apache/zookeeper/pull/257
  
Finally, it works :smile:
```bash
 [exec] [junit] 2017-05-18 09:29:00,770 [myid:] - INFO  
[main:ZKTestCase$1@58] - STARTING testSameSID
 [exec] [junit] 2017-05-18 09:29:00,770 [myid:] - INFO  
[main:PortAssignment@85] - Assigned port 24753 from range 24686 - 27378.
 [exec] [junit] 2017-05-18 09:29:00,770 [myid:] - INFO  
[main:PortAssignment@85] - Assigned port 24754 from range 24686 - 27378.
 [exec] [junit] 2017-05-18 09:29:00,771 [myid:] - INFO  
[main:PortAssignment@85] - Assigned port 24755 from range 24686 - 27378.
 [exec] [junit] 2017-05-18 09:29:00,771 [myid:] - INFO  
[main:PortAssignment@85] - Assigned port 24756 from range 24686 - 27378.
 [exec] [junit] 2017-05-18 09:29:00,771 [myid:] - INFO  
[main:PortAssignment@85] - Assigned port 24757 from range 24686 - 27378.
 [exec] [junit] 2017-05-18 09:29:00,772 [myid:] - INFO  
[main:PortAssignment@85] - Assigned port 24758 from range 24686 - 27378.
 [exec] [junit] 2017-05-18 09:29:00,772 [myid:] - INFO  
[main:PortAssignment@85] - Assigned port 24759 from range 24686 - 27378.
 [exec] [junit] 2017-05-18 09:29:00,772 [myid:] - INFO  
[main:PortAssignment@85] - Assigned port 24760 from range 24686 - 27378.
 [exec] [junit] 2017-05-18 09:29:00,773 [myid:] - INFO  
[main:PortAssignment@85] - Assigned port 24761 from range 24686 - 27378.
 [exec] [junit] 2017-05-18 09:29:00,773 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@77] - RUNNING TEST METHOD 
testSameSID
 [exec] [junit] 2017-05-18 09:29:00,773 [myid:] - INFO  
[main:ServerCnxnFactory@134] - Using 
org.apache.zookeeper.server.NIOServerCnxnFactory as server connection factory
 [exec] [junit] 2017-05-18 09:29:00,773 [myid:] - INFO  
[main:NIOServerCnxnFactory@673] - Configuring NIO connection handler with 10s 
sessionless connection timeout, 3 selector thread(s), 48 worker threads, and 64 
kB direct buffers.
 [exec] [junit] 2017-05-18 09:29:00,774 [myid:] - INFO  
[main:NIOServerCnxnFactory@686] - binding to port /127.0.0.1:24760
 [exec] [junit] 2017-05-18 09:29:00,775 [myid:] - INFO  
[main:CnxManagerTest@392] - Creating socket connection, host: 127.0.0.1, port: 
24761
 [exec] [junit] 2017-05-18 09:29:00,775 [myid:] - INFO  
[QuorumPeerListener:QuorumCnxManager$Listener@636] - My election bind port: 
/127.0.0.1:24761
 [exec] [junit] 2017-05-18 09:29:00,775 [myid:] - INFO  
[/127.0.0.1:24761:QuorumCnxManager$Listener@642] - Received connection request 
/127.0.0.1:54430
 [exec] [junit] 2017-05-18 09:29:00,775 [myid:] - WARN  
[main:QuorumCnxManager@342] - Exception reading or writing challenge: 
java.io.EOFException
 [exec] [junit] 2017-05-18 09:29:00,776 [myid:] - ERROR 
[/127.0.0.1:24761:QuorumCnxManager$Listener@664] - Appearing duplicate SID: 2
```


> Add some limitations on code level for `SID` to avoid configuration problem
> ---
>
> Key: ZOOKEEPER-2784
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2784
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: quorum
>Affects Versions: 3.5.2
>Reporter: Benedict Jin
> Fix For: 3.5.4
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> As so far, `QuorumCnxManager#receiveConnection` cannot find out the same 
> `SID` problem, then the Zookeeper cluster will start successfully. But the 
> cluster is not health, and it will throw some problem like `not 
> synchronized`. So, i thought we should add some limitations on code level for 
> `SID` to find those configuration problem more early.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015523#comment-16015523
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2775:
---

Github user rakeshadr commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/254#discussion_r117207576
  
--- Diff: src/java/main/org/apache/zookeeper/ClientCnxn.java ---
@@ -1054,6 +1054,8 @@ private void sendPing() {
 private boolean saslLoginFailed = false;
 
 private void startConnect() throws IOException {
+// initializing it for new connection
+saslLoginFailed = false;
--- End diff --

Thanks @arshadmohammad  for the details.

yes, only `SendThread` is updating the flag. But, during sasl login retries 
period, the flag status will be checked by `tunnelAuthInProgress()` packet 
processing thread, so multiple threads are accessing the flag. The code looks 
little tricky and `zooKeeperSaslClient `null value represents auth in progress. 
I'm almost OK with the change and trying another attempt to avoid any 
compatibility issues to the users as this would go to stable branches:-). 

Earlier the behavior was, once the flag updated to flase, 
`tunnelAuthInProgress` function would return false always. Now, with the 
proposed fix, sometimes it would return false and sometimes it would return 
true, right? Will this results in any consistency issues later?

Assume  a case, where successful login takes several retries.
(1) Immediately after the login failure the flag will be false. During this 
time `tunnelAuthInProgress() ` function returns false to the callers.
(2) Assume, `startConnect()` retries started. During this time, 
`tunnelAuthInProgress() ` function returns true to the callers.

My previous suggestion was to avoid this situation and consistently 
`tunnelAuthInProgress()` function return false until the next successful login. 
Does this makes sense to you?

@hanm, welcome your thoughts. Thanks!


> ZK Client not able to connect with Xid out of order error 
> --
>
> Key: ZOOKEEPER-2775
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.10, 3.5.3, 3.6.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Mohammad Arshad
>Priority: Critical
> Attachments: ZOOKEEPER-2775-01.patch
>
>
> During Network unreachable scenario in one of the cluster, we observed Xid 
> out of order and Nothing in the queue error continously. And ZK client it 
> finally not able to connect successully to ZK server. 
> *Logs:*
> unexpected error, closing socket connection and attempting reconnect | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) 
> java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 
> for a packet with details: clientPath:null serverPath:null finished:false 
> header:: 53,101  replyHeader:: 0,0,-4  request:: 
> 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes}
>   response:: null
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
> unexpected error, closing socket connection and attempting reconnect 
> java.io.IOException: Nothing in the queue, but got 1
>   at 
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426)
>   
> *Analysis:* 
> 1) First time Client fails to do SASL login due to network unreachable 
> problem.
> 2017-03-29 10:03:59,377 | WARN  | [main-SendThread(192.168.130.8:24002)] | 
> SASL configuration failed: javax.security.auth.login.LoginException: Network 
> is unreachable (sendto failed) Will continue connection to Zookeeper server 
> without SASL authentication, if Zookeeper server allows it. | 
> org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) 
>   Here the boolean saslLoginFailed becomes true.
> 2) After some time network connection is recovered and client is successully 
> able to login but still the boolean saslLoginFailed is not reset to false. 
> 3) Now SASL negotiation between client and server start happening and during 
> this time no user re

[jira] [Commented] (ZOOKEEPER-2762) Multithreaded correctness Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015537#comment-16015537
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2762:
---

Github user rakeshadr commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/239#discussion_r117211921
  
--- Diff: ivy.xml ---
@@ -49,6 +49,8 @@
 
 
 
+
--- End diff --

Any alternative other than intro dependency to google libraries. Presently, 
zk code doesn't have any dependency with google libraries.


> Multithreaded correctness Warnings
> --
>
> Key: ZOOKEEPER-2762
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2762
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Abraham Fine
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2731) Cleanup findbug warnings in branch-3.4: Malicious code vulnerability Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015642#comment-16015642
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2731:
---

Github user rakeshadr commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/232#discussion_r117228583
  
--- Diff: src/java/main/org/apache/jute/compiler/JType.java ---
@@ -27,7 +27,7 @@
private String mCName;
 private String mCppName;
 private String mCsharpName;
-private String mJavaName;
+protected String mJavaName;
--- End diff --

Thanks a lot @afine for pointing out the performance gains. I'd suggest to 
separate out the changes that helps to improve the performance from findbug fix 
because that would make the findbug fix/reviews simple. Also, iiuc perf related 
changes are applicable to all the branch codes and separate task would help us 
to track/merge the changes easily rather than clubbing multiple changes 
together in one commit.



> Cleanup findbug warnings in branch-3.4: Malicious code vulnerability Warnings
> -
>
> Key: ZOOKEEPER-2731
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2731
> Project: ZooKeeper
>  Issue Type: Sub-task
>Affects Versions: 3.4.9
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> MSorg.apache.zookeeper.Environment.JAAS_CONF_KEY isn't final but should be
> Bug type MS_SHOULD_BE_FINAL (click for details) 
> In class org.apache.zookeeper.Environment
> Field org.apache.zookeeper.Environment.JAAS_CONF_KEY
> At Environment.java:[line 34]
> MSorg.apache.zookeeper.server.ServerCnxn.cmd2String is a mutable 
> collection which should be package protected
> Bug type MS_MUTABLE_COLLECTION_PKGPROTECT (click for details) 
> In class org.apache.zookeeper.server.ServerCnxn
> Field org.apache.zookeeper.server.ServerCnxn.cmd2String
> At ServerCnxn.java:[line 230]
> MSorg.apache.zookeeper.ZooDefs$Ids.OPEN_ACL_UNSAFE is a mutable collection
> Bug type MS_MUTABLE_COLLECTION (click for details) 
> In class org.apache.zookeeper.ZooDefs$Ids
> Field org.apache.zookeeper.ZooDefs$Ids.OPEN_ACL_UNSAFE
> At ZooDefs.java:[line 100]
> MSorg.apache.zookeeper.ZooKeeperMain.commandMap is a mutable collection 
> which should be package protected
> Bug type MS_MUTABLE_COLLECTION_PKGPROTECT (click for details) 
> In class org.apache.zookeeper.ZooKeeperMain
> Field org.apache.zookeeper.ZooKeeperMain.commandMap
> At ZooKeeperMain.java:[line 53]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2732) Cleanup findbug warnings in branch-3.4: Performance Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015659#comment-16015659
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2732:
---

Github user rakeshadr commented on the issue:

https://github.com/apache/zookeeper/pull/231
  
+1 LGTM

Could you please rebase PR, as this has following conflict with the latest 
code - ZOOKEEPER-2759.
`error: patch failed: 
src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java:182`


> Cleanup findbug warnings in branch-3.4: Performance Warnings
> 
>
> Key: ZOOKEEPER-2732
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2732
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> BxBoxing/unboxing to parse a primitive new 
> org.apache.zookeeper.server.quorum.QuorumCnxManager(long, Map, 
> QuorumAuthServer, QuorumAuthLearner, int, boolean, int, boolean)
> Bxnew org.apache.zookeeper.server.quorum.QuorumCnxManager(long, Map, 
> QuorumAuthServer, QuorumAuthLearner, int, boolean, int, boolean) invokes 
> inefficient new Integer(String) constructor; use Integer.valueOf(String) 
> instead
> Dm
> org.apache.zookeeper.server.quorum.FastLeaderElection$Notification.toString() 
> invokes inefficient new String(String) constructor
> WMI   org.apache.zookeeper.server.DataTree.dumpEphemerals(PrintWriter) makes 
> inefficient use of keySet iterator instead of entrySet iterator
> WMI   
> org.apache.zookeeper.server.quorum.flexible.QuorumHierarchical.computeGroupWeight()
>  makes inefficient use of keySet iterator instead of entrySet iterator
> WMI   
> org.apache.zookeeper.server.quorum.flexible.QuorumHierarchical.containsQuorum(HashSet)
>  makes inefficient use of keySet iterator instead of entrySet iterator
> WMI   org.apache.zookeeper.ZooKeeperMain.usage() makes inefficient use of 
> keySet iterator instead of entrySet iterator
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2733) Cleanup findbug warnings in branch-3.4: Dodgy code Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015986#comment-16015986
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2733:
---

Github user rakeshadr commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/236#discussion_r117279935
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java ---
@@ -628,14 +628,7 @@ protected void pRequest(Request request) throws 
RequestProcessorException {
 break;
  
 //All the rest don't need to create a Txn - just verify session
-case OpCode.sync:
-case OpCode.exists:
-case OpCode.getData:
-case OpCode.getACL:
-case OpCode.getChildren:
-case OpCode.getChildren2:
-case OpCode.ping:
-case OpCode.setWatches:
+default:
--- End diff --

This will execute if any `unknown type` and is not expected, isn't it?
We could keep the existing case checks and add default like,
```
zks.sessionTracker.checkSession(request.sessionId,
  request.getOwner());
break;
default:
LOG.warn("unknown type " + request.type);
break;

```


> Cleanup findbug warnings in branch-3.4: Dodgy code Warnings
> ---
>
> Key: ZOOKEEPER-2733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2733
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> DB
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[]) uses the same code for two branches
> DLS   Dead store to txn in 
> org.apache.zookeeper.server.quorum.LearnerHandler.packetToString(QuorumPacket)
> NPLoad of known null value in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[])
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthServer.send(DataOutputStream,
>  byte[], QuorumAuth$Status)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.upgrade.UpgradeMain.copyFiles(File, File, String) 
> due to return value of called method
> RCN   Redundant nullcheck of bytes, which is known to be non-null in 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next()
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(int, long, 
> Request, Record, boolean) where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerReceiver.run()
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerSender.process(AuthFastLeaderElection$ToSend)
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Follower.processPacket(QuorumPacket) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Observer.processPacket(QuorumPacket) where 
> default case is missing
> STWrite to static field 
> org.apache.zookeeper.server.SyncRequestProcessor.randRoll from instance 
> method org.apache.zookeeper.server.SyncRequestProcessor.run()
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.err
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.path
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.stat
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.type
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2733) Cleanup findbug warnings in branch-3.4: Dodgy code Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015985#comment-16015985
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2733:
---

Github user rakeshadr commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/236#discussion_r117232748
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java ---
@@ -504,6 +504,8 @@ protected void pRequest2Txn(int type, long zxid, 
Request request, Record record,
 version = currentVersion + 1;
 request.txn = new CheckVersionTxn(path, version);
 break;
+default:
+LOG.error("Invalid OpCode received by 
PrepRequestProcessor: " + type);
--- End diff --

Just a suggestion to make this more readable.

`LOG.error("Invalid OpCode: {} received by PrepRequestProcessor", type);`


> Cleanup findbug warnings in branch-3.4: Dodgy code Warnings
> ---
>
> Key: ZOOKEEPER-2733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2733
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> DB
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[]) uses the same code for two branches
> DLS   Dead store to txn in 
> org.apache.zookeeper.server.quorum.LearnerHandler.packetToString(QuorumPacket)
> NPLoad of known null value in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[])
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthServer.send(DataOutputStream,
>  byte[], QuorumAuth$Status)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.upgrade.UpgradeMain.copyFiles(File, File, String) 
> due to return value of called method
> RCN   Redundant nullcheck of bytes, which is known to be non-null in 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next()
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(int, long, 
> Request, Record, boolean) where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerReceiver.run()
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerSender.process(AuthFastLeaderElection$ToSend)
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Follower.processPacket(QuorumPacket) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Observer.processPacket(QuorumPacket) where 
> default case is missing
> STWrite to static field 
> org.apache.zookeeper.server.SyncRequestProcessor.randRoll from instance 
> method org.apache.zookeeper.server.SyncRequestProcessor.run()
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.err
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.path
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.stat
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.type
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2733) Cleanup findbug warnings in branch-3.4: Dodgy code Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015982#comment-16015982
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2733:
---

Github user rakeshadr commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/236#discussion_r117285488
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/upgrade/UpgradeMain.java ---
@@ -113,16 +113,18 @@ private void createAllDirs() throws IOException {
  * @throws IOException
  */
 void copyFiles(File srcDir, File dstDir, String filter) throws 
IOException {
-File[] list = srcDir.listFiles();
-for (File file: list) {
-String name = file.getName();
-if (name.startsWith(filter)) {
-// we need to copy this file
-File dest = new File(dstDir, name);
-LOG.info("Renaming " + file + " to " + dest);
-if (!file.renameTo(dest)) {
-throw new IOException("Unable to rename " 
-+ file + " to " +  dest);
+File[] list;
+if ((list = srcDir.listFiles()) != null) {
--- End diff --

Please keep var assignment `(list = srcDir.listFiles())` along with the 
object reference.

`File[] list = srcDir.listFiles();`


> Cleanup findbug warnings in branch-3.4: Dodgy code Warnings
> ---
>
> Key: ZOOKEEPER-2733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2733
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> DB
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[]) uses the same code for two branches
> DLS   Dead store to txn in 
> org.apache.zookeeper.server.quorum.LearnerHandler.packetToString(QuorumPacket)
> NPLoad of known null value in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[])
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthServer.send(DataOutputStream,
>  byte[], QuorumAuth$Status)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.upgrade.UpgradeMain.copyFiles(File, File, String) 
> due to return value of called method
> RCN   Redundant nullcheck of bytes, which is known to be non-null in 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next()
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(int, long, 
> Request, Record, boolean) where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerReceiver.run()
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerSender.process(AuthFastLeaderElection$ToSend)
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Follower.processPacket(QuorumPacket) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Observer.processPacket(QuorumPacket) where 
> default case is missing
> STWrite to static field 
> org.apache.zookeeper.server.SyncRequestProcessor.randRoll from instance 
> method org.apache.zookeeper.server.SyncRequestProcessor.run()
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.err
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.path
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.stat
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.type
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2733) Cleanup findbug warnings in branch-3.4: Dodgy code Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015988#comment-16015988
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2733:
---

Github user rakeshadr commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/236#discussion_r117280664
  
--- Diff: src/java/main/org/apache/zookeeper/server/PurgeTxnLog.java ---
@@ -133,11 +133,17 @@ public boolean accept(File f){
 }
 }
 // add all non-excluded log files
-List files = new ArrayList(Arrays.asList(txnLog
-.getDataDir().listFiles(new MyFileFilter(PREFIX_LOG;
+List files = new ArrayList();
+File[] fileArray;
+if ((fileArray = txnLog.getDataDir().listFiles(new 
MyFileFilter(PREFIX_LOG))) != null) {
--- End diff --

Can we keep the var assignment `File[] fileArray` outside along with the 
object reference instead of clubbing with if check.

 File[] fileArray = txnLog.getDataDir().listFiles(new 
MyFileFilter(PREFIX_LOG);


> Cleanup findbug warnings in branch-3.4: Dodgy code Warnings
> ---
>
> Key: ZOOKEEPER-2733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2733
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> DB
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[]) uses the same code for two branches
> DLS   Dead store to txn in 
> org.apache.zookeeper.server.quorum.LearnerHandler.packetToString(QuorumPacket)
> NPLoad of known null value in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[])
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthServer.send(DataOutputStream,
>  byte[], QuorumAuth$Status)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.upgrade.UpgradeMain.copyFiles(File, File, String) 
> due to return value of called method
> RCN   Redundant nullcheck of bytes, which is known to be non-null in 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next()
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(int, long, 
> Request, Record, boolean) where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerReceiver.run()
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerSender.process(AuthFastLeaderElection$ToSend)
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Follower.processPacket(QuorumPacket) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Observer.processPacket(QuorumPacket) where 
> default case is missing
> STWrite to static field 
> org.apache.zookeeper.server.SyncRequestProcessor.randRoll from instance 
> method org.apache.zookeeper.server.SyncRequestProcessor.run()
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.err
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.path
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.stat
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.type
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2733) Cleanup findbug warnings in branch-3.4: Dodgy code Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015987#comment-16015987
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2733:
---

Github user rakeshadr commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/236#discussion_r117284521
  
--- Diff: src/java/main/org/apache/zookeeper/server/quorum/Observer.java ---
@@ -125,6 +125,8 @@ protected void processPacket(QuorumPacket qp) throws 
IOException{
 ObserverZooKeeperServer obs = (ObserverZooKeeperServer)zk;
 obs.commitRequest(request);
 break;
+default:
+LOG.error("Invalid packet type received by Observer: " + 
qp.getType());
--- End diff --

Can we change log message like below to improve readability.

`LOG.error("Invalid packet type: {} received by Observer", qp.getType());`


> Cleanup findbug warnings in branch-3.4: Dodgy code Warnings
> ---
>
> Key: ZOOKEEPER-2733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2733
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> DB
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[]) uses the same code for two branches
> DLS   Dead store to txn in 
> org.apache.zookeeper.server.quorum.LearnerHandler.packetToString(QuorumPacket)
> NPLoad of known null value in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[])
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthServer.send(DataOutputStream,
>  byte[], QuorumAuth$Status)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.upgrade.UpgradeMain.copyFiles(File, File, String) 
> due to return value of called method
> RCN   Redundant nullcheck of bytes, which is known to be non-null in 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next()
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(int, long, 
> Request, Record, boolean) where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerReceiver.run()
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerSender.process(AuthFastLeaderElection$ToSend)
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Follower.processPacket(QuorumPacket) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Observer.processPacket(QuorumPacket) where 
> default case is missing
> STWrite to static field 
> org.apache.zookeeper.server.SyncRequestProcessor.randRoll from instance 
> method org.apache.zookeeper.server.SyncRequestProcessor.run()
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.err
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.path
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.stat
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.type
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2733) Cleanup findbug warnings in branch-3.4: Dodgy code Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015989#comment-16015989
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2733:
---

Github user rakeshadr commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/236#discussion_r117283048
  
--- Diff: src/java/main/org/apache/zookeeper/server/PurgeTxnLog.java ---
@@ -133,11 +133,17 @@ public boolean accept(File f){
 }
 }
 // add all non-excluded log files
-List files = new ArrayList(Arrays.asList(txnLog
-.getDataDir().listFiles(new MyFileFilter(PREFIX_LOG;
+List files = new ArrayList();
+File[] fileArray;
+if ((fileArray = txnLog.getDataDir().listFiles(new 
MyFileFilter(PREFIX_LOG))) != null) {
+files.addAll(Arrays.asList(fileArray));
+}
+
 // add all non-excluded snapshot files to the deletion list
-files.addAll(Arrays.asList(txnLog.getSnapDir().listFiles(
-new MyFileFilter(PREFIX_SNAPSHOT;
+if ((fileArray = txnLog.getSnapDir().listFiles(new 
MyFileFilter(PREFIX_SNAPSHOT))) != null) {
--- End diff --

Same as above, Its good to keep the var assignment outside if check for 
better readability.
`(fileArray = txnLog.getSnapDir().listFiles(new 
MyFileFilter(PREFIX_SNAPSHOT))`


> Cleanup findbug warnings in branch-3.4: Dodgy code Warnings
> ---
>
> Key: ZOOKEEPER-2733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2733
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> DB
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[]) uses the same code for two branches
> DLS   Dead store to txn in 
> org.apache.zookeeper.server.quorum.LearnerHandler.packetToString(QuorumPacket)
> NPLoad of known null value in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[])
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthServer.send(DataOutputStream,
>  byte[], QuorumAuth$Status)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.upgrade.UpgradeMain.copyFiles(File, File, String) 
> due to return value of called method
> RCN   Redundant nullcheck of bytes, which is known to be non-null in 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next()
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(int, long, 
> Request, Record, boolean) where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerReceiver.run()
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerSender.process(AuthFastLeaderElection$ToSend)
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Follower.processPacket(QuorumPacket) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Observer.processPacket(QuorumPacket) where 
> default case is missing
> STWrite to static field 
> org.apache.zookeeper.server.SyncRequestProcessor.randRoll from instance 
> method org.apache.zookeeper.server.SyncRequestProcessor.run()
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.err
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.path
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.stat
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.type
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2733) Cleanup findbug warnings in branch-3.4: Dodgy code Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015984#comment-16015984
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2733:
---

Github user rakeshadr commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/236#discussion_r117284290
  
--- Diff: src/java/main/org/apache/zookeeper/server/quorum/Follower.java ---
@@ -135,6 +135,8 @@ protected void processPacket(QuorumPacket qp) throws 
IOException{
 case Leader.SYNC:
 fzk.sync();
 break;
+default:
+LOG.error("Invalid packet type received by Observer: " + 
qp.getType());
--- End diff --

Can we change log message like below to improve readability.
`LOG.error("Invalid packet type: {} received by Observer", qp.getType());`


> Cleanup findbug warnings in branch-3.4: Dodgy code Warnings
> ---
>
> Key: ZOOKEEPER-2733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2733
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> DB
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[]) uses the same code for two branches
> DLS   Dead store to txn in 
> org.apache.zookeeper.server.quorum.LearnerHandler.packetToString(QuorumPacket)
> NPLoad of known null value in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[])
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthServer.send(DataOutputStream,
>  byte[], QuorumAuth$Status)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.upgrade.UpgradeMain.copyFiles(File, File, String) 
> due to return value of called method
> RCN   Redundant nullcheck of bytes, which is known to be non-null in 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next()
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(int, long, 
> Request, Record, boolean) where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerReceiver.run()
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerSender.process(AuthFastLeaderElection$ToSend)
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Follower.processPacket(QuorumPacket) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Observer.processPacket(QuorumPacket) where 
> default case is missing
> STWrite to static field 
> org.apache.zookeeper.server.SyncRequestProcessor.randRoll from instance 
> method org.apache.zookeeper.server.SyncRequestProcessor.run()
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.err
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.path
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.stat
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.type
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2733) Cleanup findbug warnings in branch-3.4: Dodgy code Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015983#comment-16015983
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2733:
---

Github user rakeshadr commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/236#discussion_r117287341
  
--- Diff: ivy.xml ---
@@ -49,6 +49,8 @@
 
 
 
+
--- End diff --

I'd prefer to exclude `AuthFastLeaderElection.java` as this is deprecated 
rather than introducing google library dependency. Does this sound good to you. 
Thanks!


> Cleanup findbug warnings in branch-3.4: Dodgy code Warnings
> ---
>
> Key: ZOOKEEPER-2733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2733
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> DB
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[]) uses the same code for two branches
> DLS   Dead store to txn in 
> org.apache.zookeeper.server.quorum.LearnerHandler.packetToString(QuorumPacket)
> NPLoad of known null value in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[])
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthServer.send(DataOutputStream,
>  byte[], QuorumAuth$Status)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.upgrade.UpgradeMain.copyFiles(File, File, String) 
> due to return value of called method
> RCN   Redundant nullcheck of bytes, which is known to be non-null in 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next()
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(int, long, 
> Request, Record, boolean) where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerReceiver.run()
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerSender.process(AuthFastLeaderElection$ToSend)
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Follower.processPacket(QuorumPacket) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Observer.processPacket(QuorumPacket) where 
> default case is missing
> STWrite to static field 
> org.apache.zookeeper.server.SyncRequestProcessor.randRoll from instance 
> method org.apache.zookeeper.server.SyncRequestProcessor.run()
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.err
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.path
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.stat
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.type
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2774) Ephemeral znode will not be removed when sesstion timeout, if the system time of ZooKeeper node changes unexpectedly.

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016193#comment-16016193
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2774:
---

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/253
  
LGTM - all unit test passed. The Jenkins check failed because it has 
findbug warnings and that is a known issue for branch-3.4. I will merge this to 
branch-3.4 soon.

One suggestion, in future pull request, please provide a descriptive title 
and a brief description of what the pull request did.


> Ephemeral znode will not be removed when sesstion timeout, if the system time 
> of ZooKeeper node changes unexpectedly.
> -
>
> Key: ZOOKEEPER-2774
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2774
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.8, 3.4.9, 3.4.10
> Environment: Centos6.5
>Reporter: JiangJiafu
>
> 1. Deploy a ZooKeeper cluster with one node.
> 2. Create a Ephemeral znode.
> 3. Change the system time of the ZooKeeper node to a earlier point.
> 4. Disconnect the client with the ZooKeeper server.
> Then the ephemeral znode will exist for a long time even when session timeout.
> I have read the ZooKeeper source code and I find the code int 
> SessionTrackerImpl.java,
> {code:title=SessionTrackerImpl.java|borderStyle=solid}
> @Override
> synchronized public void run() {
> try {
> while (running) {
> currentTime = System.currentTimeMillis();
> if (nextExpirationTime > currentTime) {
> this.wait(nextExpirationTime - currentTime);
> continue;
> }
> SessionSet set;
> set = sessionSets.remove(nextExpirationTime);
> if (set != null) {
> for (SessionImpl s : set.sessions) {
> setSessionClosing(s.sessionId);
> expirer.expire(s);
> }
> }
> nextExpirationTime += expirationInterval;
> }
> } catch (InterruptedException e) {
> handleException(this.getName(), e);
> }
> LOG.info("SessionTrackerImpl exited loop!");
> }
> {code}
> I think it may be better to use System.nanoTime(), not 
> System.currentTimeMillis, because the later can be changed manually or 
> automatically by a NTP client. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2733) Cleanup findbug warnings in branch-3.4: Dodgy code Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016382#comment-16016382
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2733:
---

Github user afine commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/236#discussion_r117341551
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java ---
@@ -628,14 +628,7 @@ protected void pRequest(Request request) throws 
RequestProcessorException {
 break;
  
 //All the rest don't need to create a Txn - just verify session
-case OpCode.sync:
-case OpCode.exists:
-case OpCode.getData:
-case OpCode.getACL:
-case OpCode.getChildren:
-case OpCode.getChildren2:
-case OpCode.ping:
-case OpCode.setWatches:
+default:
--- End diff --

I get your point. I would argue that we perform the check to make sure we 
don't get a bad OpCode here: 
https://github.com/apache/zookeeper/blob/branch-3.4/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java#L742
 and I think this is cleaner. I would be willing to change this if you feel 
strongly about it.


> Cleanup findbug warnings in branch-3.4: Dodgy code Warnings
> ---
>
> Key: ZOOKEEPER-2733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2733
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> DB
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[]) uses the same code for two branches
> DLS   Dead store to txn in 
> org.apache.zookeeper.server.quorum.LearnerHandler.packetToString(QuorumPacket)
> NPLoad of known null value in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[])
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthServer.send(DataOutputStream,
>  byte[], QuorumAuth$Status)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.upgrade.UpgradeMain.copyFiles(File, File, String) 
> due to return value of called method
> RCN   Redundant nullcheck of bytes, which is known to be non-null in 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next()
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(int, long, 
> Request, Record, boolean) where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerReceiver.run()
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerSender.process(AuthFastLeaderElection$ToSend)
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Follower.processPacket(QuorumPacket) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Observer.processPacket(QuorumPacket) where 
> default case is missing
> STWrite to static field 
> org.apache.zookeeper.server.SyncRequestProcessor.randRoll from instance 
> method org.apache.zookeeper.server.SyncRequestProcessor.run()
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.err
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.path
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.stat
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.type
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2733) Cleanup findbug warnings in branch-3.4: Dodgy code Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016415#comment-16016415
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2733:
---

Github user afine commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/236#discussion_r117345541
  
--- Diff: src/java/main/org/apache/zookeeper/server/quorum/Follower.java ---
@@ -135,6 +135,8 @@ protected void processPacket(QuorumPacket qp) throws 
IOException{
 case Leader.SYNC:
 fzk.sync();
 break;
+default:
+LOG.error("Invalid packet type received by Observer: " + 
qp.getType());
--- End diff --

fixed


> Cleanup findbug warnings in branch-3.4: Dodgy code Warnings
> ---
>
> Key: ZOOKEEPER-2733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2733
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> DB
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[]) uses the same code for two branches
> DLS   Dead store to txn in 
> org.apache.zookeeper.server.quorum.LearnerHandler.packetToString(QuorumPacket)
> NPLoad of known null value in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[])
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthServer.send(DataOutputStream,
>  byte[], QuorumAuth$Status)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.upgrade.UpgradeMain.copyFiles(File, File, String) 
> due to return value of called method
> RCN   Redundant nullcheck of bytes, which is known to be non-null in 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next()
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(int, long, 
> Request, Record, boolean) where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerReceiver.run()
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerSender.process(AuthFastLeaderElection$ToSend)
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Follower.processPacket(QuorumPacket) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Observer.processPacket(QuorumPacket) where 
> default case is missing
> STWrite to static field 
> org.apache.zookeeper.server.SyncRequestProcessor.randRoll from instance 
> method org.apache.zookeeper.server.SyncRequestProcessor.run()
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.err
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.path
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.stat
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.type
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2733) Cleanup findbug warnings in branch-3.4: Dodgy code Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016417#comment-16016417
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2733:
---

Github user afine commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/236#discussion_r117345587
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/upgrade/UpgradeMain.java ---
@@ -113,16 +113,18 @@ private void createAllDirs() throws IOException {
  * @throws IOException
  */
 void copyFiles(File srcDir, File dstDir, String filter) throws 
IOException {
-File[] list = srcDir.listFiles();
-for (File file: list) {
-String name = file.getName();
-if (name.startsWith(filter)) {
-// we need to copy this file
-File dest = new File(dstDir, name);
-LOG.info("Renaming " + file + " to " + dest);
-if (!file.renameTo(dest)) {
-throw new IOException("Unable to rename " 
-+ file + " to " +  dest);
+File[] list;
+if ((list = srcDir.listFiles()) != null) {
--- End diff --

fixed


> Cleanup findbug warnings in branch-3.4: Dodgy code Warnings
> ---
>
> Key: ZOOKEEPER-2733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2733
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> DB
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[]) uses the same code for two branches
> DLS   Dead store to txn in 
> org.apache.zookeeper.server.quorum.LearnerHandler.packetToString(QuorumPacket)
> NPLoad of known null value in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[])
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthServer.send(DataOutputStream,
>  byte[], QuorumAuth$Status)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.upgrade.UpgradeMain.copyFiles(File, File, String) 
> due to return value of called method
> RCN   Redundant nullcheck of bytes, which is known to be non-null in 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next()
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(int, long, 
> Request, Record, boolean) where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerReceiver.run()
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerSender.process(AuthFastLeaderElection$ToSend)
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Follower.processPacket(QuorumPacket) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Observer.processPacket(QuorumPacket) where 
> default case is missing
> STWrite to static field 
> org.apache.zookeeper.server.SyncRequestProcessor.randRoll from instance 
> method org.apache.zookeeper.server.SyncRequestProcessor.run()
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.err
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.path
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.stat
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.type
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2733) Cleanup findbug warnings in branch-3.4: Dodgy code Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016414#comment-16016414
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2733:
---

Github user afine commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/236#discussion_r117345530
  
--- Diff: src/java/main/org/apache/zookeeper/server/PurgeTxnLog.java ---
@@ -133,11 +133,17 @@ public boolean accept(File f){
 }
 }
 // add all non-excluded log files
-List files = new ArrayList(Arrays.asList(txnLog
-.getDataDir().listFiles(new MyFileFilter(PREFIX_LOG;
+List files = new ArrayList();
+File[] fileArray;
+if ((fileArray = txnLog.getDataDir().listFiles(new 
MyFileFilter(PREFIX_LOG))) != null) {
+files.addAll(Arrays.asList(fileArray));
+}
+
 // add all non-excluded snapshot files to the deletion list
-files.addAll(Arrays.asList(txnLog.getSnapDir().listFiles(
-new MyFileFilter(PREFIX_SNAPSHOT;
+if ((fileArray = txnLog.getSnapDir().listFiles(new 
MyFileFilter(PREFIX_SNAPSHOT))) != null) {
--- End diff --

fixed


> Cleanup findbug warnings in branch-3.4: Dodgy code Warnings
> ---
>
> Key: ZOOKEEPER-2733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2733
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> DB
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[]) uses the same code for two branches
> DLS   Dead store to txn in 
> org.apache.zookeeper.server.quorum.LearnerHandler.packetToString(QuorumPacket)
> NPLoad of known null value in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[])
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthServer.send(DataOutputStream,
>  byte[], QuorumAuth$Status)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.upgrade.UpgradeMain.copyFiles(File, File, String) 
> due to return value of called method
> RCN   Redundant nullcheck of bytes, which is known to be non-null in 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next()
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(int, long, 
> Request, Record, boolean) where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerReceiver.run()
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerSender.process(AuthFastLeaderElection$ToSend)
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Follower.processPacket(QuorumPacket) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Observer.processPacket(QuorumPacket) where 
> default case is missing
> STWrite to static field 
> org.apache.zookeeper.server.SyncRequestProcessor.randRoll from instance 
> method org.apache.zookeeper.server.SyncRequestProcessor.run()
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.err
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.path
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.stat
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.type
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2733) Cleanup findbug warnings in branch-3.4: Dodgy code Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016413#comment-16016413
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2733:
---

Github user afine commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/236#discussion_r117345515
  
--- Diff: src/java/main/org/apache/zookeeper/server/PurgeTxnLog.java ---
@@ -133,11 +133,17 @@ public boolean accept(File f){
 }
 }
 // add all non-excluded log files
-List files = new ArrayList(Arrays.asList(txnLog
-.getDataDir().listFiles(new MyFileFilter(PREFIX_LOG;
+List files = new ArrayList();
+File[] fileArray;
+if ((fileArray = txnLog.getDataDir().listFiles(new 
MyFileFilter(PREFIX_LOG))) != null) {
--- End diff --

fixed


> Cleanup findbug warnings in branch-3.4: Dodgy code Warnings
> ---
>
> Key: ZOOKEEPER-2733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2733
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> DB
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[]) uses the same code for two branches
> DLS   Dead store to txn in 
> org.apache.zookeeper.server.quorum.LearnerHandler.packetToString(QuorumPacket)
> NPLoad of known null value in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[])
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthServer.send(DataOutputStream,
>  byte[], QuorumAuth$Status)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.upgrade.UpgradeMain.copyFiles(File, File, String) 
> due to return value of called method
> RCN   Redundant nullcheck of bytes, which is known to be non-null in 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next()
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(int, long, 
> Request, Record, boolean) where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerReceiver.run()
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerSender.process(AuthFastLeaderElection$ToSend)
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Follower.processPacket(QuorumPacket) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Observer.processPacket(QuorumPacket) where 
> default case is missing
> STWrite to static field 
> org.apache.zookeeper.server.SyncRequestProcessor.randRoll from instance 
> method org.apache.zookeeper.server.SyncRequestProcessor.run()
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.err
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.path
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.stat
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.type
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2733) Cleanup findbug warnings in branch-3.4: Dodgy code Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016416#comment-16016416
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2733:
---

Github user afine commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/236#discussion_r117345561
  
--- Diff: src/java/main/org/apache/zookeeper/server/quorum/Observer.java ---
@@ -125,6 +125,8 @@ protected void processPacket(QuorumPacket qp) throws 
IOException{
 ObserverZooKeeperServer obs = (ObserverZooKeeperServer)zk;
 obs.commitRequest(request);
 break;
+default:
+LOG.error("Invalid packet type received by Observer: " + 
qp.getType());
--- End diff --

fixed


> Cleanup findbug warnings in branch-3.4: Dodgy code Warnings
> ---
>
> Key: ZOOKEEPER-2733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2733
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> DB
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[]) uses the same code for two branches
> DLS   Dead store to txn in 
> org.apache.zookeeper.server.quorum.LearnerHandler.packetToString(QuorumPacket)
> NPLoad of known null value in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[])
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthServer.send(DataOutputStream,
>  byte[], QuorumAuth$Status)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.upgrade.UpgradeMain.copyFiles(File, File, String) 
> due to return value of called method
> RCN   Redundant nullcheck of bytes, which is known to be non-null in 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next()
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(int, long, 
> Request, Record, boolean) where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerReceiver.run()
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerSender.process(AuthFastLeaderElection$ToSend)
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Follower.processPacket(QuorumPacket) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Observer.processPacket(QuorumPacket) where 
> default case is missing
> STWrite to static field 
> org.apache.zookeeper.server.SyncRequestProcessor.randRoll from instance 
> method org.apache.zookeeper.server.SyncRequestProcessor.run()
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.err
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.path
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.stat
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.type
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2733) Cleanup findbug warnings in branch-3.4: Dodgy code Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016411#comment-16016411
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2733:
---

Github user afine commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/236#discussion_r117345471
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java ---
@@ -504,6 +504,8 @@ protected void pRequest2Txn(int type, long zxid, 
Request request, Record record,
 version = currentVersion + 1;
 request.txn = new CheckVersionTxn(path, version);
 break;
+default:
+LOG.error("Invalid OpCode received by 
PrepRequestProcessor: " + type);
--- End diff --

fixed


> Cleanup findbug warnings in branch-3.4: Dodgy code Warnings
> ---
>
> Key: ZOOKEEPER-2733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2733
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> DB
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[]) uses the same code for two branches
> DLS   Dead store to txn in 
> org.apache.zookeeper.server.quorum.LearnerHandler.packetToString(QuorumPacket)
> NPLoad of known null value in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.PurgeTxnLog.purgeOlderSnapshots(FileTxnSnapLog, 
> File) due to return value of called method
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthLearner.send(DataOutputStream,
>  byte[])
> NPLoad of known null value in 
> org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthServer.send(DataOutputStream,
>  byte[], QuorumAuth$Status)
> NPPossible null pointer dereference in 
> org.apache.zookeeper.server.upgrade.UpgradeMain.copyFiles(File, File, String) 
> due to return value of called method
> RCN   Redundant nullcheck of bytes, which is known to be non-null in 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next()
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(Request) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest2Txn(int, long, 
> Request, Record, boolean) where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerReceiver.run()
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.AuthFastLeaderElection$Messenger$WorkerSender.process(AuthFastLeaderElection$ToSend)
>  where default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Follower.processPacket(QuorumPacket) where 
> default case is missing
> SFSwitch statement found in 
> org.apache.zookeeper.server.quorum.Observer.processPacket(QuorumPacket) where 
> default case is missing
> STWrite to static field 
> org.apache.zookeeper.server.SyncRequestProcessor.randRoll from instance 
> method org.apache.zookeeper.server.SyncRequestProcessor.run()
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.err
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.path
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.stat
> UrF   Unread public/protected field: 
> org.apache.zookeeper.server.upgrade.DataTreeV1$ProcessTxnResult.type
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2732) Cleanup findbug warnings in branch-3.4: Performance Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016430#comment-16016430
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2732:
---

Github user afine closed the pull request at:

https://github.com/apache/zookeeper/pull/231


> Cleanup findbug warnings in branch-3.4: Performance Warnings
> 
>
> Key: ZOOKEEPER-2732
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2732
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> BxBoxing/unboxing to parse a primitive new 
> org.apache.zookeeper.server.quorum.QuorumCnxManager(long, Map, 
> QuorumAuthServer, QuorumAuthLearner, int, boolean, int, boolean)
> Bxnew org.apache.zookeeper.server.quorum.QuorumCnxManager(long, Map, 
> QuorumAuthServer, QuorumAuthLearner, int, boolean, int, boolean) invokes 
> inefficient new Integer(String) constructor; use Integer.valueOf(String) 
> instead
> Dm
> org.apache.zookeeper.server.quorum.FastLeaderElection$Notification.toString() 
> invokes inefficient new String(String) constructor
> WMI   org.apache.zookeeper.server.DataTree.dumpEphemerals(PrintWriter) makes 
> inefficient use of keySet iterator instead of entrySet iterator
> WMI   
> org.apache.zookeeper.server.quorum.flexible.QuorumHierarchical.computeGroupWeight()
>  makes inefficient use of keySet iterator instead of entrySet iterator
> WMI   
> org.apache.zookeeper.server.quorum.flexible.QuorumHierarchical.containsQuorum(HashSet)
>  makes inefficient use of keySet iterator instead of entrySet iterator
> WMI   org.apache.zookeeper.ZooKeeperMain.usage() makes inefficient use of 
> keySet iterator instead of entrySet iterator
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2732) Cleanup findbug warnings in branch-3.4: Performance Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016431#comment-16016431
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2732:
---

GitHub user afine reopened a pull request:

https://github.com/apache/zookeeper/pull/231

ZOOKEEPER-2732: Cleanup findbug warnings in branch-3.4: Performance Warnings



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/afine/zookeeper ZOOKEEPER-2732

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/231.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #231






> Cleanup findbug warnings in branch-3.4: Performance Warnings
> 
>
> Key: ZOOKEEPER-2732
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2732
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> BxBoxing/unboxing to parse a primitive new 
> org.apache.zookeeper.server.quorum.QuorumCnxManager(long, Map, 
> QuorumAuthServer, QuorumAuthLearner, int, boolean, int, boolean)
> Bxnew org.apache.zookeeper.server.quorum.QuorumCnxManager(long, Map, 
> QuorumAuthServer, QuorumAuthLearner, int, boolean, int, boolean) invokes 
> inefficient new Integer(String) constructor; use Integer.valueOf(String) 
> instead
> Dm
> org.apache.zookeeper.server.quorum.FastLeaderElection$Notification.toString() 
> invokes inefficient new String(String) constructor
> WMI   org.apache.zookeeper.server.DataTree.dumpEphemerals(PrintWriter) makes 
> inefficient use of keySet iterator instead of entrySet iterator
> WMI   
> org.apache.zookeeper.server.quorum.flexible.QuorumHierarchical.computeGroupWeight()
>  makes inefficient use of keySet iterator instead of entrySet iterator
> WMI   
> org.apache.zookeeper.server.quorum.flexible.QuorumHierarchical.containsQuorum(HashSet)
>  makes inefficient use of keySet iterator instead of entrySet iterator
> WMI   org.apache.zookeeper.ZooKeeperMain.usage() makes inefficient use of 
> keySet iterator instead of entrySet iterator
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2732) Cleanup findbug warnings in branch-3.4: Performance Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016440#comment-16016440
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2732:
---

Github user afine closed the pull request at:

https://github.com/apache/zookeeper/pull/231


> Cleanup findbug warnings in branch-3.4: Performance Warnings
> 
>
> Key: ZOOKEEPER-2732
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2732
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> BxBoxing/unboxing to parse a primitive new 
> org.apache.zookeeper.server.quorum.QuorumCnxManager(long, Map, 
> QuorumAuthServer, QuorumAuthLearner, int, boolean, int, boolean)
> Bxnew org.apache.zookeeper.server.quorum.QuorumCnxManager(long, Map, 
> QuorumAuthServer, QuorumAuthLearner, int, boolean, int, boolean) invokes 
> inefficient new Integer(String) constructor; use Integer.valueOf(String) 
> instead
> Dm
> org.apache.zookeeper.server.quorum.FastLeaderElection$Notification.toString() 
> invokes inefficient new String(String) constructor
> WMI   org.apache.zookeeper.server.DataTree.dumpEphemerals(PrintWriter) makes 
> inefficient use of keySet iterator instead of entrySet iterator
> WMI   
> org.apache.zookeeper.server.quorum.flexible.QuorumHierarchical.computeGroupWeight()
>  makes inefficient use of keySet iterator instead of entrySet iterator
> WMI   
> org.apache.zookeeper.server.quorum.flexible.QuorumHierarchical.containsQuorum(HashSet)
>  makes inefficient use of keySet iterator instead of entrySet iterator
> WMI   org.apache.zookeeper.ZooKeeperMain.usage() makes inefficient use of 
> keySet iterator instead of entrySet iterator
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2732) Cleanup findbug warnings in branch-3.4: Performance Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016451#comment-16016451
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2732:
---

GitHub user afine opened a pull request:

https://github.com/apache/zookeeper/pull/258

ZOOKEEPER-2732: Cleanup findbug warnings in branch-3.4: Performance Warnings

@rakeshadr Apologies for recreating this, I accidentally pushed a bad 
branch in https://github.com/apache/zookeeper/pull/231 and GitHub will not let 
me reset the head for that PR. This should be rebased and ready to be merged.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/afine/zookeeper ZOOKEEPER-2732

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/258.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #258


commit 7509f69e95749b30ba06fb224750bbb5cd487547
Author: Abraham Fine 
Date:   2017-04-18T18:40:12Z

ZOOKEEPER-2732: Cleanup findbug warnings in branch-3.4: Performance Warnings




> Cleanup findbug warnings in branch-3.4: Performance Warnings
> 
>
> Key: ZOOKEEPER-2732
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2732
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>
> Please refer the attached sheet in parent jira. Below is the details of 
> findbug warnings.
> {code}
> BxBoxing/unboxing to parse a primitive new 
> org.apache.zookeeper.server.quorum.QuorumCnxManager(long, Map, 
> QuorumAuthServer, QuorumAuthLearner, int, boolean, int, boolean)
> Bxnew org.apache.zookeeper.server.quorum.QuorumCnxManager(long, Map, 
> QuorumAuthServer, QuorumAuthLearner, int, boolean, int, boolean) invokes 
> inefficient new Integer(String) constructor; use Integer.valueOf(String) 
> instead
> Dm
> org.apache.zookeeper.server.quorum.FastLeaderElection$Notification.toString() 
> invokes inefficient new String(String) constructor
> WMI   org.apache.zookeeper.server.DataTree.dumpEphemerals(PrintWriter) makes 
> inefficient use of keySet iterator instead of entrySet iterator
> WMI   
> org.apache.zookeeper.server.quorum.flexible.QuorumHierarchical.computeGroupWeight()
>  makes inefficient use of keySet iterator instead of entrySet iterator
> WMI   
> org.apache.zookeeper.server.quorum.flexible.QuorumHierarchical.containsQuorum(HashSet)
>  makes inefficient use of keySet iterator instead of entrySet iterator
> WMI   org.apache.zookeeper.ZooKeeperMain.usage() makes inefficient use of 
> keySet iterator instead of entrySet iterator
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2762) Multithreaded correctness Warnings

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016455#comment-16016455
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2762:
---

Github user afine commented on the issue:

https://github.com/apache/zookeeper/pull/239
  
Done, thanks for the review @rakeshadr 


> Multithreaded correctness Warnings
> --
>
> Key: ZOOKEEPER-2762
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2762
> Project: ZooKeeper
>  Issue Type: Sub-task
>Reporter: Abraham Fine
>Assignee: Abraham Fine
> Fix For: 3.4.11
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ZOOKEEPER-2785) Server inappropriately throttles connections under load before SASL completes

2017-05-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016473#comment-16016473
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2785:
---

Github user asfgit closed the pull request at:

https://github.com/apache/zookeeper/pull/256


> Server inappropriately throttles connections under load before SASL completes
> -
>
> Key: ZOOKEEPER-2785
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2785
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.10
>Reporter: Abhishek Singh Chouhan
>Priority: Critical
>  Labels: sasl
> Fix For: 3.5.4, 3.6.0, 3.4.11
>
>
> When a zk server is running close to its outstanding requests limit, the 
> server incorrectly throttles the sasl request. This leads to the client 
> waiting for the final sasl packet (session is already established) and 
> deferring all non priming packets till then which also includes the ping 
> packets. The client then waits for the final packet but never gets it and 
> times out saying haven't heard from server. This is fatal for services such 
> as HBase which retry for finite attempts and exit post these attempts.
> Issue being that in ZooKeeperServer.processPacket(..) incase of sasl we send 
> the response and incorrectly also call cnxn.incrOutstandingRequests(h), which 
> throttles the connection if we're running over outstandingrequests limit, 
> which results in the server not processing the subsequent packet from the 
> client. Also we donot have any pending request to send for the connection and 
> hence never call enableRecv(). We should return after sending response to the 
> sasl request.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   3   4   5   6   7   8   9   10   >