date:20170809

[jira] [Commented] (ZOOKEEPER-2867) an expired ZK session can be re-established

2017-08-09 Thread Michael Han (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121101#comment-16121101
 ] 

Michael Han commented on ZOOKEEPER-2867:


[~junrao] I also did experiments on my side, when a session is closed the 
{{CommitProcessor}} should log something like:
{noformat}2017-08-09 22:41:49,824 [myid:2] - DEBUG 
[SyncThread:2:CommitProcessor@386] - Committing request:: 
sessionid:0x1134d2f type:closeSession cxid:0x1 zxid:0x20002 
txntype:-11 reqpath:n/a{noformat}
But this is for 3.5, which has changed a lot in terms of how commit works, and 
I realized you are using 3.4. But, from what you just referred about {{didn't 
find any logging of closeSession in the log4j log}} I think we can conclude 
that this specific session was not closed.

To circle back to original question - assuming this problematic session is not 
closed (which is what existing evidence demonstrates), does this create any 
inconsistency or confusions at higher level in your use case? Were you 
expecting this session got closed based on what you observed at client side?

> an expired ZK session can be re-established
> ---
>
> Key: ZOOKEEPER-2867
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2867
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.10
>Reporter: Jun Rao
> Attachments: zk.0.formatted, zk.1.formatted
>
>
> Not sure if this is a real bug, but I found an instance when a ZK client 
> seems to be able to renew a session already expired by the ZK server.
> From ZK server log, session 25cd1e82c110001 was expired at 22:04:39.
> {code:java}
> June 27th 2017, 22:04:39.000  INFO
> org.apache.zookeeper.server.ZooKeeperServer Expiring session 
> 0x25cd1e82c110001, timeout of 12000ms exceeded
> June 27th 2017, 22:04:39.001  DEBUG   
> org.apache.zookeeper.server.quorum.Leader   Proposing:: 
> sessionid:0x25cd1e82c110001 type:closeSession cxid:0x0 zxid:0x20fc4 
> txntype:-11 reqpath:n/a
> June 27th 2017, 22:04:39.001  INFO
> org.apache.zookeeper.server.PrepRequestProcessorProcessed session 
> termination for sessionid: 0x25cd1e82c110001
> June 27th 2017, 22:04:39.001  DEBUG   
> org.apache.zookeeper.server.quorum.CommitProcessor  Processing request:: 
> sessionid:0x25cd1e82c110001 type:closeSession cxid:0x0 zxid:0x20fc4 
> txntype:-11 reqpath:n/a
> June 27th 2017, 22:05:20.324  INFO
> org.apache.zookeeper.server.quorum.Learner  Revalidating client: 
> 0x25cd1e82c110001
> June 27th 2017, 22:05:20.324  INFO
> org.apache.zookeeper.server.ZooKeeperServer Client attempting to renew 
> session 0x25cd1e82c110001 at /100.96.5.6:47618
> June 27th 2017, 22:05:20.325  INFO
> org.apache.zookeeper.server.ZooKeeperServer Established session 
> 0x25cd1e82c110001 with negotiated timeout 12000 for client /100.96.5.6:47618
> {code}
> From ZK client's log, it was able to renew the expired session on 22:05:20.
> {code:java}
> June 27th 2017, 22:05:18.590  INFOorg.apache.zookeeper.ClientCnxn Client 
> session timed out, have not heard from server in 4485ms for sessionid 
> 0x25cd1e82c110001, closing socket connection and attempting reconnect  0
> June 27th 2017, 22:05:18.590  WARNorg.apache.zookeeper.ClientCnxn Client 
> session timed out, have not heard from server in 4485ms for sessionid 
> 0x25cd1e82c110001  0
> June 27th 2017, 22:05:19.325  WARNorg.apache.zookeeper.ClientCnxn SASL 
> configuration failed: javax.security.auth.login.LoginException: No JAAS 
> configuration section named 'Client' was found in specified JAAS 
> configuration file: '/opt/confluent/etc/kafka/server_jaas.conf'. Will 
> continue connection to Zookeeper server without SASL authentication, if 
> Zookeeper server allows it. 0
> June 27th 2017, 22:05:19.326  INFOorg.apache.zookeeper.ClientCnxn Opening 
> socket connection to server 100.65.188.168/100.65.188.168:2181  0
> June 27th 2017, 22:05:20.324  INFOorg.apache.zookeeper.ClientCnxn Socket 
> connection established to 100.65.188.168/100.65.188.168:2181, initiating 
> session 0
> June 27th 2017, 22:05:20.327  INFOorg.apache.zookeeper.ClientCnxn Session 
> establishment complete on server 100.65.188.168/100.65.188.168:2181, 
> sessionid = 0x25cd1e82c110001, negotiated timeout = 12000  0
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Success: ZOOKEEPER- PreCommit Build #935

2017-08-09 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/935/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 75.06 MB...]
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 4 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 3.0.1) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/935//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/935//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/935//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 998f013fe2fa165935dd8a9f5b90023629b1eecc logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] mv: 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 and 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 are the same file

BUILD SUCCESSFUL
Total time: 17 minutes 55 seconds
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[description-setter] Description set: ZOOKEEPER-1416
Putting comment on the pull request
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Success
Sending email for trigger: Success
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (ZOOKEEPER-1416) Persistent Recursive Watch

2017-08-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121059#comment-16121059
 ] 

Hadoop QA commented on ZOOKEEPER-1416:
--

+1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/935//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/935//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/935//console

This message is automatically generated.

> Persistent Recursive Watch
> --
>
> Key: ZOOKEEPER-1416
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: c client, documentation, java client, server
>Reporter: Phillip Liu
>Assignee: Jordan Zimmerman
> Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> h4. The Problem
> A ZooKeeper Watch can be placed on a single znode and when the znode changes 
> a Watch event is sent to the client. If there are thousands of znodes being 
> watched, when a client (re)connect, it would have to send thousands of watch 
> requests. At Facebook, we have this problem storing information for thousands 
> of db shards. Consequently a naming service that consumes the db shard 
> definition issues thousands of watch requests each time the service starts 
> and changes client watcher.
> h4. Proposed Solution
> We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent 
> means no Watch reset is necessary after a watch-fire. Recursive means the 
> Watch applies to the node and descendant nodes. A Persistent Recursive Watch 
> behaves as follows:
> # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS.
> # CHILDREN and DATA Recursive Watches can be placed on any znode.
> # EXISTS Recursive Watches can be placed on any path.
> # A Recursive Watch behaves like a auto-watch registrar on the server side. 
> Setting a  Recursive Watch means to set watches on all descendant znodes.
> # When a watch on a descendant fires, no subsequent event is fired until a 
> corresponding getData(..) on the znode is called, then Recursive Watch 
> automically apply the watch on the znode. This maintains the existing Watch 
> semantic on an individual znode.
> # A Recursive Watch overrides any watches placed on a descendant znode. 
> Practically this means the Recursive Watch Watcher callback is the one 
> receiving the event and event is delivered exactly once.
> A goal here is to reduce the number of semantic changes. The guarantee of no 
> intermediate watch event until data is read will be maintained. The only 
> difference is we will automatically re-add the watch after read. At the same 
> time we add the convience of reducing the need to add multiple watches for 
> sibling znodes and in turn reduce the number of watch messages sent from the 
> client to the server.
> There are some implementation details that needs to be hashed out. Initial 
> thinking is to have the Recursive Watch create per-node watches. This will 
> cause a lot of watches to be created on the server side. Currently, each 
> watch is stored as a single bit in a bit set relative to a session - up to 3 
> bits per client per znode. If there are 100m znodes with 100k clients, each 
> watching all nodes, then this strategy will consume approximately 3.75TB of 
> ram distributed across all Observers. Seems expensive.
> Alternatively, a blacklist of paths to not send Watches regardless of Watch 
> setting can be set each time a watch event from a Recursive Watch is fired. 
> The memory utilization is relative to the number of outstanding reads and at 
> worst case it's 1/3 * 3.75TB using the parameters given above.
> Otherwise, a relaxation of no intermediate watch event until read guarantee 
> is required. If the server can send watch events regardless of one has 
> already been fired without corresponding read, then the server can simply 
> fire watch events without tracking.



--
This message was sent by

[jira] [Commented] (ZOOKEEPER-2871) Port ZOOKEEPER-1416 to 3.5.x

2017-08-09 Thread Jordan Zimmerman (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121049#comment-16121049
 ] 

Jordan Zimmerman commented on ZOOKEEPER-2871:
-

https://github.com/apache/zookeeper/pull/332

> Port ZOOKEEPER-1416 to 3.5.x
> 
>
> Key: ZOOKEEPER-2871
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2871
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client, documentation, java client, server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.4
>
>
> Port the work of Persistent Recursive Watchers (ZOOKEEPER-1416) to 3.5.x



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (ZOOKEEPER-1416) Persistent Recursive Watch

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121047#comment-16121047
 ] 

ASF GitHub Bot commented on ZOOKEEPER-1416:
---

GitHub user Randgalt opened a pull request:

https://github.com/apache/zookeeper/pull/332

Port of ZOOKEEPER-1416 Persistent Recursive Watches



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Randgalt/zookeeper ZOOKEEPER-2871

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/332.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #332


commit 6767b58a9eeba389cf4eb3c9b80066e17b514b13
Author: randgalt 
Date:   2017-08-10T04:38:23Z

Port of ZOOKEEPER-1416 Persistent Recursive Watches




> Persistent Recursive Watch
> --
>
> Key: ZOOKEEPER-1416
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: c client, documentation, java client, server
>Reporter: Phillip Liu
>Assignee: Jordan Zimmerman
> Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> h4. The Problem
> A ZooKeeper Watch can be placed on a single znode and when the znode changes 
> a Watch event is sent to the client. If there are thousands of znodes being 
> watched, when a client (re)connect, it would have to send thousands of watch 
> requests. At Facebook, we have this problem storing information for thousands 
> of db shards. Consequently a naming service that consumes the db shard 
> definition issues thousands of watch requests each time the service starts 
> and changes client watcher.
> h4. Proposed Solution
> We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent 
> means no Watch reset is necessary after a watch-fire. Recursive means the 
> Watch applies to the node and descendant nodes. A Persistent Recursive Watch 
> behaves as follows:
> # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS.
> # CHILDREN and DATA Recursive Watches can be placed on any znode.
> # EXISTS Recursive Watches can be placed on any path.
> # A Recursive Watch behaves like a auto-watch registrar on the server side. 
> Setting a  Recursive Watch means to set watches on all descendant znodes.
> # When a watch on a descendant fires, no subsequent event is fired until a 
> corresponding getData(..) on the znode is called, then Recursive Watch 
> automically apply the watch on the znode. This maintains the existing Watch 
> semantic on an individual znode.
> # A Recursive Watch overrides any watches placed on a descendant znode. 
> Practically this means the Recursive Watch Watcher callback is the one 
> receiving the event and event is delivered exactly once.
> A goal here is to reduce the number of semantic changes. The guarantee of no 
> intermediate watch event until data is read will be maintained. The only 
> difference is we will automatically re-add the watch after read. At the same 
> time we add the convience of reducing the need to add multiple watches for 
> sibling znodes and in turn reduce the number of watch messages sent from the 
> client to the server.
> There are some implementation details that needs to be hashed out. Initial 
> thinking is to have the Recursive Watch create per-node watches. This will 
> cause a lot of watches to be created on the server side. Currently, each 
> watch is stored as a single bit in a bit set relative to a session - up to 3 
> bits per client per znode. If there are 100m znodes with 100k clients, each 
> watching all nodes, then this strategy will consume approximately 3.75TB of 
> ram distributed across all Observers. Seems expensive.
> Alternatively, a blacklist of paths to not send Watches regardless of Watch 
> setting can be set each time a watch event from a Recursive Watch is fired. 
> The memory utilization is relative to the number of outstanding reads and at 
> worst case it's 1/3 * 3.75TB using the parameters given above.
> Otherwise, a relaxation of no intermediate watch event until read guarantee 
> is required. If the server can send watch events regardless of one has 
> already been fired without corresponding read, then the server can simply 
> fire watch events without tracking.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] zookeeper pull request #332: Port of ZOOKEEPER-1416 Persistent Recursive Wat...

2017-08-09 Thread Randgalt

GitHub user Randgalt opened a pull request:

https://github.com/apache/zookeeper/pull/332

Port of ZOOKEEPER-1416 Persistent Recursive Watches



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Randgalt/zookeeper ZOOKEEPER-2871

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/332.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #332


commit 6767b58a9eeba389cf4eb3c9b80066e17b514b13
Author: randgalt 
Date:   2017-08-10T04:38:23Z

Port of ZOOKEEPER-1416 Persistent Recursive Watches




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (ZOOKEEPER-1416) Persistent Recursive Watch

2017-08-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120989#comment-16120989
 ] 

Hadoop QA commented on ZOOKEEPER-1416:
--

+1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/934//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/934//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/934//console

This message is automatically generated.

> Persistent Recursive Watch
> --
>
> Key: ZOOKEEPER-1416
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: c client, documentation, java client, server
>Reporter: Phillip Liu
>Assignee: Jordan Zimmerman
> Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> h4. The Problem
> A ZooKeeper Watch can be placed on a single znode and when the znode changes 
> a Watch event is sent to the client. If there are thousands of znodes being 
> watched, when a client (re)connect, it would have to send thousands of watch 
> requests. At Facebook, we have this problem storing information for thousands 
> of db shards. Consequently a naming service that consumes the db shard 
> definition issues thousands of watch requests each time the service starts 
> and changes client watcher.
> h4. Proposed Solution
> We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent 
> means no Watch reset is necessary after a watch-fire. Recursive means the 
> Watch applies to the node and descendant nodes. A Persistent Recursive Watch 
> behaves as follows:
> # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS.
> # CHILDREN and DATA Recursive Watches can be placed on any znode.
> # EXISTS Recursive Watches can be placed on any path.
> # A Recursive Watch behaves like a auto-watch registrar on the server side. 
> Setting a  Recursive Watch means to set watches on all descendant znodes.
> # When a watch on a descendant fires, no subsequent event is fired until a 
> corresponding getData(..) on the znode is called, then Recursive Watch 
> automically apply the watch on the znode. This maintains the existing Watch 
> semantic on an individual znode.
> # A Recursive Watch overrides any watches placed on a descendant znode. 
> Practically this means the Recursive Watch Watcher callback is the one 
> receiving the event and event is delivered exactly once.
> A goal here is to reduce the number of semantic changes. The guarantee of no 
> intermediate watch event until data is read will be maintained. The only 
> difference is we will automatically re-add the watch after read. At the same 
> time we add the convience of reducing the need to add multiple watches for 
> sibling znodes and in turn reduce the number of watch messages sent from the 
> client to the server.
> There are some implementation details that needs to be hashed out. Initial 
> thinking is to have the Recursive Watch create per-node watches. This will 
> cause a lot of watches to be created on the server side. Currently, each 
> watch is stored as a single bit in a bit set relative to a session - up to 3 
> bits per client per znode. If there are 100m znodes with 100k clients, each 
> watching all nodes, then this strategy will consume approximately 3.75TB of 
> ram distributed across all Observers. Seems expensive.
> Alternatively, a blacklist of paths to not send Watches regardless of Watch 
> setting can be set each time a watch event from a Recursive Watch is fired. 
> The memory utilization is relative to the number of outstanding reads and at 
> worst case it's 1/3 * 3.75TB using the parameters given above.
> Otherwise, a relaxation of no intermediate watch event until read guarantee 
> is required. If the server can send watch events regardless of one has 
> already been fired without corresponding read, then the server can simply 
> fire watch events without tracking.



--
This message was sent by

[jira] [Resolved] (ZOOKEEPER-2866) Reconfig Causes Newly Joined Node to Crash

2017-08-09 Thread Alexander Shraer (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Shraer resolved ZOOKEEPER-2866.
-
Resolution: Not A Problem

> Reconfig Causes Newly Joined Node to Crash
> --
>
> Key: ZOOKEEPER-2866
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2866
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: leaderElection, quorum, server
>Affects Versions: 3.5.3
>Reporter: Jeffrey F. Lukman
> Attachments: ZK-2866.pdf
>
>
> When we run our Distributed system Model Checking (DMCK) in ZooKeeper v3.5.3
> by following the workload in ZK-2778:
> * initially start 2 ZooKeeper nodes
> * start 3 new nodes and let them join the cluster
> * do a reconfiguration where the newly joined will be PARTICIPANTS, 
> while the previous 2 nodes change to be OBSERVERS
> We think our DMCK found this following bug:
> * one of the newly joined node crashes due to 
> it receives an *unexpected* PROPOSAL message
> from the new leader in the cluster.
> For complete information of the bug, please see the document that is attached.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (ZOOKEEPER-2866) Reconfig Causes Newly Joined Node to Crash

2017-08-09 Thread Alexander Shraer (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120987#comment-16120987
 ] 

Alexander Shraer commented on ZOOKEEPER-2866:
-

Discussed offline with [~castuardo] and we currently believe that there is no 
bug.
Please reopen the Jira if needed.


> Reconfig Causes Newly Joined Node to Crash
> --
>
> Key: ZOOKEEPER-2866
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2866
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: leaderElection, quorum, server
>Affects Versions: 3.5.3
>Reporter: Jeffrey F. Lukman
> Attachments: ZK-2866.pdf
>
>
> When we run our Distributed system Model Checking (DMCK) in ZooKeeper v3.5.3
> by following the workload in ZK-2778:
> * initially start 2 ZooKeeper nodes
> * start 3 new nodes and let them join the cluster
> * do a reconfiguration where the newly joined will be PARTICIPANTS, 
> while the previous 2 nodes change to be OBSERVERS
> We think our DMCK found this following bug:
> * one of the newly joined node crashes due to 
> it receives an *unexpected* PROPOSAL message
> from the new leader in the cluster.
> For complete information of the bug, please see the document that is attached.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Success: ZOOKEEPER- PreCommit Build #934

2017-08-09 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/934/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 74.57 MB...]
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 4 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 3.0.1) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/934//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/934//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/934//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 2b8039bc8b950232a4044c3c1f9ca60d093140f0 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] mv: 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 and 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 are the same file

BUILD SUCCESSFUL
Total time: 19 minutes 0 seconds
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[description-setter] Description set: ZOOKEEPER-1416
Putting comment on the pull request
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Success
Sending email for trigger: Success
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Assigned] (ZOOKEEPER-2866) Reconfig Causes Newly Joined Node to Crash

2017-08-09 Thread Alexander Shraer (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Shraer reassigned ZOOKEEPER-2866:
---

Assignee: Alexander Shraer

> Reconfig Causes Newly Joined Node to Crash
> --
>
> Key: ZOOKEEPER-2866
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2866
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: leaderElection, quorum, server
>Affects Versions: 3.5.3
>Reporter: Jeffrey F. Lukman
>Assignee: Alexander Shraer
> Attachments: ZK-2866.pdf
>
>
> When we run our Distributed system Model Checking (DMCK) in ZooKeeper v3.5.3
> by following the workload in ZK-2778:
> * initially start 2 ZooKeeper nodes
> * start 3 new nodes and let them join the cluster
> * do a reconfiguration where the newly joined will be PARTICIPANTS, 
> while the previous 2 nodes change to be OBSERVERS
> We think our DMCK found this following bug:
> * one of the newly joined node crashes due to 
> it receives an *unexpected* PROPOSAL message
> from the new leader in the cluster.
> For complete information of the bug, please see the document that is attached.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (ZOOKEEPER-2871) Port ZOOKEEPER-1416 to 3.5.x

2017-08-09 Thread Jordan Zimmerman (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman reassigned ZOOKEEPER-2871:
---

Assignee: Jordan Zimmerman

> Port ZOOKEEPER-1416 to 3.5.x
> 
>
> Key: ZOOKEEPER-2871
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2871
> Project: ZooKeeper
>  Issue Type: Sub-task
>  Components: c client, documentation, java client, server
>Affects Versions: 3.5.3
>Reporter: Jordan Zimmerman
>Assignee: Jordan Zimmerman
> Fix For: 3.5.4
>
>
> Port the work of Persistent Recursive Watchers (ZOOKEEPER-1416) to 3.5.x



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (ZOOKEEPER-2871) Port ZOOKEEPER-1416 to 3.5.x

2017-08-09 Thread Jordan Zimmerman (JIRA)

Jordan Zimmerman created ZOOKEEPER-2871:
---

 Summary: Port ZOOKEEPER-1416 to 3.5.x
 Key: ZOOKEEPER-2871
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2871
 Project: ZooKeeper
  Issue Type: Sub-task
  Components: c client, documentation, java client, server
Affects Versions: 3.5.3
Reporter: Jordan Zimmerman
 Fix For: 3.5.4


Port the work of Persistent Recursive Watchers (ZOOKEEPER-1416) to 3.5.x



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

ZooKeeper-trunk - Build # 3492 - Still Failing

2017-08-09 Thread Apache Jenkins Server

See https://builds.apache.org/job/ZooKeeper-trunk/3492/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 68.54 MB...]
 [exec] Log Message Received: [2017-08-09 
23:32:51,246:1624(0x7f6e3da14740):ZOO_INFO@zookeeper_init_internal@1151: 
Initiating client connection, host=127.0.0.1:22181 sessionTimeout=1 
watcher=0x459d00 sessionId=0 sessionPasswd= context=0x7ffeae7a3270 
flags=0]
 [exec] Log Message Received: [2017-08-09 
23:32:51,247:1624(0x7f6e3af08700):ZOO_INFO@check_events@2424: initiated 
connection to server [127.0.0.1:22181]]
 [exec] Log Message Received: [2017-08-09 
23:32:51,249:1624(0x7f6e3af08700):ZOO_INFO@check_events@2476: session 
establishment complete on server [127.0.0.1:22181], 
sessionId=0x105f3075acb000f, negotiated timeout=1 ]
 [exec]  : elapsed 1001 : OK
 [exec] Zookeeper_simpleSystem::testAsyncWatcherAutoReset ZooKeeper server 
started : elapsed 11149 : OK
 [exec] Zookeeper_simpleSystem::testDeserializeString : elapsed 0 : OK
 [exec] Zookeeper_simpleSystem::testFirstServerDown : elapsed 1000 : OK
 [exec] Zookeeper_simpleSystem::testNullData : elapsed 4012 : OK
 [exec] Zookeeper_simpleSystem::testIPV6 : elapsed 1519 : OK
 [exec] Zookeeper_simpleSystem::testCreate : elapsed 4801 : OK
 [exec] Zookeeper_simpleSystem::testPath : elapsed 8499 : OK
 [exec] Zookeeper_simpleSystem::testPathValidation : elapsed 1052 : OK
 [exec] Zookeeper_simpleSystem::testPing : elapsed 17277 : OK
 [exec] Zookeeper_simpleSystem::testAcl : elapsed 1019 : OK
 [exec] Zookeeper_simpleSystem::testChroot : elapsed 3028 : OK
 [exec] Zookeeper_simpleSystem::testAuth ZooKeeper server started ZooKeeper 
server started : elapsed 34404 : OK
 [exec] Zookeeper_simpleSystem::testHangingClient : elapsed 1021 : OK
 [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithGlobal ZooKeeper 
server started ZooKeeper server started ZooKeeper server started : elapsed 
17447 : OK
 [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithLocal ZooKeeper 
server started ZooKeeper server started ZooKeeper server started : elapsed 
24269 : OK
 [exec] Zookeeper_simpleSystem::testGetChildren2 : elapsed 1024 : OK
 [exec] Zookeeper_simpleSystem::testLastZxid : elapsed 6142 : OK
 [exec] Zookeeper_simpleSystem::testRemoveWatchers ZooKeeper server started 
: elapsed 11988 : OK
 [exec] Zookeeper_readOnly::testReadOnly : assertion : elapsed 6659
 [exec] 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/src/c/tests/TestReadOnlyClient.cc:99:
 Assertion: equality assertion failed [Expected: 0, Actual  : -4]
 [exec] Failures !!!
 [exec] Run: 74   Failure total: 1   Failures: 1   Errors: 0
 [exec] FAIL: zktest-mt
 [exec] ==
 [exec] 1 of 2 tests failed
 [exec] Please report to u...@zookeeper.apache.org
 [exec] ==
 [exec] Makefile:1744: recipe for target 'check-TESTS' failed
 [exec] make[1]: Leaving directory 
'/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build/test/test-cppunit'
 [exec] Makefile:2000: recipe for target 'check-am' failed
 [exec] make[1]: *** [check-TESTS] Error 1
 [exec] make: *** [check-am] Error 2

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build.xml:1339: The 
following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build.xml:1299: The 
following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build.xml:1309: exec 
returned: 2

Total time: 18 minutes 59 seconds
Build step 'Execute shell' marked build as failure
[FINDBUGS] Skipping publisher since build result is FAILURE
[WARNINGS] Skipping publisher since build result is FAILURE
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording fingerprints
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Publishing Javadoc
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (ZOOKEEPER-2867) an expired ZK session can be re-established

2017-08-09 Thread Jun Rao (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120800#comment-16120800
 ] 

Jun Rao commented on ZOOKEEPER-2867:


[~hanm], I tried it locally. It seems that every time a session is closed. The 
log4j log will log the deleting of ephemeral nodes with the session and 
type:closeSession in FinalRequestProcessor. 

{code:java}
[2017-08-09 16:00:46,325] DEBUG Processing request:: 
sessionid:0x15dc93a4f1a type:closeSession cxid:0x3f zxid:0x39 txntype:-11 
reqpath:n/a (org.apache.zookeeper.server.FinalRequestProcessor)
[2017-08-09 16:00:46,325] DEBUG Deleting ephemeral node /brokers/ids/0 for 
session 0x15dc93a4f1a (org.apache.zookeeper.server.DataTree)
[2017-08-09 16:00:46,326] DEBUG sessionid:0x15dc93a4f1a type:closeSession 
cxid:0x3f zxid:0x39 txntype:-11 reqpath:n/a 
(org.apache.zookeeper.server.FinalRequestProcessor)
{code}

However, for the above incident, I didn't find any logging of closeSession in 
the log4j log.


> an expired ZK session can be re-established
> ---
>
> Key: ZOOKEEPER-2867
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2867
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.10
>Reporter: Jun Rao
> Attachments: zk.0.formatted, zk.1.formatted
>
>
> Not sure if this is a real bug, but I found an instance when a ZK client 
> seems to be able to renew a session already expired by the ZK server.
> From ZK server log, session 25cd1e82c110001 was expired at 22:04:39.
> {code:java}
> June 27th 2017, 22:04:39.000  INFO
> org.apache.zookeeper.server.ZooKeeperServer Expiring session 
> 0x25cd1e82c110001, timeout of 12000ms exceeded
> June 27th 2017, 22:04:39.001  DEBUG   
> org.apache.zookeeper.server.quorum.Leader   Proposing:: 
> sessionid:0x25cd1e82c110001 type:closeSession cxid:0x0 zxid:0x20fc4 
> txntype:-11 reqpath:n/a
> June 27th 2017, 22:04:39.001  INFO
> org.apache.zookeeper.server.PrepRequestProcessorProcessed session 
> termination for sessionid: 0x25cd1e82c110001
> June 27th 2017, 22:04:39.001  DEBUG   
> org.apache.zookeeper.server.quorum.CommitProcessor  Processing request:: 
> sessionid:0x25cd1e82c110001 type:closeSession cxid:0x0 zxid:0x20fc4 
> txntype:-11 reqpath:n/a
> June 27th 2017, 22:05:20.324  INFO
> org.apache.zookeeper.server.quorum.Learner  Revalidating client: 
> 0x25cd1e82c110001
> June 27th 2017, 22:05:20.324  INFO
> org.apache.zookeeper.server.ZooKeeperServer Client attempting to renew 
> session 0x25cd1e82c110001 at /100.96.5.6:47618
> June 27th 2017, 22:05:20.325  INFO
> org.apache.zookeeper.server.ZooKeeperServer Established session 
> 0x25cd1e82c110001 with negotiated timeout 12000 for client /100.96.5.6:47618
> {code}
> From ZK client's log, it was able to renew the expired session on 22:05:20.
> {code:java}
> June 27th 2017, 22:05:18.590  INFOorg.apache.zookeeper.ClientCnxn Client 
> session timed out, have not heard from server in 4485ms for sessionid 
> 0x25cd1e82c110001, closing socket connection and attempting reconnect  0
> June 27th 2017, 22:05:18.590  WARNorg.apache.zookeeper.ClientCnxn Client 
> session timed out, have not heard from server in 4485ms for sessionid 
> 0x25cd1e82c110001  0
> June 27th 2017, 22:05:19.325  WARNorg.apache.zookeeper.ClientCnxn SASL 
> configuration failed: javax.security.auth.login.LoginException: No JAAS 
> configuration section named 'Client' was found in specified JAAS 
> configuration file: '/opt/confluent/etc/kafka/server_jaas.conf'. Will 
> continue connection to Zookeeper server without SASL authentication, if 
> Zookeeper server allows it. 0
> June 27th 2017, 22:05:19.326  INFOorg.apache.zookeeper.ClientCnxn Opening 
> socket connection to server 100.65.188.168/100.65.188.168:2181  0
> June 27th 2017, 22:05:20.324  INFOorg.apache.zookeeper.ClientCnxn Socket 
> connection established to 100.65.188.168/100.65.188.168:2181, initiating 
> session 0
> June 27th 2017, 22:05:20.327  INFOorg.apache.zookeeper.ClientCnxn Session 
> establishment complete on server 100.65.188.168/100.65.188.168:2181, 
> sessionid = 0x25cd1e82c110001, negotiated timeout = 12000  0
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

ZooKeeper-trunk-openjdk7 - Build # 1572 - Still Failing

2017-08-09 Thread Apache Jenkins Server

See https://builds.apache.org/job/ZooKeeper-trunk-openjdk7/1572/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 60.70 MB...]
[junit] at 
org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88)
[junit] at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
[junit] at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
[junit] at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
[junit] at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
[junit] at 
org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
[junit] at 
org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
[junit] at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
[junit] at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
[junit] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[junit] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[junit] at java.lang.Thread.run(Thread.java:745)
[junit] 2017-08-09 20:34:13,498 [myid:] - INFO  [New I/O worker 
#11598:ClientCnxnSocketNetty$ZKClientHandler@384] - channel is disconnected: 
[id: 0xdc8c6010, /127.0.0.1:41994 :> 127.0.0.1/127.0.0.1:19479]
[junit] 2017-08-09 20:34:13,498 [myid:] - INFO  [New I/O worker 
#11598:ClientCnxnSocketNetty@208] - channel is told closing
[junit] 2017-08-09 20:34:13,498 [myid:127.0.0.1:19479] - INFO  
[main-SendThread(127.0.0.1:19479):ClientCnxn$SendThread@1231] - channel for 
sessionid 0x205f39e94030001 is lost, closing socket connection and attempting 
reconnect
[junit] 2017-08-09 20:34:13,770 [myid:127.0.0.1:19479] - INFO  
[main-SendThread(127.0.0.1:19479):ClientCnxn$SendThread@1113] - Opening socket 
connection to server 127.0.0.1/127.0.0.1:19479. Will not attempt to 
authenticate using SASL (unknown error)
[junit] 2017-08-09 20:34:13,771 [myid:] - INFO  [New I/O worker 
#11445:ClientCnxn$SendThread@946] - Socket connection established, initiating 
session, client: /127.0.0.1:42008, server: 127.0.0.1/127.0.0.1:19479
[junit] 2017-08-09 20:34:13,772 [myid:] - INFO  [New I/O worker 
#11445:ClientCnxnSocketNetty$1@153] - channel is connected: [id: 0x84cf4658, 
/127.0.0.1:42008 => 127.0.0.1/127.0.0.1:19479]
[junit] 2017-08-09 20:34:13,772 [myid:] - WARN  [New I/O worker 
#11285:NettyServerCnxn@426] - Closing connection to /127.0.0.1:42008
[junit] java.io.IOException: ZK down
[junit] at 
org.apache.zookeeper.server.NettyServerCnxn.receiveMessage(NettyServerCnxn.java:363)
[junit] at 
org.apache.zookeeper.server.NettyServerCnxnFactory$CnxnChannelHandler.processMessage(NettyServerCnxnFactory.java:244)
[junit] at 
org.apache.zookeeper.server.NettyServerCnxnFactory$CnxnChannelHandler.messageReceived(NettyServerCnxnFactory.java:166)
[junit] at 
org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88)
[junit] at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
[junit] at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
[junit] at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
[junit] at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
[junit] at 
org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
[junit] at 
org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
[junit] at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
[junit] at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
[junit] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[junit] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[junit] at

Success: ZOOKEEPER- PreCommit Build #933

2017-08-09 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/933/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 68.99 MB...]
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +0 tests included.  The patch appears to be a documentation 
patch that doesn't require tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 3.0.1) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/933//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/933//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/933//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 98b0c69ab65abd659f8cf3fac20ebf3b31bce241 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] mv: 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 and 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 are the same file

BUILD SUCCESSFUL
Total time: 18 minutes 24 seconds
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[description-setter] Description set: ZOOKEEPER-2870
Putting comment on the pull request
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Success
Sending email for trigger: Success
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (ZOOKEEPER-2870) Improve the efficiency of AtomicFileOutputStream

2017-08-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120593#comment-16120593
 ] 

Hadoop QA commented on ZOOKEEPER-2870:
--

+1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+0 tests included.  The patch appears to be a documentation patch that 
doesn't require tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/933//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/933//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/933//console

This message is automatically generated.

> Improve the efficiency of AtomicFileOutputStream
> 
>
> Key: ZOOKEEPER-2870
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2870
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.10, 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>
> The AtomicFileOutputStream extends from FilterOutputStream, where the write 
> function writes data to underlying stream byte by byte: 
> https://searchcode.com/codesearch/view/17990706/, which is very inefficient. 
> Currently, we only this this class to write the dynamic config, because it's 
> quite small it won't be a big problem. But in the future we may want to use 
> this class to write the snapshot file, which will take much longer time, 
> tested inside, writing 600MB snapshot will take more than 10 minutes, while 
> using FileOutputStream directly only takes 6s.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (ZOOKEEPER-2870) Improve the efficiency of AtomicFileOutputStream

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120565#comment-16120565
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2870:
---

GitHub user lvfangmin opened a pull request:

https://github.com/apache/zookeeper/pull/331

[ZOOKEEPER-2870] Improve the efficiency of AtomicFileOutputStream



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvfangmin/zookeeper ZOOKEEPER-2870

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/331.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #331


commit 7154ccd9c0fa489b070645ba3bcfc4c9e25d4683
Author: Fangmin Lyu 
Date:   2017-08-09T20:01:02Z

[ZOOKEEPER-2870] Improve the efficiency of AtomicFileOutputStream




> Improve the efficiency of AtomicFileOutputStream
> 
>
> Key: ZOOKEEPER-2870
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2870
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.10, 3.5.3, 3.6.0
>Reporter: Fangmin Lv
>Assignee: Fangmin Lv
>
> The AtomicFileOutputStream extends from FilterOutputStream, where the write 
> function writes data to underlying stream byte by byte: 
> https://searchcode.com/codesearch/view/17990706/, which is very inefficient. 
> Currently, we only this this class to write the dynamic config, because it's 
> quite small it won't be a big problem. But in the future we may want to use 
> this class to write the snapshot file, which will take much longer time, 
> tested inside, writing 600MB snapshot will take more than 10 minutes, while 
> using FileOutputStream directly only takes 6s.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] zookeeper pull request #331: [ZOOKEEPER-2870] Improve the efficiency of Atom...

2017-08-09 Thread lvfangmin

GitHub user lvfangmin opened a pull request:

https://github.com/apache/zookeeper/pull/331

[ZOOKEEPER-2870] Improve the efficiency of AtomicFileOutputStream



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvfangmin/zookeeper ZOOKEEPER-2870

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/331.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #331


commit 7154ccd9c0fa489b070645ba3bcfc4c9e25d4683
Author: Fangmin Lyu 
Date:   2017-08-09T20:01:02Z

[ZOOKEEPER-2870] Improve the efficiency of AtomicFileOutputStream




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (ZOOKEEPER-2870) Improve the efficiency of AtomicFileOutputStream

2017-08-09 Thread Fangmin Lv (JIRA)

Fangmin Lv created ZOOKEEPER-2870:
-

 Summary: Improve the efficiency of AtomicFileOutputStream
 Key: ZOOKEEPER-2870
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2870
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.5.3, 3.4.10, 3.6.0
Reporter: Fangmin Lv
Assignee: Fangmin Lv


The AtomicFileOutputStream extends from FilterOutputStream, where the write 
function writes data to underlying stream byte by byte: 
https://searchcode.com/codesearch/view/17990706/, which is very inefficient. 

Currently, we only this this class to write the dynamic config, because it's 
quite small it won't be a big problem. But in the future we may want to use 
this class to write the snapshot file, which will take much longer time, tested 
inside, writing 600MB snapshot will take more than 10 minutes, while using 
FileOutputStream directly only takes 6s.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Success: ZOOKEEPER- PreCommit Build #932

2017-08-09 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/932/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 72.20 MB...]
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +0 tests included.  The patch appears to be a documentation 
patch that doesn't require tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 3.0.1) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/932//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/932//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/932//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] ef401405e05f7080cdbac069083dc45433bfe011 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] mv: 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build@2/patchprocess'
 and 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build@2/patchprocess'
 are the same file

BUILD SUCCESSFUL
Total time: 19 minutes 23 seconds
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[description-setter] Description set: ZOOKEEPER-2471
Putting comment on the pull request
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Success
Sending email for trigger: Success
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (ZOOKEEPER-2471) Java Zookeeper Client incorrectly considers time spent sleeping as time spent connecting, potentially resulting in infinite reconnect loop

2017-08-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120512#comment-16120512
 ] 

Hadoop QA commented on ZOOKEEPER-2471:
--

+1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+0 tests included.  The patch appears to be a documentation patch that 
doesn't require tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/932//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/932//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/932//console

This message is automatically generated.

> Java Zookeeper Client incorrectly considers time spent sleeping as time spent 
> connecting, potentially resulting in infinite reconnect loop
> --
>
> Key: ZOOKEEPER-2471
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2471
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.5.3
> Environment: all
>Reporter: Dan Benediktson
>Assignee: Dan Benediktson
> Attachments: ZOOKEEPER-2471.patch
>
>
> ClientCnxnSocket uses a member variable "now" to track the current time, and 
> lastSend / lastHeard variables to track socket liveness. Implementations, and 
> even ClientCnxn itself, are expected to call both updateNow() to reset "now" 
> to System.currentTimeMillis, and then call updateLastSend()/updateLastHeard() 
> on IO completions.
> This is a fragile contract, so it's not surprising that there's a bug 
> resulting from it: ClientCnxn.SendThread.run() calls updateLastSendAndHeard() 
> as soon as startConnect() returns, but it does not call updateNow() first. I 
> expect when this was written, either the expectation was that startConnect() 
> was an asynchronous operation and that updateNow() would have been called 
> very recently, or simply the requirement to call updateNow() was forgotten at 
> this point. As far as I can see, this bug has been present since the 
> "updateNow" method was first introduced in the distant past. As it turns out, 
> since startConnect() calls HostProvider.next(), which can sleep, quite a lot 
> of time can pass, leaving a big gap between "now" and now.
> If you are using very short session timeouts (one of our ZK ensembles has 
> many clients using a 1-second timeout), this is potentially disastrous, 
> because the sleep time may exceed the connection timeout itself, which can 
> potentially result in the Java client being stuck in a perpetual reconnect 
> loop. The exact code path it goes through in this case is complicated, 
> because there has to be a previously-closed socket still waiting in the 
> selector (otherwise, the first timeout evaluation will not fail because "now" 
> still hasn't been updated, and then the actual connect timeout will be 
> applied in ClientCnxnSocket.doTransport()) so that select() will harvest the 
> IO from the previous socket and updateNow(), resulting in the next loop 
> through ClientCnxnSocket.SendThread.run() observing the spurious timeout and 
> failing. In practice it does happen to us fairly frequently; we only got to 
> the bottom of the bug yesterday. Worse, when it does happen, the Zookeeper 
> client object is rendered unusable: it's stuck in a perpetual reconnect loop 
> where it keeps sleeping, opening a socket, and immediately closing it.
> I have a patch. Rather than calling updateNow() right after startConnect(), 
> my fix is to remove the "now" member variable and the updateNow() method 
> entirely, and to instead just call System.currentTimeMillis() whenever time 
> needs to be evaluated. I realize there is a benefit (aside from a trivial 
> micro-optimization not worth worrying about) to having the time be "fixed", 
> particularly for truth in the logging: if time is fixed by an updateNow() 
> call, then the log for a timeout will still show exactly the same value the 
> code reasoned about. However, this benefit is in my opinion not enough to 
> merit the fragility of the contract which led to this (for us) highly 
> impactful and

[jira] [Commented] (ZOOKEEPER-2786) Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode

2017-08-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120510#comment-16120510
 ] 

Hudson commented on ZOOKEEPER-2786:
---

FAILURE: Integrated in Jenkins build ZooKeeper-trunk #3491 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/3491/])
ZOOKEEPER-2786: Flaky test: (hanm: rev e104175bb47baeb800354078c015e78bfcb7c953)
* (edit) src/java/main/org/apache/zookeeper/server/NettyServerCnxn.java


> Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode
> --
>
> Key: ZOOKEEPER-2786
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2786
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.10, 3.5.3
>Reporter: Abraham Fine
>Assignee: Abraham Fine
> Fix For: 3.5.4, 3.6.0, 3.4.11
>
>
> This test is broken on 3.4 and 3.5, but is broken in "different" ways. Please 
> see the individual pull requests for detailed descriptions for the issues 
> faced in both branches.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

ZooKeeper-trunk - Build # 3491 - Failure

2017-08-09 Thread Apache Jenkins Server

See https://builds.apache.org/job/ZooKeeper-trunk/3491/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 66.43 MB...]
[junit] 2017-08-09 19:26:35,648 [myid:] - INFO  [New I/O boss 
#10731:ClientCnxnSocketNetty@208] - channel is told closing
[junit] 2017-08-09 19:26:35,648 [myid:127.0.0.1:13979] - INFO  
[main-SendThread(127.0.0.1:13979):ClientCnxn$SendThread@1231] - channel for 
sessionid 0x105f2210902 is lost, closing socket connection and attempting 
reconnect
[junit] 2017-08-09 19:26:35,711 [myid:127.0.0.1:14044] - INFO  
[main-SendThread(127.0.0.1:14044):ClientCnxn$SendThread@1113] - Opening socket 
connection to server 127.0.0.1/127.0.0.1:14044. Will not attempt to 
authenticate using SASL (unknown error)
[junit] 2017-08-09 19:26:35,712 [myid:] - INFO  [New I/O boss 
#15141:ClientCnxnSocketNetty$1@127] - future isn't success, cause: {}
[junit] java.net.ConnectException: Connection refused: 
127.0.0.1/127.0.0.1:14044
[junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
[junit] at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
[junit] at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
[junit] at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
[junit] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[junit] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[junit] at java.lang.Thread.run(Thread.java:745)
[junit] 2017-08-09 19:26:35,712 [myid:] - WARN  [New I/O boss 
#15141:ClientCnxnSocketNetty$ZKClientHandler@439] - Exception caught: [id: 
0xf1c0c84e] EXCEPTION: java.net.ConnectException: Connection refused: 
127.0.0.1/127.0.0.1:14044
[junit] java.net.ConnectException: Connection refused: 
127.0.0.1/127.0.0.1:14044
[junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
[junit] at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
[junit] at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
[junit] at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
[junit] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[junit] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[junit] at java.lang.Thread.run(Thread.java:745)
[junit] 2017-08-09 19:26:35,713 [myid:] - INFO  [New I/O boss 
#15141:ClientCnxnSocketNetty@208] - channel is told closing
[junit] 2017-08-09 19:26:35,713 [myid:127.0.0.1:14044] - INFO  
[main-SendThread(127.0.0.1:14044):ClientCnxn$SendThread@1231] - channel for 
sessionid 0x305f2235730 is lost, closing socket connection and attempting 
reconnect

fail.build.on.test.failure:

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build.xml:1339: The 
following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build.xml:1220: The 
following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build.xml:1224: Tests 
failed!

Total time: 9 minutes 59 seconds
Build step 'Execute shell' marked build as failure
[FINDBUGS] Skipping publisher since build result is FAILURE
[WARNINGS] Skipping publisher since build result is FAILURE
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording fingerprints
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[JIRA] Updating issue ZOOKEEPER-2786
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Publishing Javadoc
Setting

[jira] [Commented] (ZOOKEEPER-1416) Persistent Recursive Watch

2017-08-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120481#comment-16120481
 ] 

Hadoop QA commented on ZOOKEEPER-1416:
--

+1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/931//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/931//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/931//console

This message is automatically generated.

> Persistent Recursive Watch
> --
>
> Key: ZOOKEEPER-1416
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: c client, documentation, java client, server
>Reporter: Phillip Liu
>Assignee: Jordan Zimmerman
> Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> h4. The Problem
> A ZooKeeper Watch can be placed on a single znode and when the znode changes 
> a Watch event is sent to the client. If there are thousands of znodes being 
> watched, when a client (re)connect, it would have to send thousands of watch 
> requests. At Facebook, we have this problem storing information for thousands 
> of db shards. Consequently a naming service that consumes the db shard 
> definition issues thousands of watch requests each time the service starts 
> and changes client watcher.
> h4. Proposed Solution
> We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent 
> means no Watch reset is necessary after a watch-fire. Recursive means the 
> Watch applies to the node and descendant nodes. A Persistent Recursive Watch 
> behaves as follows:
> # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS.
> # CHILDREN and DATA Recursive Watches can be placed on any znode.
> # EXISTS Recursive Watches can be placed on any path.
> # A Recursive Watch behaves like a auto-watch registrar on the server side. 
> Setting a  Recursive Watch means to set watches on all descendant znodes.
> # When a watch on a descendant fires, no subsequent event is fired until a 
> corresponding getData(..) on the znode is called, then Recursive Watch 
> automically apply the watch on the znode. This maintains the existing Watch 
> semantic on an individual znode.
> # A Recursive Watch overrides any watches placed on a descendant znode. 
> Practically this means the Recursive Watch Watcher callback is the one 
> receiving the event and event is delivered exactly once.
> A goal here is to reduce the number of semantic changes. The guarantee of no 
> intermediate watch event until data is read will be maintained. The only 
> difference is we will automatically re-add the watch after read. At the same 
> time we add the convience of reducing the need to add multiple watches for 
> sibling znodes and in turn reduce the number of watch messages sent from the 
> client to the server.
> There are some implementation details that needs to be hashed out. Initial 
> thinking is to have the Recursive Watch create per-node watches. This will 
> cause a lot of watches to be created on the server side. Currently, each 
> watch is stored as a single bit in a bit set relative to a session - up to 3 
> bits per client per znode. If there are 100m znodes with 100k clients, each 
> watching all nodes, then this strategy will consume approximately 3.75TB of 
> ram distributed across all Observers. Seems expensive.
> Alternatively, a blacklist of paths to not send Watches regardless of Watch 
> setting can be set each time a watch event from a Recursive Watch is fired. 
> The memory utilization is relative to the number of outstanding reads and at 
> worst case it's 1/3 * 3.75TB using the parameters given above.
> Otherwise, a relaxation of no intermediate watch event until read guarantee 
> is required. If the server can send watch events regardless of one has 
> already been fired without corresponding read, then the server can simply 
> fire watch events without tracking.



--
This message was sent by

Success: ZOOKEEPER- PreCommit Build #931

2017-08-09 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/931/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 72.85 MB...]
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 4 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 3.0.1) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/931//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/931//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/931//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 3ca8decd53363e2c5b4978cddf0012ee01072d9f logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] mv: 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 and 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 are the same file

BUILD SUCCESSFUL
Total time: 20 minutes 11 seconds
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[description-setter] Description set: ZOOKEEPER-1416
Putting comment on the pull request
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Success
Sending email for trigger: Success
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
All tests passed

Success: ZOOKEEPER- PreCommit Build #930

2017-08-09 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/930/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 70.97 MB...]
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 4 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 3.0.1) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/930//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/930//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/930//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] fa7027d6dfe25fe116f23b3c25df194b836e66a6 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] mv: 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 and 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 are the same file

BUILD SUCCESSFUL
Total time: 18 minutes 37 seconds
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[description-setter] Description set: ZOOKEEPER-1416
Putting comment on the pull request
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Success
Sending email for trigger: Success
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (ZOOKEEPER-1416) Persistent Recursive Watch

2017-08-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120479#comment-16120479
 ] 

Hadoop QA commented on ZOOKEEPER-1416:
--

+1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/930//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/930//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/930//console

This message is automatically generated.

> Persistent Recursive Watch
> --
>
> Key: ZOOKEEPER-1416
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: c client, documentation, java client, server
>Reporter: Phillip Liu
>Assignee: Jordan Zimmerman
> Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> h4. The Problem
> A ZooKeeper Watch can be placed on a single znode and when the znode changes 
> a Watch event is sent to the client. If there are thousands of znodes being 
> watched, when a client (re)connect, it would have to send thousands of watch 
> requests. At Facebook, we have this problem storing information for thousands 
> of db shards. Consequently a naming service that consumes the db shard 
> definition issues thousands of watch requests each time the service starts 
> and changes client watcher.
> h4. Proposed Solution
> We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent 
> means no Watch reset is necessary after a watch-fire. Recursive means the 
> Watch applies to the node and descendant nodes. A Persistent Recursive Watch 
> behaves as follows:
> # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS.
> # CHILDREN and DATA Recursive Watches can be placed on any znode.
> # EXISTS Recursive Watches can be placed on any path.
> # A Recursive Watch behaves like a auto-watch registrar on the server side. 
> Setting a  Recursive Watch means to set watches on all descendant znodes.
> # When a watch on a descendant fires, no subsequent event is fired until a 
> corresponding getData(..) on the znode is called, then Recursive Watch 
> automically apply the watch on the znode. This maintains the existing Watch 
> semantic on an individual znode.
> # A Recursive Watch overrides any watches placed on a descendant znode. 
> Practically this means the Recursive Watch Watcher callback is the one 
> receiving the event and event is delivered exactly once.
> A goal here is to reduce the number of semantic changes. The guarantee of no 
> intermediate watch event until data is read will be maintained. The only 
> difference is we will automatically re-add the watch after read. At the same 
> time we add the convience of reducing the need to add multiple watches for 
> sibling znodes and in turn reduce the number of watch messages sent from the 
> client to the server.
> There are some implementation details that needs to be hashed out. Initial 
> thinking is to have the Recursive Watch create per-node watches. This will 
> cause a lot of watches to be created on the server side. Currently, each 
> watch is stored as a single bit in a bit set relative to a session - up to 3 
> bits per client per znode. If there are 100m znodes with 100k clients, each 
> watching all nodes, then this strategy will consume approximately 3.75TB of 
> ram distributed across all Observers. Seems expensive.
> Alternatively, a blacklist of paths to not send Watches regardless of Watch 
> setting can be set each time a watch event from a Recursive Watch is fired. 
> The memory utilization is relative to the number of outstanding reads and at 
> worst case it's 1/3 * 3.75TB using the parameters given above.
> Otherwise, a relaxation of no intermediate watch event until read guarantee 
> is required. If the server can send watch events regardless of one has 
> already been fired without corresponding read, then the server can simply 
> fire watch events without tracking.



--
This message was sent by

[jira] [Commented] (ZOOKEEPER-1416) Persistent Recursive Watch

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120448#comment-16120448
 ] 

ASF GitHub Bot commented on ZOOKEEPER-1416:
---

Github user Randgalt commented on the issue:

https://github.com/apache/zookeeper/pull/136
  
@afine issues addressed


> Persistent Recursive Watch
> --
>
> Key: ZOOKEEPER-1416
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: c client, documentation, java client, server
>Reporter: Phillip Liu
>Assignee: Jordan Zimmerman
> Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> h4. The Problem
> A ZooKeeper Watch can be placed on a single znode and when the znode changes 
> a Watch event is sent to the client. If there are thousands of znodes being 
> watched, when a client (re)connect, it would have to send thousands of watch 
> requests. At Facebook, we have this problem storing information for thousands 
> of db shards. Consequently a naming service that consumes the db shard 
> definition issues thousands of watch requests each time the service starts 
> and changes client watcher.
> h4. Proposed Solution
> We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent 
> means no Watch reset is necessary after a watch-fire. Recursive means the 
> Watch applies to the node and descendant nodes. A Persistent Recursive Watch 
> behaves as follows:
> # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS.
> # CHILDREN and DATA Recursive Watches can be placed on any znode.
> # EXISTS Recursive Watches can be placed on any path.
> # A Recursive Watch behaves like a auto-watch registrar on the server side. 
> Setting a  Recursive Watch means to set watches on all descendant znodes.
> # When a watch on a descendant fires, no subsequent event is fired until a 
> corresponding getData(..) on the znode is called, then Recursive Watch 
> automically apply the watch on the znode. This maintains the existing Watch 
> semantic on an individual znode.
> # A Recursive Watch overrides any watches placed on a descendant znode. 
> Practically this means the Recursive Watch Watcher callback is the one 
> receiving the event and event is delivered exactly once.
> A goal here is to reduce the number of semantic changes. The guarantee of no 
> intermediate watch event until data is read will be maintained. The only 
> difference is we will automatically re-add the watch after read. At the same 
> time we add the convience of reducing the need to add multiple watches for 
> sibling znodes and in turn reduce the number of watch messages sent from the 
> client to the server.
> There are some implementation details that needs to be hashed out. Initial 
> thinking is to have the Recursive Watch create per-node watches. This will 
> cause a lot of watches to be created on the server side. Currently, each 
> watch is stored as a single bit in a bit set relative to a session - up to 3 
> bits per client per znode. If there are 100m znodes with 100k clients, each 
> watching all nodes, then this strategy will consume approximately 3.75TB of 
> ram distributed across all Observers. Seems expensive.
> Alternatively, a blacklist of paths to not send Watches regardless of Watch 
> setting can be set each time a watch event from a Recursive Watch is fired. 
> The memory utilization is relative to the number of outstanding reads and at 
> worst case it's 1/3 * 3.75TB using the parameters given above.
> Otherwise, a relaxation of no intermediate watch event until read guarantee 
> is required. If the server can send watch events regardless of one has 
> already been fired without corresponding read, then the server can simply 
> fire watch events without tracking.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] zookeeper issue #136: [ZOOKEEPER-1416] Persistent Recursive Watch

2017-08-09 Thread Randgalt

Github user Randgalt commented on the issue:

https://github.com/apache/zookeeper/pull/136
  
@afine issues addressed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Re: JIRA permissions

2017-08-09 Thread Michael Han

done and done.

On Wed, Aug 9, 2017 at 6:24 AM, Mark Fenes  wrote:

> Hi All,
>
> I'm a new dev and would like to contribute.
> Could you please provide me permissions to assign and edit issues in JIRA?
> My username is "mfenes".
>
> Thanks,
> Mark Fenes
>



-- 
Cheers
Michael.

[jira] [Commented] (ZOOKEEPER-2786) Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120426#comment-16120426
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2786:
---

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/327
  
The following up Netty fix to this flaky test is committed to master: 
e104175bb47baeb800354078c015e78bfcb7c953 and 3.5 
23962f12395ada67e689b8ff57573fc1398a54eb


> Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode
> --
>
> Key: ZOOKEEPER-2786
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2786
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.10, 3.5.3
>Reporter: Abraham Fine
>Assignee: Abraham Fine
> Fix For: 3.5.4, 3.6.0, 3.4.11
>
>
> This test is broken on 3.4 and 3.5, but is broken in "different" ways. Please 
> see the individual pull requests for detailed descriptions for the issues 
> faced in both branches.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] zookeeper issue #327: ZOOKEEPER-2786 Flaky test: org.apache.zookeeper.test.C...

2017-08-09 Thread hanm

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/327
  
The following up Netty fix to this flaky test is committed to master: 
e104175bb47baeb800354078c015e78bfcb7c953 and 3.5 
23962f12395ada67e689b8ff57573fc1398a54eb


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (ZOOKEEPER-2786) Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode

2017-08-09 Thread Michael Han (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Han updated ZOOKEEPER-2786:
---
Fix Version/s: 3.4.11

> Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode
> --
>
> Key: ZOOKEEPER-2786
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2786
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.10, 3.5.3
>Reporter: Abraham Fine
>Assignee: Abraham Fine
> Fix For: 3.5.4, 3.6.0, 3.4.11
>
>
> This test is broken on 3.4 and 3.5, but is broken in "different" ways. Please 
> see the individual pull requests for detailed descriptions for the issues 
> faced in both branches.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (ZOOKEEPER-2786) Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120419#comment-16120419
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2786:
---

Github user asfgit closed the pull request at:

https://github.com/apache/zookeeper/pull/327


> Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode
> --
>
> Key: ZOOKEEPER-2786
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2786
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.10, 3.5.3
>Reporter: Abraham Fine
>Assignee: Abraham Fine
> Fix For: 3.5.4, 3.6.0
>
>
> This test is broken on 3.4 and 3.5, but is broken in "different" ways. Please 
> see the individual pull requests for detailed descriptions for the issues 
> faced in both branches.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (ZOOKEEPER-2786) Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode

2017-08-09 Thread Michael Han (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Han resolved ZOOKEEPER-2786.

Resolution: Fixed

Issue resolved by pull request 327
[https://github.com/apache/zookeeper/pull/327]

> Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode
> --
>
> Key: ZOOKEEPER-2786
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2786
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.10, 3.5.3
>Reporter: Abraham Fine
>Assignee: Abraham Fine
> Fix For: 3.6.0, 3.5.4
>
>
> This test is broken on 3.4 and 3.5, but is broken in "different" ways. Please 
> see the individual pull requests for detailed descriptions for the issues 
> faced in both branches.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] zookeeper pull request #327: ZOOKEEPER-2786 Flaky test: org.apache.zookeeper...

2017-08-09 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/zookeeper/pull/327


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (ZOOKEEPER-2864) Add script to run a java api compatibility tool

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120397#comment-16120397
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2864:
---

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/329
  
I suggest we put this script under zookeeper/src/java/test/bin/ where other 
scripts are currently located, so it's consistent. 

A better solution is to consolidate all scripts and put them in a folder 
with name making more sense, like other projects (e.g. hbase who puts scripts 
under root/dev-support), but that should be done separately as moving scripts 
will break a couple of work flows and require coordination.


> Add script to run a java api compatibility tool
> ---
>
> Key: ZOOKEEPER-2864
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2864
> Project: ZooKeeper
>  Issue Type: Improvement
>Affects Versions: 3.4.10, 3.5.3
>Reporter: Abraham Fine
>Assignee: Abraham Fine
>
> We should use the annotations added in ZOOKEEPER-2829 to run a script to 
> verify api compatibility. See KUDU-1265 for an example.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] zookeeper issue #329: ZOOKEEPER-2864: Add script to run a java api compatibi...

2017-08-09 Thread hanm

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/329
  
I suggest we put this script under zookeeper/src/java/test/bin/ where other 
scripts are currently located, so it's consistent. 

A better solution is to consolidate all scripts and put them in a folder 
with name making more sense, like other projects (e.g. hbase who puts scripts 
under root/dev-support), but that should be done separately as moving scripts 
will break a couple of work flows and require coordination.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (ZOOKEEPER-2471) Java Zookeeper Client incorrectly considers time spent sleeping as time spent connecting, potentially resulting in infinite reconnect loop

2017-08-09 Thread Dan Benediktson (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120331#comment-16120331
 ] 

Dan Benediktson commented on ZOOKEEPER-2471:


Core tests that I see failed in the log both had already passed on my local run:
ChrootClientTest.testNonExistingOpCode
WatchEventWhenAutoResetTest.testNodeDataChanged

The first one is clearly suspicious because my patch allegedly "fixed" the 
corresponding ClientTest.testNonExistingOpCode, which it shouldn't have done 
anything about, so I'm pretty certain that's just a flaky test.

I've tried running both of those test cases about 10 times on my MBP to no 
avail; they pass every time. I also already ran the full "ant test" suite 
before submitting the patch in the first place, and it all succeeded.

Any suggestions here?

> Java Zookeeper Client incorrectly considers time spent sleeping as time spent 
> connecting, potentially resulting in infinite reconnect loop
> --
>
> Key: ZOOKEEPER-2471
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2471
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.5.1
> Environment: all
>Reporter: Dan Benediktson
>Assignee: Dan Benediktson
> Attachments: ZOOKEEPER-2471.patch
>
>
> ClientCnxnSocket uses a member variable "now" to track the current time, and 
> lastSend / lastHeard variables to track socket liveness. Implementations, and 
> even ClientCnxn itself, are expected to call both updateNow() to reset "now" 
> to System.currentTimeMillis, and then call updateLastSend()/updateLastHeard() 
> on IO completions.
> This is a fragile contract, so it's not surprising that there's a bug 
> resulting from it: ClientCnxn.SendThread.run() calls updateLastSendAndHeard() 
> as soon as startConnect() returns, but it does not call updateNow() first. I 
> expect when this was written, either the expectation was that startConnect() 
> was an asynchronous operation and that updateNow() would have been called 
> very recently, or simply the requirement to call updateNow() was forgotten at 
> this point. As far as I can see, this bug has been present since the 
> "updateNow" method was first introduced in the distant past. As it turns out, 
> since startConnect() calls HostProvider.next(), which can sleep, quite a lot 
> of time can pass, leaving a big gap between "now" and now.
> If you are using very short session timeouts (one of our ZK ensembles has 
> many clients using a 1-second timeout), this is potentially disastrous, 
> because the sleep time may exceed the connection timeout itself, which can 
> potentially result in the Java client being stuck in a perpetual reconnect 
> loop. The exact code path it goes through in this case is complicated, 
> because there has to be a previously-closed socket still waiting in the 
> selector (otherwise, the first timeout evaluation will not fail because "now" 
> still hasn't been updated, and then the actual connect timeout will be 
> applied in ClientCnxnSocket.doTransport()) so that select() will harvest the 
> IO from the previous socket and updateNow(), resulting in the next loop 
> through ClientCnxnSocket.SendThread.run() observing the spurious timeout and 
> failing. In practice it does happen to us fairly frequently; we only got to 
> the bottom of the bug yesterday. Worse, when it does happen, the Zookeeper 
> client object is rendered unusable: it's stuck in a perpetual reconnect loop 
> where it keeps sleeping, opening a socket, and immediately closing it.
> I have a patch. Rather than calling updateNow() right after startConnect(), 
> my fix is to remove the "now" member variable and the updateNow() method 
> entirely, and to instead just call System.currentTimeMillis() whenever time 
> needs to be evaluated. I realize there is a benefit (aside from a trivial 
> micro-optimization not worth worrying about) to having the time be "fixed", 
> particularly for truth in the logging: if time is fixed by an updateNow() 
> call, then the log for a timeout will still show exactly the same value the 
> code reasoned about. However, this benefit is in my opinion not enough to 
> merit the fragility of the contract which led to this (for us) highly 
> impactful and difficult-to-find bug in the first place.
> I'm currently running ant tests locally against my patch on trunk, and then 
> I'll upload it here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (ZOOKEEPER-2471) Java Zookeeper Client incorrectly considers time spent sleeping as time spent connecting, potentially resulting in infinite reconnect loop

2017-08-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120297#comment-16120297
 ] 

Hadoop QA commented on ZOOKEEPER-2471:
--

-1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+0 tests included.  The patch appears to be a documentation patch that 
doesn't require tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/929//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/929//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/929//console

This message is automatically generated.

> Java Zookeeper Client incorrectly considers time spent sleeping as time spent 
> connecting, potentially resulting in infinite reconnect loop
> --
>
> Key: ZOOKEEPER-2471
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2471
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.5.1
> Environment: all
>Reporter: Dan Benediktson
>Assignee: Dan Benediktson
> Attachments: ZOOKEEPER-2471.patch
>
>
> ClientCnxnSocket uses a member variable "now" to track the current time, and 
> lastSend / lastHeard variables to track socket liveness. Implementations, and 
> even ClientCnxn itself, are expected to call both updateNow() to reset "now" 
> to System.currentTimeMillis, and then call updateLastSend()/updateLastHeard() 
> on IO completions.
> This is a fragile contract, so it's not surprising that there's a bug 
> resulting from it: ClientCnxn.SendThread.run() calls updateLastSendAndHeard() 
> as soon as startConnect() returns, but it does not call updateNow() first. I 
> expect when this was written, either the expectation was that startConnect() 
> was an asynchronous operation and that updateNow() would have been called 
> very recently, or simply the requirement to call updateNow() was forgotten at 
> this point. As far as I can see, this bug has been present since the 
> "updateNow" method was first introduced in the distant past. As it turns out, 
> since startConnect() calls HostProvider.next(), which can sleep, quite a lot 
> of time can pass, leaving a big gap between "now" and now.
> If you are using very short session timeouts (one of our ZK ensembles has 
> many clients using a 1-second timeout), this is potentially disastrous, 
> because the sleep time may exceed the connection timeout itself, which can 
> potentially result in the Java client being stuck in a perpetual reconnect 
> loop. The exact code path it goes through in this case is complicated, 
> because there has to be a previously-closed socket still waiting in the 
> selector (otherwise, the first timeout evaluation will not fail because "now" 
> still hasn't been updated, and then the actual connect timeout will be 
> applied in ClientCnxnSocket.doTransport()) so that select() will harvest the 
> IO from the previous socket and updateNow(), resulting in the next loop 
> through ClientCnxnSocket.SendThread.run() observing the spurious timeout and 
> failing. In practice it does happen to us fairly frequently; we only got to 
> the bottom of the bug yesterday. Worse, when it does happen, the Zookeeper 
> client object is rendered unusable: it's stuck in a perpetual reconnect loop 
> where it keeps sleeping, opening a socket, and immediately closing it.
> I have a patch. Rather than calling updateNow() right after startConnect(), 
> my fix is to remove the "now" member variable and the updateNow() method 
> entirely, and to instead just call System.currentTimeMillis() whenever time 
> needs to be evaluated. I realize there is a benefit (aside from a trivial 
> micro-optimization not worth worrying about) to having the time be "fixed", 
> particularly for truth in the logging: if time is fixed by an updateNow() 
> call, then the log for a timeout will still show exactly the same value the 
> code reasoned about. However, this benefit is in my opinion not enough to 
> merit the fragility of the contract which led to this (for us) highly 
> impactful and

Failed: ZOOKEEPER- PreCommit Build #929

2017-08-09 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/929/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 71.34 MB...]
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 3.0.1) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] -1 core tests.  The patch failed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/929//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/929//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/929//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 4ab077b212193aacc964978028662d7df1eeacbc logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] mv: 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 and 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 are the same file

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/build.xml:1643:
 exec returned: 1

Total time: 13 minutes 19 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[Fast Archiver] Compressed 574.71 KB of artifacts by 50.1% relative to #927
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[description-setter] Description set: ZOOKEEPER-2471
Putting comment on the pull request
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
3 tests failed.
FAILED:  
org.apache.zookeeper.server.quorum.Zab1_0Test.testNormalFollowerRunWithDiff

Error Message:
expected:<4294967298> but was:<0>

Stack Trace:
junit.framework.AssertionFailedError: expected:<4294967298> but was:<0>
at 
org.apache.zookeeper.server.quorum.Zab1_0Test$5.converseWithFollower(Zab1_0Test.java:869)
at 
org.apache.zookeeper.server.quorum.Zab1_0Test.testFollowerConversation(Zab1_0Test.java:517)
at 
org.apache.zookeeper.server.quorum.Zab1_0Test.testNormalFollowerRunWithDiff(Zab1_0Test.java:784)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79)


FAILED:  
org.apache.zookeeper.test.WatchEventWhenAutoResetTest.testNodeDataChanged

Error Message:
expected: but was:

Stack Trace:
junit.framework.AssertionFailedError: expected: but 
was:
at 
org.apache.zookeeper.test.WatchEventWhenAutoResetTest$EventsWatcher.assertEvent(WatchEventWhenAutoResetTest.java:67)
at 
org.apache.zookeeper.test.WatchEventWhenAutoResetTest.testNodeDataChanged(WatchEventWhenAutoResetTest.java:126)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79)


FAILED:  org.apache.zookeeper.test.ChrootClientTest.testNonExistingOpCode

Error Message:
expected:<-4> but was:<-6>

Stack Trace:

[jira] [Commented] (ZOOKEEPER-2471) Java Zookeeper Client incorrectly considers time spent sleeping as time spent connecting, potentially resulting in infinite reconnect loop

2017-08-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120259#comment-16120259
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2471:
---

GitHub user DanBenediktson opened a pull request:

https://github.com/apache/zookeeper/pull/330

ZOOKEEPER-2471: ZK Java client should not count sleep time as connect time

ClientCnxnSocket uses a member variable "now" to track the current time, 
but does not update it at all potentially-blocking times: in particular, it 
does not update it after the random sleep introduced if an initial connect 
attempt fails. This results in the random sleep time being counted towards 
connect time, resulting in incorrect application of connection timeout 
currently, and if ZOOKEEPER-2869 is taken, a very real possibility (we have 
seen it in production) of wedging the Zookeeper client so that it can never 
successfully reconnect, because its sleep time may grow beyond its connection 
timeout, especially in scenarios where there is a big gap between negotiated 
session timeout and client-requested session timeout.

Rather than fixing the bug by adding another "updateNow()" call, keeping 
the brittle "updateNow()" implementation which led to the bug in the first 
place, I have deleted updateNow() and replaced usage of that member variable 
with actually getting the current system timestamp whenever the implementation 
needs to know the current time.

Regarding unit testing, this is, IMO, too difficult to test without 
introducing a lot of invasive changes to ClientCnxn.java, seeing as the only 
effective change is that, on connection retry, the random sleep time is no 
longer counted towards a time budget. I can throw a lot of mocks at this, like 
ClientReconnectTest, but I'm still going to be stuck depending on the behavior 
of that randomly-generated sleep time, which is going to be inherently 
unreliable. If a fix is taken for ZOOKEEPER-2869, this should become much 
easier to test, since I will then be able to inject a different backoff sleep 
behavior, and since I'm planning to submit a pull request for that ticket as 
well, so maybe as a compromise I can submit a test for this bug fix at that 
time?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/DanBenediktson/zookeeper ZOOKEEPER-2471

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/330.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #330


commit 60f38726e7f07b4bb970cc8fb089363ff48eb3df
Author: Dan Benediktson 
Date:   2017-08-09T16:41:42Z

ZOOKEEPER-2471: Zookeeper Java client should not count time spent sleeping 
as time spent connecting

Rather than keep the brittle "updateNow()" implementation which led to the 
bug and fixing the bug by
adding another "updateNow()" call, I have deleted updateNow() and replaced 
usage of that member variable
with actually getting the current system timestamp.

This is, IMO, too difficult to test without introducing a lot of invasive 
changes to ClientCnxn.java,
seeing as the only effective change is that, on connection retry, a random 
sleep time is no longer
counted towards a time budget. If a fix is taken for ZOOKEEPER-2869, this 
should become much easier to
test, and since I'm planning to submit a pull request for that ticket as 
well, maybe as a compromise
I can submit a test for this patch at that time?




> Java Zookeeper Client incorrectly considers time spent sleeping as time spent 
> connecting, potentially resulting in infinite reconnect loop
> --
>
> Key: ZOOKEEPER-2471
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2471
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.5.1
> Environment: all
>Reporter: Dan Benediktson
>Assignee: Dan Benediktson
> Attachments: ZOOKEEPER-2471.patch
>
>
> ClientCnxnSocket uses a member variable "now" to track the current time, and 
> lastSend / lastHeard variables to track socket liveness. Implementations, and 
> even ClientCnxn itself, are expected to call both updateNow() to reset "now" 
> to System.currentTimeMillis, and then call updateLastSend()/updateLastHeard() 
> on IO completions.
> This is a fragile contract, so it's not surprising that there's a bug 
> resulting from it: ClientCnxn.SendThread.run() calls updateLastSendAndHeard() 
> as soon as startConnect() returns, but it does not call updateNow() first. I 
> expect

[GitHub] zookeeper pull request #330: ZOOKEEPER-2471: ZK Java client should not count...

2017-08-09 Thread DanBenediktson

GitHub user DanBenediktson opened a pull request:

https://github.com/apache/zookeeper/pull/330

ZOOKEEPER-2471: ZK Java client should not count sleep time as connect time

ClientCnxnSocket uses a member variable "now" to track the current time, 
but does not update it at all potentially-blocking times: in particular, it 
does not update it after the random sleep introduced if an initial connect 
attempt fails. This results in the random sleep time being counted towards 
connect time, resulting in incorrect application of connection timeout 
currently, and if ZOOKEEPER-2869 is taken, a very real possibility (we have 
seen it in production) of wedging the Zookeeper client so that it can never 
successfully reconnect, because its sleep time may grow beyond its connection 
timeout, especially in scenarios where there is a big gap between negotiated 
session timeout and client-requested session timeout.

Rather than fixing the bug by adding another "updateNow()" call, keeping 
the brittle "updateNow()" implementation which led to the bug in the first 
place, I have deleted updateNow() and replaced usage of that member variable 
with actually getting the current system timestamp whenever the implementation 
needs to know the current time.

Regarding unit testing, this is, IMO, too difficult to test without 
introducing a lot of invasive changes to ClientCnxn.java, seeing as the only 
effective change is that, on connection retry, the random sleep time is no 
longer counted towards a time budget. I can throw a lot of mocks at this, like 
ClientReconnectTest, but I'm still going to be stuck depending on the behavior 
of that randomly-generated sleep time, which is going to be inherently 
unreliable. If a fix is taken for ZOOKEEPER-2869, this should become much 
easier to test, since I will then be able to inject a different backoff sleep 
behavior, and since I'm planning to submit a pull request for that ticket as 
well, so maybe as a compromise I can submit a test for this bug fix at that 
time?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/DanBenediktson/zookeeper ZOOKEEPER-2471

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/330.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #330


commit 60f38726e7f07b4bb970cc8fb089363ff48eb3df
Author: Dan Benediktson 
Date:   2017-08-09T16:41:42Z

ZOOKEEPER-2471: Zookeeper Java client should not count time spent sleeping 
as time spent connecting

Rather than keep the brittle "updateNow()" implementation which led to the 
bug and fixing the bug by
adding another "updateNow()" call, I have deleted updateNow() and replaced 
usage of that member variable
with actually getting the current system timestamp.

This is, IMO, too difficult to test without introducing a lot of invasive 
changes to ClientCnxn.java,
seeing as the only effective change is that, on connection retry, a random 
sleep time is no longer
counted towards a time budget. If a fix is taken for ZOOKEEPER-2869, this 
should become much easier to
test, and since I'm planning to submit a pull request for that ticket as 
well, maybe as a compromise
I can submit a test for this patch at that time?




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

ZooKeeper_branch34_openjdk7 - Build # 1603 - Still Failing

2017-08-09 Thread Apache Jenkins Server

See https://builds.apache.org/job/ZooKeeper_branch34_openjdk7/1603/

###
## LAST 60 LINES OF THE CONSOLE 
###
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on H27 (ubuntu xenial) in workspace 
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_openjdk7
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url git://git.apache.org/zookeeper.git # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
Fetching upstream changes from git://git.apache.org/zookeeper.git
 > git --version # timeout=10
 > git fetch --tags --progress git://git.apache.org/zookeeper.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/branch-3.4^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/branch-3.4^{commit} # timeout=10
Checking out Revision e4303a37a813c9f1bd4cdefd9c754267b12c32b4 
(refs/remotes/origin/branch-3.4)
Commit message: "ZOOKEEPER-2853: The lastZxidSeen in FileTxnLog.java is never 
being assigned. This is a port of the same patch committed to master and 
branch-3.5, after resolving merge conflicts."
 > git config core.sparsecheckout # timeout=10
 > git checkout -f e4303a37a813c9f1bd4cdefd9c754267b12c32b4
 > git rev-list e4303a37a813c9f1bd4cdefd9c754267b12c32b4 # timeout=10
No emails were triggered.
[ZooKeeper_branch34_openjdk7] $ 
/home/jenkins/tools/ant/apache-ant-1.9.9/bin/ant -Dtest.output=yes 
-Dtest.junit.threads=8 -Dtest.junit.output.format=xml -Djavac.target=1.7 clean 
test-core-java
Error: JAVA_HOME is not defined correctly.
  We cannot execute /usr/lib/jvm/java-7-openjdk-amd64//bin/java
Build step 'Invoke Ant' marked build as failure
Recording test results
ERROR: Step ?Publish JUnit test result report? failed: No test report files 
were found. Configuration error?
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
No tests ran.

JIRA permissions

2017-08-09 Thread Mark Fenes

Hi All,

I'm a new dev and would like to contribute.
Could you please provide me permissions to assign and edit issues in JIRA?
My username is "mfenes".

Thanks,
Mark Fenes

jira permissions

2017-08-09 Thread Tamas Penzes

Hi All,

I'm a new dev and would like to contribute.
Could you please provide me the rights to assign and edit issues in JIRA?
My username is "tamaas".

Thanks, Tamaas

ZooKeeper_branch35_jdk8 - Build # 627 - Still Failing

2017-08-09 Thread Apache Jenkins Server

See https://builds.apache.org/job/ZooKeeper_branch35_jdk8/627/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 69.05 MB...]
[junit] at 
org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:182)
[junit] at 
org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113)
[junit] 2017-08-09 12:24:07,821 [myid:] - INFO  [main:ZooKeeper@1334] - 
Session: 0x1070cc2ce77 closed
[junit] 2017-08-09 12:24:07,821 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@82] - Memory used 147756
[junit] 2017-08-09 12:24:07,821 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@87] - Number of threads 471
[junit] 2017-08-09 12:24:07,821 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@102] - FINISHED TEST METHOD 
testWatcherAutoResetWithLocal
[junit] 2017-08-09 12:24:07,821 [myid:] - INFO  [main:ClientBase@586] - 
tearDown starting
[junit] 2017-08-09 12:24:07,821 [myid:] - INFO  [main:ClientBase@556] - 
STOPPING server
[junit] 2017-08-09 12:24:07,821 [myid:] - INFO  
[main:NettyServerCnxnFactory@464] - shutdown called 0.0.0.0/0.0.0.0:22240
[junit] 2017-08-09 12:24:07,821 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down for 
session: 0x1070cc2ce77
[junit] 2017-08-09 12:24:07,822 [myid:] - INFO  [main:ZooKeeperServer@541] 
- shutting down
[junit] 2017-08-09 12:24:07,822 [myid:] - ERROR [main:ZooKeeperServer@505] 
- ZKShutdownHandler is not registered, so ZooKeeper server won't take any 
action on ERROR or SHUTDOWN server state changes
[junit] 2017-08-09 12:24:07,822 [myid:] - INFO  
[main:SessionTrackerImpl@232] - Shutting down
[junit] 2017-08-09 12:24:07,822 [myid:] - INFO  
[main:PrepRequestProcessor@1005] - Shutting down
[junit] 2017-08-09 12:24:07,823 [myid:] - INFO  
[main:SyncRequestProcessor@191] - Shutting down
[junit] 2017-08-09 12:24:07,823 [myid:] - INFO  [ProcessThread(sid:0 
cport:22240)::PrepRequestProcessor@155] - PrepRequestProcessor exited loop!
[junit] 2017-08-09 12:24:07,823 [myid:] - INFO  
[SyncThread:0:SyncRequestProcessor@169] - SyncRequestProcessor exited!
[junit] 2017-08-09 12:24:07,824 [myid:] - INFO  
[main:FinalRequestProcessor@481] - shutdown of request processor complete
[junit] 2017-08-09 12:24:07,824 [myid:] - INFO  [main:MBeanRegistry@128] - 
Unregister MBean 
[org.apache.ZooKeeperService:name0=StandaloneServer_port22240,name1=InMemoryDataTree]
[junit] 2017-08-09 12:24:07,824 [myid:] - INFO  [main:MBeanRegistry@128] - 
Unregister MBean [org.apache.ZooKeeperService:name0=StandaloneServer_port22240]
[junit] 2017-08-09 12:24:07,824 [myid:] - INFO  
[main:FourLetterWordMain@87] - connecting to 127.0.0.1 22240
[junit] 2017-08-09 12:24:07,825 [myid:] - INFO  [main:JMXEnv@146] - 
ensureOnly:[]
[junit] 2017-08-09 12:24:07,828 [myid:] - INFO  [main:ClientBase@611] - 
fdcount after test is: 1414 at start it was 1414
[junit] 2017-08-09 12:24:07,828 [myid:] - INFO  [main:ZKTestCase$1@68] - 
SUCCEEDED testWatcherAutoResetWithLocal
[junit] 2017-08-09 12:24:07,828 [myid:] - INFO  [main:ZKTestCase$1@63] - 
FINISHED testWatcherAutoResetWithLocal
[junit] 2017-08-09 12:24:07,847 [myid:127.0.0.1:22058] - INFO  
[main-SendThread(127.0.0.1:22058):ClientCnxn$SendThread@1113] - Opening socket 
connection to server 127.0.0.1/127.0.0.1:22058. Will not attempt to 
authenticate using SASL (unknown error)
[junit] 2017-08-09 12:24:07,847 [myid:127.0.0.1:22058] - WARN  
[main-SendThread(127.0.0.1:22058):ClientCnxn$SendThread@1235] - Session 
0x1070cbd69c5 for server 127.0.0.1/127.0.0.1:22058, unexpected error, 
closing socket connection and attempting reconnect
[junit] java.net.ConnectException: Connection refused
[junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
[junit] at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
[junit] at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:357)
[junit] at 
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1214)
[junit] Tests run: 103, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
422.237 sec, Thread: 5, Class: org.apache.zookeeper.test.NioNettySuiteTest
[junit] 2017-08-09 12:24:07,963 [myid:127.0.0.1:22120] - INFO  
[main-SendThread(127.0.0.1:22120):ClientCnxn$SendThread@1113] - Opening socket 
connection to server 127.0.0.1/127.0.0.1:22120. Will not attempt to 
authenticate using SASL (unknown error)
[junit] 2017-08-09 12:24:07,964 [myid:127.0.0.1:22120] - WARN  
[main-SendThread(127.0.0.1:22120):ClientCnxn$SendThread@1235] - Session 
0x2070cbf80d3 for server 127.0.0.1/127.0.0.1:22120, unexpected error, 
closing socket connection and attempting reconnect
[junit]

ZooKeeper-trunk-jdk8 - Build # 1155 - Still Failing

2017-08-09 Thread Apache Jenkins Server

See https://builds.apache.org/job/ZooKeeper-trunk-jdk8/1155/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 65.31 MB...]
[junit] 2017-08-09 12:12:11,029 [myid:127.0.0.1:22123] - WARN  
[main-SendThread(127.0.0.1:22123):ClientCnxn$SendThread@1235] - Session 
0x3054056f76d for server 127.0.0.1/127.0.0.1:22123, unexpected error, 
closing socket connection and attempting reconnect
[junit] java.net.ConnectException: Connection refused
[junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
[junit] at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
[junit] at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:357)
[junit] at 
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1214)
[junit] 2017-08-09 12:12:11,288 [myid:] - INFO  [ProcessThread(sid:0 
cport:22240)::PrepRequestProcessor@614] - Processed session termination for 
sessionid: 0x105405a3ecd
[junit] 2017-08-09 12:12:11,291 [myid:] - INFO  
[SyncThread:0:MBeanRegistry@128] - Unregister MBean 
[org.apache.ZooKeeperService:name0=StandaloneServer_port22240,name1=Connections,name2=127.0.0.1,name3=0x105405a3ecd]
[junit] 2017-08-09 12:12:11,291 [myid:] - INFO  [main:ZooKeeper@1332] - 
Session: 0x105405a3ecd closed
[junit] 2017-08-09 12:12:11,291 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down for 
session: 0x105405a3ecd
[junit] 2017-08-09 12:12:11,291 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@82] - Memory used 120349
[junit] 2017-08-09 12:12:11,292 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@87] - Number of threads 863
[junit] 2017-08-09 12:12:11,292 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@102] - FINISHED TEST METHOD 
testWatcherAutoResetWithLocal
[junit] 2017-08-09 12:12:11,292 [myid:] - INFO  [main:ClientBase@601] - 
tearDown starting
[junit] 2017-08-09 12:12:11,292 [myid:] - INFO  [main:ClientBase@571] - 
STOPPING server
[junit] 2017-08-09 12:12:11,292 [myid:] - INFO  
[main:NettyServerCnxnFactory@464] - shutdown called 0.0.0.0/0.0.0.0:22240
[junit] 2017-08-09 12:12:11,294 [myid:] - INFO  [main:ZooKeeperServer@541] 
- shutting down
[junit] 2017-08-09 12:12:11,294 [myid:] - ERROR [main:ZooKeeperServer@505] 
- ZKShutdownHandler is not registered, so ZooKeeper server won't take any 
action on ERROR or SHUTDOWN server state changes
[junit] 2017-08-09 12:12:11,294 [myid:] - INFO  
[main:SessionTrackerImpl@232] - Shutting down
[junit] 2017-08-09 12:12:11,294 [myid:] - INFO  
[main:PrepRequestProcessor@1008] - Shutting down
[junit] 2017-08-09 12:12:11,294 [myid:] - INFO  
[main:SyncRequestProcessor@191] - Shutting down
[junit] 2017-08-09 12:12:11,294 [myid:] - INFO  [ProcessThread(sid:0 
cport:22240)::PrepRequestProcessor@155] - PrepRequestProcessor exited loop!
[junit] 2017-08-09 12:12:11,295 [myid:] - INFO  
[SyncThread:0:SyncRequestProcessor@169] - SyncRequestProcessor exited!
[junit] 2017-08-09 12:12:11,295 [myid:] - INFO  
[main:FinalRequestProcessor@481] - shutdown of request processor complete
[junit] 2017-08-09 12:12:11,295 [myid:] - INFO  [main:MBeanRegistry@128] - 
Unregister MBean 
[org.apache.ZooKeeperService:name0=StandaloneServer_port22240,name1=InMemoryDataTree]
[junit] 2017-08-09 12:12:11,295 [myid:] - INFO  [main:MBeanRegistry@128] - 
Unregister MBean [org.apache.ZooKeeperService:name0=StandaloneServer_port22240]
[junit] 2017-08-09 12:12:11,296 [myid:] - INFO  
[main:FourLetterWordMain@87] - connecting to 127.0.0.1 22240
[junit] 2017-08-09 12:12:11,296 [myid:] - INFO  [main:JMXEnv@146] - 
ensureOnly:[]
[junit] 2017-08-09 12:12:11,301 [myid:] - INFO  [main:ClientBase@626] - 
fdcount after test is: 2545 at start it was 2545
[junit] 2017-08-09 12:12:11,301 [myid:] - INFO  [main:ZKTestCase$1@68] - 
SUCCEEDED testWatcherAutoResetWithLocal
[junit] 2017-08-09 12:12:11,301 [myid:] - INFO  [main:ZKTestCase$1@63] - 
FINISHED testWatcherAutoResetWithLocal
[junit] Tests run: 103, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
419.184 sec, Thread: 5, Class: org.apache.zookeeper.test.NioNettySuiteTest
[junit] 2017-08-09 12:12:11,458 [myid:127.0.0.1:22043] - INFO  
[main-SendThread(127.0.0.1:22043):ClientCnxn$SendThread@1113] - Opening socket 
connection to server 127.0.0.1/127.0.0.1:22043. Will not attempt to 
authenticate using SASL (unknown error)
[junit] 2017-08-09 12:12:11,458 [myid:127.0.0.1:22043] - WARN  
[main-SendThread(127.0.0.1:22043):ClientCnxn$SendThread@1235] - Session 
0x10540545aeb0001 for server 127.0.0.1/127.0.0.1:22043, unexpected error, 
closing socket connection and attempting reconnect
[junit] java.net.ConnectException: Connection refused
[junit] at

50 matches

Mail list logo