[jira] [Commented] (ZOOKEEPER-2867) an expired ZK session can be re-established
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121101#comment-16121101 ] Michael Han commented on ZOOKEEPER-2867: [~junrao] I also did experiments on my side, when a session is closed the {{CommitProcessor}} should log something like: {noformat}2017-08-09 22:41:49,824 [myid:2] - DEBUG [SyncThread:2:CommitProcessor@386] - Committing request:: sessionid:0x1134d2f type:closeSession cxid:0x1 zxid:0x20002 txntype:-11 reqpath:n/a{noformat} But this is for 3.5, which has changed a lot in terms of how commit works, and I realized you are using 3.4. But, from what you just referred about {{didn't find any logging of closeSession in the log4j log}} I think we can conclude that this specific session was not closed. To circle back to original question - assuming this problematic session is not closed (which is what existing evidence demonstrates), does this create any inconsistency or confusions at higher level in your use case? Were you expecting this session got closed based on what you observed at client side? > an expired ZK session can be re-established > --- > > Key: ZOOKEEPER-2867 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2867 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.4.10 >Reporter: Jun Rao > Attachments: zk.0.formatted, zk.1.formatted > > > Not sure if this is a real bug, but I found an instance when a ZK client > seems to be able to renew a session already expired by the ZK server. > From ZK server log, session 25cd1e82c110001 was expired at 22:04:39. > {code:java} > June 27th 2017, 22:04:39.000 INFO > org.apache.zookeeper.server.ZooKeeperServer Expiring session > 0x25cd1e82c110001, timeout of 12000ms exceeded > June 27th 2017, 22:04:39.001 DEBUG > org.apache.zookeeper.server.quorum.Leader Proposing:: > sessionid:0x25cd1e82c110001 type:closeSession cxid:0x0 zxid:0x20fc4 > txntype:-11 reqpath:n/a > June 27th 2017, 22:04:39.001 INFO > org.apache.zookeeper.server.PrepRequestProcessorProcessed session > termination for sessionid: 0x25cd1e82c110001 > June 27th 2017, 22:04:39.001 DEBUG > org.apache.zookeeper.server.quorum.CommitProcessor Processing request:: > sessionid:0x25cd1e82c110001 type:closeSession cxid:0x0 zxid:0x20fc4 > txntype:-11 reqpath:n/a > June 27th 2017, 22:05:20.324 INFO > org.apache.zookeeper.server.quorum.Learner Revalidating client: > 0x25cd1e82c110001 > June 27th 2017, 22:05:20.324 INFO > org.apache.zookeeper.server.ZooKeeperServer Client attempting to renew > session 0x25cd1e82c110001 at /100.96.5.6:47618 > June 27th 2017, 22:05:20.325 INFO > org.apache.zookeeper.server.ZooKeeperServer Established session > 0x25cd1e82c110001 with negotiated timeout 12000 for client /100.96.5.6:47618 > {code} > From ZK client's log, it was able to renew the expired session on 22:05:20. > {code:java} > June 27th 2017, 22:05:18.590 INFOorg.apache.zookeeper.ClientCnxn Client > session timed out, have not heard from server in 4485ms for sessionid > 0x25cd1e82c110001, closing socket connection and attempting reconnect 0 > June 27th 2017, 22:05:18.590 WARNorg.apache.zookeeper.ClientCnxn Client > session timed out, have not heard from server in 4485ms for sessionid > 0x25cd1e82c110001 0 > June 27th 2017, 22:05:19.325 WARNorg.apache.zookeeper.ClientCnxn SASL > configuration failed: javax.security.auth.login.LoginException: No JAAS > configuration section named 'Client' was found in specified JAAS > configuration file: '/opt/confluent/etc/kafka/server_jaas.conf'. Will > continue connection to Zookeeper server without SASL authentication, if > Zookeeper server allows it. 0 > June 27th 2017, 22:05:19.326 INFOorg.apache.zookeeper.ClientCnxn Opening > socket connection to server 100.65.188.168/100.65.188.168:2181 0 > June 27th 2017, 22:05:20.324 INFOorg.apache.zookeeper.ClientCnxn Socket > connection established to 100.65.188.168/100.65.188.168:2181, initiating > session 0 > June 27th 2017, 22:05:20.327 INFOorg.apache.zookeeper.ClientCnxn Session > establishment complete on server 100.65.188.168/100.65.188.168:2181, > sessionid = 0x25cd1e82c110001, negotiated timeout = 12000 0 > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Success: ZOOKEEPER- PreCommit Build #935
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/935/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 75.06 MB...] [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 4 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 3.0.1) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 core tests. The patch passed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/935//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/935//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/935//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] 998f013fe2fa165935dd8a9f5b90023629b1eecc logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] [exec] mv: '/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess' and '/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess' are the same file BUILD SUCCESSFUL Total time: 17 minutes 55 seconds Archiving artifacts Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Recording test results Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 [description-setter] Description set: ZOOKEEPER-1416 Putting comment on the pull request Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Email was triggered for: Success Sending email for trigger: Success Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (ZOOKEEPER-1416) Persistent Recursive Watch
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121059#comment-16121059 ] Hadoop QA commented on ZOOKEEPER-1416: -- +1 overall. GitHub Pull Request Build +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 3.0.1) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/935//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/935//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/935//console This message is automatically generated. > Persistent Recursive Watch > -- > > Key: ZOOKEEPER-1416 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416 > Project: ZooKeeper > Issue Type: Improvement > Components: c client, documentation, java client, server >Reporter: Phillip Liu >Assignee: Jordan Zimmerman > Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch > > Original Estimate: 504h > Remaining Estimate: 504h > > h4. The Problem > A ZooKeeper Watch can be placed on a single znode and when the znode changes > a Watch event is sent to the client. If there are thousands of znodes being > watched, when a client (re)connect, it would have to send thousands of watch > requests. At Facebook, we have this problem storing information for thousands > of db shards. Consequently a naming service that consumes the db shard > definition issues thousands of watch requests each time the service starts > and changes client watcher. > h4. Proposed Solution > We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent > means no Watch reset is necessary after a watch-fire. Recursive means the > Watch applies to the node and descendant nodes. A Persistent Recursive Watch > behaves as follows: > # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS. > # CHILDREN and DATA Recursive Watches can be placed on any znode. > # EXISTS Recursive Watches can be placed on any path. > # A Recursive Watch behaves like a auto-watch registrar on the server side. > Setting a Recursive Watch means to set watches on all descendant znodes. > # When a watch on a descendant fires, no subsequent event is fired until a > corresponding getData(..) on the znode is called, then Recursive Watch > automically apply the watch on the znode. This maintains the existing Watch > semantic on an individual znode. > # A Recursive Watch overrides any watches placed on a descendant znode. > Practically this means the Recursive Watch Watcher callback is the one > receiving the event and event is delivered exactly once. > A goal here is to reduce the number of semantic changes. The guarantee of no > intermediate watch event until data is read will be maintained. The only > difference is we will automatically re-add the watch after read. At the same > time we add the convience of reducing the need to add multiple watches for > sibling znodes and in turn reduce the number of watch messages sent from the > client to the server. > There are some implementation details that needs to be hashed out. Initial > thinking is to have the Recursive Watch create per-node watches. This will > cause a lot of watches to be created on the server side. Currently, each > watch is stored as a single bit in a bit set relative to a session - up to 3 > bits per client per znode. If there are 100m znodes with 100k clients, each > watching all nodes, then this strategy will consume approximately 3.75TB of > ram distributed across all Observers. Seems expensive. > Alternatively, a blacklist of paths to not send Watches regardless of Watch > setting can be set each time a watch event from a Recursive Watch is fired. > The memory utilization is relative to the number of outstanding reads and at > worst case it's 1/3 * 3.75TB using the parameters given above. > Otherwise, a relaxation of no intermediate watch event until read guarantee > is required. If the server can send watch events regardless of one has > already been fired without corresponding read, then the server can simply > fire watch events without tracking. -- This message was sent by
[jira] [Commented] (ZOOKEEPER-2871) Port ZOOKEEPER-1416 to 3.5.x
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121049#comment-16121049 ] Jordan Zimmerman commented on ZOOKEEPER-2871: - https://github.com/apache/zookeeper/pull/332 > Port ZOOKEEPER-1416 to 3.5.x > > > Key: ZOOKEEPER-2871 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2871 > Project: ZooKeeper > Issue Type: Sub-task > Components: c client, documentation, java client, server >Affects Versions: 3.5.3 >Reporter: Jordan Zimmerman >Assignee: Jordan Zimmerman > Fix For: 3.5.4 > > > Port the work of Persistent Recursive Watchers (ZOOKEEPER-1416) to 3.5.x -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ZOOKEEPER-1416) Persistent Recursive Watch
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121047#comment-16121047 ] ASF GitHub Bot commented on ZOOKEEPER-1416: --- GitHub user Randgalt opened a pull request: https://github.com/apache/zookeeper/pull/332 Port of ZOOKEEPER-1416 Persistent Recursive Watches You can merge this pull request into a Git repository by running: $ git pull https://github.com/Randgalt/zookeeper ZOOKEEPER-2871 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/332.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #332 commit 6767b58a9eeba389cf4eb3c9b80066e17b514b13 Author: randgaltDate: 2017-08-10T04:38:23Z Port of ZOOKEEPER-1416 Persistent Recursive Watches > Persistent Recursive Watch > -- > > Key: ZOOKEEPER-1416 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416 > Project: ZooKeeper > Issue Type: Improvement > Components: c client, documentation, java client, server >Reporter: Phillip Liu >Assignee: Jordan Zimmerman > Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch > > Original Estimate: 504h > Remaining Estimate: 504h > > h4. The Problem > A ZooKeeper Watch can be placed on a single znode and when the znode changes > a Watch event is sent to the client. If there are thousands of znodes being > watched, when a client (re)connect, it would have to send thousands of watch > requests. At Facebook, we have this problem storing information for thousands > of db shards. Consequently a naming service that consumes the db shard > definition issues thousands of watch requests each time the service starts > and changes client watcher. > h4. Proposed Solution > We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent > means no Watch reset is necessary after a watch-fire. Recursive means the > Watch applies to the node and descendant nodes. A Persistent Recursive Watch > behaves as follows: > # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS. > # CHILDREN and DATA Recursive Watches can be placed on any znode. > # EXISTS Recursive Watches can be placed on any path. > # A Recursive Watch behaves like a auto-watch registrar on the server side. > Setting a Recursive Watch means to set watches on all descendant znodes. > # When a watch on a descendant fires, no subsequent event is fired until a > corresponding getData(..) on the znode is called, then Recursive Watch > automically apply the watch on the znode. This maintains the existing Watch > semantic on an individual znode. > # A Recursive Watch overrides any watches placed on a descendant znode. > Practically this means the Recursive Watch Watcher callback is the one > receiving the event and event is delivered exactly once. > A goal here is to reduce the number of semantic changes. The guarantee of no > intermediate watch event until data is read will be maintained. The only > difference is we will automatically re-add the watch after read. At the same > time we add the convience of reducing the need to add multiple watches for > sibling znodes and in turn reduce the number of watch messages sent from the > client to the server. > There are some implementation details that needs to be hashed out. Initial > thinking is to have the Recursive Watch create per-node watches. This will > cause a lot of watches to be created on the server side. Currently, each > watch is stored as a single bit in a bit set relative to a session - up to 3 > bits per client per znode. If there are 100m znodes with 100k clients, each > watching all nodes, then this strategy will consume approximately 3.75TB of > ram distributed across all Observers. Seems expensive. > Alternatively, a blacklist of paths to not send Watches regardless of Watch > setting can be set each time a watch event from a Recursive Watch is fired. > The memory utilization is relative to the number of outstanding reads and at > worst case it's 1/3 * 3.75TB using the parameters given above. > Otherwise, a relaxation of no intermediate watch event until read guarantee > is required. If the server can send watch events regardless of one has > already been fired without corresponding read, then the server can simply > fire watch events without tracking. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] zookeeper pull request #332: Port of ZOOKEEPER-1416 Persistent Recursive Wat...
GitHub user Randgalt opened a pull request: https://github.com/apache/zookeeper/pull/332 Port of ZOOKEEPER-1416 Persistent Recursive Watches You can merge this pull request into a Git repository by running: $ git pull https://github.com/Randgalt/zookeeper ZOOKEEPER-2871 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/332.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #332 commit 6767b58a9eeba389cf4eb3c9b80066e17b514b13 Author: randgaltDate: 2017-08-10T04:38:23Z Port of ZOOKEEPER-1416 Persistent Recursive Watches --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (ZOOKEEPER-1416) Persistent Recursive Watch
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120989#comment-16120989 ] Hadoop QA commented on ZOOKEEPER-1416: -- +1 overall. GitHub Pull Request Build +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 3.0.1) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/934//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/934//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/934//console This message is automatically generated. > Persistent Recursive Watch > -- > > Key: ZOOKEEPER-1416 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416 > Project: ZooKeeper > Issue Type: Improvement > Components: c client, documentation, java client, server >Reporter: Phillip Liu >Assignee: Jordan Zimmerman > Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch > > Original Estimate: 504h > Remaining Estimate: 504h > > h4. The Problem > A ZooKeeper Watch can be placed on a single znode and when the znode changes > a Watch event is sent to the client. If there are thousands of znodes being > watched, when a client (re)connect, it would have to send thousands of watch > requests. At Facebook, we have this problem storing information for thousands > of db shards. Consequently a naming service that consumes the db shard > definition issues thousands of watch requests each time the service starts > and changes client watcher. > h4. Proposed Solution > We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent > means no Watch reset is necessary after a watch-fire. Recursive means the > Watch applies to the node and descendant nodes. A Persistent Recursive Watch > behaves as follows: > # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS. > # CHILDREN and DATA Recursive Watches can be placed on any znode. > # EXISTS Recursive Watches can be placed on any path. > # A Recursive Watch behaves like a auto-watch registrar on the server side. > Setting a Recursive Watch means to set watches on all descendant znodes. > # When a watch on a descendant fires, no subsequent event is fired until a > corresponding getData(..) on the znode is called, then Recursive Watch > automically apply the watch on the znode. This maintains the existing Watch > semantic on an individual znode. > # A Recursive Watch overrides any watches placed on a descendant znode. > Practically this means the Recursive Watch Watcher callback is the one > receiving the event and event is delivered exactly once. > A goal here is to reduce the number of semantic changes. The guarantee of no > intermediate watch event until data is read will be maintained. The only > difference is we will automatically re-add the watch after read. At the same > time we add the convience of reducing the need to add multiple watches for > sibling znodes and in turn reduce the number of watch messages sent from the > client to the server. > There are some implementation details that needs to be hashed out. Initial > thinking is to have the Recursive Watch create per-node watches. This will > cause a lot of watches to be created on the server side. Currently, each > watch is stored as a single bit in a bit set relative to a session - up to 3 > bits per client per znode. If there are 100m znodes with 100k clients, each > watching all nodes, then this strategy will consume approximately 3.75TB of > ram distributed across all Observers. Seems expensive. > Alternatively, a blacklist of paths to not send Watches regardless of Watch > setting can be set each time a watch event from a Recursive Watch is fired. > The memory utilization is relative to the number of outstanding reads and at > worst case it's 1/3 * 3.75TB using the parameters given above. > Otherwise, a relaxation of no intermediate watch event until read guarantee > is required. If the server can send watch events regardless of one has > already been fired without corresponding read, then the server can simply > fire watch events without tracking. -- This message was sent by
[jira] [Resolved] (ZOOKEEPER-2866) Reconfig Causes Newly Joined Node to Crash
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Shraer resolved ZOOKEEPER-2866. - Resolution: Not A Problem > Reconfig Causes Newly Joined Node to Crash > -- > > Key: ZOOKEEPER-2866 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2866 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection, quorum, server >Affects Versions: 3.5.3 >Reporter: Jeffrey F. Lukman > Attachments: ZK-2866.pdf > > > When we run our Distributed system Model Checking (DMCK) in ZooKeeper v3.5.3 > by following the workload in ZK-2778: > * initially start 2 ZooKeeper nodes > * start 3 new nodes and let them join the cluster > * do a reconfiguration where the newly joined will be PARTICIPANTS, > while the previous 2 nodes change to be OBSERVERS > We think our DMCK found this following bug: > * one of the newly joined node crashes due to > it receives an *unexpected* PROPOSAL message > from the new leader in the cluster. > For complete information of the bug, please see the document that is attached. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ZOOKEEPER-2866) Reconfig Causes Newly Joined Node to Crash
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120987#comment-16120987 ] Alexander Shraer commented on ZOOKEEPER-2866: - Discussed offline with [~castuardo] and we currently believe that there is no bug. Please reopen the Jira if needed. > Reconfig Causes Newly Joined Node to Crash > -- > > Key: ZOOKEEPER-2866 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2866 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection, quorum, server >Affects Versions: 3.5.3 >Reporter: Jeffrey F. Lukman > Attachments: ZK-2866.pdf > > > When we run our Distributed system Model Checking (DMCK) in ZooKeeper v3.5.3 > by following the workload in ZK-2778: > * initially start 2 ZooKeeper nodes > * start 3 new nodes and let them join the cluster > * do a reconfiguration where the newly joined will be PARTICIPANTS, > while the previous 2 nodes change to be OBSERVERS > We think our DMCK found this following bug: > * one of the newly joined node crashes due to > it receives an *unexpected* PROPOSAL message > from the new leader in the cluster. > For complete information of the bug, please see the document that is attached. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Success: ZOOKEEPER- PreCommit Build #934
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/934/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 74.57 MB...] [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 4 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 3.0.1) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 core tests. The patch passed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/934//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/934//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/934//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] 2b8039bc8b950232a4044c3c1f9ca60d093140f0 logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] [exec] mv: '/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess' and '/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess' are the same file BUILD SUCCESSFUL Total time: 19 minutes 0 seconds Archiving artifacts Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Recording test results Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 [description-setter] Description set: ZOOKEEPER-1416 Putting comment on the pull request Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Email was triggered for: Success Sending email for trigger: Success Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Assigned] (ZOOKEEPER-2866) Reconfig Causes Newly Joined Node to Crash
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Shraer reassigned ZOOKEEPER-2866: --- Assignee: Alexander Shraer > Reconfig Causes Newly Joined Node to Crash > -- > > Key: ZOOKEEPER-2866 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2866 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection, quorum, server >Affects Versions: 3.5.3 >Reporter: Jeffrey F. Lukman >Assignee: Alexander Shraer > Attachments: ZK-2866.pdf > > > When we run our Distributed system Model Checking (DMCK) in ZooKeeper v3.5.3 > by following the workload in ZK-2778: > * initially start 2 ZooKeeper nodes > * start 3 new nodes and let them join the cluster > * do a reconfiguration where the newly joined will be PARTICIPANTS, > while the previous 2 nodes change to be OBSERVERS > We think our DMCK found this following bug: > * one of the newly joined node crashes due to > it receives an *unexpected* PROPOSAL message > from the new leader in the cluster. > For complete information of the bug, please see the document that is attached. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (ZOOKEEPER-2871) Port ZOOKEEPER-1416 to 3.5.x
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jordan Zimmerman reassigned ZOOKEEPER-2871: --- Assignee: Jordan Zimmerman > Port ZOOKEEPER-1416 to 3.5.x > > > Key: ZOOKEEPER-2871 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2871 > Project: ZooKeeper > Issue Type: Sub-task > Components: c client, documentation, java client, server >Affects Versions: 3.5.3 >Reporter: Jordan Zimmerman >Assignee: Jordan Zimmerman > Fix For: 3.5.4 > > > Port the work of Persistent Recursive Watchers (ZOOKEEPER-1416) to 3.5.x -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (ZOOKEEPER-2871) Port ZOOKEEPER-1416 to 3.5.x
Jordan Zimmerman created ZOOKEEPER-2871: --- Summary: Port ZOOKEEPER-1416 to 3.5.x Key: ZOOKEEPER-2871 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2871 Project: ZooKeeper Issue Type: Sub-task Components: c client, documentation, java client, server Affects Versions: 3.5.3 Reporter: Jordan Zimmerman Fix For: 3.5.4 Port the work of Persistent Recursive Watchers (ZOOKEEPER-1416) to 3.5.x -- This message was sent by Atlassian JIRA (v6.4.14#64029)
ZooKeeper-trunk - Build # 3492 - Still Failing
See https://builds.apache.org/job/ZooKeeper-trunk/3492/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 68.54 MB...] [exec] Log Message Received: [2017-08-09 23:32:51,246:1624(0x7f6e3da14740):ZOO_INFO@zookeeper_init_internal@1151: Initiating client connection, host=127.0.0.1:22181 sessionTimeout=1 watcher=0x459d00 sessionId=0 sessionPasswd= context=0x7ffeae7a3270 flags=0] [exec] Log Message Received: [2017-08-09 23:32:51,247:1624(0x7f6e3af08700):ZOO_INFO@check_events@2424: initiated connection to server [127.0.0.1:22181]] [exec] Log Message Received: [2017-08-09 23:32:51,249:1624(0x7f6e3af08700):ZOO_INFO@check_events@2476: session establishment complete on server [127.0.0.1:22181], sessionId=0x105f3075acb000f, negotiated timeout=1 ] [exec] : elapsed 1001 : OK [exec] Zookeeper_simpleSystem::testAsyncWatcherAutoReset ZooKeeper server started : elapsed 11149 : OK [exec] Zookeeper_simpleSystem::testDeserializeString : elapsed 0 : OK [exec] Zookeeper_simpleSystem::testFirstServerDown : elapsed 1000 : OK [exec] Zookeeper_simpleSystem::testNullData : elapsed 4012 : OK [exec] Zookeeper_simpleSystem::testIPV6 : elapsed 1519 : OK [exec] Zookeeper_simpleSystem::testCreate : elapsed 4801 : OK [exec] Zookeeper_simpleSystem::testPath : elapsed 8499 : OK [exec] Zookeeper_simpleSystem::testPathValidation : elapsed 1052 : OK [exec] Zookeeper_simpleSystem::testPing : elapsed 17277 : OK [exec] Zookeeper_simpleSystem::testAcl : elapsed 1019 : OK [exec] Zookeeper_simpleSystem::testChroot : elapsed 3028 : OK [exec] Zookeeper_simpleSystem::testAuth ZooKeeper server started ZooKeeper server started : elapsed 34404 : OK [exec] Zookeeper_simpleSystem::testHangingClient : elapsed 1021 : OK [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithGlobal ZooKeeper server started ZooKeeper server started ZooKeeper server started : elapsed 17447 : OK [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithLocal ZooKeeper server started ZooKeeper server started ZooKeeper server started : elapsed 24269 : OK [exec] Zookeeper_simpleSystem::testGetChildren2 : elapsed 1024 : OK [exec] Zookeeper_simpleSystem::testLastZxid : elapsed 6142 : OK [exec] Zookeeper_simpleSystem::testRemoveWatchers ZooKeeper server started : elapsed 11988 : OK [exec] Zookeeper_readOnly::testReadOnly : assertion : elapsed 6659 [exec] /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/src/c/tests/TestReadOnlyClient.cc:99: Assertion: equality assertion failed [Expected: 0, Actual : -4] [exec] Failures !!! [exec] Run: 74 Failure total: 1 Failures: 1 Errors: 0 [exec] FAIL: zktest-mt [exec] == [exec] 1 of 2 tests failed [exec] Please report to u...@zookeeper.apache.org [exec] == [exec] Makefile:1744: recipe for target 'check-TESTS' failed [exec] make[1]: Leaving directory '/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build/test/test-cppunit' [exec] Makefile:2000: recipe for target 'check-am' failed [exec] make[1]: *** [check-TESTS] Error 1 [exec] make: *** [check-am] Error 2 BUILD FAILED /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build.xml:1339: The following error occurred while executing this line: /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build.xml:1299: The following error occurred while executing this line: /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build.xml:1309: exec returned: 2 Total time: 18 minutes 59 seconds Build step 'Execute shell' marked build as failure [FINDBUGS] Skipping publisher since build result is FAILURE [WARNINGS] Skipping publisher since build result is FAILURE Archiving artifacts Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Recording fingerprints Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Recording test results Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Publishing Javadoc Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Email was triggered for: Failure - Any Sending email for trigger: Failure - Any Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (ZOOKEEPER-2867) an expired ZK session can be re-established
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120800#comment-16120800 ] Jun Rao commented on ZOOKEEPER-2867: [~hanm], I tried it locally. It seems that every time a session is closed. The log4j log will log the deleting of ephemeral nodes with the session and type:closeSession in FinalRequestProcessor. {code:java} [2017-08-09 16:00:46,325] DEBUG Processing request:: sessionid:0x15dc93a4f1a type:closeSession cxid:0x3f zxid:0x39 txntype:-11 reqpath:n/a (org.apache.zookeeper.server.FinalRequestProcessor) [2017-08-09 16:00:46,325] DEBUG Deleting ephemeral node /brokers/ids/0 for session 0x15dc93a4f1a (org.apache.zookeeper.server.DataTree) [2017-08-09 16:00:46,326] DEBUG sessionid:0x15dc93a4f1a type:closeSession cxid:0x3f zxid:0x39 txntype:-11 reqpath:n/a (org.apache.zookeeper.server.FinalRequestProcessor) {code} However, for the above incident, I didn't find any logging of closeSession in the log4j log. > an expired ZK session can be re-established > --- > > Key: ZOOKEEPER-2867 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2867 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.4.10 >Reporter: Jun Rao > Attachments: zk.0.formatted, zk.1.formatted > > > Not sure if this is a real bug, but I found an instance when a ZK client > seems to be able to renew a session already expired by the ZK server. > From ZK server log, session 25cd1e82c110001 was expired at 22:04:39. > {code:java} > June 27th 2017, 22:04:39.000 INFO > org.apache.zookeeper.server.ZooKeeperServer Expiring session > 0x25cd1e82c110001, timeout of 12000ms exceeded > June 27th 2017, 22:04:39.001 DEBUG > org.apache.zookeeper.server.quorum.Leader Proposing:: > sessionid:0x25cd1e82c110001 type:closeSession cxid:0x0 zxid:0x20fc4 > txntype:-11 reqpath:n/a > June 27th 2017, 22:04:39.001 INFO > org.apache.zookeeper.server.PrepRequestProcessorProcessed session > termination for sessionid: 0x25cd1e82c110001 > June 27th 2017, 22:04:39.001 DEBUG > org.apache.zookeeper.server.quorum.CommitProcessor Processing request:: > sessionid:0x25cd1e82c110001 type:closeSession cxid:0x0 zxid:0x20fc4 > txntype:-11 reqpath:n/a > June 27th 2017, 22:05:20.324 INFO > org.apache.zookeeper.server.quorum.Learner Revalidating client: > 0x25cd1e82c110001 > June 27th 2017, 22:05:20.324 INFO > org.apache.zookeeper.server.ZooKeeperServer Client attempting to renew > session 0x25cd1e82c110001 at /100.96.5.6:47618 > June 27th 2017, 22:05:20.325 INFO > org.apache.zookeeper.server.ZooKeeperServer Established session > 0x25cd1e82c110001 with negotiated timeout 12000 for client /100.96.5.6:47618 > {code} > From ZK client's log, it was able to renew the expired session on 22:05:20. > {code:java} > June 27th 2017, 22:05:18.590 INFOorg.apache.zookeeper.ClientCnxn Client > session timed out, have not heard from server in 4485ms for sessionid > 0x25cd1e82c110001, closing socket connection and attempting reconnect 0 > June 27th 2017, 22:05:18.590 WARNorg.apache.zookeeper.ClientCnxn Client > session timed out, have not heard from server in 4485ms for sessionid > 0x25cd1e82c110001 0 > June 27th 2017, 22:05:19.325 WARNorg.apache.zookeeper.ClientCnxn SASL > configuration failed: javax.security.auth.login.LoginException: No JAAS > configuration section named 'Client' was found in specified JAAS > configuration file: '/opt/confluent/etc/kafka/server_jaas.conf'. Will > continue connection to Zookeeper server without SASL authentication, if > Zookeeper server allows it. 0 > June 27th 2017, 22:05:19.326 INFOorg.apache.zookeeper.ClientCnxn Opening > socket connection to server 100.65.188.168/100.65.188.168:2181 0 > June 27th 2017, 22:05:20.324 INFOorg.apache.zookeeper.ClientCnxn Socket > connection established to 100.65.188.168/100.65.188.168:2181, initiating > session 0 > June 27th 2017, 22:05:20.327 INFOorg.apache.zookeeper.ClientCnxn Session > establishment complete on server 100.65.188.168/100.65.188.168:2181, > sessionid = 0x25cd1e82c110001, negotiated timeout = 12000 0 > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
ZooKeeper-trunk-openjdk7 - Build # 1572 - Still Failing
See https://builds.apache.org/job/ZooKeeper-trunk-openjdk7/1572/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 60.70 MB...] [junit] at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88) [junit] at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [junit] at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) [junit] at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) [junit] at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) [junit] at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) [junit] at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) [junit] at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) [junit] at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) [junit] at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) [junit] at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [junit] at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) [junit] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [junit] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [junit] at java.lang.Thread.run(Thread.java:745) [junit] 2017-08-09 20:34:13,498 [myid:] - INFO [New I/O worker #11598:ClientCnxnSocketNetty$ZKClientHandler@384] - channel is disconnected: [id: 0xdc8c6010, /127.0.0.1:41994 :> 127.0.0.1/127.0.0.1:19479] [junit] 2017-08-09 20:34:13,498 [myid:] - INFO [New I/O worker #11598:ClientCnxnSocketNetty@208] - channel is told closing [junit] 2017-08-09 20:34:13,498 [myid:127.0.0.1:19479] - INFO [main-SendThread(127.0.0.1:19479):ClientCnxn$SendThread@1231] - channel for sessionid 0x205f39e94030001 is lost, closing socket connection and attempting reconnect [junit] 2017-08-09 20:34:13,770 [myid:127.0.0.1:19479] - INFO [main-SendThread(127.0.0.1:19479):ClientCnxn$SendThread@1113] - Opening socket connection to server 127.0.0.1/127.0.0.1:19479. Will not attempt to authenticate using SASL (unknown error) [junit] 2017-08-09 20:34:13,771 [myid:] - INFO [New I/O worker #11445:ClientCnxn$SendThread@946] - Socket connection established, initiating session, client: /127.0.0.1:42008, server: 127.0.0.1/127.0.0.1:19479 [junit] 2017-08-09 20:34:13,772 [myid:] - INFO [New I/O worker #11445:ClientCnxnSocketNetty$1@153] - channel is connected: [id: 0x84cf4658, /127.0.0.1:42008 => 127.0.0.1/127.0.0.1:19479] [junit] 2017-08-09 20:34:13,772 [myid:] - WARN [New I/O worker #11285:NettyServerCnxn@426] - Closing connection to /127.0.0.1:42008 [junit] java.io.IOException: ZK down [junit] at org.apache.zookeeper.server.NettyServerCnxn.receiveMessage(NettyServerCnxn.java:363) [junit] at org.apache.zookeeper.server.NettyServerCnxnFactory$CnxnChannelHandler.processMessage(NettyServerCnxnFactory.java:244) [junit] at org.apache.zookeeper.server.NettyServerCnxnFactory$CnxnChannelHandler.messageReceived(NettyServerCnxnFactory.java:166) [junit] at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88) [junit] at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [junit] at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) [junit] at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) [junit] at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) [junit] at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) [junit] at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) [junit] at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) [junit] at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) [junit] at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) [junit] at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [junit] at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) [junit] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [junit] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [junit] at
Success: ZOOKEEPER- PreCommit Build #933
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/933/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 68.99 MB...] [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +0 tests included. The patch appears to be a documentation patch that doesn't require tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 3.0.1) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 core tests. The patch passed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/933//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/933//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/933//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] 98b0c69ab65abd659f8cf3fac20ebf3b31bce241 logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] [exec] mv: '/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess' and '/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess' are the same file BUILD SUCCESSFUL Total time: 18 minutes 24 seconds Archiving artifacts Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Recording test results Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 [description-setter] Description set: ZOOKEEPER-2870 Putting comment on the pull request Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Email was triggered for: Success Sending email for trigger: Success Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (ZOOKEEPER-2870) Improve the efficiency of AtomicFileOutputStream
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120593#comment-16120593 ] Hadoop QA commented on ZOOKEEPER-2870: -- +1 overall. GitHub Pull Request Build +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 3.0.1) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/933//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/933//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/933//console This message is automatically generated. > Improve the efficiency of AtomicFileOutputStream > > > Key: ZOOKEEPER-2870 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2870 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.4.10, 3.5.3, 3.6.0 >Reporter: Fangmin Lv >Assignee: Fangmin Lv > > The AtomicFileOutputStream extends from FilterOutputStream, where the write > function writes data to underlying stream byte by byte: > https://searchcode.com/codesearch/view/17990706/, which is very inefficient. > Currently, we only this this class to write the dynamic config, because it's > quite small it won't be a big problem. But in the future we may want to use > this class to write the snapshot file, which will take much longer time, > tested inside, writing 600MB snapshot will take more than 10 minutes, while > using FileOutputStream directly only takes 6s. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ZOOKEEPER-2870) Improve the efficiency of AtomicFileOutputStream
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120565#comment-16120565 ] ASF GitHub Bot commented on ZOOKEEPER-2870: --- GitHub user lvfangmin opened a pull request: https://github.com/apache/zookeeper/pull/331 [ZOOKEEPER-2870] Improve the efficiency of AtomicFileOutputStream You can merge this pull request into a Git repository by running: $ git pull https://github.com/lvfangmin/zookeeper ZOOKEEPER-2870 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/331.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #331 commit 7154ccd9c0fa489b070645ba3bcfc4c9e25d4683 Author: Fangmin LyuDate: 2017-08-09T20:01:02Z [ZOOKEEPER-2870] Improve the efficiency of AtomicFileOutputStream > Improve the efficiency of AtomicFileOutputStream > > > Key: ZOOKEEPER-2870 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2870 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.4.10, 3.5.3, 3.6.0 >Reporter: Fangmin Lv >Assignee: Fangmin Lv > > The AtomicFileOutputStream extends from FilterOutputStream, where the write > function writes data to underlying stream byte by byte: > https://searchcode.com/codesearch/view/17990706/, which is very inefficient. > Currently, we only this this class to write the dynamic config, because it's > quite small it won't be a big problem. But in the future we may want to use > this class to write the snapshot file, which will take much longer time, > tested inside, writing 600MB snapshot will take more than 10 minutes, while > using FileOutputStream directly only takes 6s. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] zookeeper pull request #331: [ZOOKEEPER-2870] Improve the efficiency of Atom...
GitHub user lvfangmin opened a pull request: https://github.com/apache/zookeeper/pull/331 [ZOOKEEPER-2870] Improve the efficiency of AtomicFileOutputStream You can merge this pull request into a Git repository by running: $ git pull https://github.com/lvfangmin/zookeeper ZOOKEEPER-2870 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/331.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #331 commit 7154ccd9c0fa489b070645ba3bcfc4c9e25d4683 Author: Fangmin LyuDate: 2017-08-09T20:01:02Z [ZOOKEEPER-2870] Improve the efficiency of AtomicFileOutputStream --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (ZOOKEEPER-2870) Improve the efficiency of AtomicFileOutputStream
Fangmin Lv created ZOOKEEPER-2870: - Summary: Improve the efficiency of AtomicFileOutputStream Key: ZOOKEEPER-2870 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2870 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.5.3, 3.4.10, 3.6.0 Reporter: Fangmin Lv Assignee: Fangmin Lv The AtomicFileOutputStream extends from FilterOutputStream, where the write function writes data to underlying stream byte by byte: https://searchcode.com/codesearch/view/17990706/, which is very inefficient. Currently, we only this this class to write the dynamic config, because it's quite small it won't be a big problem. But in the future we may want to use this class to write the snapshot file, which will take much longer time, tested inside, writing 600MB snapshot will take more than 10 minutes, while using FileOutputStream directly only takes 6s. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Success: ZOOKEEPER- PreCommit Build #932
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/932/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 72.20 MB...] [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +0 tests included. The patch appears to be a documentation patch that doesn't require tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 3.0.1) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 core tests. The patch passed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/932//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/932//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/932//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] ef401405e05f7080cdbac069083dc45433bfe011 logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] [exec] mv: '/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build@2/patchprocess' and '/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build@2/patchprocess' are the same file BUILD SUCCESSFUL Total time: 19 minutes 23 seconds Archiving artifacts Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Recording test results Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 [description-setter] Description set: ZOOKEEPER-2471 Putting comment on the pull request Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Email was triggered for: Success Sending email for trigger: Success Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (ZOOKEEPER-2471) Java Zookeeper Client incorrectly considers time spent sleeping as time spent connecting, potentially resulting in infinite reconnect loop
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120512#comment-16120512 ] Hadoop QA commented on ZOOKEEPER-2471: -- +1 overall. GitHub Pull Request Build +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 3.0.1) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/932//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/932//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/932//console This message is automatically generated. > Java Zookeeper Client incorrectly considers time spent sleeping as time spent > connecting, potentially resulting in infinite reconnect loop > -- > > Key: ZOOKEEPER-2471 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2471 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.3 > Environment: all >Reporter: Dan Benediktson >Assignee: Dan Benediktson > Attachments: ZOOKEEPER-2471.patch > > > ClientCnxnSocket uses a member variable "now" to track the current time, and > lastSend / lastHeard variables to track socket liveness. Implementations, and > even ClientCnxn itself, are expected to call both updateNow() to reset "now" > to System.currentTimeMillis, and then call updateLastSend()/updateLastHeard() > on IO completions. > This is a fragile contract, so it's not surprising that there's a bug > resulting from it: ClientCnxn.SendThread.run() calls updateLastSendAndHeard() > as soon as startConnect() returns, but it does not call updateNow() first. I > expect when this was written, either the expectation was that startConnect() > was an asynchronous operation and that updateNow() would have been called > very recently, or simply the requirement to call updateNow() was forgotten at > this point. As far as I can see, this bug has been present since the > "updateNow" method was first introduced in the distant past. As it turns out, > since startConnect() calls HostProvider.next(), which can sleep, quite a lot > of time can pass, leaving a big gap between "now" and now. > If you are using very short session timeouts (one of our ZK ensembles has > many clients using a 1-second timeout), this is potentially disastrous, > because the sleep time may exceed the connection timeout itself, which can > potentially result in the Java client being stuck in a perpetual reconnect > loop. The exact code path it goes through in this case is complicated, > because there has to be a previously-closed socket still waiting in the > selector (otherwise, the first timeout evaluation will not fail because "now" > still hasn't been updated, and then the actual connect timeout will be > applied in ClientCnxnSocket.doTransport()) so that select() will harvest the > IO from the previous socket and updateNow(), resulting in the next loop > through ClientCnxnSocket.SendThread.run() observing the spurious timeout and > failing. In practice it does happen to us fairly frequently; we only got to > the bottom of the bug yesterday. Worse, when it does happen, the Zookeeper > client object is rendered unusable: it's stuck in a perpetual reconnect loop > where it keeps sleeping, opening a socket, and immediately closing it. > I have a patch. Rather than calling updateNow() right after startConnect(), > my fix is to remove the "now" member variable and the updateNow() method > entirely, and to instead just call System.currentTimeMillis() whenever time > needs to be evaluated. I realize there is a benefit (aside from a trivial > micro-optimization not worth worrying about) to having the time be "fixed", > particularly for truth in the logging: if time is fixed by an updateNow() > call, then the log for a timeout will still show exactly the same value the > code reasoned about. However, this benefit is in my opinion not enough to > merit the fragility of the contract which led to this (for us) highly > impactful and
[jira] [Commented] (ZOOKEEPER-2786) Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120510#comment-16120510 ] Hudson commented on ZOOKEEPER-2786: --- FAILURE: Integrated in Jenkins build ZooKeeper-trunk #3491 (See [https://builds.apache.org/job/ZooKeeper-trunk/3491/]) ZOOKEEPER-2786: Flaky test: (hanm: rev e104175bb47baeb800354078c015e78bfcb7c953) * (edit) src/java/main/org/apache/zookeeper/server/NettyServerCnxn.java > Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode > -- > > Key: ZOOKEEPER-2786 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2786 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.4.10, 3.5.3 >Reporter: Abraham Fine >Assignee: Abraham Fine > Fix For: 3.5.4, 3.6.0, 3.4.11 > > > This test is broken on 3.4 and 3.5, but is broken in "different" ways. Please > see the individual pull requests for detailed descriptions for the issues > faced in both branches. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
ZooKeeper-trunk - Build # 3491 - Failure
See https://builds.apache.org/job/ZooKeeper-trunk/3491/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 66.43 MB...] [junit] 2017-08-09 19:26:35,648 [myid:] - INFO [New I/O boss #10731:ClientCnxnSocketNetty@208] - channel is told closing [junit] 2017-08-09 19:26:35,648 [myid:127.0.0.1:13979] - INFO [main-SendThread(127.0.0.1:13979):ClientCnxn$SendThread@1231] - channel for sessionid 0x105f2210902 is lost, closing socket connection and attempting reconnect [junit] 2017-08-09 19:26:35,711 [myid:127.0.0.1:14044] - INFO [main-SendThread(127.0.0.1:14044):ClientCnxn$SendThread@1113] - Opening socket connection to server 127.0.0.1/127.0.0.1:14044. Will not attempt to authenticate using SASL (unknown error) [junit] 2017-08-09 19:26:35,712 [myid:] - INFO [New I/O boss #15141:ClientCnxnSocketNetty$1@127] - future isn't success, cause: {} [junit] java.net.ConnectException: Connection refused: 127.0.0.1/127.0.0.1:14044 [junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) [junit] at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744) [junit] at org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152) [junit] at org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105) [junit] at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79) [junit] at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) [junit] at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42) [junit] at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [junit] at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) [junit] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [junit] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [junit] at java.lang.Thread.run(Thread.java:745) [junit] 2017-08-09 19:26:35,712 [myid:] - WARN [New I/O boss #15141:ClientCnxnSocketNetty$ZKClientHandler@439] - Exception caught: [id: 0xf1c0c84e] EXCEPTION: java.net.ConnectException: Connection refused: 127.0.0.1/127.0.0.1:14044 [junit] java.net.ConnectException: Connection refused: 127.0.0.1/127.0.0.1:14044 [junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) [junit] at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744) [junit] at org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152) [junit] at org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105) [junit] at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79) [junit] at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) [junit] at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42) [junit] at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [junit] at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) [junit] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [junit] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [junit] at java.lang.Thread.run(Thread.java:745) [junit] 2017-08-09 19:26:35,713 [myid:] - INFO [New I/O boss #15141:ClientCnxnSocketNetty@208] - channel is told closing [junit] 2017-08-09 19:26:35,713 [myid:127.0.0.1:14044] - INFO [main-SendThread(127.0.0.1:14044):ClientCnxn$SendThread@1231] - channel for sessionid 0x305f2235730 is lost, closing socket connection and attempting reconnect fail.build.on.test.failure: BUILD FAILED /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build.xml:1339: The following error occurred while executing this line: /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build.xml:1220: The following error occurred while executing this line: /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build.xml:1224: Tests failed! Total time: 9 minutes 59 seconds Build step 'Execute shell' marked build as failure [FINDBUGS] Skipping publisher since build result is FAILURE [WARNINGS] Skipping publisher since build result is FAILURE Archiving artifacts Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Recording fingerprints Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 [JIRA] Updating issue ZOOKEEPER-2786 Recording test results Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Publishing Javadoc Setting
[jira] [Commented] (ZOOKEEPER-1416) Persistent Recursive Watch
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120481#comment-16120481 ] Hadoop QA commented on ZOOKEEPER-1416: -- +1 overall. GitHub Pull Request Build +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 3.0.1) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/931//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/931//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/931//console This message is automatically generated. > Persistent Recursive Watch > -- > > Key: ZOOKEEPER-1416 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416 > Project: ZooKeeper > Issue Type: Improvement > Components: c client, documentation, java client, server >Reporter: Phillip Liu >Assignee: Jordan Zimmerman > Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch > > Original Estimate: 504h > Remaining Estimate: 504h > > h4. The Problem > A ZooKeeper Watch can be placed on a single znode and when the znode changes > a Watch event is sent to the client. If there are thousands of znodes being > watched, when a client (re)connect, it would have to send thousands of watch > requests. At Facebook, we have this problem storing information for thousands > of db shards. Consequently a naming service that consumes the db shard > definition issues thousands of watch requests each time the service starts > and changes client watcher. > h4. Proposed Solution > We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent > means no Watch reset is necessary after a watch-fire. Recursive means the > Watch applies to the node and descendant nodes. A Persistent Recursive Watch > behaves as follows: > # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS. > # CHILDREN and DATA Recursive Watches can be placed on any znode. > # EXISTS Recursive Watches can be placed on any path. > # A Recursive Watch behaves like a auto-watch registrar on the server side. > Setting a Recursive Watch means to set watches on all descendant znodes. > # When a watch on a descendant fires, no subsequent event is fired until a > corresponding getData(..) on the znode is called, then Recursive Watch > automically apply the watch on the znode. This maintains the existing Watch > semantic on an individual znode. > # A Recursive Watch overrides any watches placed on a descendant znode. > Practically this means the Recursive Watch Watcher callback is the one > receiving the event and event is delivered exactly once. > A goal here is to reduce the number of semantic changes. The guarantee of no > intermediate watch event until data is read will be maintained. The only > difference is we will automatically re-add the watch after read. At the same > time we add the convience of reducing the need to add multiple watches for > sibling znodes and in turn reduce the number of watch messages sent from the > client to the server. > There are some implementation details that needs to be hashed out. Initial > thinking is to have the Recursive Watch create per-node watches. This will > cause a lot of watches to be created on the server side. Currently, each > watch is stored as a single bit in a bit set relative to a session - up to 3 > bits per client per znode. If there are 100m znodes with 100k clients, each > watching all nodes, then this strategy will consume approximately 3.75TB of > ram distributed across all Observers. Seems expensive. > Alternatively, a blacklist of paths to not send Watches regardless of Watch > setting can be set each time a watch event from a Recursive Watch is fired. > The memory utilization is relative to the number of outstanding reads and at > worst case it's 1/3 * 3.75TB using the parameters given above. > Otherwise, a relaxation of no intermediate watch event until read guarantee > is required. If the server can send watch events regardless of one has > already been fired without corresponding read, then the server can simply > fire watch events without tracking. -- This message was sent by
Success: ZOOKEEPER- PreCommit Build #931
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/931/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 72.85 MB...] [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 4 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 3.0.1) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 core tests. The patch passed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/931//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/931//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/931//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] 3ca8decd53363e2c5b4978cddf0012ee01072d9f logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] [exec] mv: '/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess' and '/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess' are the same file BUILD SUCCESSFUL Total time: 20 minutes 11 seconds Archiving artifacts Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Recording test results Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 [description-setter] Description set: ZOOKEEPER-1416 Putting comment on the pull request Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Email was triggered for: Success Sending email for trigger: Success Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 ### ## FAILED TESTS (if any) ## All tests passed
Success: ZOOKEEPER- PreCommit Build #930
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/930/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 70.97 MB...] [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 4 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 3.0.1) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 core tests. The patch passed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/930//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/930//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/930//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] fa7027d6dfe25fe116f23b3c25df194b836e66a6 logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] [exec] mv: '/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess' and '/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess' are the same file BUILD SUCCESSFUL Total time: 18 minutes 37 seconds Archiving artifacts Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Recording test results Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 [description-setter] Description set: ZOOKEEPER-1416 Putting comment on the pull request Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Email was triggered for: Success Sending email for trigger: Success Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (ZOOKEEPER-1416) Persistent Recursive Watch
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120479#comment-16120479 ] Hadoop QA commented on ZOOKEEPER-1416: -- +1 overall. GitHub Pull Request Build +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 3.0.1) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/930//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/930//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/930//console This message is automatically generated. > Persistent Recursive Watch > -- > > Key: ZOOKEEPER-1416 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416 > Project: ZooKeeper > Issue Type: Improvement > Components: c client, documentation, java client, server >Reporter: Phillip Liu >Assignee: Jordan Zimmerman > Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch > > Original Estimate: 504h > Remaining Estimate: 504h > > h4. The Problem > A ZooKeeper Watch can be placed on a single znode and when the znode changes > a Watch event is sent to the client. If there are thousands of znodes being > watched, when a client (re)connect, it would have to send thousands of watch > requests. At Facebook, we have this problem storing information for thousands > of db shards. Consequently a naming service that consumes the db shard > definition issues thousands of watch requests each time the service starts > and changes client watcher. > h4. Proposed Solution > We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent > means no Watch reset is necessary after a watch-fire. Recursive means the > Watch applies to the node and descendant nodes. A Persistent Recursive Watch > behaves as follows: > # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS. > # CHILDREN and DATA Recursive Watches can be placed on any znode. > # EXISTS Recursive Watches can be placed on any path. > # A Recursive Watch behaves like a auto-watch registrar on the server side. > Setting a Recursive Watch means to set watches on all descendant znodes. > # When a watch on a descendant fires, no subsequent event is fired until a > corresponding getData(..) on the znode is called, then Recursive Watch > automically apply the watch on the znode. This maintains the existing Watch > semantic on an individual znode. > # A Recursive Watch overrides any watches placed on a descendant znode. > Practically this means the Recursive Watch Watcher callback is the one > receiving the event and event is delivered exactly once. > A goal here is to reduce the number of semantic changes. The guarantee of no > intermediate watch event until data is read will be maintained. The only > difference is we will automatically re-add the watch after read. At the same > time we add the convience of reducing the need to add multiple watches for > sibling znodes and in turn reduce the number of watch messages sent from the > client to the server. > There are some implementation details that needs to be hashed out. Initial > thinking is to have the Recursive Watch create per-node watches. This will > cause a lot of watches to be created on the server side. Currently, each > watch is stored as a single bit in a bit set relative to a session - up to 3 > bits per client per znode. If there are 100m znodes with 100k clients, each > watching all nodes, then this strategy will consume approximately 3.75TB of > ram distributed across all Observers. Seems expensive. > Alternatively, a blacklist of paths to not send Watches regardless of Watch > setting can be set each time a watch event from a Recursive Watch is fired. > The memory utilization is relative to the number of outstanding reads and at > worst case it's 1/3 * 3.75TB using the parameters given above. > Otherwise, a relaxation of no intermediate watch event until read guarantee > is required. If the server can send watch events regardless of one has > already been fired without corresponding read, then the server can simply > fire watch events without tracking. -- This message was sent by
[jira] [Commented] (ZOOKEEPER-1416) Persistent Recursive Watch
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120448#comment-16120448 ] ASF GitHub Bot commented on ZOOKEEPER-1416: --- Github user Randgalt commented on the issue: https://github.com/apache/zookeeper/pull/136 @afine issues addressed > Persistent Recursive Watch > -- > > Key: ZOOKEEPER-1416 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416 > Project: ZooKeeper > Issue Type: Improvement > Components: c client, documentation, java client, server >Reporter: Phillip Liu >Assignee: Jordan Zimmerman > Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch > > Original Estimate: 504h > Remaining Estimate: 504h > > h4. The Problem > A ZooKeeper Watch can be placed on a single znode and when the znode changes > a Watch event is sent to the client. If there are thousands of znodes being > watched, when a client (re)connect, it would have to send thousands of watch > requests. At Facebook, we have this problem storing information for thousands > of db shards. Consequently a naming service that consumes the db shard > definition issues thousands of watch requests each time the service starts > and changes client watcher. > h4. Proposed Solution > We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent > means no Watch reset is necessary after a watch-fire. Recursive means the > Watch applies to the node and descendant nodes. A Persistent Recursive Watch > behaves as follows: > # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS. > # CHILDREN and DATA Recursive Watches can be placed on any znode. > # EXISTS Recursive Watches can be placed on any path. > # A Recursive Watch behaves like a auto-watch registrar on the server side. > Setting a Recursive Watch means to set watches on all descendant znodes. > # When a watch on a descendant fires, no subsequent event is fired until a > corresponding getData(..) on the znode is called, then Recursive Watch > automically apply the watch on the znode. This maintains the existing Watch > semantic on an individual znode. > # A Recursive Watch overrides any watches placed on a descendant znode. > Practically this means the Recursive Watch Watcher callback is the one > receiving the event and event is delivered exactly once. > A goal here is to reduce the number of semantic changes. The guarantee of no > intermediate watch event until data is read will be maintained. The only > difference is we will automatically re-add the watch after read. At the same > time we add the convience of reducing the need to add multiple watches for > sibling znodes and in turn reduce the number of watch messages sent from the > client to the server. > There are some implementation details that needs to be hashed out. Initial > thinking is to have the Recursive Watch create per-node watches. This will > cause a lot of watches to be created on the server side. Currently, each > watch is stored as a single bit in a bit set relative to a session - up to 3 > bits per client per znode. If there are 100m znodes with 100k clients, each > watching all nodes, then this strategy will consume approximately 3.75TB of > ram distributed across all Observers. Seems expensive. > Alternatively, a blacklist of paths to not send Watches regardless of Watch > setting can be set each time a watch event from a Recursive Watch is fired. > The memory utilization is relative to the number of outstanding reads and at > worst case it's 1/3 * 3.75TB using the parameters given above. > Otherwise, a relaxation of no intermediate watch event until read guarantee > is required. If the server can send watch events regardless of one has > already been fired without corresponding read, then the server can simply > fire watch events without tracking. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] zookeeper issue #136: [ZOOKEEPER-1416] Persistent Recursive Watch
Github user Randgalt commented on the issue: https://github.com/apache/zookeeper/pull/136 @afine issues addressed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: JIRA permissions
done and done. On Wed, Aug 9, 2017 at 6:24 AM, Mark Feneswrote: > Hi All, > > I'm a new dev and would like to contribute. > Could you please provide me permissions to assign and edit issues in JIRA? > My username is "mfenes". > > Thanks, > Mark Fenes > -- Cheers Michael.
[jira] [Commented] (ZOOKEEPER-2786) Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120426#comment-16120426 ] ASF GitHub Bot commented on ZOOKEEPER-2786: --- Github user hanm commented on the issue: https://github.com/apache/zookeeper/pull/327 The following up Netty fix to this flaky test is committed to master: e104175bb47baeb800354078c015e78bfcb7c953 and 3.5 23962f12395ada67e689b8ff57573fc1398a54eb > Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode > -- > > Key: ZOOKEEPER-2786 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2786 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.4.10, 3.5.3 >Reporter: Abraham Fine >Assignee: Abraham Fine > Fix For: 3.5.4, 3.6.0, 3.4.11 > > > This test is broken on 3.4 and 3.5, but is broken in "different" ways. Please > see the individual pull requests for detailed descriptions for the issues > faced in both branches. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] zookeeper issue #327: ZOOKEEPER-2786 Flaky test: org.apache.zookeeper.test.C...
Github user hanm commented on the issue: https://github.com/apache/zookeeper/pull/327 The following up Netty fix to this flaky test is committed to master: e104175bb47baeb800354078c015e78bfcb7c953 and 3.5 23962f12395ada67e689b8ff57573fc1398a54eb --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (ZOOKEEPER-2786) Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han updated ZOOKEEPER-2786: --- Fix Version/s: 3.4.11 > Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode > -- > > Key: ZOOKEEPER-2786 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2786 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.4.10, 3.5.3 >Reporter: Abraham Fine >Assignee: Abraham Fine > Fix For: 3.5.4, 3.6.0, 3.4.11 > > > This test is broken on 3.4 and 3.5, but is broken in "different" ways. Please > see the individual pull requests for detailed descriptions for the issues > faced in both branches. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ZOOKEEPER-2786) Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120419#comment-16120419 ] ASF GitHub Bot commented on ZOOKEEPER-2786: --- Github user asfgit closed the pull request at: https://github.com/apache/zookeeper/pull/327 > Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode > -- > > Key: ZOOKEEPER-2786 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2786 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.4.10, 3.5.3 >Reporter: Abraham Fine >Assignee: Abraham Fine > Fix For: 3.5.4, 3.6.0 > > > This test is broken on 3.4 and 3.5, but is broken in "different" ways. Please > see the individual pull requests for detailed descriptions for the issues > faced in both branches. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (ZOOKEEPER-2786) Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-2786. Resolution: Fixed Issue resolved by pull request 327 [https://github.com/apache/zookeeper/pull/327] > Flaky test: org.apache.zookeeper.test.ClientTest.testNonExistingOpCode > -- > > Key: ZOOKEEPER-2786 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2786 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.4.10, 3.5.3 >Reporter: Abraham Fine >Assignee: Abraham Fine > Fix For: 3.6.0, 3.5.4 > > > This test is broken on 3.4 and 3.5, but is broken in "different" ways. Please > see the individual pull requests for detailed descriptions for the issues > faced in both branches. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] zookeeper pull request #327: ZOOKEEPER-2786 Flaky test: org.apache.zookeeper...
Github user asfgit closed the pull request at: https://github.com/apache/zookeeper/pull/327 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (ZOOKEEPER-2864) Add script to run a java api compatibility tool
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120397#comment-16120397 ] ASF GitHub Bot commented on ZOOKEEPER-2864: --- Github user hanm commented on the issue: https://github.com/apache/zookeeper/pull/329 I suggest we put this script under zookeeper/src/java/test/bin/ where other scripts are currently located, so it's consistent. A better solution is to consolidate all scripts and put them in a folder with name making more sense, like other projects (e.g. hbase who puts scripts under root/dev-support), but that should be done separately as moving scripts will break a couple of work flows and require coordination. > Add script to run a java api compatibility tool > --- > > Key: ZOOKEEPER-2864 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2864 > Project: ZooKeeper > Issue Type: Improvement >Affects Versions: 3.4.10, 3.5.3 >Reporter: Abraham Fine >Assignee: Abraham Fine > > We should use the annotations added in ZOOKEEPER-2829 to run a script to > verify api compatibility. See KUDU-1265 for an example. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] zookeeper issue #329: ZOOKEEPER-2864: Add script to run a java api compatibi...
Github user hanm commented on the issue: https://github.com/apache/zookeeper/pull/329 I suggest we put this script under zookeeper/src/java/test/bin/ where other scripts are currently located, so it's consistent. A better solution is to consolidate all scripts and put them in a folder with name making more sense, like other projects (e.g. hbase who puts scripts under root/dev-support), but that should be done separately as moving scripts will break a couple of work flows and require coordination. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (ZOOKEEPER-2471) Java Zookeeper Client incorrectly considers time spent sleeping as time spent connecting, potentially resulting in infinite reconnect loop
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120331#comment-16120331 ] Dan Benediktson commented on ZOOKEEPER-2471: Core tests that I see failed in the log both had already passed on my local run: ChrootClientTest.testNonExistingOpCode WatchEventWhenAutoResetTest.testNodeDataChanged The first one is clearly suspicious because my patch allegedly "fixed" the corresponding ClientTest.testNonExistingOpCode, which it shouldn't have done anything about, so I'm pretty certain that's just a flaky test. I've tried running both of those test cases about 10 times on my MBP to no avail; they pass every time. I also already ran the full "ant test" suite before submitting the patch in the first place, and it all succeeded. Any suggestions here? > Java Zookeeper Client incorrectly considers time spent sleeping as time spent > connecting, potentially resulting in infinite reconnect loop > -- > > Key: ZOOKEEPER-2471 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2471 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.1 > Environment: all >Reporter: Dan Benediktson >Assignee: Dan Benediktson > Attachments: ZOOKEEPER-2471.patch > > > ClientCnxnSocket uses a member variable "now" to track the current time, and > lastSend / lastHeard variables to track socket liveness. Implementations, and > even ClientCnxn itself, are expected to call both updateNow() to reset "now" > to System.currentTimeMillis, and then call updateLastSend()/updateLastHeard() > on IO completions. > This is a fragile contract, so it's not surprising that there's a bug > resulting from it: ClientCnxn.SendThread.run() calls updateLastSendAndHeard() > as soon as startConnect() returns, but it does not call updateNow() first. I > expect when this was written, either the expectation was that startConnect() > was an asynchronous operation and that updateNow() would have been called > very recently, or simply the requirement to call updateNow() was forgotten at > this point. As far as I can see, this bug has been present since the > "updateNow" method was first introduced in the distant past. As it turns out, > since startConnect() calls HostProvider.next(), which can sleep, quite a lot > of time can pass, leaving a big gap between "now" and now. > If you are using very short session timeouts (one of our ZK ensembles has > many clients using a 1-second timeout), this is potentially disastrous, > because the sleep time may exceed the connection timeout itself, which can > potentially result in the Java client being stuck in a perpetual reconnect > loop. The exact code path it goes through in this case is complicated, > because there has to be a previously-closed socket still waiting in the > selector (otherwise, the first timeout evaluation will not fail because "now" > still hasn't been updated, and then the actual connect timeout will be > applied in ClientCnxnSocket.doTransport()) so that select() will harvest the > IO from the previous socket and updateNow(), resulting in the next loop > through ClientCnxnSocket.SendThread.run() observing the spurious timeout and > failing. In practice it does happen to us fairly frequently; we only got to > the bottom of the bug yesterday. Worse, when it does happen, the Zookeeper > client object is rendered unusable: it's stuck in a perpetual reconnect loop > where it keeps sleeping, opening a socket, and immediately closing it. > I have a patch. Rather than calling updateNow() right after startConnect(), > my fix is to remove the "now" member variable and the updateNow() method > entirely, and to instead just call System.currentTimeMillis() whenever time > needs to be evaluated. I realize there is a benefit (aside from a trivial > micro-optimization not worth worrying about) to having the time be "fixed", > particularly for truth in the logging: if time is fixed by an updateNow() > call, then the log for a timeout will still show exactly the same value the > code reasoned about. However, this benefit is in my opinion not enough to > merit the fragility of the contract which led to this (for us) highly > impactful and difficult-to-find bug in the first place. > I'm currently running ant tests locally against my patch on trunk, and then > I'll upload it here. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ZOOKEEPER-2471) Java Zookeeper Client incorrectly considers time spent sleeping as time spent connecting, potentially resulting in infinite reconnect loop
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120297#comment-16120297 ] Hadoop QA commented on ZOOKEEPER-2471: -- -1 overall. GitHub Pull Request Build +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 3.0.1) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/929//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/929//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/929//console This message is automatically generated. > Java Zookeeper Client incorrectly considers time spent sleeping as time spent > connecting, potentially resulting in infinite reconnect loop > -- > > Key: ZOOKEEPER-2471 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2471 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.1 > Environment: all >Reporter: Dan Benediktson >Assignee: Dan Benediktson > Attachments: ZOOKEEPER-2471.patch > > > ClientCnxnSocket uses a member variable "now" to track the current time, and > lastSend / lastHeard variables to track socket liveness. Implementations, and > even ClientCnxn itself, are expected to call both updateNow() to reset "now" > to System.currentTimeMillis, and then call updateLastSend()/updateLastHeard() > on IO completions. > This is a fragile contract, so it's not surprising that there's a bug > resulting from it: ClientCnxn.SendThread.run() calls updateLastSendAndHeard() > as soon as startConnect() returns, but it does not call updateNow() first. I > expect when this was written, either the expectation was that startConnect() > was an asynchronous operation and that updateNow() would have been called > very recently, or simply the requirement to call updateNow() was forgotten at > this point. As far as I can see, this bug has been present since the > "updateNow" method was first introduced in the distant past. As it turns out, > since startConnect() calls HostProvider.next(), which can sleep, quite a lot > of time can pass, leaving a big gap between "now" and now. > If you are using very short session timeouts (one of our ZK ensembles has > many clients using a 1-second timeout), this is potentially disastrous, > because the sleep time may exceed the connection timeout itself, which can > potentially result in the Java client being stuck in a perpetual reconnect > loop. The exact code path it goes through in this case is complicated, > because there has to be a previously-closed socket still waiting in the > selector (otherwise, the first timeout evaluation will not fail because "now" > still hasn't been updated, and then the actual connect timeout will be > applied in ClientCnxnSocket.doTransport()) so that select() will harvest the > IO from the previous socket and updateNow(), resulting in the next loop > through ClientCnxnSocket.SendThread.run() observing the spurious timeout and > failing. In practice it does happen to us fairly frequently; we only got to > the bottom of the bug yesterday. Worse, when it does happen, the Zookeeper > client object is rendered unusable: it's stuck in a perpetual reconnect loop > where it keeps sleeping, opening a socket, and immediately closing it. > I have a patch. Rather than calling updateNow() right after startConnect(), > my fix is to remove the "now" member variable and the updateNow() method > entirely, and to instead just call System.currentTimeMillis() whenever time > needs to be evaluated. I realize there is a benefit (aside from a trivial > micro-optimization not worth worrying about) to having the time be "fixed", > particularly for truth in the logging: if time is fixed by an updateNow() > call, then the log for a timeout will still show exactly the same value the > code reasoned about. However, this benefit is in my opinion not enough to > merit the fragility of the contract which led to this (for us) highly > impactful and
Failed: ZOOKEEPER- PreCommit Build #929
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/929/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 71.34 MB...] [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 3.0.1) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] -1 core tests. The patch failed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/929//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/929//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/929//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] 4ab077b212193aacc964978028662d7df1eeacbc logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] [exec] mv: '/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess' and '/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess' are the same file BUILD FAILED /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/build.xml:1643: exec returned: 1 Total time: 13 minutes 19 seconds Build step 'Execute shell' marked build as failure Archiving artifacts Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 [Fast Archiver] Compressed 574.71 KB of artifacts by 50.1% relative to #927 Recording test results Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 [description-setter] Description set: ZOOKEEPER-2471 Putting comment on the pull request Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Email was triggered for: Failure - Any Sending email for trigger: Failure - Any Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 ### ## FAILED TESTS (if any) ## 3 tests failed. FAILED: org.apache.zookeeper.server.quorum.Zab1_0Test.testNormalFollowerRunWithDiff Error Message: expected:<4294967298> but was:<0> Stack Trace: junit.framework.AssertionFailedError: expected:<4294967298> but was:<0> at org.apache.zookeeper.server.quorum.Zab1_0Test$5.converseWithFollower(Zab1_0Test.java:869) at org.apache.zookeeper.server.quorum.Zab1_0Test.testFollowerConversation(Zab1_0Test.java:517) at org.apache.zookeeper.server.quorum.Zab1_0Test.testNormalFollowerRunWithDiff(Zab1_0Test.java:784) at org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79) FAILED: org.apache.zookeeper.test.WatchEventWhenAutoResetTest.testNodeDataChanged Error Message: expected: but was: Stack Trace: junit.framework.AssertionFailedError: expected: but was: at org.apache.zookeeper.test.WatchEventWhenAutoResetTest$EventsWatcher.assertEvent(WatchEventWhenAutoResetTest.java:67) at org.apache.zookeeper.test.WatchEventWhenAutoResetTest.testNodeDataChanged(WatchEventWhenAutoResetTest.java:126) at org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79) FAILED: org.apache.zookeeper.test.ChrootClientTest.testNonExistingOpCode Error Message: expected:<-4> but was:<-6> Stack Trace:
[jira] [Commented] (ZOOKEEPER-2471) Java Zookeeper Client incorrectly considers time spent sleeping as time spent connecting, potentially resulting in infinite reconnect loop
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120259#comment-16120259 ] ASF GitHub Bot commented on ZOOKEEPER-2471: --- GitHub user DanBenediktson opened a pull request: https://github.com/apache/zookeeper/pull/330 ZOOKEEPER-2471: ZK Java client should not count sleep time as connect time ClientCnxnSocket uses a member variable "now" to track the current time, but does not update it at all potentially-blocking times: in particular, it does not update it after the random sleep introduced if an initial connect attempt fails. This results in the random sleep time being counted towards connect time, resulting in incorrect application of connection timeout currently, and if ZOOKEEPER-2869 is taken, a very real possibility (we have seen it in production) of wedging the Zookeeper client so that it can never successfully reconnect, because its sleep time may grow beyond its connection timeout, especially in scenarios where there is a big gap between negotiated session timeout and client-requested session timeout. Rather than fixing the bug by adding another "updateNow()" call, keeping the brittle "updateNow()" implementation which led to the bug in the first place, I have deleted updateNow() and replaced usage of that member variable with actually getting the current system timestamp whenever the implementation needs to know the current time. Regarding unit testing, this is, IMO, too difficult to test without introducing a lot of invasive changes to ClientCnxn.java, seeing as the only effective change is that, on connection retry, the random sleep time is no longer counted towards a time budget. I can throw a lot of mocks at this, like ClientReconnectTest, but I'm still going to be stuck depending on the behavior of that randomly-generated sleep time, which is going to be inherently unreliable. If a fix is taken for ZOOKEEPER-2869, this should become much easier to test, since I will then be able to inject a different backoff sleep behavior, and since I'm planning to submit a pull request for that ticket as well, so maybe as a compromise I can submit a test for this bug fix at that time? You can merge this pull request into a Git repository by running: $ git pull https://github.com/DanBenediktson/zookeeper ZOOKEEPER-2471 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/330.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #330 commit 60f38726e7f07b4bb970cc8fb089363ff48eb3df Author: Dan BenediktsonDate: 2017-08-09T16:41:42Z ZOOKEEPER-2471: Zookeeper Java client should not count time spent sleeping as time spent connecting Rather than keep the brittle "updateNow()" implementation which led to the bug and fixing the bug by adding another "updateNow()" call, I have deleted updateNow() and replaced usage of that member variable with actually getting the current system timestamp. This is, IMO, too difficult to test without introducing a lot of invasive changes to ClientCnxn.java, seeing as the only effective change is that, on connection retry, a random sleep time is no longer counted towards a time budget. If a fix is taken for ZOOKEEPER-2869, this should become much easier to test, and since I'm planning to submit a pull request for that ticket as well, maybe as a compromise I can submit a test for this patch at that time? > Java Zookeeper Client incorrectly considers time spent sleeping as time spent > connecting, potentially resulting in infinite reconnect loop > -- > > Key: ZOOKEEPER-2471 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2471 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.1 > Environment: all >Reporter: Dan Benediktson >Assignee: Dan Benediktson > Attachments: ZOOKEEPER-2471.patch > > > ClientCnxnSocket uses a member variable "now" to track the current time, and > lastSend / lastHeard variables to track socket liveness. Implementations, and > even ClientCnxn itself, are expected to call both updateNow() to reset "now" > to System.currentTimeMillis, and then call updateLastSend()/updateLastHeard() > on IO completions. > This is a fragile contract, so it's not surprising that there's a bug > resulting from it: ClientCnxn.SendThread.run() calls updateLastSendAndHeard() > as soon as startConnect() returns, but it does not call updateNow() first. I > expect
[GitHub] zookeeper pull request #330: ZOOKEEPER-2471: ZK Java client should not count...
GitHub user DanBenediktson opened a pull request: https://github.com/apache/zookeeper/pull/330 ZOOKEEPER-2471: ZK Java client should not count sleep time as connect time ClientCnxnSocket uses a member variable "now" to track the current time, but does not update it at all potentially-blocking times: in particular, it does not update it after the random sleep introduced if an initial connect attempt fails. This results in the random sleep time being counted towards connect time, resulting in incorrect application of connection timeout currently, and if ZOOKEEPER-2869 is taken, a very real possibility (we have seen it in production) of wedging the Zookeeper client so that it can never successfully reconnect, because its sleep time may grow beyond its connection timeout, especially in scenarios where there is a big gap between negotiated session timeout and client-requested session timeout. Rather than fixing the bug by adding another "updateNow()" call, keeping the brittle "updateNow()" implementation which led to the bug in the first place, I have deleted updateNow() and replaced usage of that member variable with actually getting the current system timestamp whenever the implementation needs to know the current time. Regarding unit testing, this is, IMO, too difficult to test without introducing a lot of invasive changes to ClientCnxn.java, seeing as the only effective change is that, on connection retry, the random sleep time is no longer counted towards a time budget. I can throw a lot of mocks at this, like ClientReconnectTest, but I'm still going to be stuck depending on the behavior of that randomly-generated sleep time, which is going to be inherently unreliable. If a fix is taken for ZOOKEEPER-2869, this should become much easier to test, since I will then be able to inject a different backoff sleep behavior, and since I'm planning to submit a pull request for that ticket as well, so maybe as a compromise I can submit a test for this bug fix at that time? You can merge this pull request into a Git repository by running: $ git pull https://github.com/DanBenediktson/zookeeper ZOOKEEPER-2471 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/330.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #330 commit 60f38726e7f07b4bb970cc8fb089363ff48eb3df Author: Dan BenediktsonDate: 2017-08-09T16:41:42Z ZOOKEEPER-2471: Zookeeper Java client should not count time spent sleeping as time spent connecting Rather than keep the brittle "updateNow()" implementation which led to the bug and fixing the bug by adding another "updateNow()" call, I have deleted updateNow() and replaced usage of that member variable with actually getting the current system timestamp. This is, IMO, too difficult to test without introducing a lot of invasive changes to ClientCnxn.java, seeing as the only effective change is that, on connection retry, a random sleep time is no longer counted towards a time budget. If a fix is taken for ZOOKEEPER-2869, this should become much easier to test, and since I'm planning to submit a pull request for that ticket as well, maybe as a compromise I can submit a test for this patch at that time? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
ZooKeeper_branch34_openjdk7 - Build # 1603 - Still Failing
See https://builds.apache.org/job/ZooKeeper_branch34_openjdk7/1603/ ### ## LAST 60 LINES OF THE CONSOLE ### Started by timer [EnvInject] - Loading node environment variables. Building remotely on H27 (ubuntu xenial) in workspace /home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_openjdk7 > git rev-parse --is-inside-work-tree # timeout=10 Fetching changes from the remote Git repository > git config remote.origin.url git://git.apache.org/zookeeper.git # timeout=10 Cleaning workspace > git rev-parse --verify HEAD # timeout=10 Resetting working tree > git reset --hard # timeout=10 > git clean -fdx # timeout=10 Fetching upstream changes from git://git.apache.org/zookeeper.git > git --version # timeout=10 > git fetch --tags --progress git://git.apache.org/zookeeper.git > +refs/heads/*:refs/remotes/origin/* > git rev-parse refs/remotes/origin/branch-3.4^{commit} # timeout=10 > git rev-parse refs/remotes/origin/origin/branch-3.4^{commit} # timeout=10 Checking out Revision e4303a37a813c9f1bd4cdefd9c754267b12c32b4 (refs/remotes/origin/branch-3.4) Commit message: "ZOOKEEPER-2853: The lastZxidSeen in FileTxnLog.java is never being assigned. This is a port of the same patch committed to master and branch-3.5, after resolving merge conflicts." > git config core.sparsecheckout # timeout=10 > git checkout -f e4303a37a813c9f1bd4cdefd9c754267b12c32b4 > git rev-list e4303a37a813c9f1bd4cdefd9c754267b12c32b4 # timeout=10 No emails were triggered. [ZooKeeper_branch34_openjdk7] $ /home/jenkins/tools/ant/apache-ant-1.9.9/bin/ant -Dtest.output=yes -Dtest.junit.threads=8 -Dtest.junit.output.format=xml -Djavac.target=1.7 clean test-core-java Error: JAVA_HOME is not defined correctly. We cannot execute /usr/lib/jvm/java-7-openjdk-amd64//bin/java Build step 'Invoke Ant' marked build as failure Recording test results ERROR: Step ?Publish JUnit test result report? failed: No test report files were found. Configuration error? Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## No tests ran.
JIRA permissions
Hi All, I'm a new dev and would like to contribute. Could you please provide me permissions to assign and edit issues in JIRA? My username is "mfenes". Thanks, Mark Fenes
jira permissions
Hi All, I'm a new dev and would like to contribute. Could you please provide me the rights to assign and edit issues in JIRA? My username is "tamaas". Thanks, Tamaas
ZooKeeper_branch35_jdk8 - Build # 627 - Still Failing
See https://builds.apache.org/job/ZooKeeper_branch35_jdk8/627/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 69.05 MB...] [junit] at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:182) [junit] at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113) [junit] 2017-08-09 12:24:07,821 [myid:] - INFO [main:ZooKeeper@1334] - Session: 0x1070cc2ce77 closed [junit] 2017-08-09 12:24:07,821 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@82] - Memory used 147756 [junit] 2017-08-09 12:24:07,821 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@87] - Number of threads 471 [junit] 2017-08-09 12:24:07,821 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@102] - FINISHED TEST METHOD testWatcherAutoResetWithLocal [junit] 2017-08-09 12:24:07,821 [myid:] - INFO [main:ClientBase@586] - tearDown starting [junit] 2017-08-09 12:24:07,821 [myid:] - INFO [main:ClientBase@556] - STOPPING server [junit] 2017-08-09 12:24:07,821 [myid:] - INFO [main:NettyServerCnxnFactory@464] - shutdown called 0.0.0.0/0.0.0.0:22240 [junit] 2017-08-09 12:24:07,821 [myid:] - INFO [main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down for session: 0x1070cc2ce77 [junit] 2017-08-09 12:24:07,822 [myid:] - INFO [main:ZooKeeperServer@541] - shutting down [junit] 2017-08-09 12:24:07,822 [myid:] - ERROR [main:ZooKeeperServer@505] - ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes [junit] 2017-08-09 12:24:07,822 [myid:] - INFO [main:SessionTrackerImpl@232] - Shutting down [junit] 2017-08-09 12:24:07,822 [myid:] - INFO [main:PrepRequestProcessor@1005] - Shutting down [junit] 2017-08-09 12:24:07,823 [myid:] - INFO [main:SyncRequestProcessor@191] - Shutting down [junit] 2017-08-09 12:24:07,823 [myid:] - INFO [ProcessThread(sid:0 cport:22240)::PrepRequestProcessor@155] - PrepRequestProcessor exited loop! [junit] 2017-08-09 12:24:07,823 [myid:] - INFO [SyncThread:0:SyncRequestProcessor@169] - SyncRequestProcessor exited! [junit] 2017-08-09 12:24:07,824 [myid:] - INFO [main:FinalRequestProcessor@481] - shutdown of request processor complete [junit] 2017-08-09 12:24:07,824 [myid:] - INFO [main:MBeanRegistry@128] - Unregister MBean [org.apache.ZooKeeperService:name0=StandaloneServer_port22240,name1=InMemoryDataTree] [junit] 2017-08-09 12:24:07,824 [myid:] - INFO [main:MBeanRegistry@128] - Unregister MBean [org.apache.ZooKeeperService:name0=StandaloneServer_port22240] [junit] 2017-08-09 12:24:07,824 [myid:] - INFO [main:FourLetterWordMain@87] - connecting to 127.0.0.1 22240 [junit] 2017-08-09 12:24:07,825 [myid:] - INFO [main:JMXEnv@146] - ensureOnly:[] [junit] 2017-08-09 12:24:07,828 [myid:] - INFO [main:ClientBase@611] - fdcount after test is: 1414 at start it was 1414 [junit] 2017-08-09 12:24:07,828 [myid:] - INFO [main:ZKTestCase$1@68] - SUCCEEDED testWatcherAutoResetWithLocal [junit] 2017-08-09 12:24:07,828 [myid:] - INFO [main:ZKTestCase$1@63] - FINISHED testWatcherAutoResetWithLocal [junit] 2017-08-09 12:24:07,847 [myid:127.0.0.1:22058] - INFO [main-SendThread(127.0.0.1:22058):ClientCnxn$SendThread@1113] - Opening socket connection to server 127.0.0.1/127.0.0.1:22058. Will not attempt to authenticate using SASL (unknown error) [junit] 2017-08-09 12:24:07,847 [myid:127.0.0.1:22058] - WARN [main-SendThread(127.0.0.1:22058):ClientCnxn$SendThread@1235] - Session 0x1070cbd69c5 for server 127.0.0.1/127.0.0.1:22058, unexpected error, closing socket connection and attempting reconnect [junit] java.net.ConnectException: Connection refused [junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) [junit] at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) [junit] at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:357) [junit] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1214) [junit] Tests run: 103, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 422.237 sec, Thread: 5, Class: org.apache.zookeeper.test.NioNettySuiteTest [junit] 2017-08-09 12:24:07,963 [myid:127.0.0.1:22120] - INFO [main-SendThread(127.0.0.1:22120):ClientCnxn$SendThread@1113] - Opening socket connection to server 127.0.0.1/127.0.0.1:22120. Will not attempt to authenticate using SASL (unknown error) [junit] 2017-08-09 12:24:07,964 [myid:127.0.0.1:22120] - WARN [main-SendThread(127.0.0.1:22120):ClientCnxn$SendThread@1235] - Session 0x2070cbf80d3 for server 127.0.0.1/127.0.0.1:22120, unexpected error, closing socket connection and attempting reconnect [junit]
ZooKeeper-trunk-jdk8 - Build # 1155 - Still Failing
See https://builds.apache.org/job/ZooKeeper-trunk-jdk8/1155/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 65.31 MB...] [junit] 2017-08-09 12:12:11,029 [myid:127.0.0.1:22123] - WARN [main-SendThread(127.0.0.1:22123):ClientCnxn$SendThread@1235] - Session 0x3054056f76d for server 127.0.0.1/127.0.0.1:22123, unexpected error, closing socket connection and attempting reconnect [junit] java.net.ConnectException: Connection refused [junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) [junit] at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) [junit] at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:357) [junit] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1214) [junit] 2017-08-09 12:12:11,288 [myid:] - INFO [ProcessThread(sid:0 cport:22240)::PrepRequestProcessor@614] - Processed session termination for sessionid: 0x105405a3ecd [junit] 2017-08-09 12:12:11,291 [myid:] - INFO [SyncThread:0:MBeanRegistry@128] - Unregister MBean [org.apache.ZooKeeperService:name0=StandaloneServer_port22240,name1=Connections,name2=127.0.0.1,name3=0x105405a3ecd] [junit] 2017-08-09 12:12:11,291 [myid:] - INFO [main:ZooKeeper@1332] - Session: 0x105405a3ecd closed [junit] 2017-08-09 12:12:11,291 [myid:] - INFO [main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down for session: 0x105405a3ecd [junit] 2017-08-09 12:12:11,291 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@82] - Memory used 120349 [junit] 2017-08-09 12:12:11,292 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@87] - Number of threads 863 [junit] 2017-08-09 12:12:11,292 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@102] - FINISHED TEST METHOD testWatcherAutoResetWithLocal [junit] 2017-08-09 12:12:11,292 [myid:] - INFO [main:ClientBase@601] - tearDown starting [junit] 2017-08-09 12:12:11,292 [myid:] - INFO [main:ClientBase@571] - STOPPING server [junit] 2017-08-09 12:12:11,292 [myid:] - INFO [main:NettyServerCnxnFactory@464] - shutdown called 0.0.0.0/0.0.0.0:22240 [junit] 2017-08-09 12:12:11,294 [myid:] - INFO [main:ZooKeeperServer@541] - shutting down [junit] 2017-08-09 12:12:11,294 [myid:] - ERROR [main:ZooKeeperServer@505] - ZKShutdownHandler is not registered, so ZooKeeper server won't take any action on ERROR or SHUTDOWN server state changes [junit] 2017-08-09 12:12:11,294 [myid:] - INFO [main:SessionTrackerImpl@232] - Shutting down [junit] 2017-08-09 12:12:11,294 [myid:] - INFO [main:PrepRequestProcessor@1008] - Shutting down [junit] 2017-08-09 12:12:11,294 [myid:] - INFO [main:SyncRequestProcessor@191] - Shutting down [junit] 2017-08-09 12:12:11,294 [myid:] - INFO [ProcessThread(sid:0 cport:22240)::PrepRequestProcessor@155] - PrepRequestProcessor exited loop! [junit] 2017-08-09 12:12:11,295 [myid:] - INFO [SyncThread:0:SyncRequestProcessor@169] - SyncRequestProcessor exited! [junit] 2017-08-09 12:12:11,295 [myid:] - INFO [main:FinalRequestProcessor@481] - shutdown of request processor complete [junit] 2017-08-09 12:12:11,295 [myid:] - INFO [main:MBeanRegistry@128] - Unregister MBean [org.apache.ZooKeeperService:name0=StandaloneServer_port22240,name1=InMemoryDataTree] [junit] 2017-08-09 12:12:11,295 [myid:] - INFO [main:MBeanRegistry@128] - Unregister MBean [org.apache.ZooKeeperService:name0=StandaloneServer_port22240] [junit] 2017-08-09 12:12:11,296 [myid:] - INFO [main:FourLetterWordMain@87] - connecting to 127.0.0.1 22240 [junit] 2017-08-09 12:12:11,296 [myid:] - INFO [main:JMXEnv@146] - ensureOnly:[] [junit] 2017-08-09 12:12:11,301 [myid:] - INFO [main:ClientBase@626] - fdcount after test is: 2545 at start it was 2545 [junit] 2017-08-09 12:12:11,301 [myid:] - INFO [main:ZKTestCase$1@68] - SUCCEEDED testWatcherAutoResetWithLocal [junit] 2017-08-09 12:12:11,301 [myid:] - INFO [main:ZKTestCase$1@63] - FINISHED testWatcherAutoResetWithLocal [junit] Tests run: 103, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 419.184 sec, Thread: 5, Class: org.apache.zookeeper.test.NioNettySuiteTest [junit] 2017-08-09 12:12:11,458 [myid:127.0.0.1:22043] - INFO [main-SendThread(127.0.0.1:22043):ClientCnxn$SendThread@1113] - Opening socket connection to server 127.0.0.1/127.0.0.1:22043. Will not attempt to authenticate using SASL (unknown error) [junit] 2017-08-09 12:12:11,458 [myid:127.0.0.1:22043] - WARN [main-SendThread(127.0.0.1:22043):ClientCnxn$SendThread@1235] - Session 0x10540545aeb0001 for server 127.0.0.1/127.0.0.1:22043, unexpected error, closing socket connection and attempting reconnect [junit] java.net.ConnectException: Connection refused [junit] at