ZooKeeper-trunk-solaris - Build # 912 - Still Failing
See https://builds.apache.org/job/ZooKeeper-trunk-solaris/912/ ### ## LAST 60 LINES OF THE CONSOLE ### Started by timer Building remotely on solaris1 (Solaris) in workspace /export/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris Updating http://svn.apache.org/repos/asf/zookeeper/trunk at revision '2015-01-16T08:31:05.200 +' At revision 1652359 Updating http://svn.apache.org/repos/asf/hadoop/nightly at revision '2015-01-16T08:31:05.200 +' At revision 1652359 no change for http://svn.apache.org/repos/asf/zookeeper/trunk since the previous build no change for http://svn.apache.org/repos/asf/hadoop/nightly since the previous build No emails were triggered. [locks-and-latches] Checking to see if we really have the locks [locks-and-latches] Have all the locks, build can start [ZooKeeper-trunk-solaris] $ /bin/bash /var/tmp/hudson753059925298406493.sh [trunk] $ /export/home/hudson/hudson-slave/tools/hudson.tasks.Ant_AntInstallation/ant-1.8.2/bin/ant -DBUILD_ARGS=-Dfindbugs.home=${FINDBUGS_HOME} -Dforrest.home=${FORREST_HOME} -Djava5.home=${JAVA5_HOME} -DBUILD_TARGETS=hudson-test-trunk -DANALYSIS_TARGETS=test -DBUILD_FLAGS=-Dtest.junit.output.format=xml -Dtest.output=yes -Dtest.output=yes -Dtest.junit.output.format=xml clean test-core-java Error: JAVA_HOME is not defined correctly. We cannot execute /home/jenkins/tools/java/latest1.7/bin/java Build step 'Invoke Ant' marked build as failure [locks-and-latches] Releasing all the locks [locks-and-latches] All the locks released Recording test results Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## No tests ran.
[jira] [Updated] (ZOOKEEPER-2101) Transaction larger than max buffer of jute makes zookeeper unavailable
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated ZOOKEEPER-2101: --- Attachment: test.diff [~rakeshr] Add log in ZKDatabase to validate that the size of Proposal may larger than the request size. {code} 2015-01-16 17:56:07,469 [myid:] - INFO [SyncThread:0:ZKDatabase@261] - Request type 14 size: 5499 zxid: 2, Proposal size:5526 {code} Transaction larger than max buffer of jute makes zookeeper unavailable -- Key: ZOOKEEPER-2101 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2101 Project: ZooKeeper Issue Type: Bug Components: jute Affects Versions: 3.4.4 Reporter: Liu Shaohui Attachments: ZOOKEEPER-2101-v1.diff, test.diff *Problem* For multi operation, PrepRequestProcessor may produce a large transaction whose size may be larger than the max buffer size of jute. There is check of buffer size in readBuffer method of BinaryInputArchive, but no check in writeBuffer method of BinaryOutputArchive, which will cause that 1, Leader can sync transaction to txn log and send the large transaction to the followers, but the followers failed to read the transaction and can't sync with leader. {code} 2015-01-04,12:42:26,474 WARN org.apache.zookeeper.server.quorum.Learner: [myid:2] Exception when following the leader java.io.IOException: Unreasonable length = 2054758 at org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:100) at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:85) at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108) at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152) at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:740) 2015-01-04,12:42:26,475 INFO org.apache.zookeeper.server.quorum.Learner: [myid:2] shutdown called java.lang.Exception: shutdown Follower at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:744) {code} 2, The leader lose all followers, which trigger the leader election. The old leader will become leader again for it has up-to-date data. {code} 2015-01-04,12:42:28,502 INFO org.apache.zookeeper.server.quorum.Leader: [myid:3] Shutting down 2015-01-04,12:42:28,502 INFO org.apache.zookeeper.server.quorum.Leader: [myid:3] Shutdown called java.lang.Exception: shutdown Leader! reason: Only 1 followers, need 2 at org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:496) at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:471) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:753) {code} 3, The leader can not load the transaction from the txn log for the length of data is larger than the max buffer of jute. {code} 2015-01-04,12:42:31,282 ERROR org.apache.zookeeper.server.quorum.QuorumPeer: [myid:3] Unable to load database on disk java.io.IOException: Unreasonable length = 2054758 at org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:100) at org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:233) at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:602) at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:157) at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223) at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:417) at org.apache.zookeeper.server.quorum.QuorumPeer.getLastLoggedZxid(QuorumPeer.java:546) at org.apache.zookeeper.server.quorum.FastLeaderElection.getInitLastLoggedZxid(FastLeaderElection.java:690) at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:737) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:716) {code} The zookeeper service will be unavailable until we enlarge the jute.maxbuffer and restart zookeeper hbase cluster. *Solution* Add buffer size check in BinaryOutputArchive to avoid large transaction be written to log and sent to followers. But I am not sure if there are side-effects of throwing an IOException in BinaryOutputArchive and RequestProcessors -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-1525) Plumb ZooKeeperServer object into auth plugins
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Crowder updated ZOOKEEPER-1525: --- Attachment: ZOOKEEPER-1525.patch Updated patch Plumb ZooKeeperServer object into auth plugins -- Key: ZOOKEEPER-1525 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1525 Project: ZooKeeper Issue Type: Improvement Affects Versions: 3.5.0 Reporter: Warren Turkal Assignee: Warren Turkal Fix For: 3.5.1 Attachments: ZOOKEEPER-1525.patch, ZOOKEEPER-1525.patch I want to plumb the ZooKeeperServer object into the auth plugins so that I can store authentication data in zookeeper itself. With access to the ZooKeeperServer object, I also have access to the ZKDatabase and can look up entries in the local copy of the zookeeper data. In order to implement this, I make sure that a ZooKeeperServer instance is passed in to the ProviderRegistry.initialize() method. Then initialize() will try to find a constructor for the AuthenticationProvider that takes a ZooKeeperServer instance. If the constructor is found, it will be used. Otherwise, initialize() will look for a constructor that takes no arguments and use that instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-1865) Fix retry logic in Learner.connectToLeader()
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280916#comment-14280916 ] Jared Cantwell commented on ZOOKEEPER-1865: --- We just hit this in our internal testing today too. connect() throws a SocketTimeoutException if the specified timeout was reached. This isn't perfect, but could this be leveraged to assume connect was fast if that exception wasn't thrown, and it was slow otherwise? Unfortunately, if connect() takes just under the timeout to throw a different error, then we'll lose that time. Probably not ideal, but wanted to suggest it as an option. Fix retry logic in Learner.connectToLeader() - Key: ZOOKEEPER-1865 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1865 Project: ZooKeeper Issue Type: Bug Components: server Reporter: Thawan Kooburat Assignee: Edward Carter Fix For: 3.5.1 Attachments: ZOOKEEPER-1865.patch We discovered a long leader election time today in one of our prod ensemble. Here is the description of the event. Before the old leader goes down, it is able to announce notification message. So 3 out 5 (including the old leader) elected the old leader to be a new leader for the next epoch. While, the old leader is being rebooted, 2 other machines are trying to connect to the old leader. So the quorum couldn't form until those 2 machines give up and move to the next round of leader election. This is because Learner.connectToLeader() use a simple retry logic. The contract for this method is that it should never spend longer that initLimit trying to connect to the leader. In our outage, each sock.connect() is probably blocked for initLimit and it is called 5 times. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Failed: ZOOKEEPER-1525 PreCommit Build #2481
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1525 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2481/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 348699 lines...] [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no new tests are needed for this patch. [exec] Also please list what manual steps were performed to verify this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] -1 javac. The applied patch generated 8 javac compiler warnings (more than the trunk's current 6 warnings). [exec] [exec] -1 findbugs. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] -1 core tests. The patch failed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2481//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2481//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2481//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] a47d28b320620a1f7f9c58b6f123b0b074bc9c4c logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] BUILD FAILED /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1714: exec returned: 4 Total time: 45 minutes 24 seconds Build step 'Execute shell' marked build as failure Archiving artifacts Sending artifact delta relative to PreCommit-ZOOKEEPER-Build #2464 Archived 7 artifacts Archive block size is 32768 Received 8 blocks and 298168 bytes Compression is 46.8% Took 0.91 sec Recording test results Description set: ZOOKEEPER-1525 Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## 1 tests failed. REGRESSION: org.apache.zookeeper.test.ReconfigTest.testPortChange Error Message: expected:test[1] but was:test[0] Stack Trace: junit.framework.AssertionFailedError: expected:test[1] but was:test[0] at org.apache.zookeeper.test.ReconfigTest.testNormalOperation(ReconfigTest.java:151) at org.apache.zookeeper.test.ReconfigTest.testPortChange(ReconfigTest.java:600) at org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
[jira] [Commented] (ZOOKEEPER-1525) Plumb ZooKeeperServer object into auth plugins
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280960#comment-14280960 ] Hadoop QA commented on ZOOKEEPER-1525: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12692848/ZOOKEEPER-1525.patch against trunk revision 1646992. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 8 javac compiler warnings (more than the trunk's current 6 warnings). -1 findbugs. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2481//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2481//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2481//console This message is automatically generated. Plumb ZooKeeperServer object into auth plugins -- Key: ZOOKEEPER-1525 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1525 Project: ZooKeeper Issue Type: Improvement Affects Versions: 3.5.0 Reporter: Warren Turkal Assignee: Warren Turkal Fix For: 3.5.1 Attachments: ZOOKEEPER-1525.patch, ZOOKEEPER-1525.patch I want to plumb the ZooKeeperServer object into the auth plugins so that I can store authentication data in zookeeper itself. With access to the ZooKeeperServer object, I also have access to the ZKDatabase and can look up entries in the local copy of the zookeeper data. In order to implement this, I make sure that a ZooKeeperServer instance is passed in to the ProviderRegistry.initialize() method. Then initialize() will try to find a constructor for the AuthenticationProvider that takes a ZooKeeperServer instance. If the constructor is found, it will be used. Otherwise, initialize() will look for a constructor that takes no arguments and use that instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-1525) Plumb ZooKeeperServer object into auth plugins
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14281027#comment-14281027 ] Tim Crowder commented on ZOOKEEPER-1525: It looks like the port-change test that failed here was already failing before this patch. i.e. https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2477/ Plumb ZooKeeperServer object into auth plugins -- Key: ZOOKEEPER-1525 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1525 Project: ZooKeeper Issue Type: Improvement Affects Versions: 3.5.0 Reporter: Warren Turkal Assignee: Warren Turkal Fix For: 3.5.1 Attachments: ZOOKEEPER-1525.patch, ZOOKEEPER-1525.patch I want to plumb the ZooKeeperServer object into the auth plugins so that I can store authentication data in zookeeper itself. With access to the ZooKeeperServer object, I also have access to the ZKDatabase and can look up entries in the local copy of the zookeeper data. In order to implement this, I make sure that a ZooKeeperServer instance is passed in to the ProviderRegistry.initialize() method. Then initialize() will try to find a constructor for the AuthenticationProvider that takes a ZooKeeperServer instance. If the constructor is found, it will be used. Otherwise, initialize() will look for a constructor that takes no arguments and use that instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)