[jira] [Commented] (ZOOKEEPER-2124) Allow Zookeeper version string to have underscore '_'
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503568#comment-14503568 ] Chris Nauroth commented on ZOOKEEPER-2124: -- See below for an abridged list of the test failures, which I pulled from the Jenkins console log. I expect any test failures are unrelated to this patch, because it's a change in the rpm packaging only. Jenkins doesn't cover this in test runs. {code} [exec] [junit] 2015-04-20 19:48:07,327 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@62] - TEST METHOD FAILED testReadArrayOffsetLength_LengthTooLarge [exec] [junit] java.lang.IndexOutOfBoundsException [exec] [junit] 2015-04-20 19:50:46,555 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@62] - TEST METHOD FAILED testTooManySnapshotsNonessential [exec] [junit] org.apache.zookeeper.server.quorum.SnapshotThrottleException: new snapshot would make 6 concurrently in progress; maximum is 5 [exec] [junit] 2015-04-20 19:50:46,575 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@62] - TEST METHOD FAILED testTryWithResourceThrottle [exec] [junit] org.apache.zookeeper.server.quorum.SnapshotThrottleException: new snapshot would make 2 concurrently in progress; maximum is 1 [exec] [junit] 2015-04-20 19:50:53,062 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@62] - TEST METHOD FAILED connectionRetryTimeoutTest [exec] [junit] java.io.IOException: Test injected Socket.connect() error. [exec] [junit] 2015-04-20 20:25:20,473 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@62] - TEST METHOD FAILED testTwoInvalidHostAddresses [exec] [junit] java.lang.IllegalArgumentException: A HostProvider may not be empty! [exec] [exec] Zookeeper_readOnly::testReadOnly : elapsed 4081 : OK [exec] [exec] /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/src/c/tests/TestReconfig.cc:183: Assertion: equality assertion failed [Expected: 1, Actual : 0] {code} Allow Zookeeper version string to have underscore '_' - Key: ZOOKEEPER-2124 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2124 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.4.6 Reporter: Jerry He Assignee: Jerry He Fix For: 3.5.1, 3.6.0 Attachments: ZOOKEEPER-2124.001.patch Using Bigtop or other RPM build for Zookeeper, there is a problem with using the hyphen '-' character in the version string: {noformat} [bigdata@bdvs1166 bigtop]$ gradle zookeeper-rpm :buildSrc:compileJava UP-TO-DATE :buildSrc:compileGroovy UP-TO-DATE :buildSrc:processResources UP-TO-DATE :buildSrc:classes UP-TO-DATE :buildSrc:jar UP-TO-DATE :buildSrc:assemble UP-TO-DATE :buildSrc:compileTestJava UP-TO-DATE :buildSrc:compileTestGroovy UP-TO-DATE :buildSrc:processTestResources UP-TO-DATE :buildSrc:testClasses UP-TO-DATE :buildSrc:test UP-TO-DATE :buildSrc:check UP-TO-DATE :buildSrc:build UP-TO-DATE :zookeeper_vardefines :zookeeper-download :zookeeper-tar Copy /home/bigdata/bigtop/dl/zookeeper-3.4.6-IBM-1.tar.gz to /home/bigdata/bigtop/build/zookeeper/tar/zookeeper-3.4.6-IBM-1.tar.gz :zookeeper-srpm error: line 64: Illegal char '-' in: Version: 3.4.6-IBM-1 :zookeeper-srpm FAILED FAILURE: Build failed with an exception. * Where: Script '/home/bigdata/bigtop/packages.gradle' line: 462 * What went wrong: Execution failed for task ':zookeeper-srpm'. Process 'command 'rpmbuild'' finished with non-zero exit value 1 * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. BUILD FAILED {noformat} Also, according to the [rpm-maven-plugin|http://mojo.codehaus.org/rpm-maven-plugin/ident-params.html] documentation: {noformat} version The version number to use for the RPM package. By default, this is the project version. This value cannot contain a dash (-) due to contraints in the RPM file naming convention. Any specified value will be truncated at the first dash release The release number of the RPM. Beginning with release 2.0-beta-2, this is an optional parameter. By default, the release will be generated from the modifier portion of the project version using the following rules: If no modifier exists, the release will be 1. If the modifier ends with SNAPSHOT, the timestamp (in UTC) of the build will be appended to end. All instances of '-' in the modifier will be replaced with '_'. If a modifier exists and does not end with SNAPSHOT, _1 will be appended to end. {noformat} We should allow underscore '_' as part of the version string. e.g. 3.4.6_abc_1 -- This message was sent by Atlassian JIRA
[jira] [Commented] (ZOOKEEPER-1506) Re-try DNS hostname - IP resolution if node connection fails
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503825#comment-14503825 ] Hadoop QA commented on ZOOKEEPER-1506: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726674/ZOOKEEPER-1506-fix.patch against trunk revision 1672934. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2641//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2641//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2641//console This message is automatically generated. Re-try DNS hostname - IP resolution if node connection fails - Key: ZOOKEEPER-1506 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1506 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.4.5 Environment: Ubuntu 11.04 64-bit Reporter: Mike Heffner Assignee: Michi Mutsuzaki Priority: Critical Labels: patch Fix For: 3.4.7, 3.5.1, 3.6.0 Attachments: ZOOKEEPER-1506-fix.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, zk-dns-caching-refresh.patch In our zoo.cfg we use hostnames to identify the ZK servers that are part of an ensemble. These hostnames are configured with a low (= 60s) TTL and the IP address they map to can and does change. Our procedure for replacing/upgrading a ZK node is to boot an entirely new instance and remap the hostname to the new instance's IP address. Our expectation is that when the original ZK node is terminated/shutdown, the remaining nodes in the ensemble would reconnect to the new instance. However, what we are noticing is that the remaining ZK nodes do not attempt to re-resolve the hostname-IP mapping for the new server. Once the original ZK node is terminated, the existing servers continue to attempt contacting it at the old IP address. It would be great if the ZK servers could try to re-resolve the hostname when attempting to connect to a lost ZK server, instead of caching the lookup indefinitely. Currently we must do a rolling restart of the ZK ensemble after swapping a node -- which at three nodes means we periodically lose quorum. The exact method we are following is to boot new instances in EC2 and attach one, of a set of three, Elastic IP address. External to EC2 this IP address remains the same and maps to whatever instance it is attached to. Internal to EC2, the elastic IP hostname has a TTL of about 45-60 seconds and is remapped to the internal (10.x.y.z) address of the instance it is attached to. Therefore, in our case we would like ZK to pickup the new 10.x.y.z address that the elastic IP hostname gets mapped to and reconnect appropriately. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-1506) Re-try DNS hostname - IP resolution if node connection fails
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503871#comment-14503871 ] Michi Mutsuzaki commented on ZOOKEEPER-1506: Thanks Raul for testing this. I'd try replacing calls to getHostName to getHostString. For example, I found another one in QuorumCnxManager.java: org/apache/zookeeper/server/quorum/QuorumCnxManager.java:String addr = self.getElectionAddress().getHostName() + : + self.getElectionAddress().getPort(); Re-try DNS hostname - IP resolution if node connection fails - Key: ZOOKEEPER-1506 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1506 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.4.5 Environment: Ubuntu 11.04 64-bit Reporter: Mike Heffner Assignee: Michi Mutsuzaki Priority: Critical Labels: patch Fix For: 3.4.7, 3.5.1, 3.6.0 Attachments: ZOOKEEPER-1506-fix.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, zk-dns-caching-refresh.patch In our zoo.cfg we use hostnames to identify the ZK servers that are part of an ensemble. These hostnames are configured with a low (= 60s) TTL and the IP address they map to can and does change. Our procedure for replacing/upgrading a ZK node is to boot an entirely new instance and remap the hostname to the new instance's IP address. Our expectation is that when the original ZK node is terminated/shutdown, the remaining nodes in the ensemble would reconnect to the new instance. However, what we are noticing is that the remaining ZK nodes do not attempt to re-resolve the hostname-IP mapping for the new server. Once the original ZK node is terminated, the existing servers continue to attempt contacting it at the old IP address. It would be great if the ZK servers could try to re-resolve the hostname when attempting to connect to a lost ZK server, instead of caching the lookup indefinitely. Currently we must do a rolling restart of the ZK ensemble after swapping a node -- which at three nodes means we periodically lose quorum. The exact method we are following is to boot new instances in EC2 and attach one, of a set of three, Elastic IP address. External to EC2 this IP address remains the same and maps to whatever instance it is attached to. Internal to EC2, the elastic IP hostname has a TTL of about 45-60 seconds and is remapped to the internal (10.x.y.z) address of the instance it is attached to. Therefore, in our case we would like ZK to pickup the new 10.x.y.z address that the elastic IP hostname gets mapped to and reconnect appropriately. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-1506) Re-try DNS hostname - IP resolution if node connection fails
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503821#comment-14503821 ] Raul Gutierrez Segales commented on ZOOKEEPER-1506: --- actually, maybe it does. not sure my first try was clean. couldn't get a repro after the 2nd try. Re-try DNS hostname - IP resolution if node connection fails - Key: ZOOKEEPER-1506 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1506 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.4.5 Environment: Ubuntu 11.04 64-bit Reporter: Mike Heffner Assignee: Michi Mutsuzaki Priority: Critical Labels: patch Fix For: 3.4.7, 3.5.1, 3.6.0 Attachments: ZOOKEEPER-1506-fix.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, zk-dns-caching-refresh.patch In our zoo.cfg we use hostnames to identify the ZK servers that are part of an ensemble. These hostnames are configured with a low (= 60s) TTL and the IP address they map to can and does change. Our procedure for replacing/upgrading a ZK node is to boot an entirely new instance and remap the hostname to the new instance's IP address. Our expectation is that when the original ZK node is terminated/shutdown, the remaining nodes in the ensemble would reconnect to the new instance. However, what we are noticing is that the remaining ZK nodes do not attempt to re-resolve the hostname-IP mapping for the new server. Once the original ZK node is terminated, the existing servers continue to attempt contacting it at the old IP address. It would be great if the ZK servers could try to re-resolve the hostname when attempting to connect to a lost ZK server, instead of caching the lookup indefinitely. Currently we must do a rolling restart of the ZK ensemble after swapping a node -- which at three nodes means we periodically lose quorum. The exact method we are following is to boot new instances in EC2 and attach one, of a set of three, Elastic IP address. External to EC2 this IP address remains the same and maps to whatever instance it is attached to. Internal to EC2, the elastic IP hostname has a TTL of about 45-60 seconds and is remapped to the internal (10.x.y.z) address of the instance it is attached to. Therefore, in our case we would like ZK to pickup the new 10.x.y.z address that the elastic IP hostname gets mapped to and reconnect appropriately. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504028#comment-14504028 ] Michi Mutsuzaki commented on ZOOKEEPER-2171: Thanks Raul. We should replace getHostName() with getHostString(), and also remove src/java/main/org/apache/zookeeper/common/HostNameUtils.java. I don't think the code relies on getHostName() performing a reverse dns lookup, so replacing it with getHostString() shouldn't cause any correctness issues. avoid reverse lookups in QuorumCnxManager - Key: ZOOKEEPER-2171 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171 Project: ZooKeeper Issue Type: Bug Components: quorum Reporter: Raul Gutierrez Segales Assignee: Raul Gutierrez Segales Fix For: 3.5.1, 3.6.0 Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of getHostName() calls in QCM. Besides the overhead, these can cause problems when mixed with failing/mis-configured DNS servers. It would be nice to reduce them, if that doesn't affect operational correctness. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-1506) Re-try DNS hostname - IP resolution if node connection fails
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503671#comment-14503671 ] Michi Mutsuzaki commented on ZOOKEEPER-1506: The patch uses HostNameUtils.getHostString(), which supposedly avoid reverse lookup. Maybe there is a bug in HostNameUtils.getHostString()? We can replace HostNameUtils with InetSocketAddress.getHostString since we now require java7. Re-try DNS hostname - IP resolution if node connection fails - Key: ZOOKEEPER-1506 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1506 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.4.5 Environment: Ubuntu 11.04 64-bit Reporter: Mike Heffner Assignee: Michi Mutsuzaki Priority: Critical Labels: patch Fix For: 3.4.7, 3.5.1, 3.6.0 Attachments: ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, zk-dns-caching-refresh.patch In our zoo.cfg we use hostnames to identify the ZK servers that are part of an ensemble. These hostnames are configured with a low (= 60s) TTL and the IP address they map to can and does change. Our procedure for replacing/upgrading a ZK node is to boot an entirely new instance and remap the hostname to the new instance's IP address. Our expectation is that when the original ZK node is terminated/shutdown, the remaining nodes in the ensemble would reconnect to the new instance. However, what we are noticing is that the remaining ZK nodes do not attempt to re-resolve the hostname-IP mapping for the new server. Once the original ZK node is terminated, the existing servers continue to attempt contacting it at the old IP address. It would be great if the ZK servers could try to re-resolve the hostname when attempting to connect to a lost ZK server, instead of caching the lookup indefinitely. Currently we must do a rolling restart of the ZK ensemble after swapping a node -- which at three nodes means we periodically lose quorum. The exact method we are following is to boot new instances in EC2 and attach one, of a set of three, Elastic IP address. External to EC2 this IP address remains the same and maps to whatever instance it is attached to. Internal to EC2, the elastic IP hostname has a TTL of about 45-60 seconds and is remapped to the internal (10.x.y.z) address of the instance it is attached to. Therefore, in our case we would like ZK to pickup the new 10.x.y.z address that the elastic IP hostname gets mapped to and reconnect appropriately. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-1506) Re-try DNS hostname - IP resolution if node connection fails
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michi Mutsuzaki updated ZOOKEEPER-1506: --- Attachment: ZOOKEEPER-1506-fix.patch Raul, could you try ZOOKEEPER-1506-fix.patch and see if it fixes the problem? Thanks! Re-try DNS hostname - IP resolution if node connection fails - Key: ZOOKEEPER-1506 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1506 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.4.5 Environment: Ubuntu 11.04 64-bit Reporter: Mike Heffner Assignee: Michi Mutsuzaki Priority: Critical Labels: patch Fix For: 3.4.7, 3.5.1, 3.6.0 Attachments: ZOOKEEPER-1506-fix.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, zk-dns-caching-refresh.patch In our zoo.cfg we use hostnames to identify the ZK servers that are part of an ensemble. These hostnames are configured with a low (= 60s) TTL and the IP address they map to can and does change. Our procedure for replacing/upgrading a ZK node is to boot an entirely new instance and remap the hostname to the new instance's IP address. Our expectation is that when the original ZK node is terminated/shutdown, the remaining nodes in the ensemble would reconnect to the new instance. However, what we are noticing is that the remaining ZK nodes do not attempt to re-resolve the hostname-IP mapping for the new server. Once the original ZK node is terminated, the existing servers continue to attempt contacting it at the old IP address. It would be great if the ZK servers could try to re-resolve the hostname when attempting to connect to a lost ZK server, instead of caching the lookup indefinitely. Currently we must do a rolling restart of the ZK ensemble after swapping a node -- which at three nodes means we periodically lose quorum. The exact method we are following is to boot new instances in EC2 and attach one, of a set of three, Elastic IP address. External to EC2 this IP address remains the same and maps to whatever instance it is attached to. Internal to EC2, the elastic IP hostname has a TTL of about 45-60 seconds and is remapped to the internal (10.x.y.z) address of the instance it is attached to. Therefore, in our case we would like ZK to pickup the new 10.x.y.z address that the elastic IP hostname gets mapped to and reconnect appropriately. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Jenkins pre-commit build is bogus?
Look near the bottom of the console output. It failed for two reasons - no tests as part of the patch, and the c client tests failed. Jenkins doesn't know how to report the c client tests, just the java. Patrick On Mon, Apr 20, 2015 at 2:04 PM, Flavio Junqueira fpjunque...@yahoo.com.invalid wrote: While looking at ZK-2124, I checked the report #2640 and it says no test failures (https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2640/ https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2640/). The jenkins summary on the jira correctly reports that there has been core test failures, though. Any clue of what needs to be fixed? -Flavio
[jira] [Commented] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504019#comment-14504019 ] Raul Gutierrez Segales commented on ZOOKEEPER-2171: --- For background/reference see: https://issues.apache.org/jira/browse/ZOOKEEPER-1666 https://issues.apache.org/jira/browse/ZOOKEEPER-1506 avoid reverse lookups in QuorumCnxManager - Key: ZOOKEEPER-2171 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171 Project: ZooKeeper Issue Type: Bug Components: quorum Reporter: Raul Gutierrez Segales Assignee: Raul Gutierrez Segales Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of getHostName() calls in QCM. Besides the overhead, these can cause problems when mixed with failing/mis-configured DNS servers. It would be nice to reduce them, if that doesn't affect operational correctness. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michi Mutsuzaki updated ZOOKEEPER-2171: --- Fix Version/s: 3.6.0 3.5.1 avoid reverse lookups in QuorumCnxManager - Key: ZOOKEEPER-2171 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171 Project: ZooKeeper Issue Type: Bug Components: quorum Reporter: Raul Gutierrez Segales Assignee: Raul Gutierrez Segales Fix For: 3.5.1, 3.6.0 Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of getHostName() calls in QCM. Besides the overhead, these can cause problems when mixed with failing/mis-configured DNS servers. It would be nice to reduce them, if that doesn't affect operational correctness. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-1506) Re-try DNS hostname - IP resolution if node connection fails
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503799#comment-14503799 ] Raul Gutierrez Segales commented on ZOOKEEPER-1506: --- hmmm, it does not help [~michim] Re-try DNS hostname - IP resolution if node connection fails - Key: ZOOKEEPER-1506 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1506 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.4.5 Environment: Ubuntu 11.04 64-bit Reporter: Mike Heffner Assignee: Michi Mutsuzaki Priority: Critical Labels: patch Fix For: 3.4.7, 3.5.1, 3.6.0 Attachments: ZOOKEEPER-1506-fix.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, zk-dns-caching-refresh.patch In our zoo.cfg we use hostnames to identify the ZK servers that are part of an ensemble. These hostnames are configured with a low (= 60s) TTL and the IP address they map to can and does change. Our procedure for replacing/upgrading a ZK node is to boot an entirely new instance and remap the hostname to the new instance's IP address. Our expectation is that when the original ZK node is terminated/shutdown, the remaining nodes in the ensemble would reconnect to the new instance. However, what we are noticing is that the remaining ZK nodes do not attempt to re-resolve the hostname-IP mapping for the new server. Once the original ZK node is terminated, the existing servers continue to attempt contacting it at the old IP address. It would be great if the ZK servers could try to re-resolve the hostname when attempting to connect to a lost ZK server, instead of caching the lookup indefinitely. Currently we must do a rolling restart of the ZK ensemble after swapping a node -- which at three nodes means we periodically lose quorum. The exact method we are following is to boot new instances in EC2 and attach one, of a set of three, Elastic IP address. External to EC2 this IP address remains the same and maps to whatever instance it is attached to. Internal to EC2, the elastic IP hostname has a TTL of about 45-60 seconds and is remapped to the internal (10.x.y.z) address of the instance it is attached to. Therefore, in our case we would like ZK to pickup the new 10.x.y.z address that the elastic IP hostname gets mapped to and reconnect appropriately. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Failed: ZOOKEEPER-1506 PreCommit Build #2641
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1506 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2641/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 364318 lines...] [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no new tests are needed for this patch. [exec] Also please list what manual steps were performed to verify this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] -1 core tests. The patch failed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2641//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2641//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2641//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] fb45919bc0f139c7eb57576d1424d0fab620257e logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] BUILD FAILED /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1721: exec returned: 2 Total time: 47 minutes 0 seconds Build step 'Execute shell' marked build as failure Archiving artifacts Sending artifact delta relative to PreCommit-ZOOKEEPER-Build #2639 Archived 7 artifacts Archive block size is 32768 Received 8 blocks and 301134 bytes Compression is 46.5% Took 7.4 sec Recording test results Description set: ZOOKEEPER-1506 Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## 1 tests failed. REGRESSION: org.apache.zookeeper.server.quorum.ReconfigRecoveryTest.testCurrentObserverIsParticipantInNewConfig Error Message: waiting for server 2 being up Stack Trace: junit.framework.AssertionFailedError: waiting for server 2 being up at org.apache.zookeeper.server.quorum.ReconfigRecoveryTest.testCurrentObserverIsParticipantInNewConfig(ReconfigRecoveryTest.java:529) at org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
Re: [VOTE] Apache ZooKeeper release 3.5.1-alpha candidate 0
On 20 April 2015 at 13:03, Raúl Gutiérrez Segalés r...@itevenworks.net wrote: -1, alas. I think ZOOKEEPER-1506 could be problematic for some setups. After a couple of elections with a cluster of 5 participants and one observer, I end up with a participant that's unable to find the leader because it does a reverse lookup (IP - hostname) and ends up with a bogus hostname that it can't resolve: https://gist.github.com/rgs1/d11822799fdbbfa5d5f2 I don't think the reverse lookup from QuorumCnxManager was done before, nor that it should be done. So it could cause issues in places where reverse lookups aren't fully working. Surely, we could argue that it's a DNS setup issue but I think we should avoid the extra lookup if possible. I'll dig in a bit deeper and try to come with a deterministic repro. Commented on ZOOKEEPER-1506: turns out that my issue was with reverse lookup calls that were not introduced by that patch. They seem to have been introduced by ZOOKEEPER-107, so they have been around for a while. The tl;dr is that if your resolvers give you bad reverse names, you'll have issues. It would nice to avoid these reverse lookups, so I created: https://issues.apache.org/jira/browse/ZOOKEEPER-2171 After sorting this issue, I tested the following: * many elections (which look quick) * creating and deleting ephemerals in a loop (via zk-shell) * phunt's smoke test scripts (comparable results to 3.5.0) * partitioning and unpartioning an attached observer * use zktraffic's fle-dump zab-dump to inspect if there were any bogus FLE votes or ZAB messages [0] All of this looks good! So +1 now :-) -rgs p.s.: fwiw, here's my test setup: http://itevenworks.net/zk-releases [0] https://github.com/twitter/zktraffic -rgs On 12 April 2015 at 14:58, Michi Mutsuzaki mi...@cs.stanford.edu wrote: This is a release candidate for 3.5.1-alpha. The full release notes is available at: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801version=12326786 *** Please download, test and vote by April 25th 2015, 23:59 UTC+0. *** Source files: http://people.apache.org/~michim/zookeeper-3.5.1-alpha-candidate-0/ Maven staging repo: https://repository.apache.org/content/groups/staging/org/apache/zookeeper/zookeeper/3.5.1-alpha/ The tag to be voted upon: https://svn.apache.org/repos/asf/zookeeper/tags/release-3.5.1-rc0/ ZooKeeper's KEYS file containing PGP keys we use to sign the release: http://www.apache.org/dist/zookeeper/KEYS Should we release this candidate? --Michi
[jira] [Commented] (ZOOKEEPER-2124) Allow Zookeeper version string to have underscore '_'
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503569#comment-14503569 ] Flavio Junqueira commented on ZOOKEEPER-2124: - I had a quick look and the patch seems ok to me. I want to build with ant rpm before I +1 it. I can't do it from this computer. Allow Zookeeper version string to have underscore '_' - Key: ZOOKEEPER-2124 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2124 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.4.6 Reporter: Jerry He Assignee: Jerry He Fix For: 3.5.1, 3.6.0 Attachments: ZOOKEEPER-2124.001.patch Using Bigtop or other RPM build for Zookeeper, there is a problem with using the hyphen '-' character in the version string: {noformat} [bigdata@bdvs1166 bigtop]$ gradle zookeeper-rpm :buildSrc:compileJava UP-TO-DATE :buildSrc:compileGroovy UP-TO-DATE :buildSrc:processResources UP-TO-DATE :buildSrc:classes UP-TO-DATE :buildSrc:jar UP-TO-DATE :buildSrc:assemble UP-TO-DATE :buildSrc:compileTestJava UP-TO-DATE :buildSrc:compileTestGroovy UP-TO-DATE :buildSrc:processTestResources UP-TO-DATE :buildSrc:testClasses UP-TO-DATE :buildSrc:test UP-TO-DATE :buildSrc:check UP-TO-DATE :buildSrc:build UP-TO-DATE :zookeeper_vardefines :zookeeper-download :zookeeper-tar Copy /home/bigdata/bigtop/dl/zookeeper-3.4.6-IBM-1.tar.gz to /home/bigdata/bigtop/build/zookeeper/tar/zookeeper-3.4.6-IBM-1.tar.gz :zookeeper-srpm error: line 64: Illegal char '-' in: Version: 3.4.6-IBM-1 :zookeeper-srpm FAILED FAILURE: Build failed with an exception. * Where: Script '/home/bigdata/bigtop/packages.gradle' line: 462 * What went wrong: Execution failed for task ':zookeeper-srpm'. Process 'command 'rpmbuild'' finished with non-zero exit value 1 * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. BUILD FAILED {noformat} Also, according to the [rpm-maven-plugin|http://mojo.codehaus.org/rpm-maven-plugin/ident-params.html] documentation: {noformat} version The version number to use for the RPM package. By default, this is the project version. This value cannot contain a dash (-) due to contraints in the RPM file naming convention. Any specified value will be truncated at the first dash release The release number of the RPM. Beginning with release 2.0-beta-2, this is an optional parameter. By default, the release will be generated from the modifier portion of the project version using the following rules: If no modifier exists, the release will be 1. If the modifier ends with SNAPSHOT, the timestamp (in UTC) of the build will be appended to end. All instances of '-' in the modifier will be replaced with '_'. If a modifier exists and does not end with SNAPSHOT, _1 will be appended to end. {noformat} We should allow underscore '_' as part of the version string. e.g. 3.4.6_abc_1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-1506) Re-try DNS hostname - IP resolution if node connection fails
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503846#comment-14503846 ] Raul Gutierrez Segales commented on ZOOKEEPER-1506: --- It doesn't. I got a consistent repro by first firewalling the participant with id 0, to force that code path. I'll try reverting the patch entirely and see if that helps. Re-try DNS hostname - IP resolution if node connection fails - Key: ZOOKEEPER-1506 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1506 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.4.5 Environment: Ubuntu 11.04 64-bit Reporter: Mike Heffner Assignee: Michi Mutsuzaki Priority: Critical Labels: patch Fix For: 3.4.7, 3.5.1, 3.6.0 Attachments: ZOOKEEPER-1506-fix.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, zk-dns-caching-refresh.patch In our zoo.cfg we use hostnames to identify the ZK servers that are part of an ensemble. These hostnames are configured with a low (= 60s) TTL and the IP address they map to can and does change. Our procedure for replacing/upgrading a ZK node is to boot an entirely new instance and remap the hostname to the new instance's IP address. Our expectation is that when the original ZK node is terminated/shutdown, the remaining nodes in the ensemble would reconnect to the new instance. However, what we are noticing is that the remaining ZK nodes do not attempt to re-resolve the hostname-IP mapping for the new server. Once the original ZK node is terminated, the existing servers continue to attempt contacting it at the old IP address. It would be great if the ZK servers could try to re-resolve the hostname when attempting to connect to a lost ZK server, instead of caching the lookup indefinitely. Currently we must do a rolling restart of the ZK ensemble after swapping a node -- which at three nodes means we periodically lose quorum. The exact method we are following is to boot new instances in EC2 and attach one, of a set of three, Elastic IP address. External to EC2 this IP address remains the same and maps to whatever instance it is attached to. Internal to EC2, the elastic IP hostname has a TTL of about 45-60 seconds and is remapped to the internal (10.x.y.z) address of the instance it is attached to. Therefore, in our case we would like ZK to pickup the new 10.x.y.z address that the elastic IP hostname gets mapped to and reconnect appropriately. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Failed: ZOOKEEPER-2124 PreCommit Build #2640
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2124 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2640/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 365992 lines...] [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no new tests are needed for this patch. [exec] Also please list what manual steps were performed to verify this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] -1 core tests. The patch failed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2640//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2640//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2640//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] 05b41ef08f52a6976d00c999343f27a0e3a7fbae logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] BUILD FAILED /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1721: exec returned: 2 Total time: 51 minutes 4 seconds Build step 'Execute shell' marked build as failure Archiving artifacts Sending artifact delta relative to PreCommit-ZOOKEEPER-Build #2639 Archived 24 artifacts Archive block size is 32768 Received 8 blocks and 33613697 bytes Compression is 0.8% Took 10 sec Recording test results Description set: ZOOKEEPER-2124 Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## All tests passed
Re: Jenkins pre-commit build is bogus?
In addition to the C client test failures, there were also JUnit failures. One example is ByteBufferInputStreamTest. Here we can see that Jenkins reports it as Passed, but the console output shows that it failed. https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2640/testReport/org .apache.zookeeper.server/ByteBufferInputStreamTest/testReadArrayOffsetLengt h_0Length/ As I recall, the Jenkins JUnit reporting mechanism works via string parsing of the console output. I wonder if something about an exception propagating out of JUnit4ZKTestRunner causes console output that the Jenkins reporting doesn't understand. Chris Nauroth Hortonworks http://hortonworks.com/ On 4/20/15, 2:18 PM, Patrick Hunt ph...@apache.org wrote: Look near the bottom of the console output. It failed for two reasons - no tests as part of the patch, and the c client tests failed. Jenkins doesn't know how to report the c client tests, just the java. Patrick On Mon, Apr 20, 2015 at 2:04 PM, Flavio Junqueira fpjunque...@yahoo.com.invalid wrote: While looking at ZK-2124, I checked the report #2640 and it says no test failures (https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2640/ https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2640/). The jenkins summary on the jira correctly reports that there has been core test failures, though. Any clue of what needs to be fixed? -Flavio
[jira] [Created] (ZOOKEEPER-2171) avoid reverse lookups in QuorumCnxManager
Raul Gutierrez Segales created ZOOKEEPER-2171: - Summary: avoid reverse lookups in QuorumCnxManager Key: ZOOKEEPER-2171 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2171 Project: ZooKeeper Issue Type: Bug Components: quorum Reporter: Raul Gutierrez Segales Assignee: Raul Gutierrez Segales Apparently, ZOOKEEPER-107 (via a quick git-blame look) introduced a bunch of getHostName() calls in QCM. Besides the overhead, these can cause problems when mixed with failing/mis-configured DNS servers. It would be nice to reduce them, if that doesn't affect operational correctness. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] Apache ZooKeeper release 3.5.1-alpha candidate 0
Thanks Raul. I'd like to include the fix ( https://issues.apache.org/jira/browse/ZOOKEEPER-2171 ) in 3.5.1. I'll create another candidate once the issue is resolved. In the meantime, please let me know if you guys have any other feedback regarding this release candidate. On Mon, Apr 20, 2015 at 1:44 PM, Raúl Gutiérrez Segalés r...@itevenworks.net wrote: On 20 April 2015 at 13:18, Flavio Junqueira fpjunque...@yahoo.com.invalid wrote: Please reopen ZK-1506, Raul. Done - I'll post my (hopefully reproducible) setup in a bit. I guess that patch might be triggering reverse lookups as an (undesired) side effect. -rgs
[jira] [Commented] (ZOOKEEPER-2124) Allow Zookeeper version string to have underscore '_'
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503548#comment-14503548 ] Hadoop QA commented on ZOOKEEPER-2124: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726641/ZOOKEEPER-2124.001.patch against trunk revision 1672934. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2640//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2640//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2640//console This message is automatically generated. Allow Zookeeper version string to have underscore '_' - Key: ZOOKEEPER-2124 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2124 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.4.6 Reporter: Jerry He Assignee: Jerry He Fix For: 3.5.1, 3.6.0 Attachments: ZOOKEEPER-2124.001.patch Using Bigtop or other RPM build for Zookeeper, there is a problem with using the hyphen '-' character in the version string: {noformat} [bigdata@bdvs1166 bigtop]$ gradle zookeeper-rpm :buildSrc:compileJava UP-TO-DATE :buildSrc:compileGroovy UP-TO-DATE :buildSrc:processResources UP-TO-DATE :buildSrc:classes UP-TO-DATE :buildSrc:jar UP-TO-DATE :buildSrc:assemble UP-TO-DATE :buildSrc:compileTestJava UP-TO-DATE :buildSrc:compileTestGroovy UP-TO-DATE :buildSrc:processTestResources UP-TO-DATE :buildSrc:testClasses UP-TO-DATE :buildSrc:test UP-TO-DATE :buildSrc:check UP-TO-DATE :buildSrc:build UP-TO-DATE :zookeeper_vardefines :zookeeper-download :zookeeper-tar Copy /home/bigdata/bigtop/dl/zookeeper-3.4.6-IBM-1.tar.gz to /home/bigdata/bigtop/build/zookeeper/tar/zookeeper-3.4.6-IBM-1.tar.gz :zookeeper-srpm error: line 64: Illegal char '-' in: Version: 3.4.6-IBM-1 :zookeeper-srpm FAILED FAILURE: Build failed with an exception. * Where: Script '/home/bigdata/bigtop/packages.gradle' line: 462 * What went wrong: Execution failed for task ':zookeeper-srpm'. Process 'command 'rpmbuild'' finished with non-zero exit value 1 * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. BUILD FAILED {noformat} Also, according to the [rpm-maven-plugin|http://mojo.codehaus.org/rpm-maven-plugin/ident-params.html] documentation: {noformat} version The version number to use for the RPM package. By default, this is the project version. This value cannot contain a dash (-) due to contraints in the RPM file naming convention. Any specified value will be truncated at the first dash release The release number of the RPM. Beginning with release 2.0-beta-2, this is an optional parameter. By default, the release will be generated from the modifier portion of the project version using the following rules: If no modifier exists, the release will be 1. If the modifier ends with SNAPSHOT, the timestamp (in UTC) of the build will be appended to end. All instances of '-' in the modifier will be replaced with '_'. If a modifier exists and does not end with SNAPSHOT, _1 will be appended to end. {noformat} We should allow underscore '_' as part of the version string. e.g. 3.4.6_abc_1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-1506) Re-try DNS hostname - IP resolution if node connection fails
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504004#comment-14504004 ] Raul Gutierrez Segales commented on ZOOKEEPER-1506: --- So I created a build with ZOOKEEPER-1506 removed and I still get the problem. It's probably due the getHostName() calls that you pointed out. These calls can actually generate reverse lookups according to: http://download.java.net/jdk7/archive/b123/docs/api/java/net/InetSocketAddress.html#getHostName%28%29 However, these calls have been introduced by ZOOKEEPER-107 (according to git-blame). I think we should avoid them, though lets do that in another ticket. In conclusion, if you have a bad resolver or bogus reverse lookups (as is the case in my test scenario): you'll have issues because of these calls. Re-try DNS hostname - IP resolution if node connection fails - Key: ZOOKEEPER-1506 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1506 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.4.5 Environment: Ubuntu 11.04 64-bit Reporter: Mike Heffner Assignee: Michi Mutsuzaki Priority: Critical Labels: patch Fix For: 3.4.7, 3.5.1, 3.6.0 Attachments: ZOOKEEPER-1506-fix.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, zk-dns-caching-refresh.patch In our zoo.cfg we use hostnames to identify the ZK servers that are part of an ensemble. These hostnames are configured with a low (= 60s) TTL and the IP address they map to can and does change. Our procedure for replacing/upgrading a ZK node is to boot an entirely new instance and remap the hostname to the new instance's IP address. Our expectation is that when the original ZK node is terminated/shutdown, the remaining nodes in the ensemble would reconnect to the new instance. However, what we are noticing is that the remaining ZK nodes do not attempt to re-resolve the hostname-IP mapping for the new server. Once the original ZK node is terminated, the existing servers continue to attempt contacting it at the old IP address. It would be great if the ZK servers could try to re-resolve the hostname when attempting to connect to a lost ZK server, instead of caching the lookup indefinitely. Currently we must do a rolling restart of the ZK ensemble after swapping a node -- which at three nodes means we periodically lose quorum. The exact method we are following is to boot new instances in EC2 and attach one, of a set of three, Elastic IP address. External to EC2 this IP address remains the same and maps to whatever instance it is attached to. Internal to EC2, the elastic IP hostname has a TTL of about 45-60 seconds and is remapped to the internal (10.x.y.z) address of the instance it is attached to. Therefore, in our case we would like ZK to pickup the new 10.x.y.z address that the elastic IP hostname gets mapped to and reconnect appropriately. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-1506) Re-try DNS hostname - IP resolution if node connection fails
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504007#comment-14504007 ] Raul Gutierrez Segales commented on ZOOKEEPER-1506: --- I'll go ahead and close this again [~michim]. Re-try DNS hostname - IP resolution if node connection fails - Key: ZOOKEEPER-1506 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1506 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.4.5 Environment: Ubuntu 11.04 64-bit Reporter: Mike Heffner Assignee: Michi Mutsuzaki Priority: Critical Labels: patch Fix For: 3.4.7, 3.5.1, 3.6.0 Attachments: ZOOKEEPER-1506-fix.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, zk-dns-caching-refresh.patch In our zoo.cfg we use hostnames to identify the ZK servers that are part of an ensemble. These hostnames are configured with a low (= 60s) TTL and the IP address they map to can and does change. Our procedure for replacing/upgrading a ZK node is to boot an entirely new instance and remap the hostname to the new instance's IP address. Our expectation is that when the original ZK node is terminated/shutdown, the remaining nodes in the ensemble would reconnect to the new instance. However, what we are noticing is that the remaining ZK nodes do not attempt to re-resolve the hostname-IP mapping for the new server. Once the original ZK node is terminated, the existing servers continue to attempt contacting it at the old IP address. It would be great if the ZK servers could try to re-resolve the hostname when attempting to connect to a lost ZK server, instead of caching the lookup indefinitely. Currently we must do a rolling restart of the ZK ensemble after swapping a node -- which at three nodes means we periodically lose quorum. The exact method we are following is to boot new instances in EC2 and attach one, of a set of three, Elastic IP address. External to EC2 this IP address remains the same and maps to whatever instance it is attached to. Internal to EC2, the elastic IP hostname has a TTL of about 45-60 seconds and is remapped to the internal (10.x.y.z) address of the instance it is attached to. Therefore, in our case we would like ZK to pickup the new 10.x.y.z address that the elastic IP hostname gets mapped to and reconnect appropriately. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-1604) remove rpm/deb/... packaging
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504098#comment-14504098 ] Ralph Tice commented on ZOOKEEPER-1604: --- There definitely wasn't a strong reason, and I've come around on this issue. I don't think I'm supposed to minus or plus things either, sorry. remove rpm/deb/... packaging Key: ZOOKEEPER-1604 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1604 Project: ZooKeeper Issue Type: Task Components: build Affects Versions: 3.3.0 Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 3.5.2, 3.6.0 Remove rpm/deb/... packaging from our source repo. Now that BigTop is available and fully supporting ZK it's no longer necessary for us to attempt to include this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-1506) Re-try DNS hostname - IP resolution if node connection fails
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503564#comment-14503564 ] Raul Gutierrez Segales commented on ZOOKEEPER-1506: --- I am running elections (in a 5 participants + 1 observer cluster) as part of validating the 3.5.1 alpha rc proposed by [~michim]. I am getting this from time to time: https://gist.github.com/rgs1/d11822799fdbbfa5d5f2 I only have IP addresses in zoo.cfg and this patch seems to be triggering a reverse lookup (IP- hostname). Given that in my current setup (a test setup, with systemd-nspawn containers) hostnames don't necessarily resolve back (i.e.: hostname - IP doesn't work), participants might end up unable to connect to the leader if it's initially unavailable. Is the reverse lookup (IP - hostname) something expected with this patch or a side effect? I don't see why we'd ever want/need that reverse lookup given that it could be problematic in some setups. Thoughts? p.s.: will post my entire, reproducible, setup a bit later. Re-try DNS hostname - IP resolution if node connection fails - Key: ZOOKEEPER-1506 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1506 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.4.5 Environment: Ubuntu 11.04 64-bit Reporter: Mike Heffner Assignee: Michi Mutsuzaki Priority: Critical Labels: patch Fix For: 3.4.7, 3.5.1, 3.6.0 Attachments: ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, ZOOKEEPER-1506.patch, zk-dns-caching-refresh.patch In our zoo.cfg we use hostnames to identify the ZK servers that are part of an ensemble. These hostnames are configured with a low (= 60s) TTL and the IP address they map to can and does change. Our procedure for replacing/upgrading a ZK node is to boot an entirely new instance and remap the hostname to the new instance's IP address. Our expectation is that when the original ZK node is terminated/shutdown, the remaining nodes in the ensemble would reconnect to the new instance. However, what we are noticing is that the remaining ZK nodes do not attempt to re-resolve the hostname-IP mapping for the new server. Once the original ZK node is terminated, the existing servers continue to attempt contacting it at the old IP address. It would be great if the ZK servers could try to re-resolve the hostname when attempting to connect to a lost ZK server, instead of caching the lookup indefinitely. Currently we must do a rolling restart of the ZK ensemble after swapping a node -- which at three nodes means we periodically lose quorum. The exact method we are following is to boot new instances in EC2 and attach one, of a set of three, Elastic IP address. External to EC2 this IP address remains the same and maps to whatever instance it is attached to. Internal to EC2, the elastic IP hostname has a TTL of about 45-60 seconds and is remapped to the internal (10.x.y.z) address of the instance it is attached to. Therefore, in our case we would like ZK to pickup the new 10.x.y.z address that the elastic IP hostname gets mapped to and reconnect appropriately. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] Apache ZooKeeper release 3.5.1-alpha candidate 0
On 20 April 2015 at 13:18, Flavio Junqueira fpjunque...@yahoo.com.invalid wrote: Please reopen ZK-1506, Raul. Done - I'll post my (hopefully reproducible) setup in a bit. I guess that patch might be triggering reverse lookups as an (undesired) side effect. -rgs
[jira] [Commented] (ZOOKEEPER-2124) Allow Zookeeper version string to have underscore '_'
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503614#comment-14503614 ] Flavio Junqueira commented on ZOOKEEPER-2124: - There are multiple jiras open for TestReconfig.cc, I think the latest is ZOOKEEPER-2152. In general, I agree that the changes here shouldn't cause test failures. Allow Zookeeper version string to have underscore '_' - Key: ZOOKEEPER-2124 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2124 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.4.6 Reporter: Jerry He Assignee: Jerry He Fix For: 3.5.1, 3.6.0 Attachments: ZOOKEEPER-2124.001.patch Using Bigtop or other RPM build for Zookeeper, there is a problem with using the hyphen '-' character in the version string: {noformat} [bigdata@bdvs1166 bigtop]$ gradle zookeeper-rpm :buildSrc:compileJava UP-TO-DATE :buildSrc:compileGroovy UP-TO-DATE :buildSrc:processResources UP-TO-DATE :buildSrc:classes UP-TO-DATE :buildSrc:jar UP-TO-DATE :buildSrc:assemble UP-TO-DATE :buildSrc:compileTestJava UP-TO-DATE :buildSrc:compileTestGroovy UP-TO-DATE :buildSrc:processTestResources UP-TO-DATE :buildSrc:testClasses UP-TO-DATE :buildSrc:test UP-TO-DATE :buildSrc:check UP-TO-DATE :buildSrc:build UP-TO-DATE :zookeeper_vardefines :zookeeper-download :zookeeper-tar Copy /home/bigdata/bigtop/dl/zookeeper-3.4.6-IBM-1.tar.gz to /home/bigdata/bigtop/build/zookeeper/tar/zookeeper-3.4.6-IBM-1.tar.gz :zookeeper-srpm error: line 64: Illegal char '-' in: Version: 3.4.6-IBM-1 :zookeeper-srpm FAILED FAILURE: Build failed with an exception. * Where: Script '/home/bigdata/bigtop/packages.gradle' line: 462 * What went wrong: Execution failed for task ':zookeeper-srpm'. Process 'command 'rpmbuild'' finished with non-zero exit value 1 * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. BUILD FAILED {noformat} Also, according to the [rpm-maven-plugin|http://mojo.codehaus.org/rpm-maven-plugin/ident-params.html] documentation: {noformat} version The version number to use for the RPM package. By default, this is the project version. This value cannot contain a dash (-) due to contraints in the RPM file naming convention. Any specified value will be truncated at the first dash release The release number of the RPM. Beginning with release 2.0-beta-2, this is an optional parameter. By default, the release will be generated from the modifier portion of the project version using the following rules: If no modifier exists, the release will be 1. If the modifier ends with SNAPSHOT, the timestamp (in UTC) of the build will be appended to end. All instances of '-' in the modifier will be replaced with '_'. If a modifier exists and does not end with SNAPSHOT, _1 will be appended to end. {noformat} We should allow underscore '_' as part of the version string. e.g. 3.4.6_abc_1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] Apache ZooKeeper release 3.5.1-alpha candidate 0
Please reopen ZK-1506, Raul. -Flavio On 20 Apr 2015, at 21:03, Raúl Gutiérrez Segalés r...@itevenworks.net wrote: -1, alas. I think ZOOKEEPER-1506 could be problematic for some setups. After a couple of elections with a cluster of 5 participants and one observer, I end up with a participant that's unable to find the leader because it does a reverse lookup (IP - hostname) and ends up with a bogus hostname that it can't resolve: https://gist.github.com/rgs1/d11822799fdbbfa5d5f2 I don't think the reverse lookup from QuorumCnxManager was done before, nor that it should be done. So it could cause issues in places where reverse lookups aren't fully working. Surely, we could argue that it's a DNS setup issue but I think we should avoid the extra lookup if possible. I'll dig in a bit deeper and try to come with a deterministic repro. -rgs On 12 April 2015 at 14:58, Michi Mutsuzaki mi...@cs.stanford.edu wrote: This is a release candidate for 3.5.1-alpha. The full release notes is available at: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801version=12326786 *** Please download, test and vote by April 25th 2015, 23:59 UTC+0. *** Source files: http://people.apache.org/~michim/zookeeper-3.5.1-alpha-candidate-0/ Maven staging repo: https://repository.apache.org/content/groups/staging/org/apache/zookeeper/zookeeper/3.5.1-alpha/ The tag to be voted upon: https://svn.apache.org/repos/asf/zookeeper/tags/release-3.5.1-rc0/ ZooKeeper's KEYS file containing PGP keys we use to sign the release: http://www.apache.org/dist/zookeeper/KEYS Should we release this candidate? --Michi
[jira] [Commented] (ZOOKEEPER-2170) Zookeeper is not logging as per the configuraiton in log4j.properties
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502672#comment-14502672 ] Arshad Mohammad commented on ZOOKEEPER-2170: log4j gives preference to system properties over properties in log4j.properties zookeeper.root.logger is configured in log4j.properties as well as it is being passed as system property from zkServer.sh {code} nohup $JAVA $ZOO_DATADIR_AUTOCREATE -Dzookeeper.log.dir=${ZOO_LOG_DIR} -Dzookeeper.log.file=${ZOO_LOG_FILE} \ -Dzookeeper.root.logger=${ZOO_LOG4J_PROP} \ {code} This is the reason {color:red}zookeeper.root.logger{color} property in {color:red}log4j.properties{color} is never used. Zookeeper is not logging as per the configuraiton in log4j.properties -- Key: ZOOKEEPER-2170 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2170 Project: ZooKeeper Issue Type: Bug Reporter: Arshad Mohammad Fix For: 3.6.0 In conf/log4j.properties default root logger is {code} zookeeper.root.logger=INFO, CONSOLE {code} Changing root logger to bellow value or any other value does not change logging effect {code} zookeeper.root.logger=DEBUG, ROLLINGFILE {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (ZOOKEEPER-2170) Zookeeper is not logging as per the configuraiton in log4j.properties
Arshad Mohammad created ZOOKEEPER-2170: -- Summary: Zookeeper is not logging as per the configuraiton in log4j.properties Key: ZOOKEEPER-2170 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2170 Project: ZooKeeper Issue Type: Bug Reporter: Arshad Mohammad Fix For: 3.6.0 In conf/log4j.properties default root logger is {code} zookeeper.root.logger=INFO, CONSOLE {code} Changing root logger to bellow value or any other value does not change logging effect {code} zookeeper.root.logger=DEBUG, ROLLINGFILE {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
ZooKeeper-trunk - Build # 2666 - Still Failing
See https://builds.apache.org/job/ZooKeeper-trunk/2666/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 366758 lines...] [exec] Log Message Received: [2015-04-20 11:19:38,541:13675(0x2b34daa1d540):ZOO_INFO@log_env@988: Client environment:os.arch=3.13.0-36-lowlatency] [exec] Log Message Received: [2015-04-20 11:19:38,541:13675(0x2b34daa1d540):ZOO_INFO@log_env@989: Client environment:os.version=#63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014] [exec] Log Message Received: [2015-04-20 11:19:38,541:13675(0x2b34daa1d540):ZOO_INFO@log_env@997: Client environment:user.name=jenkins] [exec] Log Message Received: [2015-04-20 11:19:38,541:13675(0x2b34daa1d540):ZOO_INFO@log_env@1005: Client environment:user.home=/home/jenkins] [exec] Log Message Received: [2015-04-20 11:19:38,541:13675(0x2b34daa1d540):ZOO_INFO@log_env@1017: Client environment:user.dir=/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/build/test/test-cppunit] [exec] Log Message Received: [2015-04-20 11:19:38,541:13675(0x2b34daa1d540):ZOO_INFO@zookeeper_init_internal@1060: Initiating client connection, host=127.0.0.1:22181 sessionTimeout=1 watcher=0x45d350 sessionId=0 sessionPasswd=null context=0x7fffe6fe6390 flags=0] [exec] Log Message Received: [2015-04-20 11:19:38,541:13675(0x2b34dd0ad700):ZOO_INFO@check_events@2298: initiated connection to server [127.0.0.1:22181]] [exec] Log Message Received: [2015-04-20 11:19:38,544:13675(0x2b34dd0ad700):ZOO_INFO@check_events@2350: session establishment complete on server [127.0.0.1:22181], sessionId=0x1000531a647000f, negotiated timeout=1 ] [exec] : elapsed 1001 : OK [exec] Zookeeper_simpleSystem::testAsyncWatcherAutoReset ZooKeeper server started : elapsed 10086 : OK [exec] Zookeeper_simpleSystem::testDeserializeString : elapsed 0 : OK [exec] Zookeeper_simpleSystem::testFirstServerDown : elapsed 1001 : OK [exec] Zookeeper_simpleSystem::testNullData : elapsed 1020 : OK [exec] Zookeeper_simpleSystem::testIPV6 : elapsed 1003 : OK [exec] Zookeeper_simpleSystem::testCreate : elapsed 1007 : OK [exec] Zookeeper_simpleSystem::testPath : elapsed 1012 : OK [exec] Zookeeper_simpleSystem::testPathValidation : elapsed 1035 : OK [exec] Zookeeper_simpleSystem::testPing : elapsed 17155 : OK [exec] Zookeeper_simpleSystem::testAcl : elapsed 1012 : OK [exec] Zookeeper_simpleSystem::testChroot : elapsed 3032 : OK [exec] Zookeeper_simpleSystem::testAuth ZooKeeper server started ZooKeeper server started : elapsed 30192 : OK [exec] Zookeeper_simpleSystem::testHangingClient : elapsed 1021 : OK [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithGlobal ZooKeeper server started ZooKeeper server started ZooKeeper server started : elapsed 14352 : OK [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithLocal ZooKeeper server started ZooKeeper server started ZooKeeper server started : elapsed 14345 : OK [exec] Zookeeper_simpleSystem::testGetChildren2 : elapsed 1029 : OK [exec] Zookeeper_simpleSystem::testLastZxid : elapsed 4510 : OK [exec] Zookeeper_simpleSystem::testRemoveWatchers ZooKeeper server started : elapsed 4166 : OK [exec] Zookeeper_readOnly::testReadOnly : elapsed 4080 : OK [exec] OK (72) [exec] PASS: zktest-mt [exec] == [exec] 1 of 2 tests failed [exec] Please report to u...@zookeeper.apache.org [exec] == [exec] make[1]: Leaving directory `/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/build/test/test-cppunit' [exec] make[1]: *** [check-TESTS] Error 1 [exec] make: *** [check-am] Error 2 BUILD FAILED /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/build.xml:1428: The following error occurred while executing this line: /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/build.xml:1388: The following error occurred while executing this line: /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/build.xml:1398: exec returned: 2 Total time: 49 minutes 46 seconds Build step 'Execute shell' marked build as failure [FINDBUGS] Skipping publisher since build result is FAILURE [WARNINGS] Skipping publisher since build result is FAILURE Archiving artifacts Sending artifact delta relative to ZooKeeper-trunk #2664 Archived 5 artifacts Archive block size is 32768 Received 2 blocks and 18211338 bytes Compression is 0.4% Took 7.7 sec Recording fingerprints Recording test results Publishing Javadoc Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## All tests passed
ZooKeeper-trunk-solaris - Build # 1007 - Still Failing
See https://builds.apache.org/job/ZooKeeper-trunk-solaris/1007/ ### ## LAST 60 LINES OF THE CONSOLE ### Started by timer Building remotely on solaris1 (Solaris) in workspace /export/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris Updating http://svn.apache.org/repos/asf/zookeeper/trunk at revision '2015-04-20T08:31:10.580 +' At revision 1674756 Updating http://svn.apache.org/repos/asf/hadoop/nightly at revision '2015-04-20T08:31:10.580 +' At revision 1674756 no change for http://svn.apache.org/repos/asf/zookeeper/trunk since the previous build no change for http://svn.apache.org/repos/asf/hadoop/nightly since the previous build No emails were triggered. [locks-and-latches] Checking to see if we really have the locks [locks-and-latches] Have all the locks, build can start [ZooKeeper-trunk-solaris] $ /bin/bash /var/tmp/hudson9029692132330793769.sh /var/tmp/hudson9029692132330793769.sh: line 12: ant: command not found Build step 'Execute shell' marked build as failure [locks-and-latches] Releasing all the locks [locks-and-latches] All the locks released Recording test results Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## No tests ran.
ZooKeeper_branch35_jdk7 - Build # 270 - Failure
See https://builds.apache.org/job/ZooKeeper_branch35_jdk7/270/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 370863 lines...] [exec] Log Message Received: [2015-04-20 10:52:15,951:1756(0x2af726158780):ZOO_INFO@log_env@987: Client environment:os.name=Linux] [exec] Log Message Received: [2015-04-20 10:52:15,951:1756(0x2af726158780):ZOO_INFO@log_env@988: Client environment:os.arch=3.13.0-30-generic] [exec] Log Message Received: [2015-04-20 10:52:15,951:1756(0x2af726158780):ZOO_INFO@log_env@989: Client environment:os.version=#54-Ubuntu SMP Mon Jun 9 22:45:01 UTC 2014] [exec] Log Message Received: [2015-04-20 10:52:15,951:1756(0x2af726158780):ZOO_INFO@log_env@997: Client environment:user.name=jenkins] [exec] Log Message Received: [2015-04-20 10:52:15,951:1756(0x2af726158780):ZOO_INFO@log_env@1005: Client environment:user.home=/home/jenkins] [exec] Log Message Received: [2015-04-20 10:52:15,951:1756(0x2af726158780):ZOO_INFO@log_env@1017: Client environment:user.dir=/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/build/test/test-cppunit] [exec] Log Message Received: [2015-04-20 10:52:15,951:1756(0x2af726158780):ZOO_INFO@zookeeper_init_internal@1060: Initiating client connection, host=127.0.0.1:22181 sessionTimeout=1 watcher=0x45d350 sessionId=0 sessionPasswd=null context=0x77f73c10 flags=0] [exec] Log Message Received: [2015-04-20 10:52:15,952:1756(0x2af728815700):ZOO_INFO@check_events@2298: initiated connection to server [127.0.0.1:22181]] [exec] Log Message Received: [2015-04-20 10:52:15,954:1756(0x2af728815700):ZOO_INFO@check_events@2350: session establishment complete on server [127.0.0.1:22181], sessionId=0x1c1a56a000f, negotiated timeout=1 ] [exec] : elapsed 1001 : OK [exec] Zookeeper_simpleSystem::testAsyncWatcherAutoReset ZooKeeper server started : elapsed 10575 : OK [exec] Zookeeper_simpleSystem::testDeserializeString : elapsed 0 : OK [exec] Zookeeper_simpleSystem::testFirstServerDown : elapsed 1001 : OK [exec] Zookeeper_simpleSystem::testNullData : elapsed 1019 : OK [exec] Zookeeper_simpleSystem::testIPV6 : elapsed 1004 : OK [exec] Zookeeper_simpleSystem::testCreate : elapsed 1008 : OK [exec] Zookeeper_simpleSystem::testPath : elapsed 1012 : OK [exec] Zookeeper_simpleSystem::testPathValidation : elapsed 1032 : OK [exec] Zookeeper_simpleSystem::testPing : elapsed 17190 : OK [exec] Zookeeper_simpleSystem::testAcl : elapsed 1012 : OK [exec] Zookeeper_simpleSystem::testChroot : elapsed 3029 : OK [exec] Zookeeper_simpleSystem::testAuth ZooKeeper server started ZooKeeper server started : elapsed 31118 : OK [exec] Zookeeper_simpleSystem::testHangingClient : elapsed 1022 : OK [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithGlobal ZooKeeper server started ZooKeeper server started ZooKeeper server started : elapsed 15720 : OK [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithLocal ZooKeeper server started ZooKeeper server started ZooKeeper server started : elapsed 15686 : OK [exec] Zookeeper_simpleSystem::testGetChildren2 : elapsed 1025 : OK [exec] Zookeeper_simpleSystem::testLastZxid : elapsed 4511 : OK [exec] Zookeeper_simpleSystem::testRemoveWatchers ZooKeeper server started : elapsed 4720 : OK [exec] Zookeeper_readOnly::testReadOnly : elapsed 4340 : OK [exec] /jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/tests/TestReconfig.cc:183: Assertion: equality assertion failed [Expected: 1, Actual : 0] [exec] /jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/src/c/tests/TestReconfig.cc:474: Assertion: assertion failed [Expression: found != string::npos, 10.10.10.4:2004 not in newComing list] [exec] Failures !!! [exec] Run: 72 Failure total: 2 Failures: 2 Errors: 0 [exec] FAIL: zktest-mt [exec] make[1]: *** [check-TESTS] Error 1 [exec] make: *** [check-am] Error 2 [exec] == [exec] 1 of 2 tests failed [exec] Please report to u...@zookeeper.apache.org [exec] == [exec] make[1]: Leaving directory `/jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/build/test/test-cppunit' BUILD FAILED /jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/build.xml:1428: The following error occurred while executing this line: /jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/build.xml:1388: The following error occurred while executing this line: /jenkins/workspace/ZooKeeper_branch35_jdk7/branch-3.5/build.xml:1398: exec returned: 2 Total time: 51 minutes 49 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Sending artifact delta relative to ZooKeeper_branch35_jdk7 #269 Archived 2 artifacts Archive
[jira] [Commented] (ZOOKEEPER-1604) remove rpm/deb/... packaging
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503502#comment-14503502 ] Flavio Junqueira commented on ZOOKEEPER-1604: - I don't think there is a resolution. [~ralph.tice] isn't in favor of relying only on BigTop, but frankly I don't understand if there is a strong reason behind it. If we are to maintain our own packaging, then I'd like to know how is interested in maintaining. In any case, I'll have a look at ZOOKEEPER-2124. remove rpm/deb/... packaging Key: ZOOKEEPER-1604 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1604 Project: ZooKeeper Issue Type: Task Components: build Affects Versions: 3.3.0 Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 3.5.2, 3.6.0 Remove rpm/deb/... packaging from our source repo. Now that BigTop is available and fully supporting ZK it's no longer necessary for us to attempt to include this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] Apache ZooKeeper release 3.5.1-alpha candidate 0
-1, alas. I think ZOOKEEPER-1506 could be problematic for some setups. After a couple of elections with a cluster of 5 participants and one observer, I end up with a participant that's unable to find the leader because it does a reverse lookup (IP - hostname) and ends up with a bogus hostname that it can't resolve: https://gist.github.com/rgs1/d11822799fdbbfa5d5f2 I don't think the reverse lookup from QuorumCnxManager was done before, nor that it should be done. So it could cause issues in places where reverse lookups aren't fully working. Surely, we could argue that it's a DNS setup issue but I think we should avoid the extra lookup if possible. I'll dig in a bit deeper and try to come with a deterministic repro. -rgs On 12 April 2015 at 14:58, Michi Mutsuzaki mi...@cs.stanford.edu wrote: This is a release candidate for 3.5.1-alpha. The full release notes is available at: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310801version=12326786 *** Please download, test and vote by April 25th 2015, 23:59 UTC+0. *** Source files: http://people.apache.org/~michim/zookeeper-3.5.1-alpha-candidate-0/ Maven staging repo: https://repository.apache.org/content/groups/staging/org/apache/zookeeper/zookeeper/3.5.1-alpha/ The tag to be voted upon: https://svn.apache.org/repos/asf/zookeeper/tags/release-3.5.1-rc0/ ZooKeeper's KEYS file containing PGP keys we use to sign the release: http://www.apache.org/dist/zookeeper/KEYS Should we release this candidate? --Michi
[jira] [Reopened] (ZOOKEEPER-2124) Allow Zookeeper version string to have underscore '_'
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth reopened ZOOKEEPER-2124: -- Allow Zookeeper version string to have underscore '_' - Key: ZOOKEEPER-2124 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2124 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.4.6 Reporter: Jerry He Assignee: Jerry He Fix For: 3.5.1, 3.6.0 Using Bigtop or other RPM build for Zookeeper, there is a problem with using the hyphen '-' character in the version string: {noformat} [bigdata@bdvs1166 bigtop]$ gradle zookeeper-rpm :buildSrc:compileJava UP-TO-DATE :buildSrc:compileGroovy UP-TO-DATE :buildSrc:processResources UP-TO-DATE :buildSrc:classes UP-TO-DATE :buildSrc:jar UP-TO-DATE :buildSrc:assemble UP-TO-DATE :buildSrc:compileTestJava UP-TO-DATE :buildSrc:compileTestGroovy UP-TO-DATE :buildSrc:processTestResources UP-TO-DATE :buildSrc:testClasses UP-TO-DATE :buildSrc:test UP-TO-DATE :buildSrc:check UP-TO-DATE :buildSrc:build UP-TO-DATE :zookeeper_vardefines :zookeeper-download :zookeeper-tar Copy /home/bigdata/bigtop/dl/zookeeper-3.4.6-IBM-1.tar.gz to /home/bigdata/bigtop/build/zookeeper/tar/zookeeper-3.4.6-IBM-1.tar.gz :zookeeper-srpm error: line 64: Illegal char '-' in: Version: 3.4.6-IBM-1 :zookeeper-srpm FAILED FAILURE: Build failed with an exception. * Where: Script '/home/bigdata/bigtop/packages.gradle' line: 462 * What went wrong: Execution failed for task ':zookeeper-srpm'. Process 'command 'rpmbuild'' finished with non-zero exit value 1 * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. BUILD FAILED {noformat} Also, according to the [rpm-maven-plugin|http://mojo.codehaus.org/rpm-maven-plugin/ident-params.html] documentation: {noformat} version The version number to use for the RPM package. By default, this is the project version. This value cannot contain a dash (-) due to contraints in the RPM file naming convention. Any specified value will be truncated at the first dash release The release number of the RPM. Beginning with release 2.0-beta-2, this is an optional parameter. By default, the release will be generated from the modifier portion of the project version using the following rules: If no modifier exists, the release will be 1. If the modifier ends with SNAPSHOT, the timestamp (in UTC) of the build will be appended to end. All instances of '-' in the modifier will be replaced with '_'. If a modifier exists and does not end with SNAPSHOT, _1 will be appended to end. {noformat} We should allow underscore '_' as part of the version string. e.g. 3.4.6_abc_1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-2124) Allow Zookeeper version string to have underscore '_'
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated ZOOKEEPER-2124: - Attachment: ZOOKEEPER-2124.001.patch I'm attaching a patch for consideration that replaces hyphen with underscore automatically in build.xml. While testing this, I also discovered an old email thread with some rpm fixes that never made it into the source tree. http://mail-archives.apache.org/mod_mbox/zookeeper-user/201212.mbox/%3c50d2d481.8010...@pt-consulting.eu%3E I'm not including those fixes in my patch file, because I think it would be more appropriate to track them separately and ensure they get credited to the original author if accepted. ZOOKEEPER-1604 was an old proposal to remove the packaging entirely from the project. I'm going to follow up on that to see if there has been a resolution to that discussion. Allow Zookeeper version string to have underscore '_' - Key: ZOOKEEPER-2124 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2124 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.4.6 Reporter: Jerry He Assignee: Jerry He Fix For: 3.5.1, 3.6.0 Attachments: ZOOKEEPER-2124.001.patch Using Bigtop or other RPM build for Zookeeper, there is a problem with using the hyphen '-' character in the version string: {noformat} [bigdata@bdvs1166 bigtop]$ gradle zookeeper-rpm :buildSrc:compileJava UP-TO-DATE :buildSrc:compileGroovy UP-TO-DATE :buildSrc:processResources UP-TO-DATE :buildSrc:classes UP-TO-DATE :buildSrc:jar UP-TO-DATE :buildSrc:assemble UP-TO-DATE :buildSrc:compileTestJava UP-TO-DATE :buildSrc:compileTestGroovy UP-TO-DATE :buildSrc:processTestResources UP-TO-DATE :buildSrc:testClasses UP-TO-DATE :buildSrc:test UP-TO-DATE :buildSrc:check UP-TO-DATE :buildSrc:build UP-TO-DATE :zookeeper_vardefines :zookeeper-download :zookeeper-tar Copy /home/bigdata/bigtop/dl/zookeeper-3.4.6-IBM-1.tar.gz to /home/bigdata/bigtop/build/zookeeper/tar/zookeeper-3.4.6-IBM-1.tar.gz :zookeeper-srpm error: line 64: Illegal char '-' in: Version: 3.4.6-IBM-1 :zookeeper-srpm FAILED FAILURE: Build failed with an exception. * Where: Script '/home/bigdata/bigtop/packages.gradle' line: 462 * What went wrong: Execution failed for task ':zookeeper-srpm'. Process 'command 'rpmbuild'' finished with non-zero exit value 1 * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. BUILD FAILED {noformat} Also, according to the [rpm-maven-plugin|http://mojo.codehaus.org/rpm-maven-plugin/ident-params.html] documentation: {noformat} version The version number to use for the RPM package. By default, this is the project version. This value cannot contain a dash (-) due to contraints in the RPM file naming convention. Any specified value will be truncated at the first dash release The release number of the RPM. Beginning with release 2.0-beta-2, this is an optional parameter. By default, the release will be generated from the modifier portion of the project version using the following rules: If no modifier exists, the release will be 1. If the modifier ends with SNAPSHOT, the timestamp (in UTC) of the build will be appended to end. All instances of '-' in the modifier will be replaced with '_'. If a modifier exists and does not end with SNAPSHOT, _1 will be appended to end. {noformat} We should allow underscore '_' as part of the version string. e.g. 3.4.6_abc_1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-1604) remove rpm/deb/... packaging
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503483#comment-14503483 ] Chris Nauroth commented on ZOOKEEPER-1604: -- I'm curious if there has been any resolution to this discussion. If there is a decision to maintain the packaging targets, then would someone please review ZOOKEEPER-2124? This would fix the rpm packaging for -alpha releases. (The hyphen character is illegal in an rpm version.) There were also some earlier rpm fixes discussed in an email thread that would need to be committed and credited to the original author. http://mail-archives.apache.org/mod_mbox/zookeeper-user/201212.mbox/%3c50d2d481.8010...@pt-consulting.eu%3E remove rpm/deb/... packaging Key: ZOOKEEPER-1604 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1604 Project: ZooKeeper Issue Type: Task Components: build Affects Versions: 3.3.0 Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 3.5.2, 3.6.0 Remove rpm/deb/... packaging from our source repo. Now that BigTop is available and fully supporting ZK it's no longer necessary for us to attempt to include this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)