[jira] [Updated] (ZOOKEEPER-2995) ant docs fails when Java 1.9 is present on my system
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abraham Fine updated ZOOKEEPER-2995: Description: When attempting to compile the documentation (with JAVA_HOME set to 1.7) I see output like this: {code} $ ant clean docs -Dforrest.home=$(brew info apache-forrest | grep /Cellar | awk '{print $1;}') -d Apache Ant(TM) version 1.9.7 compiled on April 9 2016 Trying the default build file: build.xml Buildfile: REDACTED/zookeeper/build.xml Adding reference: ant.PropertyHelper Detected Java version: 1.7 in: /Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre OTHER STUFF docs: Class org.apache.tools.ant.taskdefs.condition.Os loaded from parent loader (parentFirst) Condition false; setting forrest.exec to forrest Setting project property: forrest.exec -> forrest [exec] Current OS is Mac OS X [exec] Executing '/usr/local/Cellar/apache-forrest/0.9/bin/forrest' [exec] The ' characters around the executable and arguments are [exec] not part of the command. Execute:Java13CommandLauncher: Executing '/usr/local/Cellar/apache-forrest/0.9/bin/forrest' The ' characters around the executable and arguments are not part of the command. [exec] Apache Forrest. Run 'forrest -projecthelp' to list options [exec] [exec] Buildfile: /usr/local/Cellar/apache-forrest/0.9/libexec/main/forrest.build.xml [exec] [exec] check-java-version: [exec] This is apache-forrest-0.9 [exec] Using Java 1.6 from /Library/Java/JavaVirtualMachines/jdk-9.0.1.jdk/Contents/Home MORE STUFF [exec] [exec] BUILD FAILED [exec] /usr/local/Cellar/apache-forrest/0.9/libexec/main/targets/site.xml:180: Warning: Could not find file REDACTED/zookeeper/src/docs/build/tmp/brokenlinks.xml to copy. [exec] [exec] Total time: 3 seconds [exec] -Djava.endorsed.dirs=/usr/local/Cellar/apache-forrest/0.9/libexec/lib/endorsed:${java.endorsed.dirs} is not supported. Endorsed standards and standalone APIs [exec] Error: Could not create the Java Virtual Machine. [exec] in modular form will be supported via the concept of upgradeable modules. [exec] Error: A fatal exception has occurred. Program will exit. [exec] [exec] Copying broken links file to site root. [exec] BUILD FAILED REDACTED/zookeeper/build.xml:501: exec returned: 1 at org.apache.tools.ant.taskdefs.ExecTask.runExecute(ExecTask.java:644) at org.apache.tools.ant.taskdefs.ExecTask.runExec(ExecTask.java:670) at org.apache.tools.ant.taskdefs.ExecTask.execute(ExecTask.java:496) at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:293) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106) at org.apache.tools.ant.Task.perform(Task.java:348) at org.apache.tools.ant.Target.execute(Target.java:435) at org.apache.tools.ant.Target.performTasks(Target.java:456) at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1405) at org.apache.tools.ant.Project.executeTarget(Project.java:1376) at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41) at org.apache.tools.ant.Project.executeTargets(Project.java:1260) at org.apache.tools.ant.Main.runBuild(Main.java:854) at org.apache.tools.ant.Main.startAnt(Main.java:236) at org.apache.tools.ant.launch.Launcher.run(Launcher.java:285) at org.apache.tools.ant.launch.Launcher.main(Launcher.java:112) {code} The build succeeds when I uninstall java 9. was: When attempting to compile the documentation (with JAVA_HOME set to 1.7) I see output like this: {code} $ ant docs -Dforrest.home=$(brew info apache-forrest | grep /Cellar | awk '{print $1;}') -d Apache Ant(TM) version 1.9.7 compiled on April 9 2016 Trying the default build file: build.xml Buildfile: REDACTED/zookeeper/build.xml Adding reference: ant.PropertyHelper Detected Java version: 1.7 in: /Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre OTHER STUFF docs: Class org.apache.tools.ant.taskdefs.condition.Os loaded from parent loader (parentFirst) Condition false; setting forrest.exec to forrest Setting project property: forrest.exec -> forrest [exec] Current OS is Mac OS X [exec] Executing '/usr/local/Cellar/apache-forrest/0.9/bin/forrest' [exec] The ' characters around the executable and arguments are [exec] not part of the command. Execute:Java13CommandLauncher: Executing '/usr/local/Cellar/apache-forrest/0.9/bin/forrest' The ' characters around the executable and arguments are not part of the
[jira] [Created] (ZOOKEEPER-2995) ant docs fails when Java 1.9 is present on my system
Abraham Fine created ZOOKEEPER-2995: --- Summary: ant docs fails when Java 1.9 is present on my system Key: ZOOKEEPER-2995 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2995 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.4.11, 3.5.3, 3.6.0 Reporter: Abraham Fine Assignee: Abraham Fine When attempting to compile the documentation (with JAVA_HOME set to 1.7) I see output like this: {code} $ ant docs -Dforrest.home=$(brew info apache-forrest | grep /Cellar | awk '{print $1;}') -d Apache Ant(TM) version 1.9.7 compiled on April 9 2016 Trying the default build file: build.xml Buildfile: REDACTED/zookeeper/build.xml Adding reference: ant.PropertyHelper Detected Java version: 1.7 in: /Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre OTHER STUFF docs: Class org.apache.tools.ant.taskdefs.condition.Os loaded from parent loader (parentFirst) Condition false; setting forrest.exec to forrest Setting project property: forrest.exec -> forrest [exec] Current OS is Mac OS X [exec] Executing '/usr/local/Cellar/apache-forrest/0.9/bin/forrest' [exec] The ' characters around the executable and arguments are [exec] not part of the command. Execute:Java13CommandLauncher: Executing '/usr/local/Cellar/apache-forrest/0.9/bin/forrest' The ' characters around the executable and arguments are not part of the command. [exec] Apache Forrest. Run 'forrest -projecthelp' to list options [exec] [exec] Buildfile: /usr/local/Cellar/apache-forrest/0.9/libexec/main/forrest.build.xml [exec] [exec] check-java-version: [exec] This is apache-forrest-0.9 [exec] Using Java 1.6 from /Library/Java/JavaVirtualMachines/jdk-9.0.1.jdk/Contents/Home MORE STUFF [exec] [exec] BUILD FAILED [exec] /usr/local/Cellar/apache-forrest/0.9/libexec/main/targets/site.xml:180: Warning: Could not find file REDACTED/zookeeper/src/docs/build/tmp/brokenlinks.xml to copy. [exec] [exec] Total time: 3 seconds [exec] -Djava.endorsed.dirs=/usr/local/Cellar/apache-forrest/0.9/libexec/lib/endorsed:${java.endorsed.dirs} is not supported. Endorsed standards and standalone APIs [exec] Error: Could not create the Java Virtual Machine. [exec] in modular form will be supported via the concept of upgradeable modules. [exec] Error: A fatal exception has occurred. Program will exit. [exec] [exec] Copying broken links file to site root. [exec] BUILD FAILED REDACTED/zookeeper/build.xml:501: exec returned: 1 at org.apache.tools.ant.taskdefs.ExecTask.runExecute(ExecTask.java:644) at org.apache.tools.ant.taskdefs.ExecTask.runExec(ExecTask.java:670) at org.apache.tools.ant.taskdefs.ExecTask.execute(ExecTask.java:496) at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:293) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106) at org.apache.tools.ant.Task.perform(Task.java:348) at org.apache.tools.ant.Target.execute(Target.java:435) at org.apache.tools.ant.Target.performTasks(Target.java:456) at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1405) at org.apache.tools.ant.Project.executeTarget(Project.java:1376) at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41) at org.apache.tools.ant.Project.executeTargets(Project.java:1260) at org.apache.tools.ant.Main.runBuild(Main.java:854) at org.apache.tools.ant.Main.startAnt(Main.java:236) at org.apache.tools.ant.launch.Launcher.run(Launcher.java:285) at org.apache.tools.ant.launch.Launcher.main(Launcher.java:112) {code} The build succeeds when I uninstall java 9. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ZOOKEEPER-2995) ant docs fails when Java 1.9 is present on my system
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abraham Fine reassigned ZOOKEEPER-2995: --- Assignee: (was: Abraham Fine) > ant docs fails when Java 1.9 is present on my system > > > Key: ZOOKEEPER-2995 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2995 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.5.3, 3.4.11, 3.6.0 >Reporter: Abraham Fine >Priority: Major > > When attempting to compile the documentation (with JAVA_HOME set to 1.7) I > see output like this: > {code} > $ ant docs -Dforrest.home=$(brew info apache-forrest | grep /Cellar | awk > '{print $1;}') -d > Apache Ant(TM) version 1.9.7 compiled on April 9 2016 > Trying the default build file: build.xml > Buildfile: REDACTED/zookeeper/build.xml > Adding reference: ant.PropertyHelper > Detected Java version: 1.7 in: > /Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre > OTHER STUFF > docs: > Class org.apache.tools.ant.taskdefs.condition.Os loaded from parent loader > (parentFirst) > Condition false; setting forrest.exec to forrest > Setting project property: forrest.exec -> forrest > [exec] Current OS is Mac OS X > [exec] Executing '/usr/local/Cellar/apache-forrest/0.9/bin/forrest' > [exec] The ' characters around the executable and arguments are > [exec] not part of the command. > Execute:Java13CommandLauncher: Executing > '/usr/local/Cellar/apache-forrest/0.9/bin/forrest' > The ' characters around the executable and arguments are > not part of the command. > [exec] Apache Forrest. Run 'forrest -projecthelp' to list options > [exec] > [exec] Buildfile: > /usr/local/Cellar/apache-forrest/0.9/libexec/main/forrest.build.xml > [exec] > [exec] check-java-version: > [exec] This is apache-forrest-0.9 > [exec] Using Java 1.6 from > /Library/Java/JavaVirtualMachines/jdk-9.0.1.jdk/Contents/Home > MORE STUFF > [exec] > [exec] BUILD FAILED > [exec] > /usr/local/Cellar/apache-forrest/0.9/libexec/main/targets/site.xml:180: > Warning: Could not find file > REDACTED/zookeeper/src/docs/build/tmp/brokenlinks.xml to copy. > [exec] > [exec] Total time: 3 seconds > [exec] > -Djava.endorsed.dirs=/usr/local/Cellar/apache-forrest/0.9/libexec/lib/endorsed:${java.endorsed.dirs} > is not supported. Endorsed standards and standalone APIs > [exec] Error: Could not create the Java Virtual Machine. > [exec] in modular form will be supported via the concept of upgradeable > modules. > [exec] Error: A fatal exception has occurred. Program will exit. > [exec] > [exec] Copying broken links file to site root. > [exec] > BUILD FAILED > REDACTED/zookeeper/build.xml:501: exec returned: 1 > at org.apache.tools.ant.taskdefs.ExecTask.runExecute(ExecTask.java:644) > at org.apache.tools.ant.taskdefs.ExecTask.runExec(ExecTask.java:670) > at org.apache.tools.ant.taskdefs.ExecTask.execute(ExecTask.java:496) > at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:293) > at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106) > at org.apache.tools.ant.Task.perform(Task.java:348) > at org.apache.tools.ant.Target.execute(Target.java:435) > at org.apache.tools.ant.Target.performTasks(Target.java:456) > at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1405) > at org.apache.tools.ant.Project.executeTarget(Project.java:1376) > at > org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41) > at org.apache.tools.ant.Project.executeTargets(Project.java:1260) > at org.apache.tools.ant.Main.runBuild(Main.java:854) > at org.apache.tools.ant.Main.startAnt(Main.java:236) > at org.apache.tools.ant.launch.Launcher.run(Launcher.java:285) > at org.apache.tools.ant.launch.Launcher.main(Launcher.java:112) > {code} > The build succeeds when I uninstall java 9. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
ZooKeeper_branch34 - Build # 2267 - Failure
See https://builds.apache.org/job/ZooKeeper_branch34/2267/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 125.28 KB...] [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 24.373 sec [junit] Running org.apache.zookeeper.test.RepeatStartupTest [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.175 sec [junit] Running org.apache.zookeeper.test.RestoreCommittedLogTest [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.37 sec [junit] Running org.apache.zookeeper.test.SaslAuthDesignatedClientTest [junit] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.697 sec [junit] Running org.apache.zookeeper.test.SaslAuthDesignatedServerTest [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.603 sec [junit] Running org.apache.zookeeper.test.SaslAuthFailDesignatedClientTest [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.643 sec [junit] Running org.apache.zookeeper.test.SaslAuthFailNotifyTest [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.528 sec [junit] Running org.apache.zookeeper.test.SaslAuthFailTest [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.645 sec [junit] Running org.apache.zookeeper.test.SaslAuthMissingClientConfigTest [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.577 sec [junit] Running org.apache.zookeeper.test.SaslClientTest [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.079 sec [junit] Running org.apache.zookeeper.test.SessionInvalidationTest [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.668 sec [junit] Running org.apache.zookeeper.test.SessionTest [junit] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 33.677 sec [junit] Running org.apache.zookeeper.test.StandaloneTest [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.871 sec [junit] Running org.apache.zookeeper.test.StatTest [junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.808 sec [junit] Running org.apache.zookeeper.test.StaticHostProviderTest [junit] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.228 sec [junit] Running org.apache.zookeeper.test.SyncCallTest [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.592 sec [junit] Running org.apache.zookeeper.test.TruncateTest [junit] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 9.692 sec [junit] Running org.apache.zookeeper.test.UpgradeTest [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.311 sec [junit] Running org.apache.zookeeper.test.WatchedEventTest [junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.09 sec [junit] Running org.apache.zookeeper.test.WatcherFuncTest [junit] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.371 sec [junit] Running org.apache.zookeeper.test.WatcherTest [junit] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 29.693 sec [junit] Running org.apache.zookeeper.test.ZkDatabaseCorruptionTest [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.089 sec [junit] Running org.apache.zookeeper.test.ZooKeeperQuotaTest [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.742 sec fail.build.on.test.failure: BUILD FAILED /home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34/build.xml:1474: The following error occurred while executing this line: /home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34/build.xml:1382: The following error occurred while executing this line: /home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34/build.xml:1385: Tests failed! Total time: 41 minutes 13 seconds Build step 'Execute shell' marked build as failure Archiving artifacts Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## 1 tests failed. FAILED: org.apache.zookeeper.server.quorum.Zab1_0Test.testNormalFollowerRunWithDiff Error Message: expected:<4294967298> but was:<0> Stack Trace: junit.framework.AssertionFailedError: expected:<4294967298> but was:<0> at org.apache.zookeeper.server.quorum.Zab1_0Test$5.converseWithFollower(Zab1_0Test.java:861) at org.apache.zookeeper.server.quorum.Zab1_0Test.testFollowerConversation(Zab1_0Test.java:507) at
[jira] [Updated] (ZOOKEEPER-2994) Tool required to recover log and snapshot entries with CRC errors
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andor Molnar updated ZOOKEEPER-2994: Description: In the even that the zookeeper transaction log or snapshot become corrupted and fail CRC checks (preventing startup) we should have a mechanism to get the cluster running again. Previously we achieved this by loading the broken transaction log with a modified version of ZK with disabled CRC check and forced it to snapshot. It'd very handy to have a tool which can do this for us. LogFormatter and SnapshotFormatter have already been designed to dump log and snapshot files, it'd be nice to extend their functionality and add ability for such recovery. was: In the even that the zookeeper transaction log or snapshot become corrupted and fail CRC checks (preventing startup) we should have a mechanism to get the cluster running again. Previously with achieved this by loading the broken transaction log with a modified version of ZK with disabled CRC check and forced it to snapshot. It'd very handy to have a tool which can do this for us. LogFormatter and SnapshotFormatter have already been designed to dump log and snapshot files, it'd be nice to extend their functionality and add ability for such recovery. > Tool required to recover log and snapshot entries with CRC errors > - > > Key: ZOOKEEPER-2994 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2994 > Project: ZooKeeper > Issue Type: New Feature >Reporter: Andor Molnar >Assignee: Andor Molnar >Priority: Major > Fix For: 3.5.4, 3.6.0 > > > In the even that the zookeeper transaction log or snapshot become corrupted > and fail CRC checks (preventing startup) we should have a mechanism to get > the cluster running again. > Previously we achieved this by loading the broken transaction log with a > modified version of ZK with disabled CRC check and forced it to snapshot. > It'd very handy to have a tool which can do this for us. LogFormatter and > SnapshotFormatter have already been designed to dump log and snapshot files, > it'd be nice to extend their functionality and add ability for such recovery. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Failed: ZOOKEEPER-2770 PreCommit Build #3660
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2770 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3660/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 1.26 KB...] > git rev-parse refs/remotes/origin/master^{commit} # timeout=10 > git rev-parse refs/remotes/origin/origin/master^{commit} # timeout=10 Checking out Revision 99c9bbb0ab1eef469e1662086532c58078b9909a (refs/remotes/origin/master) > git config core.sparsecheckout # timeout=10 > git checkout -f 99c9bbb0ab1eef469e1662086532c58078b9909a Commit message: "ZOOKEEPER-2992: The eclipse build target fails due to protocol redirection: http->https" > git rev-list --no-walk 99c9bbb0ab1eef469e1662086532c58078b9909a # timeout=10 No emails were triggered. Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 [PreCommit-ZOOKEEPER-Build] $ /bin/bash /tmp/jenkins5209685587972065547.sh /home/jenkins/tools/java/latest1.7/bin/java java version "1.7.0_80" Java(TM) SE Runtime Environment (build 1.7.0_80-b15) Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode) core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 386417 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 6 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 10240 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/tools/ant/launch/Launcher : Unsupported major.minor version 52.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:800) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) at java.net.URLClassLoader.access$100(URLClassLoader.java:71) at java.net.URLClassLoader$1.run(URLClassLoader.java:361) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:482) Build step 'Execute shell' marked build as failure Archiving artifacts Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Recording test results Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 ERROR: Step ?Publish JUnit test result report? failed: No test report files were found. Configuration error? Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 [description-setter] Description set: ZOOKEEPER-2992 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Email was triggered for: Failure - Any Sending email for trigger: Failure - Any Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7 ### ## FAILED TESTS (if any) ## No tests ran.
[jira] [Commented] (ZOOKEEPER-2770) ZooKeeper slow operation log
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389945#comment-16389945 ] ASF GitHub Bot commented on ZOOKEEPER-2770: --- Github user karanmehta93 commented on the issue: https://github.com/apache/zookeeper/pull/307 Hello everyone, Appreciate your efforts in reviewing this patch. @hanm @tdunning @eribeiro @skamille Is there any possibility that the patch will get merged in (with minor changes if required) or shall we 'never' this JIRA and close this PR? Thanks! > ZooKeeper slow operation log > > > Key: ZOOKEEPER-2770 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2770 > Project: ZooKeeper > Issue Type: Improvement >Reporter: Karan Mehta >Assignee: Karan Mehta >Priority: Major > Attachments: ZOOKEEPER-2770.001.patch, ZOOKEEPER-2770.002.patch, > ZOOKEEPER-2770.003.patch > > > ZooKeeper is a complex distributed application. There are many reasons why > any given read or write operation may become slow: a software bug, a protocol > problem, a hardware issue with the commit log(s), a network issue. If the > problem is constant it is trivial to come to an understanding of the cause. > However in order to diagnose intermittent problems we often don't know where, > or when, to begin looking. We need some sort of timestamped indication of the > problem. Although ZooKeeper is not a datastore, it does persist data, and can > suffer intermittent performance degradation, and should consider implementing > a 'slow query' log, a feature very common to services which persist > information on behalf of clients which may be sensitive to latency while > waiting for confirmation of successful persistence. > Log the client and request details if the server discovers, when finally > processing the request, that the current time minus arrival time of the > request is beyond a configured threshold. > Look at the HBase {{responseTooSlow}} feature for inspiration. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] zookeeper issue #307: ZOOKEEPER-2770 ZooKeeper slow operation log
Github user karanmehta93 commented on the issue: https://github.com/apache/zookeeper/pull/307 Hello everyone, Appreciate your efforts in reviewing this patch. @hanm @tdunning @eribeiro @skamille Is there any possibility that the patch will get merged in (with minor changes if required) or shall we 'never' this JIRA and close this PR? Thanks! ---
[jira] [Updated] (ZOOKEEPER-2994) Tool required to recover log and snapshot entries with CRC errors
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andor Molnar updated ZOOKEEPER-2994: Fix Version/s: 3.5.4 > Tool required to recover log and snapshot entries with CRC errors > - > > Key: ZOOKEEPER-2994 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2994 > Project: ZooKeeper > Issue Type: New Feature >Reporter: Andor Molnar >Assignee: Andor Molnar >Priority: Major > Fix For: 3.5.4, 3.6.0 > > > In the even that the zookeeper transaction log or snapshot become corrupted > and fail CRC checks (preventing startup) we should have a mechanism to get > the cluster running again. > Previously with achieved this by loading the broken transaction log with a > modified version of ZK with disabled CRC check and forced it to snapshot. > It'd very handy to have a tool which can do this for us. LogFormatter and > SnapshotFormatter have already been designed to dump log and snapshot files, > it'd be nice to extend their functionality and add ability for such recovery. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ZOOKEEPER-2994) Tool required to recover log and snapshot entries with CRC errors
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andor Molnar updated ZOOKEEPER-2994: Fix Version/s: 3.6.0 > Tool required to recover log and snapshot entries with CRC errors > - > > Key: ZOOKEEPER-2994 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2994 > Project: ZooKeeper > Issue Type: New Feature >Reporter: Andor Molnar >Assignee: Andor Molnar >Priority: Major > Fix For: 3.5.4, 3.6.0 > > > In the even that the zookeeper transaction log or snapshot become corrupted > and fail CRC checks (preventing startup) we should have a mechanism to get > the cluster running again. > Previously with achieved this by loading the broken transaction log with a > modified version of ZK with disabled CRC check and forced it to snapshot. > It'd very handy to have a tool which can do this for us. LogFormatter and > SnapshotFormatter have already been designed to dump log and snapshot files, > it'd be nice to extend their functionality and add ability for such recovery. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZOOKEEPER-2994) Tool required to recover log and snapshot entries with CRC errors
Andor Molnar created ZOOKEEPER-2994: --- Summary: Tool required to recover log and snapshot entries with CRC errors Key: ZOOKEEPER-2994 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2994 Project: ZooKeeper Issue Type: New Feature Reporter: Andor Molnar Assignee: Andor Molnar In the even that the zookeeper transaction log or snapshot become corrupted and fail CRC checks (preventing startup) we should have a mechanism to get the cluster running again. Previously with achieved this by loading the broken transaction log with a modified version of ZK with disabled CRC check and forced it to snapshot. It'd very handy to have a tool which can do this for us. LogFormatter and SnapshotFormatter have already been designed to dump log and snapshot files, it'd be nice to extend their functionality and add ability for such recovery. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389636#comment-16389636 ] Andor Molnar commented on ZOOKEEPER-2172: - Yes, please subscribe to 'users' list. > Cluster crashes when reconfig a new node as a participant > - > > Key: ZOOKEEPER-2172 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2172 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection, quorum, server >Affects Versions: 3.5.0 > Environment: Ubuntu 12.04 + java 7 >Reporter: Ziyou Wang >Assignee: Mohammad Arshad >Priority: Critical > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2172-02.patch, ZOOKEEPER-2172-03.patch, > ZOOKEEPER-2172-04.patch, ZOOKEEPER-2172-06.patch, ZOOKEEPER-2172-07.patch, > ZOOKEEPER-2172.patch, ZOOKEPER-2172-05.patch, history.txt, node-1.log, > node-2.log, node-3.log, zoo-1.log, zoo-2-1.log, zoo-2-2.log, zoo-2-3.log, > zoo-2.log, zoo-2212-1.log, zoo-2212-2.log, zoo-2212-3.log, zoo-3-1.log, > zoo-3-2.log, zoo-3-3.log, zoo-3.log, zoo-4-1.log, zoo-4-2.log, zoo-4-3.log, > zoo.cfg.dynamic.1005d, zoo.cfg.dynamic.next, zookeeper-1.log, > zookeeper-1.out, zookeeper-2.log, zookeeper-2.out, zookeeper-3.log, > zookeeper-3.out > > > The operations are quite simple: start three zk servers one by one, then > reconfig the cluster to add the new one as a participant. When I add the > third one, the zk cluster may enter a weird state and cannot recover. > > I found “2015-04-20 12:53:48,236 [myid:1] - INFO [ProcessThread(sid:1 > cport:-1)::PrepRequestProcessor@547] - Incremental reconfig” in node-1 log. > So the first node received the reconfig cmd at 12:53:48. Latter, it logged > “2015-04-20 12:53:52,230 [myid:1] - ERROR > [LearnerHandler-/10.0.0.2:55890:LearnerHandler@580] - Unexpected exception > causing shutdown while sock still open” and “2015-04-20 12:53:52,231 [myid:1] > - WARN [LearnerHandler-/10.0.0.2:55890:LearnerHandler@595] - *** GOODBYE > /10.0.0.2:55890 ”. From then on, the first node and second node > rejected all client connections and the third node didn’t join the cluster as > a participant. The whole cluster was done. > > When the problem happened, all three nodes just used the same dynamic > config file zoo.cfg.dynamic.1005d which only contained the first two > nodes. But there was another unused dynamic config file in node-1 directory > zoo.cfg.dynamic.next which already contained three nodes. > > When I extended the waiting time between starting the third node and > reconfiguring the cluster, the problem didn’t show again. So it should be a > race condition problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389619#comment-16389619 ] Yuval Dori commented on ZOOKEEPER-2172: --- So, in order to further investigate it, do I need to subscribe here: [https://zookeeper.apache.org/lists.html] ? Thanks, Yuval > Cluster crashes when reconfig a new node as a participant > - > > Key: ZOOKEEPER-2172 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2172 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection, quorum, server >Affects Versions: 3.5.0 > Environment: Ubuntu 12.04 + java 7 >Reporter: Ziyou Wang >Assignee: Mohammad Arshad >Priority: Critical > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2172-02.patch, ZOOKEEPER-2172-03.patch, > ZOOKEEPER-2172-04.patch, ZOOKEEPER-2172-06.patch, ZOOKEEPER-2172-07.patch, > ZOOKEEPER-2172.patch, ZOOKEPER-2172-05.patch, history.txt, node-1.log, > node-2.log, node-3.log, zoo-1.log, zoo-2-1.log, zoo-2-2.log, zoo-2-3.log, > zoo-2.log, zoo-2212-1.log, zoo-2212-2.log, zoo-2212-3.log, zoo-3-1.log, > zoo-3-2.log, zoo-3-3.log, zoo-3.log, zoo-4-1.log, zoo-4-2.log, zoo-4-3.log, > zoo.cfg.dynamic.1005d, zoo.cfg.dynamic.next, zookeeper-1.log, > zookeeper-1.out, zookeeper-2.log, zookeeper-2.out, zookeeper-3.log, > zookeeper-3.out > > > The operations are quite simple: start three zk servers one by one, then > reconfig the cluster to add the new one as a participant. When I add the > third one, the zk cluster may enter a weird state and cannot recover. > > I found “2015-04-20 12:53:48,236 [myid:1] - INFO [ProcessThread(sid:1 > cport:-1)::PrepRequestProcessor@547] - Incremental reconfig” in node-1 log. > So the first node received the reconfig cmd at 12:53:48. Latter, it logged > “2015-04-20 12:53:52,230 [myid:1] - ERROR > [LearnerHandler-/10.0.0.2:55890:LearnerHandler@580] - Unexpected exception > causing shutdown while sock still open” and “2015-04-20 12:53:52,231 [myid:1] > - WARN [LearnerHandler-/10.0.0.2:55890:LearnerHandler@595] - *** GOODBYE > /10.0.0.2:55890 ”. From then on, the first node and second node > rejected all client connections and the third node didn’t join the cluster as > a participant. The whole cluster was done. > > When the problem happened, all three nodes just used the same dynamic > config file zoo.cfg.dynamic.1005d which only contained the first two > nodes. But there was another unused dynamic config file in node-1 directory > zoo.cfg.dynamic.next which already contained three nodes. > > When I extended the waiting time between starting the third node and > reconfiguring the cluster, the problem didn’t show again. So it should be a > race condition problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389608#comment-16389608 ] Andor Molnar commented on ZOOKEEPER-2172: - Could be. > Cluster crashes when reconfig a new node as a participant > - > > Key: ZOOKEEPER-2172 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2172 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection, quorum, server >Affects Versions: 3.5.0 > Environment: Ubuntu 12.04 + java 7 >Reporter: Ziyou Wang >Assignee: Mohammad Arshad >Priority: Critical > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2172-02.patch, ZOOKEEPER-2172-03.patch, > ZOOKEEPER-2172-04.patch, ZOOKEEPER-2172-06.patch, ZOOKEEPER-2172-07.patch, > ZOOKEEPER-2172.patch, ZOOKEPER-2172-05.patch, history.txt, node-1.log, > node-2.log, node-3.log, zoo-1.log, zoo-2-1.log, zoo-2-2.log, zoo-2-3.log, > zoo-2.log, zoo-2212-1.log, zoo-2212-2.log, zoo-2212-3.log, zoo-3-1.log, > zoo-3-2.log, zoo-3-3.log, zoo-3.log, zoo-4-1.log, zoo-4-2.log, zoo-4-3.log, > zoo.cfg.dynamic.1005d, zoo.cfg.dynamic.next, zookeeper-1.log, > zookeeper-1.out, zookeeper-2.log, zookeeper-2.out, zookeeper-3.log, > zookeeper-3.out > > > The operations are quite simple: start three zk servers one by one, then > reconfig the cluster to add the new one as a participant. When I add the > third one, the zk cluster may enter a weird state and cannot recover. > > I found “2015-04-20 12:53:48,236 [myid:1] - INFO [ProcessThread(sid:1 > cport:-1)::PrepRequestProcessor@547] - Incremental reconfig” in node-1 log. > So the first node received the reconfig cmd at 12:53:48. Latter, it logged > “2015-04-20 12:53:52,230 [myid:1] - ERROR > [LearnerHandler-/10.0.0.2:55890:LearnerHandler@580] - Unexpected exception > causing shutdown while sock still open” and “2015-04-20 12:53:52,231 [myid:1] > - WARN [LearnerHandler-/10.0.0.2:55890:LearnerHandler@595] - *** GOODBYE > /10.0.0.2:55890 ”. From then on, the first node and second node > rejected all client connections and the third node didn’t join the cluster as > a participant. The whole cluster was done. > > When the problem happened, all three nodes just used the same dynamic > config file zoo.cfg.dynamic.1005d which only contained the first two > nodes. But there was another unused dynamic config file in node-1 directory > zoo.cfg.dynamic.next which already contained three nodes. > > When I extended the waiting time between starting the third node and > reconfiguring the cluster, the problem didn’t show again. So it should be a > race condition problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389604#comment-16389604 ] Yuval Dori commented on ZOOKEEPER-2172: --- Thanks Andor. Do you think it's another ZK bug or something else? > Cluster crashes when reconfig a new node as a participant > - > > Key: ZOOKEEPER-2172 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2172 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection, quorum, server >Affects Versions: 3.5.0 > Environment: Ubuntu 12.04 + java 7 >Reporter: Ziyou Wang >Assignee: Mohammad Arshad >Priority: Critical > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2172-02.patch, ZOOKEEPER-2172-03.patch, > ZOOKEEPER-2172-04.patch, ZOOKEEPER-2172-06.patch, ZOOKEEPER-2172-07.patch, > ZOOKEEPER-2172.patch, ZOOKEPER-2172-05.patch, history.txt, node-1.log, > node-2.log, node-3.log, zoo-1.log, zoo-2-1.log, zoo-2-2.log, zoo-2-3.log, > zoo-2.log, zoo-2212-1.log, zoo-2212-2.log, zoo-2212-3.log, zoo-3-1.log, > zoo-3-2.log, zoo-3-3.log, zoo-3.log, zoo-4-1.log, zoo-4-2.log, zoo-4-3.log, > zoo.cfg.dynamic.1005d, zoo.cfg.dynamic.next, zookeeper-1.log, > zookeeper-1.out, zookeeper-2.log, zookeeper-2.out, zookeeper-3.log, > zookeeper-3.out > > > The operations are quite simple: start three zk servers one by one, then > reconfig the cluster to add the new one as a participant. When I add the > third one, the zk cluster may enter a weird state and cannot recover. > > I found “2015-04-20 12:53:48,236 [myid:1] - INFO [ProcessThread(sid:1 > cport:-1)::PrepRequestProcessor@547] - Incremental reconfig” in node-1 log. > So the first node received the reconfig cmd at 12:53:48. Latter, it logged > “2015-04-20 12:53:52,230 [myid:1] - ERROR > [LearnerHandler-/10.0.0.2:55890:LearnerHandler@580] - Unexpected exception > causing shutdown while sock still open” and “2015-04-20 12:53:52,231 [myid:1] > - WARN [LearnerHandler-/10.0.0.2:55890:LearnerHandler@595] - *** GOODBYE > /10.0.0.2:55890 ”. From then on, the first node and second node > rejected all client connections and the third node didn’t join the cluster as > a participant. The whole cluster was done. > > When the problem happened, all three nodes just used the same dynamic > config file zoo.cfg.dynamic.1005d which only contained the first two > nodes. But there was another unused dynamic config file in node-1 directory > zoo.cfg.dynamic.next which already contained three nodes. > > When I extended the waiting time between starting the third node and > reconfiguring the cluster, the problem didn’t show again. So it should be a > race condition problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389444#comment-16389444 ] Andor Molnar commented on ZOOKEEPER-2172: - These are 2 totally different errors I believe. I'm pretty sure they're not related, because the Jira is about a feature which is available from 3.5 versions only as mentioned. Would you please kindly move this discussion to ZooKeeper 'user' mailing list and provide some more information (ensemble topology, config files, log files, step-by-step scenario, etc.)? > Cluster crashes when reconfig a new node as a participant > - > > Key: ZOOKEEPER-2172 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2172 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection, quorum, server >Affects Versions: 3.5.0 > Environment: Ubuntu 12.04 + java 7 >Reporter: Ziyou Wang >Assignee: Mohammad Arshad >Priority: Critical > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2172-02.patch, ZOOKEEPER-2172-03.patch, > ZOOKEEPER-2172-04.patch, ZOOKEEPER-2172-06.patch, ZOOKEEPER-2172-07.patch, > ZOOKEEPER-2172.patch, ZOOKEPER-2172-05.patch, history.txt, node-1.log, > node-2.log, node-3.log, zoo-1.log, zoo-2-1.log, zoo-2-2.log, zoo-2-3.log, > zoo-2.log, zoo-2212-1.log, zoo-2212-2.log, zoo-2212-3.log, zoo-3-1.log, > zoo-3-2.log, zoo-3-3.log, zoo-3.log, zoo-4-1.log, zoo-4-2.log, zoo-4-3.log, > zoo.cfg.dynamic.1005d, zoo.cfg.dynamic.next, zookeeper-1.log, > zookeeper-1.out, zookeeper-2.log, zookeeper-2.out, zookeeper-3.log, > zookeeper-3.out > > > The operations are quite simple: start three zk servers one by one, then > reconfig the cluster to add the new one as a participant. When I add the > third one, the zk cluster may enter a weird state and cannot recover. > > I found “2015-04-20 12:53:48,236 [myid:1] - INFO [ProcessThread(sid:1 > cport:-1)::PrepRequestProcessor@547] - Incremental reconfig” in node-1 log. > So the first node received the reconfig cmd at 12:53:48. Latter, it logged > “2015-04-20 12:53:52,230 [myid:1] - ERROR > [LearnerHandler-/10.0.0.2:55890:LearnerHandler@580] - Unexpected exception > causing shutdown while sock still open” and “2015-04-20 12:53:52,231 [myid:1] > - WARN [LearnerHandler-/10.0.0.2:55890:LearnerHandler@595] - *** GOODBYE > /10.0.0.2:55890 ”. From then on, the first node and second node > rejected all client connections and the third node didn’t join the cluster as > a participant. The whole cluster was done. > > When the problem happened, all three nodes just used the same dynamic > config file zoo.cfg.dynamic.1005d which only contained the first two > nodes. But there was another unused dynamic config file in node-1 directory > zoo.cfg.dynamic.next which already contained three nodes. > > When I extended the waiting time between starting the third node and > reconfiguring the cluster, the problem didn’t show again. So it should be a > race condition problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [SUGGESTION] Target branches 3.5 and master (3.6) to Java 8
Okay, I dropped a mail on the user list to get some feedback. Regards, Andor On Thu, Feb 22, 2018 at 5:59 PM, Patrick Huntwrote: > Perhaps discuss on the user list as Flavio mentioned prior to calling a > vote? Has anyone looked at dependencies, is this consistent with what the > rest of the ecosystem has defined. Hadoop/Hbase/Kafka/... components, > Curator, etc... > > Regards, > > Patrick > > On Thu, Feb 22, 2018 at 7:52 AM, Andor Molnar wrote: > > > Is everybody happy with the plan that Tamaas suggested? > > Shall we start a vote? > > > > Andor > > > > > > > > On Wed, Feb 21, 2018 at 11:34 PM, Mark Fenes > wrote: > > > > > Hi All, > > > > > > I totally support the idea of upgrading to Java 8 and I agree with Abe > > that > > > we should not require different minimum versions of Java for the client > > and > > > the server. > > > Also skipping the non-LTS versions sounds reasonable. > > > > > > Regards, > > > Mark > > > > > > > > > On Tue, Feb 20, 2018 at 8:49 PM, Tamás Pénzes > > wrote: > > > > > > > Hi All, > > > > > > > > Just to add my 2 cents. // Might be five, I write long. :) > > > > Hope, you find valuable bits. > > > > > > > > As many of us I also hope that ZooKeeper 3.5 will be released soon. > > > > Until then most of the changes go into master and branch-3.5 too, so > I > > > > would keep them on the same Java version for code compatibility. In > the > > > > same time I'd be happy if it was Java 8. > > > > > > > > ZK 3.5+ supports Java 7 since December 2014, an almost 7 year old > Java > > > > version today. > > > > It was a perfect decision in 2014, when nobody expected ZK 3.5 coming > > so > > > > late, but things might be different four years later. > > > > > > > > Since we have to keep compatibility with Java 6 on branch-3.4 we > > already > > > > need manual changes when cherry picking into that branch. Not much > > > > difference if branch-3.5 is Java 8. > > > > > > > > > > > > As Flavio said changing branch-3.5 to Java 8 might cause issues for > > users > > > > already using ZK 3.5.x-beta. > > > > I totally agree with that concern, but using a beta state software > > means > > > > you accept the risk of facing changes. > > > > And Java 8 is four years old now, so we would not change to bleeding > > > edge, > > > > which I guess nobody wanted. > > > > > > > > > > > > So what I would propose is the following: > > > > > > > >- Upgrade branches "master" and "branch-3.5" to Java 8 (LTS) asap. > > > >- After releasing 3.5 GA and the next LTS Java version (Java 11 / > > > >18.9-LTS) gets released upgrade "master" branch to Java 11-LTS. ( > > > >http://www.oracle.com/technetwork/java/eol-135779.html) > > > >- I would not upgrade Java to a non-LTS version. > > > > > > > > > > > > What do you think about it? > > > > > > > > Thanks, Tamaas > > > > > > > > > > > > On Mon, Feb 19, 2018 at 10:32 PM, Flavio Junqueira > > > wrote: > > > > > > > > > I'm fine with moving to Java 8 or even 9 in 3.6. Does anyone have a > > > > > different option? Otherwise, should we start a vote? > > > > > > > > > > -Flavio > > > > > > > > > > > > > > > > On 16 Feb 2018, at 21:28, Abraham Fine wrote: > > > > > > > > > > > > I'm a -1 on requiring different minimum versions of java for the > > > client > > > > > and the server. I think this has the potential to create a lot of > > > > > confusion for users and contributors. > > > > > > > > > > > > I would support moving master (3.6) to java 8, I also think it is > > > worth > > > > > considering moving to java 9. Given how long our release cycle > tends > > to > > > > be > > > > > I think targeting the latest and greatest this early in the > > development > > > > > cycle is reasonable. > > > > > > > > > > > > Thanks, > > > > > > Abe > > > > > > > > > > > > On Fri, Feb 16, 2018, at 06:48, Enrico Olivelli wrote: > > > > > >> 2018-02-16 14:20 GMT+01:00 Andor Molnar : > > > > > >> > > > > > >>> +1 for setting the Java8 requirement on server side. > > > > > >>> > > > > > >>> *Client side.* > > > > > >>> I'd like the idea of the setting the requirement on client side > > too > > > > > without > > > > > >>> introducing anything Java8 specific. I'm not planning to use > > Java8 > > > > > features > > > > > >>> right on, just thinking of opening the gates would be useful in > > the > > > > > long > > > > > >>> run. > > > > > >>> > > > > > >>> Additionally, I don't see heavy development on the client side. > > > Users > > > > > who > > > > > >>> are tightly coupled to Java7 are still able to use existing > > clients > > > > as > > > > > long > > > > > >>> as we introduce something breaking which they're forced to > > upgrade > > > to > > > > > for > > > > > >>> whatever reason. I'm not sure what are the odds of that to > > happen. > > > > > >>> > > > > > >> > > > > > >> > > > > > >> My two cents > > > > > >> Actually
[jira] [Commented] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389390#comment-16389390 ] Yuval Dori commented on ZOOKEEPER-2172: --- 1. The use case here was adding node to ZK cluster using zookeeper-3.5.jar. It's not the same use case as for our customers. the first use 5 machines with 3 ZK instances. shutdown 2 machine (one with ZK. so 2 ZK left) and got "java.lang.IllegalStateException: instance must be started before calling this method". The second customer got this error when deploying the application. This is this issue stack trace: 2015-04-20 12:53:52,230 [myid:1] - ERROR [LearnerHandler-/10.0.0.2:55890:LearnerHandler@580] - Unexpected exception causing shutdown while sock still open java.io.EOFException at java.io.DataInputStream.readInt(Unknown Source) at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83) at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103) at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:493) 2015-04-20 12:53:52,231 [myid:1] - WARN [LearnerHandler-/10.0.0.2:55890:LearnerHandler@595] - *** GOODBYE /10.0.0.2:55890 And this our customers stack trace: 2018-02-15T09:58:12.094+0100; ERROR; WSOSTSLXWIT01/MANAGER; P3424/T194; [SPACE/LearnerHandler-/10.17.46.142:49336/LearnerHandler]; Unexpected exception causing shutdown while sock still open java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read(BufferedInputStream.java:265) at java.io.DataInputStream.readInt(DataInputStream.java:387) at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83) at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99) at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:542) As you can see the row lines for BinaryInputArchive.java and LearnerHandler.java are different but I thought its related to the different versions (3.4.8 vs 3.5). The first customer tested it with ZK 3.5.3 and it didn't reproduced! What is this new feature that was added to 3.5? I'll be happy to hear whether do you think if it's related or not. Thanks, Yuval > Cluster crashes when reconfig a new node as a participant > - > > Key: ZOOKEEPER-2172 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2172 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection, quorum, server >Affects Versions: 3.5.0 > Environment: Ubuntu 12.04 + java 7 >Reporter: Ziyou Wang >Assignee: Mohammad Arshad >Priority: Critical > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2172-02.patch, ZOOKEEPER-2172-03.patch, > ZOOKEEPER-2172-04.patch, ZOOKEEPER-2172-06.patch, ZOOKEEPER-2172-07.patch, > ZOOKEEPER-2172.patch, ZOOKEPER-2172-05.patch, history.txt, node-1.log, > node-2.log, node-3.log, zoo-1.log, zoo-2-1.log, zoo-2-2.log, zoo-2-3.log, > zoo-2.log, zoo-2212-1.log, zoo-2212-2.log, zoo-2212-3.log, zoo-3-1.log, > zoo-3-2.log, zoo-3-3.log, zoo-3.log, zoo-4-1.log, zoo-4-2.log, zoo-4-3.log, > zoo.cfg.dynamic.1005d, zoo.cfg.dynamic.next, zookeeper-1.log, > zookeeper-1.out, zookeeper-2.log, zookeeper-2.out, zookeeper-3.log, > zookeeper-3.out > > > The operations are quite simple: start three zk servers one by one, then > reconfig the cluster to add the new one as a participant. When I add the > third one, the zk cluster may enter a weird state and cannot recover. > > I found “2015-04-20 12:53:48,236 [myid:1] - INFO [ProcessThread(sid:1 > cport:-1)::PrepRequestProcessor@547] - Incremental reconfig” in node-1 log. > So the first node received the reconfig cmd at 12:53:48. Latter, it logged > “2015-04-20 12:53:52,230 [myid:1] - ERROR > [LearnerHandler-/10.0.0.2:55890:LearnerHandler@580] - Unexpected exception > causing shutdown while sock still open” and “2015-04-20 12:53:52,231 [myid:1] > - WARN [LearnerHandler-/10.0.0.2:55890:LearnerHandler@595] - *** GOODBYE > /10.0.0.2:55890 ”. From then on, the first node and second node > rejected all client connections and the third node didn’t join the cluster as > a participant. The whole cluster was done. > > When the problem happened, all three nodes just used
[jira] [Commented] (ZOOKEEPER-2930) Leader cannot be elected due to network timeout of some members.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389384#comment-16389384 ] ASF GitHub Bot commented on ZOOKEEPER-2930: --- Github user JonathanO commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/456#discussion_r172802731 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java --- @@ -318,76 +318,167 @@ public Thread newThread(Runnable r) { */ public void testInitiateConnection(long sid) throws Exception { LOG.debug("Opening channel to server " + sid); -Socket sock = new Socket(); -setSockOpts(sock); -sock.connect(self.getVotingView().get(sid).electionAddr, cnxTO); -initiateConnection(sock, sid); +initiateConnection(sid, self.getVotingView().get(sid).electionAddr); +} + +private Socket openChannel(long sid, InetSocketAddress electionAddr) { +LOG.debug("Opening channel to server " + sid); +try { +final Socket sock = new Socket(); +setSockOpts(sock); +sock.connect(electionAddr, cnxTO); +LOG.debug("Connected to server " + sid); +return sock; +} catch (UnresolvedAddressException e) { +// Sun doesn't include the address that causes this +// exception to be thrown, also UAE cannot be wrapped cleanly +// so we log the exception in order to capture this critical +// detail. +LOG.warn("Cannot open channel to " + sid ++ " at election address " + electionAddr, e); +throw e; +} catch (IOException e) { +LOG.warn("Cannot open channel to " + sid ++ " at election address " + electionAddr, +e); +return null; +} } /** * If this server has initiated the connection, then it gives up on the * connection if it loses challenge. Otherwise, it keeps the connection. */ -public void initiateConnection(final Socket sock, final Long sid) { +public boolean initiateConnection(final Long sid, InetSocketAddress electionAddr) { try { -startConnection(sock, sid); -} catch (IOException e) { -LOG.error("Exception while connecting, id: {}, addr: {}, closing learner connection", -new Object[] { sid, sock.getRemoteSocketAddress() }, e); -closeSocket(sock); -return; +Socket sock = openChannel(sid, electionAddr); +if (sock != null) { +try { +startConnection(sock, sid); +} catch (IOException e) { +LOG.error("Exception while connecting, id: {}, addr: {}, closing learner connection", +new Object[]{sid, sock.getRemoteSocketAddress()}, e); +closeSocket(sock); +} +return true; +} else { +return false; +} +} finally { +inprogressConnections.remove(sid); } } -/** - * Server will initiate the connection request to its peer server - * asynchronously via separate connection thread. - */ -public void initiateConnectionAsync(final Socket sock, final Long sid) { +synchronized private void connectOneAsync(final Long sid, final ZooKeeperThread connectorThread) { +if (senderWorkerMap.get(sid) != null) { +LOG.debug("There is a connection already for server " + sid); +return; +} if(!inprogressConnections.add(sid)){ // simply return as there is a connection request to // server 'sid' already in progress. LOG.debug("Connection request to server id: {} is already in progress, so skipping this request", sid); -closeSocket(sock); return; } try { -connectionExecutor.execute( -new QuorumConnectionReqThread(sock, sid)); +connectionExecutor.execute(connectorThread); connectionThreadCnt.incrementAndGet(); } catch (Throwable e) { // Imp: Safer side catching all type of exceptions and remove 'sid' // from inprogress connections. This is to avoid blocking further // connection requests from this 'sid' in case of errors.
[GitHub] zookeeper pull request #456: ZOOKEEPER-2930: Leader cannot be elected due to...
Github user JonathanO commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/456#discussion_r172802731 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java --- @@ -318,76 +318,167 @@ public Thread newThread(Runnable r) { */ public void testInitiateConnection(long sid) throws Exception { LOG.debug("Opening channel to server " + sid); -Socket sock = new Socket(); -setSockOpts(sock); -sock.connect(self.getVotingView().get(sid).electionAddr, cnxTO); -initiateConnection(sock, sid); +initiateConnection(sid, self.getVotingView().get(sid).electionAddr); +} + +private Socket openChannel(long sid, InetSocketAddress electionAddr) { +LOG.debug("Opening channel to server " + sid); +try { +final Socket sock = new Socket(); +setSockOpts(sock); +sock.connect(electionAddr, cnxTO); +LOG.debug("Connected to server " + sid); +return sock; +} catch (UnresolvedAddressException e) { +// Sun doesn't include the address that causes this +// exception to be thrown, also UAE cannot be wrapped cleanly +// so we log the exception in order to capture this critical +// detail. +LOG.warn("Cannot open channel to " + sid ++ " at election address " + electionAddr, e); +throw e; +} catch (IOException e) { +LOG.warn("Cannot open channel to " + sid ++ " at election address " + electionAddr, +e); +return null; +} } /** * If this server has initiated the connection, then it gives up on the * connection if it loses challenge. Otherwise, it keeps the connection. */ -public void initiateConnection(final Socket sock, final Long sid) { +public boolean initiateConnection(final Long sid, InetSocketAddress electionAddr) { try { -startConnection(sock, sid); -} catch (IOException e) { -LOG.error("Exception while connecting, id: {}, addr: {}, closing learner connection", -new Object[] { sid, sock.getRemoteSocketAddress() }, e); -closeSocket(sock); -return; +Socket sock = openChannel(sid, electionAddr); +if (sock != null) { +try { +startConnection(sock, sid); +} catch (IOException e) { +LOG.error("Exception while connecting, id: {}, addr: {}, closing learner connection", +new Object[]{sid, sock.getRemoteSocketAddress()}, e); +closeSocket(sock); +} +return true; +} else { +return false; +} +} finally { +inprogressConnections.remove(sid); } } -/** - * Server will initiate the connection request to its peer server - * asynchronously via separate connection thread. - */ -public void initiateConnectionAsync(final Socket sock, final Long sid) { +synchronized private void connectOneAsync(final Long sid, final ZooKeeperThread connectorThread) { +if (senderWorkerMap.get(sid) != null) { +LOG.debug("There is a connection already for server " + sid); +return; +} if(!inprogressConnections.add(sid)){ // simply return as there is a connection request to // server 'sid' already in progress. LOG.debug("Connection request to server id: {} is already in progress, so skipping this request", sid); -closeSocket(sock); return; } try { -connectionExecutor.execute( -new QuorumConnectionReqThread(sock, sid)); +connectionExecutor.execute(connectorThread); connectionThreadCnt.incrementAndGet(); } catch (Throwable e) { // Imp: Safer side catching all type of exceptions and remove 'sid' // from inprogress connections. This is to avoid blocking further // connection requests from this 'sid' in case of errors. inprogressConnections.remove(sid); LOG.error("Exception while submitting quorum connection request", e); -closeSocket(sock); } } +/** + * Try to establish a connection to
[jira] [Commented] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389378#comment-16389378 ] Andor Molnar commented on ZOOKEEPER-2172: - [~yuvald] Are you sure about it's the same issue? Dynamic reconfig is a 3.5+ feature. > Cluster crashes when reconfig a new node as a participant > - > > Key: ZOOKEEPER-2172 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2172 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection, quorum, server >Affects Versions: 3.5.0 > Environment: Ubuntu 12.04 + java 7 >Reporter: Ziyou Wang >Assignee: Mohammad Arshad >Priority: Critical > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2172-02.patch, ZOOKEEPER-2172-03.patch, > ZOOKEEPER-2172-04.patch, ZOOKEEPER-2172-06.patch, ZOOKEEPER-2172-07.patch, > ZOOKEEPER-2172.patch, ZOOKEPER-2172-05.patch, history.txt, node-1.log, > node-2.log, node-3.log, zoo-1.log, zoo-2-1.log, zoo-2-2.log, zoo-2-3.log, > zoo-2.log, zoo-2212-1.log, zoo-2212-2.log, zoo-2212-3.log, zoo-3-1.log, > zoo-3-2.log, zoo-3-3.log, zoo-3.log, zoo-4-1.log, zoo-4-2.log, zoo-4-3.log, > zoo.cfg.dynamic.1005d, zoo.cfg.dynamic.next, zookeeper-1.log, > zookeeper-1.out, zookeeper-2.log, zookeeper-2.out, zookeeper-3.log, > zookeeper-3.out > > > The operations are quite simple: start three zk servers one by one, then > reconfig the cluster to add the new one as a participant. When I add the > third one, the zk cluster may enter a weird state and cannot recover. > > I found “2015-04-20 12:53:48,236 [myid:1] - INFO [ProcessThread(sid:1 > cport:-1)::PrepRequestProcessor@547] - Incremental reconfig” in node-1 log. > So the first node received the reconfig cmd at 12:53:48. Latter, it logged > “2015-04-20 12:53:52,230 [myid:1] - ERROR > [LearnerHandler-/10.0.0.2:55890:LearnerHandler@580] - Unexpected exception > causing shutdown while sock still open” and “2015-04-20 12:53:52,231 [myid:1] > - WARN [LearnerHandler-/10.0.0.2:55890:LearnerHandler@595] - *** GOODBYE > /10.0.0.2:55890 ”. From then on, the first node and second node > rejected all client connections and the third node didn’t join the cluster as > a participant. The whole cluster was done. > > When the problem happened, all three nodes just used the same dynamic > config file zoo.cfg.dynamic.1005d which only contained the first two > nodes. But there was another unused dynamic config file in node-1 directory > zoo.cfg.dynamic.next which already contained three nodes. > > When I extended the waiting time between starting the third node and > reconfiguring the cluster, the problem didn’t show again. So it should be a > race condition problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389307#comment-16389307 ] Yuval Dori commented on ZOOKEEPER-2172: --- Hi, This issue happens in a few of our customers using 3.4.8 version. During this days we are upgrading to 3.4.10. As 3.5.3 is in Beta, is it possible to backport this fix? Thanks, Yuval > Cluster crashes when reconfig a new node as a participant > - > > Key: ZOOKEEPER-2172 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2172 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection, quorum, server >Affects Versions: 3.5.0 > Environment: Ubuntu 12.04 + java 7 >Reporter: Ziyou Wang >Assignee: Mohammad Arshad >Priority: Critical > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2172-02.patch, ZOOKEEPER-2172-03.patch, > ZOOKEEPER-2172-04.patch, ZOOKEEPER-2172-06.patch, ZOOKEEPER-2172-07.patch, > ZOOKEEPER-2172.patch, ZOOKEPER-2172-05.patch, history.txt, node-1.log, > node-2.log, node-3.log, zoo-1.log, zoo-2-1.log, zoo-2-2.log, zoo-2-3.log, > zoo-2.log, zoo-2212-1.log, zoo-2212-2.log, zoo-2212-3.log, zoo-3-1.log, > zoo-3-2.log, zoo-3-3.log, zoo-3.log, zoo-4-1.log, zoo-4-2.log, zoo-4-3.log, > zoo.cfg.dynamic.1005d, zoo.cfg.dynamic.next, zookeeper-1.log, > zookeeper-1.out, zookeeper-2.log, zookeeper-2.out, zookeeper-3.log, > zookeeper-3.out > > > The operations are quite simple: start three zk servers one by one, then > reconfig the cluster to add the new one as a participant. When I add the > third one, the zk cluster may enter a weird state and cannot recover. > > I found “2015-04-20 12:53:48,236 [myid:1] - INFO [ProcessThread(sid:1 > cport:-1)::PrepRequestProcessor@547] - Incremental reconfig” in node-1 log. > So the first node received the reconfig cmd at 12:53:48. Latter, it logged > “2015-04-20 12:53:52,230 [myid:1] - ERROR > [LearnerHandler-/10.0.0.2:55890:LearnerHandler@580] - Unexpected exception > causing shutdown while sock still open” and “2015-04-20 12:53:52,231 [myid:1] > - WARN [LearnerHandler-/10.0.0.2:55890:LearnerHandler@595] - *** GOODBYE > /10.0.0.2:55890 ”. From then on, the first node and second node > rejected all client connections and the third node didn’t join the cluster as > a participant. The whole cluster was done. > > When the problem happened, all three nodes just used the same dynamic > config file zoo.cfg.dynamic.1005d which only contained the first two > nodes. But there was another unused dynamic config file in node-1 directory > zoo.cfg.dynamic.next which already contained three nodes. > > When I extended the waiting time between starting the third node and > reconfiguring the cluster, the problem didn’t show again. So it should be a > race condition problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)