[GitHub] zookeeper pull request #587: ZOOKEEPER-3106: Zookeeper client supports IPv6 ...
Github user maoling commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/587#discussion_r207090394 --- Diff: src/java/main/org/apache/zookeeper/client/ConnectStringParser.java --- @@ -68,14 +69,26 @@ public ConnectStringParser(String connectString) { List hostsList = split(connectString,","); for (String host : hostsList) { int port = DEFAULT_PORT; -int pidx = host.lastIndexOf(':'); -if (pidx >= 0) { -// otherwise : is at the end of the string, ignore -if (pidx < host.length() - 1) { -port = Integer.parseInt(host.substring(pidx + 1)); -} -host = host.substring(0, pidx); +if (!connectString.startsWith("[")) {//IPv4 + int pidx = host.lastIndexOf(':'); + if (pidx >= 0) { + // otherwise : is at the end of the string, ignore + if (pidx < host.length() - 1) { + port = Integer.parseInt(host.substring(pidx + 1)); + } + host = host.substring(0, pidx); + } +} else {//IPv6 --- End diff -- @enixon thanks for your review.after collecting enough suggestions,I will polish up this issue. ---
[jira] [Commented] (ZOOKEEPER-3062) introduce fsync.warningthresholdms constant for FileTxnLog LOG.warn message
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566277#comment-16566277 ] Hudson commented on ZOOKEEPER-3062: --- SUCCESS: Integrated in Jenkins build ZooKeeper-trunk #131 (See [https://builds.apache.org/job/ZooKeeper-trunk/131/]) ZOOKEEPER-3062: mention fsync.warningthresholdms in FileTxnLog LOG.warn (phunt: rev 7cf8035c3a5ca05bce2d183b41bf410709a5f6ee) * (edit) src/java/main/org/apache/zookeeper/server/persistence/FileTxnLog.java * (edit) src/java/test/org/apache/zookeeper/server/persistence/FileTxnLogTest.java > introduce fsync.warningthresholdms constant for FileTxnLog LOG.warn message > --- > > Key: ZOOKEEPER-3062 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3062 > Project: ZooKeeper > Issue Type: Task >Affects Versions: 3.5.4, 3.6.0, 3.4.13 >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0, 3.5.5, 3.4.14 > > Attachments: ZOOKEEPER-3062.patch, ZOOKEEPER-3062.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > The > {code} > fsync-ing the write ahead log in ... took ... ms which will adversely effect > operation latency. File size is ... bytes. See the ZooKeeper troubleshooting > guide > {code} > warning mentioning the {{fsync.warningthresholdms}} configurable property > would make it easier to discover and also when interpreting historical vs. > current logs or logs from different ensembles then differences in > configuration would be easier to spot. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
ZooKeeper_branch35_jdk8 - Build # 1068 - Failure
See https://builds.apache.org/job/ZooKeeper_branch35_jdk8/1068/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 62.63 KB...] [junit] Running org.apache.zookeeper.test.SaslSuperUserTest in thread 7 [junit] Running org.apache.zookeeper.test.ServerCnxnTest in thread 3 [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.106 sec, Thread: 7, Class: org.apache.zookeeper.test.SaslSuperUserTest [junit] Running org.apache.zookeeper.test.SessionInvalidationTest in thread 7 [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.401 sec, Thread: 3, Class: org.apache.zookeeper.test.ServerCnxnTest [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.82 sec, Thread: 7, Class: org.apache.zookeeper.test.SessionInvalidationTest [junit] Running org.apache.zookeeper.test.SessionTest in thread 3 [junit] Running org.apache.zookeeper.test.SessionTimeoutTest in thread 7 [junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.441 sec, Thread: 7, Class: org.apache.zookeeper.test.SessionTimeoutTest [junit] Running org.apache.zookeeper.test.SessionTrackerCheckTest in thread 7 [junit] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.11 sec, Thread: 7, Class: org.apache.zookeeper.test.SessionTrackerCheckTest [junit] Running org.apache.zookeeper.test.SessionUpgradeTest in thread 7 [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 145.339 sec, Thread: 5, Class: org.apache.zookeeper.test.RecoveryTest [junit] Running org.apache.zookeeper.test.StandaloneTest in thread 5 [junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.421 sec, Thread: 5, Class: org.apache.zookeeper.test.StandaloneTest [junit] Running org.apache.zookeeper.test.StatTest in thread 5 [junit] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.108 sec, Thread: 3, Class: org.apache.zookeeper.test.SessionTest [junit] Running org.apache.zookeeper.test.StaticHostProviderTest in thread 3 [junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.165 sec, Thread: 5, Class: org.apache.zookeeper.test.StatTest [junit] Running org.apache.zookeeper.test.StringUtilTest in thread 5 [junit] Tests run: 26, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.1 sec, Thread: 3, Class: org.apache.zookeeper.test.StaticHostProviderTest [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.086 sec, Thread: 5, Class: org.apache.zookeeper.test.StringUtilTest [junit] Running org.apache.zookeeper.test.TruncateTest in thread 5 [junit] Running org.apache.zookeeper.test.SyncCallTest in thread 3 [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 92.284 sec, Thread: 1, Class: org.apache.zookeeper.test.RestoreCommittedLogTest [junit] Running org.apache.zookeeper.test.WatchEventWhenAutoResetTest in thread 1 [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.004 sec, Thread: 3, Class: org.apache.zookeeper.test.SyncCallTest [junit] Running org.apache.zookeeper.test.WatchedEventTest in thread 3 [junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.147 sec, Thread: 3, Class: org.apache.zookeeper.test.WatchedEventTest [junit] Running org.apache.zookeeper.test.WatcherFuncTest in thread 3 [junit] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.971 sec, Thread: 3, Class: org.apache.zookeeper.test.WatcherFuncTest [junit] Running org.apache.zookeeper.test.WatcherTest in thread 3 [junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 23.42 sec, Thread: 7, Class: org.apache.zookeeper.test.SessionUpgradeTest [junit] Running org.apache.zookeeper.test.X509AuthTest in thread 7 [junit] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.147 sec, Thread: 7, Class: org.apache.zookeeper.test.X509AuthTest [junit] Running org.apache.zookeeper.test.ZkDatabaseCorruptionTest in thread 7 [junit] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 18.384 sec, Thread: 5, Class: org.apache.zookeeper.test.TruncateTest [junit] Running org.apache.zookeeper.test.ZooKeeperQuotaTest in thread 5 [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.384 sec, Thread: 5, Class: org.apache.zookeeper.test.ZooKeeperQuotaTest [junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.751 sec, Thread: 1, Class: org.apache.zookeeper.test.WatchEventWhenAutoResetTest [junit] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.941 sec, Thread: 7, Class: org.apache.zookeeper.test.ZkDatabaseCorruptionTest [junit] Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time
[GitHub] zookeeper pull request #588: [ZOOKEEPER-3109] Avoid long unavailable time du...
GitHub user lvfangmin opened a pull request: https://github.com/apache/zookeeper/pull/588 [ZOOKEEPER-3109] Avoid long unavailable time due to voter changed mind during leader election For more details, please check descriptions in https://issues.apache.org/jira/browse/ZOOKEEPER-3109 You can merge this pull request into a Git repository by running: $ git pull https://github.com/lvfangmin/zookeeper ZOOKEEPER-3109 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/588.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #588 commit 9611393b3d4d9e1a0327a5b8bf678e526c7fc5a7 Author: Fangmin Lyu Date: 2018-08-01T22:49:57Z Avoid long unavailable time due to voter changed mind when activating the leader during election ---
[jira] [Updated] (ZOOKEEPER-3109) Avoid long unavailable time due to voter changed mind when activating the leader during election
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ZOOKEEPER-3109: -- Labels: pull-request-available (was: ) > Avoid long unavailable time due to voter changed mind when activating the > leader during election > > > Key: ZOOKEEPER-3109 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3109 > Project: ZooKeeper > Issue Type: Improvement > Components: quorum, server >Affects Versions: 3.6.0 >Reporter: Fangmin Lv >Assignee: Fangmin Lv >Priority: Major > Labels: pull-request-available > Fix For: 3.6.0 > > > Occasionally, we'll find it takes long time to elect a leader, might longer > then 1 minute, depends on how big the initLimit and tickTime are set. > > This exposes an issue in leader election protocol. During leader election, > before the voter goes to the LEADING/FOLLOWING state, it will wait for a > finalizeWait time before changing its state. Depends on the order of > notifications, some voter might change mind just after it voting for a > server. If the server it was previous voting for has majority of votes after > considering this one, then that server will goto LEADING state. In some > corner cases, the leader may end up with timeout waiting for epoch ACK from > majority, because of the changed mind voter. This usually happen when there > are even number of servers in the ensemble (either because one of the server > is down or being restarted and it takes long time to restart). If there are 5 > servers in the ensemble, then we'll find two of them in LEADING/FOLLOWING > state, another two in LOOKING state, but the LOOKING servers cannot join the > quorum since they're waiting for majority servers FOLLOWING the current > leader before changing to FOLLOWING as well. > > As far as we know, this voter will change mind if it received a vote from > another host which just started and start to vote itself, or there is a > server takes long time to shutdown it's previous ZK server and start to vote > itself when starting the leader election process. > > Also the follower may abandon the leader if the leader is not ready for > accepting learner connection when the follower tried to connect to it. > > To solve this issue, there are multiple options: > 1. increase the finalizeWait time > 2. smartly detect this state on leader and quit earlier > > The 1st option is straightforward and easier to change, but it will cause > longer leader election time in common cases. > > The 2nd option is more complexity, but it can efficiently solve the problem > without sacrificing the performance in common cases. It remembers the first > majority servers voting for it, checking if there is anyone changed mind > while it's waiting for epoch ACK. The leader will wait for sometime before > quitting LEADING state, since one voter changed may not be a problem if there > are still majority voters voting for it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZOOKEEPER-3109) Avoid long unavailable time due to voter changed mind when activating the leader during election
Fangmin Lv created ZOOKEEPER-3109: - Summary: Avoid long unavailable time due to voter changed mind when activating the leader during election Key: ZOOKEEPER-3109 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3109 Project: ZooKeeper Issue Type: Improvement Components: quorum, server Affects Versions: 3.6.0 Reporter: Fangmin Lv Assignee: Fangmin Lv Fix For: 3.6.0 Occasionally, we'll find it takes long time to elect a leader, might longer then 1 minute, depends on how big the initLimit and tickTime are set. This exposes an issue in leader election protocol. During leader election, before the voter goes to the LEADING/FOLLOWING state, it will wait for a finalizeWait time before changing its state. Depends on the order of notifications, some voter might change mind just after it voting for a server. If the server it was previous voting for has majority of votes after considering this one, then that server will goto LEADING state. In some corner cases, the leader may end up with timeout waiting for epoch ACK from majority, because of the changed mind voter. This usually happen when there are even number of servers in the ensemble (either because one of the server is down or being restarted and it takes long time to restart). If there are 5 servers in the ensemble, then we'll find two of them in LEADING/FOLLOWING state, another two in LOOKING state, but the LOOKING servers cannot join the quorum since they're waiting for majority servers FOLLOWING the current leader before changing to FOLLOWING as well. As far as we know, this voter will change mind if it received a vote from another host which just started and start to vote itself, or there is a server takes long time to shutdown it's previous ZK server and start to vote itself when starting the leader election process. Also the follower may abandon the leader if the leader is not ready for accepting learner connection when the follower tried to connect to it. To solve this issue, there are multiple options: # increase the finalizeWait time # smartly detect this state on leader and quit earlier The 1st option is straightforward and easier to change, but it will cause longer leader election time in common cases. The 2nd option is more complexity, but it can efficiently solve the problem without sacrificing the performance in common cases. It remembers the first majority servers voting for it, checking if there is anyone changed mind while it's waiting for epoch ACK. The leader will wait for sometime before quitting LEADING state, since one voter changed may not be a problem if there are still majority voters voting for it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ZOOKEEPER-3109) Avoid long unavailable time due to voter changed mind when activating the leader during election
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fangmin Lv updated ZOOKEEPER-3109: -- Description: Occasionally, we'll find it takes long time to elect a leader, might longer then 1 minute, depends on how big the initLimit and tickTime are set. This exposes an issue in leader election protocol. During leader election, before the voter goes to the LEADING/FOLLOWING state, it will wait for a finalizeWait time before changing its state. Depends on the order of notifications, some voter might change mind just after it voting for a server. If the server it was previous voting for has majority of votes after considering this one, then that server will goto LEADING state. In some corner cases, the leader may end up with timeout waiting for epoch ACK from majority, because of the changed mind voter. This usually happen when there are even number of servers in the ensemble (either because one of the server is down or being restarted and it takes long time to restart). If there are 5 servers in the ensemble, then we'll find two of them in LEADING/FOLLOWING state, another two in LOOKING state, but the LOOKING servers cannot join the quorum since they're waiting for majority servers FOLLOWING the current leader before changing to FOLLOWING as well. As far as we know, this voter will change mind if it received a vote from another host which just started and start to vote itself, or there is a server takes long time to shutdown it's previous ZK server and start to vote itself when starting the leader election process. Also the follower may abandon the leader if the leader is not ready for accepting learner connection when the follower tried to connect to it. To solve this issue, there are multiple options: 1. increase the finalizeWait time 2. smartly detect this state on leader and quit earlier The 1st option is straightforward and easier to change, but it will cause longer leader election time in common cases. The 2nd option is more complexity, but it can efficiently solve the problem without sacrificing the performance in common cases. It remembers the first majority servers voting for it, checking if there is anyone changed mind while it's waiting for epoch ACK. The leader will wait for sometime before quitting LEADING state, since one voter changed may not be a problem if there are still majority voters voting for it. was: Occasionally, we'll find it takes long time to elect a leader, might longer then 1 minute, depends on how big the initLimit and tickTime are set. This exposes an issue in leader election protocol. During leader election, before the voter goes to the LEADING/FOLLOWING state, it will wait for a finalizeWait time before changing its state. Depends on the order of notifications, some voter might change mind just after it voting for a server. If the server it was previous voting for has majority of votes after considering this one, then that server will goto LEADING state. In some corner cases, the leader may end up with timeout waiting for epoch ACK from majority, because of the changed mind voter. This usually happen when there are even number of servers in the ensemble (either because one of the server is down or being restarted and it takes long time to restart). If there are 5 servers in the ensemble, then we'll find two of them in LEADING/FOLLOWING state, another two in LOOKING state, but the LOOKING servers cannot join the quorum since they're waiting for majority servers FOLLOWING the current leader before changing to FOLLOWING as well. As far as we know, this voter will change mind if it received a vote from another host which just started and start to vote itself, or there is a server takes long time to shutdown it's previous ZK server and start to vote itself when starting the leader election process. Also the follower may abandon the leader if the leader is not ready for accepting learner connection when the follower tried to connect to it. To solve this issue, there are multiple options: # increase the finalizeWait time # smartly detect this state on leader and quit earlier The 1st option is straightforward and easier to change, but it will cause longer leader election time in common cases. The 2nd option is more complexity, but it can efficiently solve the problem without sacrificing the performance in common cases. It remembers the first majority servers voting for it, checking if there is anyone changed mind while it's waiting for epoch ACK. The leader will wait for sometime before quitting LEADING state, since one voter changed may not be a problem if there are still majority voters voting for it. > Avoid long unavailable time due to voter changed mind when activating the > leader during election >
Re: Test failures (SASL) with Java 11 - any ideas?
Il mer 1 ago 2018, 21:08 Patrick Hunt ha scritto: > We had discussed dropping java6 as a supported platform recently. Perhaps > yet another reason to move forward with that? > So if we drop java6 we can use kerby. It shouldn't be difficult, just port 3.5 branch config. Don't know if it possible to drop java6 in a point release. Enrico > > Patrick > > On Sun, Jul 22, 2018 at 8:13 PM Rakesh Radhakrishnan > wrote: > > > Do you know why 3.4 is not using kerby? > > > > In short, Kerby was failing with java-6. Please refer jira: > > https://jira.apache.org/jira/browse/ZOOKEEPER-2689 > > > > "ZooKeeper runs in Java, release 1.6 or greater (JDK 6 or greater)." > > https://zookeeper.apache.org/doc/r3.4.13/zookeeperAdmin.html > > > > > > Rakesh > > > > On Sat, Jul 21, 2018 at 9:06 PM, Enrico Olivelli > > wrote: > > > > > > > > > > > Il sab 21 lug 2018, 17:17 Patrick Hunt ha scritto: > > > > > >> On Sat, Jul 21, 2018 at 1:21 AM Enrico Olivelli > > >> wrote: > > >> > > >> > Il sab 21 lug 2018, 09:22 Patrick Hunt ha > scritto: > > >> > > > >> > > Interestingly I don't see the auth tests that are failing in 3.4 > > >> failing > > >> > on > > >> > > trunk (they pass), instead a number of tests fail with "Address > > >> already > > >> > in > > >> > > use" > > >> > > > > >> > > > > >> > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ > > >> ZooKeeper-trunk-java11/3/#showFailuresLink > > >> > > > > >> > > 3.4 is using > > >> > > > >> value="2.0.0-M15"/> > > >> > > > value="1.0.0-M20"/> > > >> > > while trunk moved to kerby, wonder if that could be it (sasl fails > > at > > >> > > least)? > > >> > > > > >> > > > > >> > > > >> > True. > > >> > I can't find any report of errors about Kerby + jdk11 by googling a > > >> little. > > >> > Maybe we are the first :) > > >> > > > >> > > > >> To be clear it looks like kerby (master) is working while directory > > (3.4) > > >> is not - perhaps we need to update? > > >> > > > > > > Do you know why 3.4 is not using kerby? > > > > > > Enrico > > > > > > > > >> Patrick > > >> > > >> > > >> > I did not start to run extensive tests of my applications on jdk11, > I > > >> will > > >> > start next week. > > >> > > > >> > While switching from 8 to 10 I had problems with a bunch of fixes on > > >> > Kerberos impl in java which made tests not work on testing > > environments > > >> due > > >> > to stricter check about the env > > >> > > > >> > Enrico > > >> > > > >> > > > >> > > Patrick > > >> > > > > >> > > On Sat, Jul 21, 2018 at 12:02 AM Patrick Hunt > > >> wrote: > > >> > > > > >> > > > Thanks Enrico. Possible. However afaict Jenkins is running build > > 19 > > >> > > (build > > >> > > > 11-ea+19) and I didn't notice anything obvious in the notes for > > 19+ > > >> > > related > > >> > > > to sasl/kerb. > > >> > > > > > >> > > > Patrick > > >> > > > > > >> > > > On Fri, Jul 20, 2018 at 11:48 PM Enrico Olivelli < > > >> eolive...@gmail.com> > > >> > > > wrote: > > >> > > > > > >> > > >> In java11 there are a bunch of news about Kerberos, maybe it is > > >> > related > > >> > > >> > > >> > > >> http://jdk.java.net/11/release-notes > > >> > > >> > > >> > > >> My 2 cents > > >> > > >> Enrico > > >> > > >> > > >> > > >> Il sab 21 lug 2018, 08:03 Patrick Hunt ha > > >> scritto: > > >> > > >> > > >> > > >> > Hey folks, I added a couple Jenkins jobs based on Java 11 > which > > >> is > > >> > set > > >> > > >> to > > >> > > >> > release in September. Jenkins is running a pre-release > > >> > > >> > > > >> > > >> > > > >> > > >> > > >> > > > > >> > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ > > >> ZooKeeper_branch34_java11/ > > >> > > >> > > > >> > > >> > java version "11-ea" 2018-09-25 > > >> > > >> > Java(TM) SE Runtime Environment 18.9 (build 11-ea+19) > > >> > > >> > Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11-ea+19, mixed > > >> mode) > > >> > > >> > > > >> > > >> > > > >> > > >> > Anyone have insight into what's failing here? > > >> > > >> > > > >> > > >> > 2018-07-20 14:39:27,126 [myid:2] - ERROR > > >> > > >> > [QuorumConnectionThread-[myid=2]-3:QuorumCnxManager@268] - > > >> > Exception > > >> > > >> > while connecting, id: [0, localhost/127.0.0.1:11223], addr: > > {}, > > >> > > >> > closing learner connection > > >> > > >> > javax.security.sasl.SaslException: An error: > > >> > > >> > (java.security.PrivilegedActionException: > > >> > > >> > javax.security.sasl.SaslException: GSS initiate failed > [Caused > > >> by > > >> > > >> > GSSException: No valid credentials provided (Mechanism level: > > >> > Message > > >> > > >> > stream modified (41) - Message stream modified)]) occurred > when > > >> > > >> > evaluating Zookeeper Quorum Member's received SASL token. > > >> > > >> > > > >> > > >> > ... > > >> > > >> > > > >> > > >> > Entered Krb5Context.initSecContext with state=STATE_NEW > > >> > > >> > Found ticket for lear...@example.com to go to > > >> > > >> > krbtgt/example@example.com expiring on Sat Jul 21 > 14:39:01 > >
[jira] [Commented] (ZOOKEEPER-3082) Fix server snapshot behavior when out of disk space
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565860#comment-16565860 ] Brian Nixon commented on ZOOKEEPER-3082: [~andorm] my (possibly incorrect) read on ZOOKEEPER-1621 is that the issue is related to this one but not strictly a subset. Here we've removed the possibility of the snapshot side of recovery being lost during a disk-full event. There, the issue seems to be in ensuring the transaction log side of recovery is not corrupted by writing empty/incomplete log files. That issue will continue to be present even with the patch from this file applied. > Fix server snapshot behavior when out of disk space > --- > > Key: ZOOKEEPER-3082 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3082 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.6.0, 3.4.12, 3.5.5 >Reporter: Brian Nixon >Assignee: Brian Nixon >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > When the ZK server tries to make a snapshot and the machine is out of disk > space, the snapshot creation fails and throws an IOException. An empty > snapshot file is created, (probably because the server is able to create an > entry in the dir) but is not able to write to the file. > > If snapshot creation fails, the server commits suicide. When it restarts, it > will do so from the last known good snapshot. However, when it tries to make > a snapshot again, the same thing happens. This results in lots of empty > snapshot files being created. If eventually the DataDirCleanupManager garbage > collects the good snapshot files then only the empty files remain. At this > point, the server is well and truly screwed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] zookeeper issue #566: ZOOKEEPER-3062: mention fsync.warningthresholdms in Fi...
Github user phunt commented on the issue: https://github.com/apache/zookeeper/pull/566 lgtm. +1, thanks @cpoerschke . Perhaps consider logging the value during startup (initial read of the value) instead? ---
[jira] [Resolved] (ZOOKEEPER-3062) mention fsync.warningthresholdms in FileTxnLog LOG.warn message
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt resolved ZOOKEEPER-3062. - Resolution: Fixed Hadoop Flags: Reviewed LGTM. Thanks [~cpoerschke]! > mention fsync.warningthresholdms in FileTxnLog LOG.warn message > --- > > Key: ZOOKEEPER-3062 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3062 > Project: ZooKeeper > Issue Type: Task >Affects Versions: 3.5.4, 3.6.0, 3.4.13 >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0, 3.5.5, 3.4.14 > > Attachments: ZOOKEEPER-3062.patch, ZOOKEEPER-3062.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > The > {code} > fsync-ing the write ahead log in ... took ... ms which will adversely effect > operation latency. File size is ... bytes. See the ZooKeeper troubleshooting > guide > {code} > warning mentioning the {{fsync.warningthresholdms}} configurable property > would make it easier to discover and also when interpreting historical vs. > current logs or logs from different ensembles then differences in > configuration would be easier to spot. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ZOOKEEPER-3062) introduce fsync.warningthresholdms constant for FileTxnLog LOG.warn message
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-3062: Summary: introduce fsync.warningthresholdms constant for FileTxnLog LOG.warn message (was: mention fsync.warningthresholdms in FileTxnLog LOG.warn message) > introduce fsync.warningthresholdms constant for FileTxnLog LOG.warn message > --- > > Key: ZOOKEEPER-3062 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3062 > Project: ZooKeeper > Issue Type: Task >Affects Versions: 3.5.4, 3.6.0, 3.4.13 >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0, 3.5.5, 3.4.14 > > Attachments: ZOOKEEPER-3062.patch, ZOOKEEPER-3062.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > The > {code} > fsync-ing the write ahead log in ... took ... ms which will adversely effect > operation latency. File size is ... bytes. See the ZooKeeper troubleshooting > guide > {code} > warning mentioning the {{fsync.warningthresholdms}} configurable property > would make it easier to discover and also when interpreting historical vs. > current logs or logs from different ensembles then differences in > configuration would be easier to spot. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ZOOKEEPER-3062) mention fsync.warningthresholdms in FileTxnLog LOG.warn message
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-3062: Fix Version/s: 3.4.14 3.5.5 3.6.0 > mention fsync.warningthresholdms in FileTxnLog LOG.warn message > --- > > Key: ZOOKEEPER-3062 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3062 > Project: ZooKeeper > Issue Type: Task >Affects Versions: 3.5.4, 3.6.0, 3.4.13 >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0, 3.5.5, 3.4.14 > > Attachments: ZOOKEEPER-3062.patch, ZOOKEEPER-3062.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > The > {code} > fsync-ing the write ahead log in ... took ... ms which will adversely effect > operation latency. File size is ... bytes. See the ZooKeeper troubleshooting > guide > {code} > warning mentioning the {{fsync.warningthresholdms}} configurable property > would make it easier to discover and also when interpreting historical vs. > current logs or logs from different ensembles then differences in > configuration would be easier to spot. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ZOOKEEPER-3062) mention fsync.warningthresholdms in FileTxnLog LOG.warn message
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-3062: Affects Version/s: 3.6.0 3.5.4 3.4.13 > mention fsync.warningthresholdms in FileTxnLog LOG.warn message > --- > > Key: ZOOKEEPER-3062 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3062 > Project: ZooKeeper > Issue Type: Task >Affects Versions: 3.5.4, 3.6.0, 3.4.13 >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0, 3.5.5, 3.4.14 > > Attachments: ZOOKEEPER-3062.patch, ZOOKEEPER-3062.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > The > {code} > fsync-ing the write ahead log in ... took ... ms which will adversely effect > operation latency. File size is ... bytes. See the ZooKeeper troubleshooting > guide > {code} > warning mentioning the {{fsync.warningthresholdms}} configurable property > would make it easier to discover and also when interpreting historical vs. > current logs or logs from different ensembles then differences in > configuration would be easier to spot. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ZOOKEEPER-3062) mention fsync.warningthresholdms in FileTxnLog LOG.warn message
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt reassigned ZOOKEEPER-3062: --- Assignee: Christine Poerschke > mention fsync.warningthresholdms in FileTxnLog LOG.warn message > --- > > Key: ZOOKEEPER-3062 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3062 > Project: ZooKeeper > Issue Type: Task >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Labels: pull-request-available > Attachments: ZOOKEEPER-3062.patch, ZOOKEEPER-3062.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > The > {code} > fsync-ing the write ahead log in ... took ... ms which will adversely effect > operation latency. File size is ... bytes. See the ZooKeeper troubleshooting > guide > {code} > warning mentioning the {{fsync.warningthresholdms}} configurable property > would make it easier to discover and also when interpreting historical vs. > current logs or logs from different ensembles then differences in > configuration would be easier to spot. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] zookeeper pull request #566: ZOOKEEPER-3062: mention fsync.warningthresholdm...
Github user asfgit closed the pull request at: https://github.com/apache/zookeeper/pull/566 ---
Re: Trying to find pattern in Flaky Tests
Looks like 16808 has been resolved - I haven't noticed it after the recent changes. Note that INFRA recently added openjdk10 to Jenkins and I added a job or two which seem to be working OK. Java 11 is failing on 3.4 due to broken libraries (according to Rakesh on another thread) but we're also seeing failures on trunk which are unrelated to that issue. Perhaps someone can take a look? Patrick On Tue, Jul 24, 2018 at 3:49 PM Patrick Hunt wrote: > FYI, there's also this which I just reported: > https://issues.apache.org/jira/browse/INFRA-16808 > > Patrick > > On Fri, Jul 20, 2018 at 12:01 AM Patrick Hunt wrote: > >> Something that's significantly different about the 3.4 and 3.5/master >> Jenkins jobs is that 3.5/master has >> >> test.junit.threads=8 >> >> set while this is not supported in 3.4 (see build.xml). It's very likely >> that the paralyzation of the tests is causing the discrepancy. >> >> setting threads > 1 significantly improves the speed of the jobs, that's >> why it was originally added to 3.5+. >> See a358280fb2b3cc7852cded3fe67769765a519beb >> >> Perhaps we should try one/more of the 3.5/master jobs with threads=1 and >> see? >> >> Patrick >> >> >> >> On Thu, Jul 19, 2018 at 1:26 PM Molnár Andor wrote: >> >>> Sorry guys for this aweful email. Looks like Apache converted my nicely >>> illustrated email into plain text. :( >>> >>> Maybe I could attach the test reports as images, but I think you already >>> got the idea. >>> >>> >>> Andor >>> >>> >>> >>> On 07/18/2018 05:42 PM, Andor Molnar wrote: >>> > Hi, >>> > >>> > *branch-3.4* >>> > >>> > I've taken a quick look at our Jenkins builds and in terms of flaky >>> tests, >>> > it looks like branch-3.4 is in a pretty good shape. The build hasn't >>> failed >>> > for 5-6 days on all JDKs which I think is pretty awesome. >>> > >>> > *branch-3.5* >>> > >>> > This branch is in very bad condition. Which is quite unfortunate given >>> > we're in the middle of stabilising it. :) >>> > Especially on JDK8, last successful build was 11 days ago. JDK9 (50% >>> > failing) and JDK10 (30% failing) are looking better in the last 10 >>> builds. >>> > >>> > Interestingly (apart from a few quite rare ones) it looks there's only >>> 1 >>> > test which is quite nasty on this branch: >>> testManyChildWatchersAutoReset >>> > >>> > There's a Jira about fixing it and a fix has been merged by increasing >>> the >>> > timeout of the test, but having a bug on the branch is also possible >>> > causing the test to fail even with 10 min timeout. >>> > >>> > I wasn't able to repro the failing test on my machine (Mac and >>> CentOS7), it >>> > always finished in 30-40 seconds maximum. On jenkins slaves it shows >>> the >>> > following: >>> > >>> > *JDK 8:* >>> > >>> > Report creation timed out. >>> > >>> > >>> > *JDK 9:* >>> > >>> > New Failures >>> > Chart >>> > See children >>> > Build Number ⇒ >>> > Package-Class-Testmethod names ⇓ >>> > 351 >>> > 350 >>> > 349 >>> > 348 >>> > 347 >>> > 346 >>> > 345 >>> > 344 >>> > 343 >>> > 342 >>> > 341 >>> > 340 >>> > 339 >>> > 338 >>> > 337 >>> > 336 >>> > 335 >>> > 334 >>> > testManyChildWatchersAutoReset >>> > 45.604 >>> > < >>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/351/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset >>> > >>> > 600.337 >>> > < >>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/350/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset >>> > >>> > 21.904 >>> > < >>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/349/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset >>> > >>> > 583.063 >>> > < >>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/348/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset >>> > >>> > 600.325 >>> > < >>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/347/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset >>> > >>> > 600.383 >>> > < >>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/346/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset >>> > >>> > 600.362 >>> > < >>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/345/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset >>> > >>> > 21.139 >>> > < >>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/344/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset >>> > >>> > 24.031 >>> > < >>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/343/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset >>>
Re: Test failures (SASL) with Java 11 - any ideas?
We had discussed dropping java6 as a supported platform recently. Perhaps yet another reason to move forward with that? Patrick On Sun, Jul 22, 2018 at 8:13 PM Rakesh Radhakrishnan wrote: > Do you know why 3.4 is not using kerby? > > In short, Kerby was failing with java-6. Please refer jira: > https://jira.apache.org/jira/browse/ZOOKEEPER-2689 > > "ZooKeeper runs in Java, release 1.6 or greater (JDK 6 or greater)." > https://zookeeper.apache.org/doc/r3.4.13/zookeeperAdmin.html > > > Rakesh > > On Sat, Jul 21, 2018 at 9:06 PM, Enrico Olivelli > wrote: > > > > > > > Il sab 21 lug 2018, 17:17 Patrick Hunt ha scritto: > > > >> On Sat, Jul 21, 2018 at 1:21 AM Enrico Olivelli > >> wrote: > >> > >> > Il sab 21 lug 2018, 09:22 Patrick Hunt ha scritto: > >> > > >> > > Interestingly I don't see the auth tests that are failing in 3.4 > >> failing > >> > on > >> > > trunk (they pass), instead a number of tests fail with "Address > >> already > >> > in > >> > > use" > >> > > > >> > > > >> > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ > >> ZooKeeper-trunk-java11/3/#showFailuresLink > >> > > > >> > > 3.4 is using > >> > > >> value="2.0.0-M15"/> > >> > > value="1.0.0-M20"/> > >> > > while trunk moved to kerby, wonder if that could be it (sasl fails > at > >> > > least)? > >> > > > >> > > > >> > > >> > True. > >> > I can't find any report of errors about Kerby + jdk11 by googling a > >> little. > >> > Maybe we are the first :) > >> > > >> > > >> To be clear it looks like kerby (master) is working while directory > (3.4) > >> is not - perhaps we need to update? > >> > > > > Do you know why 3.4 is not using kerby? > > > > Enrico > > > > > >> Patrick > >> > >> > >> > I did not start to run extensive tests of my applications on jdk11, I > >> will > >> > start next week. > >> > > >> > While switching from 8 to 10 I had problems with a bunch of fixes on > >> > Kerberos impl in java which made tests not work on testing > environments > >> due > >> > to stricter check about the env > >> > > >> > Enrico > >> > > >> > > >> > > Patrick > >> > > > >> > > On Sat, Jul 21, 2018 at 12:02 AM Patrick Hunt > >> wrote: > >> > > > >> > > > Thanks Enrico. Possible. However afaict Jenkins is running build > 19 > >> > > (build > >> > > > 11-ea+19) and I didn't notice anything obvious in the notes for > 19+ > >> > > related > >> > > > to sasl/kerb. > >> > > > > >> > > > Patrick > >> > > > > >> > > > On Fri, Jul 20, 2018 at 11:48 PM Enrico Olivelli < > >> eolive...@gmail.com> > >> > > > wrote: > >> > > > > >> > > >> In java11 there are a bunch of news about Kerberos, maybe it is > >> > related > >> > > >> > >> > > >> http://jdk.java.net/11/release-notes > >> > > >> > >> > > >> My 2 cents > >> > > >> Enrico > >> > > >> > >> > > >> Il sab 21 lug 2018, 08:03 Patrick Hunt ha > >> scritto: > >> > > >> > >> > > >> > Hey folks, I added a couple Jenkins jobs based on Java 11 which > >> is > >> > set > >> > > >> to > >> > > >> > release in September. Jenkins is running a pre-release > >> > > >> > > >> > > >> > > >> > > >> > >> > > > >> > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ > >> ZooKeeper_branch34_java11/ > >> > > >> > > >> > > >> > java version "11-ea" 2018-09-25 > >> > > >> > Java(TM) SE Runtime Environment 18.9 (build 11-ea+19) > >> > > >> > Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11-ea+19, mixed > >> mode) > >> > > >> > > >> > > >> > > >> > > >> > Anyone have insight into what's failing here? > >> > > >> > > >> > > >> > 2018-07-20 14:39:27,126 [myid:2] - ERROR > >> > > >> > [QuorumConnectionThread-[myid=2]-3:QuorumCnxManager@268] - > >> > Exception > >> > > >> > while connecting, id: [0, localhost/127.0.0.1:11223], addr: > {}, > >> > > >> > closing learner connection > >> > > >> > javax.security.sasl.SaslException: An error: > >> > > >> > (java.security.PrivilegedActionException: > >> > > >> > javax.security.sasl.SaslException: GSS initiate failed [Caused > >> by > >> > > >> > GSSException: No valid credentials provided (Mechanism level: > >> > Message > >> > > >> > stream modified (41) - Message stream modified)]) occurred when > >> > > >> > evaluating Zookeeper Quorum Member's received SASL token. > >> > > >> > > >> > > >> > ... > >> > > >> > > >> > > >> > Entered Krb5Context.initSecContext with state=STATE_NEW > >> > > >> > Found ticket for lear...@example.com to go to > >> > > >> > krbtgt/example@example.com expiring on Sat Jul 21 14:39:01 > >> UTC > >> > > >> > 2018 > >> > > >> > Service ticket not found in the subject > >> > > >> > 2018-07-20 14:39:27,127 [myid:0] - ERROR > >> > > >> > [QuorumConnectionThread-[myid=0]-2:SaslQuorumAuthServer@133] - > >> > Failed > >> > > >> > to authenticate using SASL > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > >> > > > >> > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ > >> ZooKeeper_branch34_java11/2/#showFailuresLink > >> > > >> > > >> > > >> > Patrick > >> > > >> > > >> > > >> -- > >> > > >> > >> > > >>
[jira] [Commented] (ZOOKEEPER-3108) deprecated myid file and use a new property "server.id" in the zoo.cfg
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565814#comment-16565814 ] Brian Nixon commented on ZOOKEEPER-3108: This seems like a good idea to me (provided myid files are still supported) to give admins a bit more flexibility. One reason I can think of to keep using a separate myid file is that the server id is the one property guaranteed to be unique for a given peer across the ensemble. All other properties and jvm flags may be identical across every instance. This makes reasoning about configuration files very easy - one simply propagates the same file everywhere and no custom logic is needed when comparing them. Here's a link to an old discussion around myid -> http://zookeeper-user.578899.n2.nabble.com/The-idea-behind-myid-td3711269.html > deprecated myid file and use a new property "server.id" in the zoo.cfg > --- > > Key: ZOOKEEPER-3108 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3108 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.5.0 >Reporter: maoling >Assignee: maoling >Priority: Major > > When use zk in distributional model,we need to touch a myid file in > dataDir.then write a unique number to it.It is inconvenient and not > user-friendly,Look at an example from other distribution system such as > kafka:it just uses broker.id=0 in the server.properties to indentify a unique > server node.This issue is going to abandon the myid file and use a new > property such as server.id=0 in the zoo.cfg. this fix will be applied to > master branch,branch-3.5+, > keep branch-3.4 unchaged. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] zookeeper pull request #587: ZOOKEEPER-3106: Zookeeper client supports IPv6 ...
Github user enixon commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/587#discussion_r206983579 --- Diff: src/java/main/org/apache/zookeeper/client/ConnectStringParser.java --- @@ -68,14 +69,26 @@ public ConnectStringParser(String connectString) { List hostsList = split(connectString,","); for (String host : hostsList) { int port = DEFAULT_PORT; -int pidx = host.lastIndexOf(':'); -if (pidx >= 0) { -// otherwise : is at the end of the string, ignore -if (pidx < host.length() - 1) { -port = Integer.parseInt(host.substring(pidx + 1)); -} -host = host.substring(0, pidx); +if (!connectString.startsWith("[")) {//IPv4 + int pidx = host.lastIndexOf(':'); + if (pidx >= 0) { + // otherwise : is at the end of the string, ignore + if (pidx < host.length() - 1) { + port = Integer.parseInt(host.substring(pidx + 1)); + } + host = host.substring(0, pidx); + } +} else {//IPv6 + int pidx = host.lastIndexOf(':'); + int bracketIdx = host.lastIndexOf(']'); + if (pidx >=0 && bracketIdx >=0 && pidx > bracketIdx) { + if (pidx < host.length() - 1) { + port = Integer.parseInt(host.substring(pidx + 1)); + } + host = host.substring(0, pidx); + } --- End diff -- nit - you've added tabs with your whitespace ---
[GitHub] zookeeper pull request #587: ZOOKEEPER-3106: Zookeeper client supports IPv6 ...
Github user enixon commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/587#discussion_r206987004 --- Diff: src/java/main/org/apache/zookeeper/client/ConnectStringParser.java --- @@ -68,14 +69,26 @@ public ConnectStringParser(String connectString) { List hostsList = split(connectString,","); for (String host : hostsList) { int port = DEFAULT_PORT; -int pidx = host.lastIndexOf(':'); -if (pidx >= 0) { -// otherwise : is at the end of the string, ignore -if (pidx < host.length() - 1) { -port = Integer.parseInt(host.substring(pidx + 1)); -} -host = host.substring(0, pidx); +if (!connectString.startsWith("[")) {//IPv4 + int pidx = host.lastIndexOf(':'); + if (pidx >= 0) { + // otherwise : is at the end of the string, ignore + if (pidx < host.length() - 1) { + port = Integer.parseInt(host.substring(pidx + 1)); + } + host = host.substring(0, pidx); + } +} else {//IPv6 --- End diff -- purely selfish request - could you add an example to this comment? something like // IPv6 e.g. [2001:db8:1::242:ac11:2]:1234. Having that on hand made reasoning about the string parsing logic much easier for me. ---
[GitHub] zookeeper pull request #:
Github user maoling commented on the pull request: https://github.com/apache/zookeeper/commit/a2623a625a4778720f7d5482d0a66e9b37ae556f#commitcomment-29922823 @nkalmar Thanks for your nice explain. these security problems can also exist in the JMX and Jetty? ---
[GitHub] zookeeper pull request #:
Github user nkalmar commented on the pull request: https://github.com/apache/zookeeper/commit/a2623a625a4778720f7d5482d0a66e9b37ae556f#commitcomment-29917129 @maoling , the problem is, there is no security implemented. Anyone user who can access ZooKeeper, can send commands to the ensemble. While all 4ltw commands is read-only, some takes quite some time, so a DOS attack is actually possible. So it has been deemed unsecure and deprecated, as far as I know. Originally I implemented the 4ltw command for this PR but the I was suggested to remove from 3.6 and 3.5 ---