Re: Zookeeper at Twitter by Micheal
Congrats, Michael, Really cool blog post. :) Em seg, 15 de out de 2018 05:26, Norbert Kalmar escreveu: > Good blog post! > Can't wait for the PRs ;) > I'm very positive about ZooKeeper's future (heard lot of talks lately about > etcd overthrowing ZooKeeper - no chance :) ) > > Norbert > > On Sun, Oct 14, 2018 at 10:39 PM Andor Molnár wrote: > > > Great stuff Michael! > > > > > > Andor > > > > > > > > On 10/14/2018 12:26 AM, Enrico Olivelli wrote: > > > Hi Michael, > > > I just stepped into this very interesting post ! > > > > > > > > > https://blog.twitter.com/engineering/en_us/topics/infrastructure/2018/zookeeper-at-twitter.html > > > > > > Thank you > > > Enrico > > > > >
Re: improving tolerance to network failures
There have been several comments on the document. I will be porting discussions from the document back to the mailing list each day. Alex Shraer makes a good point that with the design as stated, there is no provision for dealing with the rebalancing of client connections during dynamic reconfiguration. I am very curious whether this needs to be addressed in the design since it seems that if connections are redirected, the same connection logic should apply. I suppose the text needs an update, regardless, even if there is no effect. But is there something I missed here? Will there be a code effect? Another comment points out that if you don't have symmetrical hardware for the servers (i.e. more network interfaces on some), then client connections are likely to be more numerous on servers with more network connections. This is undoubtedly true. I have a question, however, about this. Is this situation actually important enough to make the first version of this change? My own experience is that production settings typically involve Zookeeper servers with very consistent hardware where this would not be an issue. What experience do others have, particularly in production situations? On 2018/10/23 02:02:12, Ted Dunning wrote: > ... > I have started a collaborative document to work on the design approach. > Once that is judged by the community to be sufficiently mature, I will move > it to a JIRA. > > That document is at > https://docs.google.com/document/d/1iGVwxeHp57qogwfdodCh9b32P2_kOQaJZ2GDo7j36fI/edit?usp=sharing > > The design document is currently open to the world for commenting so that > anybody can suggest changes or ask questions. I will act as a bit of a > moderator so that the document can remain completely open. >
Re: improving tolerance to network failures
>> Will there be a code effect? There will be - the current rebalancing algorithm will be broken if no code is done to StaticHostProvider.updateServerList to teach it aware of multiple server addresses belong to the same server. For example, currently if we add a new server through reconfig, the rebalance will kick in. In the new proposal, if we add a new address to the existing server, if no code change made to updateServerList, the rebalance will also kick in but it should not, as in this case no new real servers are added. >> My own experience is that production settings typically involve Zookeeper servers with very consistent hardware where this would not be an issue. I think this is generally true, but we should consider cases where user is upgrading hardware, which might take a while and during this time it would be ideal if ZK offer the capability of balanced client connections across ensemble with heterogeneous hardwares. As a user myself, I'd like to have this feature, especially consider it seems not hard to implement. What Alex proposed should work. Another approach might be to assign weights to each address (a single server has weight one), and this will reduce to a weighted random selection problem. Overall, I think this proposal has little impact on server side, most impact is on client side. On Tue, Oct 23, 2018 at 9:34 AM Ted Dunning wrote: > There have been several comments on the document. I will be porting > discussions from the document back to the mailing list each day. > > Alex Shraer makes a good point that with the design as stated, there is no > provision for dealing with the rebalancing of client connections during > dynamic reconfiguration. I am very curious whether this needs to be > addressed in the design since it seems that if connections are redirected, > the same connection logic should apply. I suppose the text needs an update, > regardless, even if there is no effect. But is there something I missed > here? Will there be a code effect? > > Another comment points out that if you don't have symmetrical hardware for > the servers (i.e. more network interfaces on some), then client connections > are likely to be more numerous on servers with more network connections. > This is undoubtedly true. > > I have a question, however, about this. Is this situation actually > important enough to make the first version of this change? My own > experience is that production settings typically involve Zookeeper servers > with very consistent hardware where this would not be an issue. > > What experience do others have, particularly in production situations? > > On 2018/10/23 02:02:12, Ted Dunning wrote: > > ... > > I have started a collaborative document to work on the design approach. > > Once that is judged by the community to be sufficiently mature, I will > move > > it to a JIRA. > > > > That document is at > > > https://docs.google.com/document/d/1iGVwxeHp57qogwfdodCh9b32P2_kOQaJZ2GDo7j36fI/edit?usp=sharing > > > > The design document is currently open to the world for commenting so that > > anybody can suggest changes or ask questions. I will act as a bit of a > > moderator so that the document can remain completely open. > > >
Re: improving tolerance to network failures
Michael, I wouldn't characterize the current proposal as broken so much as it talks about connection balancing rather than server balancing. Other than that, I think I agree with what you are saying. So we have two folks with a feeling that server balancing from the client side is significantly better than connection balancing. I had thought that this would be desirable to defer in the interest of code simplicity. That may not be the right balance. The point about hardware upgrades is a very good one. On Tue, Oct 23, 2018 at 10:21 AM Michael Han wrote: > >> Will there be a code effect? > > There will be - the current rebalancing algorithm will be broken if no code > is done to StaticHostProvider.updateServerList to teach it aware of > multiple server addresses belong to the same server. For example, currently > if we add a new server through reconfig, the rebalance will kick in. In the > new proposal, if we add a new address to the existing server, if no code > change made to updateServerList, the rebalance will also kick in but it > should not, as in this case no new real servers are added. > > >> My own experience is that production settings typically involve > Zookeeper servers with very consistent hardware where this would not be an > issue. > > I think this is generally true, but we should consider cases where user is > upgrading hardware, which might take a while and during this time it would > be ideal if ZK offer the capability of balanced client connections across > ensemble with heterogeneous hardwares. As a user myself, I'd like to have > this feature, especially consider it seems not hard to implement. What Alex > proposed should work. Another approach might be to assign weights to each > address (a single server has weight one), and this will reduce to a > weighted random selection problem. > > Overall, I think this proposal has little impact on server side, most > impact is on client side. > > > On Tue, Oct 23, 2018 at 9:34 AM Ted Dunning wrote: > > > There have been several comments on the document. I will be porting > > discussions from the document back to the mailing list each day. > > > > Alex Shraer makes a good point that with the design as stated, there is > no > > provision for dealing with the rebalancing of client connections during > > dynamic reconfiguration. I am very curious whether this needs to be > > addressed in the design since it seems that if connections are > redirected, > > the same connection logic should apply. I suppose the text needs an > update, > > regardless, even if there is no effect. But is there something I missed > > here? Will there be a code effect? > > > > Another comment points out that if you don't have symmetrical hardware > for > > the servers (i.e. more network interfaces on some), then client > connections > > are likely to be more numerous on servers with more network connections. > > This is undoubtedly true. > > > > I have a question, however, about this. Is this situation actually > > important enough to make the first version of this change? My own > > experience is that production settings typically involve Zookeeper > servers > > with very consistent hardware where this would not be an issue. > > > > What experience do others have, particularly in production situations? > > > > On 2018/10/23 02:02:12, Ted Dunning wrote: > > > ... > > > I have started a collaborative document to work on the design approach. > > > Once that is judged by the community to be sufficiently mature, I will > > move > > > it to a JIRA. > > > > > > That document is at > > > > > > https://docs.google.com/document/d/1iGVwxeHp57qogwfdodCh9b32P2_kOQaJZ2GDo7j36fI/edit?usp=sharing > > > > > > The design document is currently open to the world for commenting so > that > > > anybody can suggest changes or ask questions. I will act as a bit of a > > > moderator so that the document can remain completely open. > > > > > >
[GitHub] zookeeper issue #665: [ZOOKEEPER-3163] Use session map in the Netty to impro...
Github user lvfangmin commented on the issue: https://github.com/apache/zookeeper/pull/665 @maoling, we can port this back to 3.4 in the same Jira, I'll send out a PR separately for that. ---
[GitHub] zookeeper pull request #673: [ZOOKEEPER-3177] Refactor request throttle logi...
Github user lvfangmin commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/673#discussion_r227532671 --- Diff: zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java --- @@ -1107,6 +1102,19 @@ public void processPacket(ServerCnxn cnxn, ByteBuffer incomingBuffer) throws IOE BinaryInputArchive bia = BinaryInputArchive.getArchive(bais); RequestHeader h = new RequestHeader(); h.deserialize(bia, "header"); + +// Need to increase the outstanding request count first, otherwise +// there might be a race condition that it enabled recv after +// processing request and then disabled when check throttling. +// +// It changes the semantic a bit, since when check throttling it's --- End diff -- @eolivelli I'll try to rephrase it, meanwhile please comment if you have any suggestion on how to rephrase this? ---
[GitHub] zookeeper pull request #632: [ZOOKEEPER-3150] Add tree digest check and veri...
Github user lvfangmin commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/632#discussion_r227532976 --- Diff: zookeeper-server/src/main/java/org/apache/zookeeper/server/command/HashCommand.java --- @@ -0,0 +1,49 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.command; + +import java.io.PrintWriter; +import java.util.List; + +import org.apache.zookeeper.server.DataTree.ZxidDigest; +import org.apache.zookeeper.server.ServerCnxn; + +/** + * Command used to dump the latest digest histories. + */ +public class HashCommand extends AbstractFourLetterCommand { --- End diff -- That seems more consistent, will do. ---
[GitHub] zookeeper pull request #632: [ZOOKEEPER-3150] Add tree digest check and veri...
Github user lvfangmin commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/632#discussion_r227533265 --- Diff: zookeeper-server/src/main/java/org/apache/zookeeper/server/util/AdHash.java --- @@ -0,0 +1,84 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.util; + +/** + * This incremental hash is used to keep track of the hash of + * the data tree to that we can quickly validate that things + * are in sync. + * + * See the excellent paper: A New Paradigm for collision-free hashing: + * Incrementality at reduced cost, M. Bellare and D. Micciancio + */ +public class AdHash { --- End diff -- I'd like to keep it as is, since it's consistent with the name in the paper. ---
Re: [VOTE] Maven migration - separation of java files to server and client project?
+1 (for keep server, client and common together). We can do client / server split separately. Interestingly, there was an outstanding JIRA about the split, which we could use after maven build is live: https://issues.apache.org/jira/browse/ZOOKEEPER-233 On Tue, Oct 16, 2018 at 3:30 AM Andor Molnar wrote: > Thanks Norbert for taking care of this. > No surprise here in a 10+ year old project. > > -1 (binding) for the separation > > Let’s keep server + client + common together for now. We can revisit this > later, but the pro is that we keep the release artifact structure and not > introducing breaking changes. > > Regards, > Andor > > > > > On 2018. Oct 16., at 9:15, Enrico Olivelli wrote: > > > > Yes, > > I think it is NOT a good idea to go ahead with this separation. > > > > so -1 (non binding) from my side for now > > > > And your patch is very good at demonstrating this. > > We can't break compatibility in clients. > > > > We can move to Maven first and then re-think about separating client and > server > > > > Enrico > > > > Il giorno lun 15 ott 2018 alle ore 23:55 Norbert Kalmar > > ha scritto: > >> > >> Sorry, I linked the document instead of the PR. I wanted to link the > >> document at the beginning of the letter after "It was said here" > >> > >> The PR: > >> https://github.com/apache/zookeeper/pull/670 > >> > >> Norbert > >> > >> On Mon, Oct 15, 2018 at 11:49 PM Norbert Kalmar > >> wrote: > >> > >>> Hi community! > >>> > >>> As outlined in the start document, it was planned to separate the java > >>> files to server and client, with common files in a separate common > module. > >>> It was said here: > >>> > >>> "Fifth iteration - move src/java/main to zk-server, which will be > further > >>> separated in Phase 2." > >>> > >>> But in order to save rebase for the contributors, I merged this into > one > >>> step. (I had a letter about it) > >>> So I already created zookeeper-server, zookeeper-client and > >>> zookeeper-common. > >>> > >>> But after doing the separation, I have to say... this just hardly makes > >>> any sense. > >>> Without breaking backward compatibility by making changes in the > package > >>> structure, it just makes the code more unreadable than before. Multiple > >>> packages has to be present in all 3 modules (as it was never an > intention > >>> to separate it, so many classes are just not in their logical package, > and > >>> even inner classes used when top level would be required for the > >>> separation). Client and server code can not be divided to only depend > on > >>> common. Either server depends on client - which makes more sense than > the > >>> other option - or client depend on server. > >>> (Or make common so fat, only literally a few class remains in server > and > >>> client - which again, makes no sense). > >>> > >>> I created a pull request to illustrate what needs to be done, and this > is > >>> not even half complete: > >>> > >>> > https://docs.google.com/document/d/1WXqhaPlCwchcWc8RCEzbCmVa4WbBDlfR3GQngikGjqc/edit?usp=sharing > >>> > >>> Some more detail in the description. > >>> > >>> My suggestion: > >>> forget about zookeeper-client-java and zookeeper-common, and just leave > >>> zookeeper-server. > >>> > >>> It just doesn't make any sense looking at the result, only makes the > >>> project much more complicated. The java code is too much tangled > together. > >>> > >>> What would this mean if I just create zookeeper-common? All the files > had > >>> to be renamed anyway, so some now would have 2 renames (fortunately > most of > >>> the files are in zookeeper-server anyway), and possible another rebase > for > >>> some PR's. > >>> > >>> Any input is appreciated. > >>> > >>> Regards, > >>> Norbert > >>> > >>> > >>> > >>> > >
[GitHub] zookeeper issue #300: ZOOKEEPER-2807: Flaky test: org.apache.zookeeper.test....
Github user lavacat commented on the issue: https://github.com/apache/zookeeper/pull/300 @anmolnar applied the patch to latest master and run tests 10 times with 8 threads. Original error in testNodeDataChanged is gone, but it failed 4 times with 2018-10-23 09:37:31,566 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@98] - TEST METHOD FAILED testNodeDataChanged java.util.concurrent.TimeoutException: Failed to connect to ZooKeeper server. at org.apache.zookeeper.test.ClientBase$CountdownWatcher.waitForConnected(ClientBase.java:151) at org.apache.zookeeper.test.WatchEventWhenAutoResetTest.testNodeDataChanged(WatchEventWhenAutoResetTest.java:116) I'll investigate more ---
[GitHub] zookeeper pull request #673: [ZOOKEEPER-3177] Refactor request throttle logi...
Github user eolivelli commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/673#discussion_r227557275 --- Diff: zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java --- @@ -1107,6 +1102,19 @@ public void processPacket(ServerCnxn cnxn, ByteBuffer incomingBuffer) throws IOE BinaryInputArchive bia = BinaryInputArchive.getArchive(bais); RequestHeader h = new RequestHeader(); h.deserialize(bia, "header"); + +// Need to increase the outstanding request count first, otherwise +// there might be a race condition that it enabled recv after +// processing request and then disabled when check throttling. +// +// It changes the semantic a bit, since when check throttling it's --- End diff -- Something simpler, without comparing current code with the old one. Like: Beware that we are actually checking the global outstanding request before this request. How does it sound to you? ---
[GitHub] zookeeper issue #628: ZOOKEEPER-3140: Allow Followers to host Observers
Github user enixon commented on the issue: https://github.com/apache/zookeeper/pull/628 @anmolnar : I see that you closed #660 without merging. Given that we're guessing it is the cause for the remaining test failures of this PR, is there something that I can do to to help address ZOOKEEPER-2320? ---
[GitHub] zookeeper issue #628: ZOOKEEPER-3140: Allow Followers to host Observers
Github user hanm commented on the issue: https://github.com/apache/zookeeper/pull/628 I am looking at https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2486/console, it was not clear to me which C test failed, aside from the ``./zktest-mt': free(): invalid pointer: 0x2b00d6385000` in the end. Note from the console log, it has: `[exec] Zookeeper_simpleSystem::testRemoveWatchers ZooKeeper server started : elapsed 4610 : OK`, so this might be a different failure that @anmolnar was fixing. Do we know which C test case is failing here? ---
[GitHub] zookeeper issue #628: ZOOKEEPER-3140: Allow Followers to host Observers
Github user hanm commented on the issue: https://github.com/apache/zookeeper/pull/628 >> This patch does not touch the c client or the default configurations for those tests so I'm unsure how to proceed. My feeling is the failure is a flaky test, and has nothing to do with this patch. Though, it would be good if we can identify the exact failing test case, and rule out the possibility that it's caused by this patch (since C client depends on same java server code.). Also, sorry for lagging on following up my previous review. I am resuming reviewing this patch this week. ---
[jira] [Commented] (ZOOKEEPER-3179) Add snapshot compression to reduce the disk IO
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661432#comment-16661432 ] Michael Han commented on ZOOKEEPER-3179: Good feature. We can also consider provide the option to offload compression / decompression to dedicated hardware - e.g. FPGA. > Add snapshot compression to reduce the disk IO > -- > > Key: ZOOKEEPER-3179 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3179 > Project: ZooKeeper > Issue Type: Improvement >Reporter: Fangmin Lv >Priority: Major > Fix For: 3.6.0 > > > When the snapshot becomes larger, the periodically snapshot after certain > number of txns will be more expensive. Which will in turn affect the maximum > throughput we can support within SLA, because of the disk contention between > snapshot and txn when they're on the same drive. > > With compression like zstd/snappy/gzip, the actual snapshot size could be > much smaller, the compress ratio depends on the actual data. It might make > the recovery time (loading from disk) faster in some cases, but will take > longer sometimes because of the extra time used to compress/decompress. > > Based on the production traffic, the performance various with different > compress method as well, that's why we provided different implementations, we > can select different compress method for different use cases. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3180) Add response cache to improve the throughput of read heavy traffic
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661452#comment-16661452 ] Michael Han commented on ZOOKEEPER-3180: What will be we caching here? Is it the byte buffers that holding the (serialized) response body that going to write out to socket? > Add response cache to improve the throughput of read heavy traffic > --- > > Key: ZOOKEEPER-3180 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3180 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Reporter: Fangmin Lv >Priority: Minor > Fix For: 3.6.0 > > > On read heavy use case with large response data size, the serialization of > response takes time and added overhead to the GC. > Add response cache helps improving the throughput we can support, which also > reduces the latency in general. > This Jira is going to implement a LRU cache for the response, which shows > some performance gain on some of our production ensembles. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] zookeeper pull request #665: [ZOOKEEPER-3163] Use session map in the Netty t...
Github user asfgit closed the pull request at: https://github.com/apache/zookeeper/pull/665 ---
[jira] [Resolved] (ZOOKEEPER-3163) Use session map to improve the performance when closing session in Netty
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-3163. Resolution: Fixed Issue resolved by pull request 665 [https://github.com/apache/zookeeper/pull/665] > Use session map to improve the performance when closing session in Netty > > > Key: ZOOKEEPER-3163 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3163 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Reporter: Fangmin Lv >Assignee: Fangmin Lv >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > Previously, it needs to go through all the cnxns to find out the session to > close, which is O(N), N is the total connections we have. > This will affect the performance of close session or renew session if there > are lots of connections on this server, this JIRA is going to reuse the > session map code in NIO implementation to improve the performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] zookeeper pull request #642: ZOOKEEPER-3151: test Jenkins. Don't merge.
Github user hanm closed the pull request at: https://github.com/apache/zookeeper/pull/642 ---
ZooKeeper-trunk - Build # 245 - Still Failing
See https://builds.apache.org/job/ZooKeeper-trunk/245/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 334.69 KB...] [exec] : elapsed 1001 : OK [exec] Zookeeper_simpleSystem::testLogCallbackClearLog Message Received: [2018-10-24 02:19:56,718:11168(0x2b9a3ec45f40):ZOO_INFO@log_env@1080: Client environment:zookeeper.version=zookeeper C client 3.6.0] [exec] Log Message Received: [2018-10-24 02:19:56,718:11168(0x2b9a3ec45f40):ZOO_INFO@log_env@1084: Client environment:host.name=asf910.gq1.ygridcore.net] [exec] Log Message Received: [2018-10-24 02:19:56,718:11168(0x2b9a3ec45f40):ZOO_INFO@log_env@1091: Client environment:os.name=Linux] [exec] Log Message Received: [2018-10-24 02:19:56,718:11168(0x2b9a3ec45f40):ZOO_INFO@log_env@1092: Client environment:os.arch=3.13.0-153-generic] [exec] Log Message Received: [2018-10-24 02:19:56,718:11168(0x2b9a3ec45f40):ZOO_INFO@log_env@1093: Client environment:os.version=#203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018] [exec] Log Message Received: [2018-10-24 02:19:56,718:11168(0x2b9a3ec45f40):ZOO_INFO@log_env@1101: Client environment:user.name=jenkins] [exec] Log Message Received: [2018-10-24 02:19:56,719:11168(0x2b9a3ec45f40):ZOO_INFO@log_env@1109: Client environment:user.home=/home/jenkins] [exec] Log Message Received: [2018-10-24 02:19:56,719:11168(0x2b9a3ec45f40):ZOO_INFO@log_env@1121: Client environment:user.dir=/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build/test/test-cppunit] [exec] Log Message Received: [2018-10-24 02:19:56,719:11168(0x2b9a3ec45f40):ZOO_INFO@zookeeper_init_internal@1167: Initiating client connection, host=127.0.0.1:22181 sessionTimeout=1 watcher=0x4639e0 sessionId=0 sessionPasswd= context=0x7ffc4bd556a0 flags=0] [exec] Log Message Received: [2018-10-24 02:19:56,719:11168(0x2b9a40ca8700):ZOO_INFO@check_events@2454: initiated connection to server 127.0.0.1:22181] [exec] Log Message Received: [2018-10-24 02:19:56,742:11168(0x2b9a40ca8700):ZOO_INFO@check_events@2506: session establishment complete on server 127.0.0.1:22181, sessionId=0x10225d7e5ee000f, negotiated timeout=1 ] [exec] : elapsed 1001 : OK [exec] Zookeeper_simpleSystem::testAsyncWatcherAutoReset ZooKeeper server started : elapsed 10495 : OK [exec] Zookeeper_simpleSystem::testDeserializeString : elapsed 0 : OK [exec] Zookeeper_simpleSystem::testFirstServerDown : elapsed 1001 : OK [exec] Zookeeper_simpleSystem::testNonexistentHost : elapsed 1137 : OK [exec] Zookeeper_simpleSystem::testNullData : elapsed 1042 : OK [exec] Zookeeper_simpleSystem::testIPV6 : elapsed 1022 : OK [exec] Zookeeper_simpleSystem::testCreate : elapsed 1015 : OK [exec] Zookeeper_simpleSystem::testPath : elapsed 1058 : OK [exec] Zookeeper_simpleSystem::testPathValidation : elapsed 1150 : OK [exec] Zookeeper_simpleSystem::testPing : elapsed 17669 : OK [exec] Zookeeper_simpleSystem::testAcl : elapsed 1016 : OK [exec] Zookeeper_simpleSystem::testChroot : elapsed 3090 : OK [exec] Zookeeper_simpleSystem::testAuth ZooKeeper server started ZooKeeper server started : elapsed 31045 : OK [exec] Zookeeper_simpleSystem::testHangingClient : elapsed 1067 : OK [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithGlobal ZooKeeper server started ZooKeeper server started ZooKeeper server started : elapsed 15726 : OK [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithLocal ZooKeeper server started ZooKeeper server started ZooKeeper server started : elapsed 15655 : OK [exec] Zookeeper_simpleSystem::testGetChildren2 : elapsed 1071 : OK [exec] Zookeeper_simpleSystem::testLastZxid : elapsed 4537 : OK [exec] Zookeeper_simpleSystem::testRemoveWatchers ZooKeeper server started : elapsed 4698 : OK [exec] Zookeeper_readOnly::testReadOnly : elapsed 4126 : OK [exec] Zookeeper_logClientEnv::testLogClientEnv : elapsed 1 : OK [exec] OK (76) [exec] FAIL: zktest-mt [exec] == [exec] 1 of 2 tests failed [exec] Please report to u...@zookeeper.apache.org [exec] == [exec] make[1]: Leaving directory `/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build/test/test-cppunit' [exec] *** Error in `./zktest-mt': free(): invalid pointer: 0x2b9a3ec31000 *** [exec] /bin/bash: line 5: 11168 Aborted ZKROOT=/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/zookeeper-client/zookeeper-client-c/../.. CLASSPATH=$CLASSPATH:$CLOVER_HOME/lib/clover.jar ${dir}$tst [exec] make[1]: *** [check-TESTS] Error 1 [exec] make: *** [check-am] Error 2 BUILD FAILED /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build.xml:1490: The following error occurred while
[GitHub] zookeeper issue #628: ZOOKEEPER-3140: Allow Followers to host Observers
Github user enixon commented on the issue: https://github.com/apache/zookeeper/pull/628 The tests that fail are not entirely consistent. I've tried disabling TestLogClientEnv.cc, TestReadOnlyClient.cc, TestReconfigServer.cc, and TestWatchers.cc and this seems to work some of the time. ---
[jira] [Commented] (ZOOKEEPER-3181) ZOOKEEPER-2355 broke Curator TestingQuorumPeerMain
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661697#comment-16661697 ] Akira Ajisaka commented on ZOOKEEPER-3181: -- Recently Apache Hadoop upgraded ZooKeeper from 3.4.8 to 3.4.13 due to security concern (HADOOP-15816). And then ZK tests fail because Hadoop is using Curator 2.12.0 (YARN-8937). > ZOOKEEPER-2355 broke Curator TestingQuorumPeerMain > -- > > Key: ZOOKEEPER-3181 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3181 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.5.3, 3.4.11 >Reporter: Akira Ajisaka >Priority: Major > > ZOOKEEPER-2355 added a getQuorumPeer method to QuorumPeerMain > [https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerMain.java#L194]. > TestingQuorumPeerMain has an identically named method, which is now > unintentionally overridding the one in the base class. > This is fixed by CURATOR-409, however, I'd like this to be fixed in ZooKeeper > as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3163) Use session map to improve the performance when closing session in Netty
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661701#comment-16661701 ] Hudson commented on ZOOKEEPER-3163: --- FAILURE: Integrated in Jenkins build Zookeeper-trunk-single-thread #72 (See [https://builds.apache.org/job/Zookeeper-trunk-single-thread/72/]) ZOOKEEPER-3163: Use session map in the Netty to improve close session (hanm: rev 1ce2ca8107438d283581d18d064a25bd6b74adf7) * (edit) zookeeper-server/src/main/java/org/apache/zookeeper/server/NIOServerCnxnFactory.java * (edit) zookeeper-server/src/main/java/org/apache/zookeeper/server/NettyServerCnxn.java * (edit) zookeeper-server/src/main/java/org/apache/zookeeper/server/NettyServerCnxnFactory.java * (edit) zookeeper-server/src/main/java/org/apache/zookeeper/server/ServerCnxnFactory.java > Use session map to improve the performance when closing session in Netty > > > Key: ZOOKEEPER-3163 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3163 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Reporter: Fangmin Lv >Assignee: Fangmin Lv >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > Previously, it needs to go through all the cnxns to find out the session to > close, which is O(N), N is the total connections we have. > This will affect the performance of close session or renew session if there > are lots of connections on this server, this JIRA is going to reuse the > session map code in NIO implementation to improve the performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZOOKEEPER-3181) ZOOKEEPER-2355 broke Curator TestingQuorumPeerMain
Akira Ajisaka created ZOOKEEPER-3181: Summary: ZOOKEEPER-2355 broke Curator TestingQuorumPeerMain Key: ZOOKEEPER-3181 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3181 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.4.11, 3.5.3 Reporter: Akira Ajisaka ZOOKEEPER-2355 added a getQuorumPeer method to QuorumPeerMain [https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerMain.java#L194]. TestingQuorumPeerMain has an identically named method, which is now unintentionally overridding the one in the base class. This is fixed by CURATOR-409, however, I'd like this to be fixed in ZooKeeper as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3163) Use session map to improve the performance when closing session in Netty
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661753#comment-16661753 ] Hudson commented on ZOOKEEPER-3163: --- SUCCESS: Integrated in Jenkins build ZooKeeper-trunk #246 (See [https://builds.apache.org/job/ZooKeeper-trunk/246/]) ZOOKEEPER-3163: Use session map in the Netty to improve close session (hanm: rev 1ce2ca8107438d283581d18d064a25bd6b74adf7) * (edit) zookeeper-server/src/main/java/org/apache/zookeeper/server/NettyServerCnxn.java * (edit) zookeeper-server/src/main/java/org/apache/zookeeper/server/NIOServerCnxnFactory.java * (edit) zookeeper-server/src/main/java/org/apache/zookeeper/server/NettyServerCnxnFactory.java * (edit) zookeeper-server/src/main/java/org/apache/zookeeper/server/ServerCnxnFactory.java > Use session map to improve the performance when closing session in Netty > > > Key: ZOOKEEPER-3163 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3163 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Reporter: Fangmin Lv >Assignee: Fangmin Lv >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > Previously, it needs to go through all the cnxns to find out the session to > close, which is O(N), N is the total connections we have. > This will affect the performance of close session or renew session if there > are lots of connections on this server, this JIRA is going to reuse the > session map code in NIO implementation to improve the performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)