[GitHub] zookeeper issue #628: ZOOKEEPER-3140: Allow Followers to host Observers

2018-12-08 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/628
  
Merged into master, thanks @enixon!


---


[GitHub] zookeeper issue #703: [ZOOKEEPER-1818] Correctly handle potential inconsiste...

2018-12-08 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/703
  
Here is the PR for 3.5: #714, thanks @mkedwards.


---


[GitHub] zookeeper issue #714: Zookeeper 1818 (on branch-3.5)

2018-12-08 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/714
  
@mkedwards I just made a 3.5 patch as well and saw this when I tried to 
send out, thanks for porting this over, thee change seems identical with my 3.5 
one except two comments change to follow {@see}, maybe we should update that 
here as well.


---


[GitHub] zookeeper issue #711: ZOOKEEPER-3177. Revert globalOutstandingLimit refactor...

2018-12-07 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/711
  
Sorry, just saw this, @anmolnar what's the findbugs issue caused by that 
refactor?


---


[GitHub] zookeeper issue #732: typo

2018-12-07 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/732
  
+1

Thanks @stanlyDoge for correcting this, can you create a JIRA to make it 
easier to track this: https://issues.apache.org/jira/projects/ZOOKEEPER/issues.


---


[GitHub] zookeeper pull request #628: ZOOKEEPER-3140: Allow Followers to host Observe...

2018-12-06 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/628#discussion_r239578960
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/ObserverMaster.java
 ---
@@ -0,0 +1,514 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.server.quorum;
+
+import org.apache.zookeeper.jmx.MBeanRegistry;
+import org.apache.zookeeper.server.Request;
+import org.apache.zookeeper.server.ZKDatabase;
+
+import java.io.BufferedInputStream;
+import java.io.ByteArrayInputStream;
+import java.io.DataInputStream;
+import java.io.IOException;
+import java.net.ServerSocket;
+import java.net.Socket;
+import java.net.SocketAddress;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.Iterator;
+import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.ConcurrentLinkedQueue;
+import java.util.concurrent.Executors;
+import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicLong;
+
+import org.apache.zookeeper.server.quorum.auth.QuorumAuthServer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Used by Followers to host Observers. This reduces the network load on 
the Leader process by pushing
+ * the responsibility for keeping Observers in sync off the leading peer.
+ *
+ * It is expected that Observers will continue to perform the initial 
vetting of clients and requests.
+ * Observers send the request to the follower where it is received by an 
ObserverMaster.
+ *
+ * The ObserverMaster forwards a copy of the request to the ensemble 
Leader and inserts it into its own
+ * request processor pipeline where it can be matched with the response 
comes back. All commits received
+ * from the Leader will be forwarded along to every Learner connected to 
the ObserverMaster.
+ *
+ * New Learners connecting to a Follower will receive a LearnerHandler 
object and be party to its syncing logic
+ * to be brought up to date.
+ *
+ * The logic is quite a bit simpler than the corresponding logic in Leader 
because it only hosts observers.
+ */
+public class ObserverMaster implements LearnerMaster, Runnable {
+private static final Logger LOG = 
LoggerFactory.getLogger(ObserverMaster.class);
+
+//Follower counter
+private final AtomicLong followerCounter = new AtomicLong(-1);
+
+private QuorumPeer self;
+private FollowerZooKeeperServer zks;
+private int port;
+
+private Set activeObservers = 
Collections.newSetFromMap(new ConcurrentHashMap());
+
+private final ConcurrentHashMap 
connectionBeans = new ConcurrentHashMap<>();
+
+/**
+ * we want to keep a log of past txns so that observers can sync up 
with us when we connect,
+ * but we can't keep everything in memory, so this limits how much 
memory will be dedicated
+ * to keeping recent txns.
+ */
+private final static int PKTS_SIZE_LIMIT = 32 * 1024 * 1024;
+private static volatile int pktsSizeLimit = 
Integer.getInteger("zookeeper.observerMaster.sizeLimit", PKTS_SIZE_LIMIT);
+private ConcurrentLinkedQueue proposedPkts = new 
ConcurrentLinkedQueue<>();
+private ConcurrentLinkedQueue committedPkts = new 
ConcurrentLinkedQueue<>();
+private int pktsSize = 0;
+
+private long lastProposedZxid;
+
+// ensure ordering of revalidations returned to this learner
+private final Object revalidateSessionLock = new Object();
+
+// Throttle when there are too many concurrent snapshots being sent to 
observers
+private static final String MAX_CONCURRENT_SNAPSHOTS = 
"zookeeper.leader.maxConcurrentSnapshots";
+private stati

[GitHub] zookeeper issue #628: ZOOKEEPER-3140: Allow Followers to host Observers

2018-12-06 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/628
  
Thanks @enixon for doing the update and rebase, went through this again, 
looks legit to me. I also compared with internal version and made sure this has 
included all the improvement and bug fixes.

Maybe consider to add the om_proposal_process_time_ms and 
om_commit_process_time_ms metric, but this is not a must have.

+1 this should be ready to go.



---


[GitHub] zookeeper issue #703: [ZOOKEEPER-1818] Correctly handle potential inconsiste...

2018-12-06 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/703
  
Thanks @anmolnar, I'll send out the PR for 3.5.


---


[GitHub] zookeeper pull request #731: [ZOOKEEPER-3208] Remove the SSLTest.java.orig i...

2018-12-05 Thread lvfangmin
GitHub user lvfangmin opened a pull request:

https://github.com/apache/zookeeper/pull/731

[ZOOKEEPER-3208] Remove the SSLTest.java.orig introduced in ZOOKEEPER-3032



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvfangmin/zookeeper ZOOKEEPER-3208

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/731.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #731


commit 1ed1820ba13b50724948a195548da0f7f07f2583
Author: Fangmin Lyu 
Date:   2018-12-05T22:21:57Z

Remove the SSLTest.java.orig introduced in ZOOKEEPER-3032




---


[GitHub] zookeeper pull request #729: [ZOOKEEPER-3207] Removing the unexpected WatchM...

2018-12-04 Thread lvfangmin
GitHub user lvfangmin opened a pull request:

https://github.com/apache/zookeeper/pull/729

[ZOOKEEPER-3207] Removing the unexpected WatchManager.java introduced by 
mistake during maven migration



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvfangmin/zookeeper ZOOKEEPER-3207

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/729.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #729


commit 9832d6c58f342e6aa7285fd9a76edac371371b64
Author: Fangmin Lyu 
Date:   2018-12-04T20:07:53Z

Removing the unexpected WatchManager.java introduced by mistake during 
maven migration




---


[GitHub] zookeeper pull request #684: ZOOKEEPER-3180: Add response cache to improve t...

2018-12-04 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/684#discussion_r238787735
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/DumbWatcher.java ---
@@ -69,7 +69,7 @@ public void sendCloseSession() { }
 void setSessionId(long sessionId) { }
 
 @Override
-void sendBuffer(ByteBuffer closeConn) { }
+void sendBuffer(ByteBuffer... closeConn) { }
--- End diff --

In order to reduce the GC effort by re-using the cached serialized data, we 
need to break the single response packet into different part, the length 
buffer, the header buffer, and the data buffer, length and header will be built 
every time, but data buffer is reused, that's why we changed this to pass in a 
buffers list.


---


[GitHub] zookeeper pull request #684: ZOOKEEPER-3180: Add response cache to improve t...

2018-12-04 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/684#discussion_r238786172
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/NIOServerCnxn.java 
---
@@ -151,12 +148,17 @@ void sendBufferSync(ByteBuffer bb) {
  * sendBuffer pushes a byte buffer onto the outgoing buffer queue for
  * asynchronous writes.
  */
-public void sendBuffer(ByteBuffer bb) {
+public void sendBuffer(ByteBuffer... buffers) {
 if (LOG.isTraceEnabled()) {
 LOG.trace("Add a buffer to outgoingBuffers, sk " + sk
   + " is valid: " + sk.isValid());
 }
-outgoingBuffers.add(bb);
+synchronized (outgoingBuffers) {
--- End diff --

We have E2E perf regression detect for different use cases with different 
traffic, haven't seen any obvious perf impact there.


---


[GitHub] zookeeper pull request #684: ZOOKEEPER-3180: Add response cache to improve t...

2018-12-04 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/684#discussion_r238794022
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/ResponseCache.java 
---
@@ -0,0 +1,84 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.server;
+
+import java.util.Collections;
+import java.util.LinkedHashMap;
+import java.util.Map;
+
+import org.apache.jute.Record;
+import org.apache.zookeeper.data.Stat;
+
+@SuppressWarnings("serial")
+public class ResponseCache {
+private static final int DEFAULT_RESPONSE_CACHE_SIZE = 400;
+
+private static class Entry {
+public Stat stat;
+public byte[] data;
+}
+
+private Map cache = Collections.synchronizedMap(
+new LRUCache(getResponseCacheSize()));
+
+public ResponseCache() {
+}
+
+public void put(String path, byte[] data, Stat stat) {
+Entry entry = new Entry();
+entry.data = data;
+entry.stat = stat;
+cache.put(path, entry);
+}
+
+public byte[] get(String key, Stat stat) {
+Entry entry = cache.get(key);
+if (entry == null) {
+return null;
+}
+if (!stat.equals(entry.stat)) {
--- End diff --

Currently, he stat is also part of the cached data, when it's changed we 
need to invalidate this as well.

We can separate the cache for data and stat, but we didn't see that's 
necessary for now, this is mainly for improving the serializing performance and 
GC effort with large znode data size.


---


[GitHub] zookeeper pull request #684: ZOOKEEPER-3180: Add response cache to improve t...

2018-12-04 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/684#discussion_r238788593
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/ResponseCache.java 
---
@@ -0,0 +1,84 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.server;
+
+import java.util.Collections;
+import java.util.LinkedHashMap;
+import java.util.Map;
+
+import org.apache.jute.Record;
+import org.apache.zookeeper.data.Stat;
+
+@SuppressWarnings("serial")
+public class ResponseCache {
--- End diff --

Since this is a read cache, it's better to invalidate it when the client 
read the changed data to avoid unnecessary update.


---


[GitHub] zookeeper pull request #684: ZOOKEEPER-3180: Add response cache to improve t...

2018-12-04 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/684#discussion_r238788174
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/NIOServerCnxn.java 
---
@@ -235,10 +237,12 @@ void handleWrite(SelectionKey k) throws IOException, 
CloseRequestException {
 if (bb == ServerCnxnFactory.closeConn) {
 throw new CloseRequestException("close requested");
 }
+if (bb == packetSentinel) {
--- End diff --

Since we separate the response into multiple pieces, length, header, data, 
we need a way to tell if it's the end of a response or not, and the sentinel is 
added for this purpose.


---


[GitHub] zookeeper pull request #684: ZOOKEEPER-3180: Add response cache to improve t...

2018-12-04 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/684#discussion_r238790553
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/ResponseCache.java 
---
@@ -0,0 +1,84 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.server;
+
+import java.util.Collections;
+import java.util.LinkedHashMap;
+import java.util.Map;
+
+import org.apache.jute.Record;
+import org.apache.zookeeper.data.Stat;
+
+@SuppressWarnings("serial")
+public class ResponseCache {
+private static final int DEFAULT_RESPONSE_CACHE_SIZE = 400;
+
+private static class Entry {
+public Stat stat;
+public byte[] data;
+}
+
+private Map cache = Collections.synchronizedMap(
+new LRUCache(getResponseCacheSize()));
+
+public ResponseCache() {
+}
+
+public void put(String path, byte[] data, Stat stat) {
+Entry entry = new Entry();
+entry.data = data;
+entry.stat = stat;
+cache.put(path, entry);
+}
+
+public byte[] get(String key, Stat stat) {
+Entry entry = cache.get(key);
+if (entry == null) {
+return null;
+}
+if (!stat.equals(entry.stat)) {
+// The node has been modified, invalidate cache.
+cache.remove(key);
+return null;
+} else {
+return entry.data;
+}
+}
+
+private static int getResponseCacheSize() {
+String value = 
System.getProperty("zookeeper.maxResponseCacheSize");
+return value == null ? DEFAULT_RESPONSE_CACHE_SIZE : 
Integer.parseInt(value);
+}
+
+public static boolean isEnabled() {
+return getResponseCacheSize() != 0;
+}
+
+private static class LRUCache extends LinkedHashMap {
--- End diff --

We can extend that, but it's better to be done in a separate thread, feel 
free to open a new JIRA to follow up on this.


---


[GitHub] zookeeper pull request #684: ZOOKEEPER-3180: Add response cache to improve t...

2018-12-04 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/684#discussion_r238789004
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/ResponseCache.java 
---
@@ -0,0 +1,84 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.server;
+
+import java.util.Collections;
+import java.util.LinkedHashMap;
+import java.util.Map;
+
+import org.apache.jute.Record;
+import org.apache.zookeeper.data.Stat;
+
+@SuppressWarnings("serial")
+public class ResponseCache {
+private static final int DEFAULT_RESPONSE_CACHE_SIZE = 400;
--- End diff --

Yes, I think we randomly chose one, it depends on how much data you have, 
we'll update the doc for the best practice. Also there is metric to show the 
cache hit rate, if it's too low, maybe we need to raise the cache size.


---


[GitHub] zookeeper issue #703: [ZOOKEEPER-1818] Correctly handle potential inconsiste...

2018-12-04 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/703
  
@hanm is the latest change looking good to you?


---


[GitHub] zookeeper pull request #703: [ZOOKEEPER-1818] Correctly handle potential inc...

2018-12-04 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/703#discussion_r238742520
  
--- Diff: 
zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/FLEOutOfElectionTest.java
 ---
@@ -0,0 +1,136 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.zookeeper.server.quorum;
+
+import java.io.File;
+import java.net.InetSocketAddress;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.zookeeper.PortAssignment;
+import org.apache.zookeeper.server.quorum.QuorumPeer;
+import org.apache.zookeeper.server.quorum.FastLeaderElection.Notification;
+import org.apache.zookeeper.server.quorum.Vote;
+import org.apache.zookeeper.server.quorum.QuorumPeer.QuorumServer;
+import org.apache.zookeeper.server.quorum.QuorumPeer.ServerState;
+import org.apache.zookeeper.server.util.ZxidUtils;
+import org.apache.zookeeper.test.ClientBase;
+import org.apache.zookeeper.test.FLETest;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+/**
+ * Test FastLeaderElection with out of election servers.
+ */
+public class FLEOutOfElectionTest {
+
+private FastLeaderElection fle;
+
+@Before
+public void setUp() throws Exception {
+File tmpdir = ClientBase.createTmpDir();
+Map peers = new HashMap();
+for(int i = 0; i < 5; i++) {
+peers.put(Long.valueOf(i), new QuorumServer(Long.valueOf(i), 
+new InetSocketAddress("127.0.0.1", 
PortAssignment.unique(;
+}
+QuorumPeer peer = new QuorumPeer(peers, tmpdir, tmpdir, 
+PortAssignment.unique(), 3, 3, 1000, 2, 2);
+fle = new FastLeaderElection(peer, peer.createCnxnManager());
+}
+
+@Test
+public void testIgnoringZxidElectionEpoch() {
+Map votes = new HashMap();
+votes.put(0L, new Vote(0x1, 4L, ZxidUtils.makeZxid(1, 1), 1, 2, 
ServerState.FOLLOWING));
+votes.put(1L, new Vote(0x1, 4L, ZxidUtils.makeZxid(1, 2), 1, 2, 
ServerState.FOLLOWING));
+votes.put(3L, new Vote(0x1, 4L, ZxidUtils.makeZxid(2, 1), 2, 2, 
ServerState.FOLLOWING));
+votes.put(4L, new Vote(0x1, 4L, ZxidUtils.makeZxid(2, 1), 2, 2, 
ServerState.LEADING));
+
+Assert.assertTrue(fle.getVoteTracker(votes, 
+new Vote(4L, ZxidUtils.makeZxid(2, 1), 2, 2, 
ServerState.FOLLOWING)).hasAllQuorums());
+}
+
+@Test
+public void testElectionWIthDifferentVersion() {
+Map votes = new HashMap();
+votes.put(0L, new Vote(0x1, 4L, ZxidUtils.makeZxid(1, 1), 1, 1, 
ServerState.FOLLOWING));
+votes.put(1L, new Vote(0x1, 4L, ZxidUtils.makeZxid(1, 1), 1, 1, 
ServerState.FOLLOWING));
+votes.put(3L, new Vote(4L, ZxidUtils.makeZxid(2, 1), 2, 2, 
ServerState.FOLLOWING));
+votes.put(4L, new Vote(4L, ZxidUtils.makeZxid(2, 1), 2, 2, 
ServerState.LEADING));
+
+Assert.assertTrue(fle.getVoteTracker(votes, 
+new Vote(4L, ZxidUtils.makeZxid(2, 1), 2, 2, 
ServerState.FOLLOWING)).hasAllQuorums());
+}
+
+@Test
+public void testLookingNormal() {
+Map votes = new HashMap();
+votes.put(0L, new Vote(4L, ZxidUtils.makeZxid(2, 1), 1, 1, 
ServerState.LOOKING));
+votes.put(1L, new Vote(4L, ZxidUtils.makeZxid(2, 1), 1, 1, 
ServerState.LOOKING));
+votes.put(3L, new Vote(4L, ZxidUtils.makeZxid(2, 1), 1, 1, 
ServerState.LOOKING));
+votes.put(4L, new Vote(4L, ZxidUtils.makeZxid(2, 1), 1, 1, 
ServerState.LEADING));
+
+Assert.assertTrue(fle.getVoteTracker(votes, 
+new Vote(4L, ZxidUtils.makeZxid(2, 1), 1, 1, 
ServerState.LOOKING)).hasAllQuorums());
+}
+
+@Test
+public void testLookingDiffRounds() {
+  

[GitHub] zookeeper pull request #703: [ZOOKEEPER-1818] Correctly handle potential inc...

2018-12-01 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/703#discussion_r238085247
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/FastLeaderElection.java
 ---
@@ -1023,9 +1016,9 @@ else if (validVoter(n.sid) && validVoter(n.leader)) {
  */
 if (n == null) {
 setPeerState(proposedLeader, voteSet);
-
 Vote endVote = new Vote(proposedLeader,
-proposedZxid, proposedEpoch);
+proposedZxid, logicalclock.get(), 
--- End diff --

@hanm for the general indention after { we use 4 white spaces, but for the 
statement breaking we usually use 8 white spaces, isn't that true for ZK 
codebase, I saw the previous code is using this rule.

Let me know if this is not true, I can update those.


---


[GitHub] zookeeper pull request #703: [ZOOKEEPER-1818] Correctly handle potential inc...

2018-12-01 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/703#discussion_r238085234
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/QuorumPeer.java
 ---
@@ -1989,6 +1989,38 @@ private boolean updateVote(long designatedLeader, 
long zxid){
 /**
  * Updates leader election info to avoid inconsistencies when
  * a new server tries to join the ensemble.
+ *
+ * Here is the inconsistency scenario we try to solve by updating the 
peer 
+ * epoch after following leader:
+ *
+ * Let's say we have an ensemble with 3 servers z1, z2 and z3.
+ *
+ * 1. z1, z2 were following z3 with peerEpoch to be 0xb8, the new 
epoch is 
+ *0xb9, aka current accepted epoch on disk.
+ * 2. z2 get restarted, which will use 0xb9 as it's peer epoch when 
loading
+ *the current accept epoch from disk.
+ * 3. z2 received notification from z1 and z3, which is following z3 
with 
+ *epoch 0xb8, so it started following z3 again with peer epoch 
0xb8.
+ * 4. before z2 successfully connected to z3, z3 get restarted with 
new 
+ *epoch 0xb9.
+ * 5. z2 will retry around a few round (default 5s) before giving up, 
+ *meanwhile it will report z3 as leader.
+ * 6. z1 restarted, and looking with peer epoch 0xb9.
+ * 7. z1 voted z3, and z3 was elected as leader again with peer epoch 
0xb9.
+ * 8. z2 successfully connected to z3 before giving up, but with peer 
+ *epoch 0xb8.
+ * 9. z1 get restarted, looking for leader with peer epoch 0xba, but 
cannot 
+ *join, because z2 is reporting peer epoch 0xb8, while z3 is 
reporting 
+ *0xb9.
+ *
+ * By updating the election vote after actually following leader, we 
can 
--- End diff --

It's align with the function name which is updating the vote, although we 
only updated the electionEpoch here.


---


[GitHub] zookeeper issue #726: ZOOKEEPER-3205: o.a.jute test cases

2018-11-28 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/726
  
Only a few nits, other LGTM.


---


[GitHub] zookeeper issue #689: ZOOKEEPER-3183:Notifying the WatcherCleaner thread and...

2018-11-28 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/689
  
Merged, thanks @tumativ. Btw, what's your Jira user id, I tried to assign 
this JIRA to you but I cannot find you.


---


[GitHub] zookeeper pull request #722: [ZOOKEEPER-3203] Tracking the number of non vot...

2018-11-28 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/722#discussion_r23726
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/LeaderBean.java
 ---
@@ -51,6 +51,15 @@ public String followerInfo() {
 return sb.toString();
 }
 
+@Override
+public String nonVotingFollowerInfo() {
+StringBuilder sb = new StringBuilder();
+for (LearnerHandler handler : leader.getNonVotingFollowers()) {
+sb.append(handler.toString()).append("\n");
--- End diff --

That's a fair point, but I'd like to keep the same behavior in this patch 
as what we're doing now for the followInfo in this LeaderBean.


---


[GitHub] zookeeper issue #689: ZOOKEEPER-3183:Notifying the WatcherCleaner thread and...

2018-11-28 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/689
  
LGTM, +1

Thanks @tumativ for all your effort on this, I'll commit this today.


---


[GitHub] zookeeper issue #703: [ZOOKEEPER-1818] Correctly handle potential inconsiste...

2018-11-28 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/703
  
May need an extra +1 from another committer, @hanm do you have time to 
review this?


---


[GitHub] zookeeper pull request #722: [ZOOKEEPER-3203] Tracking the number of non vot...

2018-11-25 Thread lvfangmin
GitHub user lvfangmin opened a pull request:

https://github.com/apache/zookeeper/pull/722

[ZOOKEEPER-3203] Tracking the number of non voting followers in ZK



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvfangmin/zookeeper ZOOKEEPER-3203

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/722.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #722


commit f093c2b6306d86efc5da9ad0834553060f99ae63
Author: Fangmin Lyu 
Date:   2018-11-26T00:41:25Z

[ZOOKEEPER-3203] Track the number of non voting followers in ZK




---


[GitHub] zookeeper pull request #689: ZOOKEEPER-3183:Notifying the WatcherCleaner thr...

2018-11-25 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/689#discussion_r236098795
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/watch/WatcherCleaner.java
 ---
@@ -161,10 +163,10 @@ public void doWork() throws Exception {
 long startTime = Time.currentElapsedTime();
 listener.processDeadWatchers(snapshot);
 long latency = Time.currentElapsedTime() - 
startTime;
-LOG.info("Takes {} to process {} watches", 
latency, total);
+LOG.info("Takes {} to process {} watchers", 
latency, total);
--- End diff --

Watches seems to be more reasonable here, watcher maps to a single client 
session, and it can watch multiple paths, so have multiple watches on a single 
watcher, the total value is the total watches count not watcher.


---


[GitHub] zookeeper pull request #689: ZOOKEEPER-3183:Notifying the WatcherCleaner thr...

2018-11-25 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/689#discussion_r236098818
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/watch/WatcherCleaner.java
 ---
@@ -102,12 +103,13 @@ public void addDeadWatcher(int watcherBit) {
 totalDeadWatchers.get() >= maxInProcessingDeadWatchers) {
 try {
 RATE_LOGGER.rateLimitLog("Waiting for dead watchers 
cleaning");
-synchronized(totalDeadWatchers) {
-totalDeadWatchers.wait(100);
+synchronized(processingCompletedEvent) {
+processingCompletedEvent.wait(100);
 }
 } catch (InterruptedException e) {
-LOG.info("Got interrupted while waiting for dead watches " 
+
+LOG.info("Got interrupted while waiting for dead watchers 
" +
--- End diff --

Ditto, watches are more accurate.


---


[GitHub] zookeeper issue #703: [ZOOKEEPER-1818] Correctly handle potential inconsiste...

2018-11-25 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/703
  
@anmolnar following are my understanding about the acceptedEpoch, 
currentEpoch and electionEpoch:

* acceptedEpoch : the previous epoch we accepted so far, usually is the 
epoch is the highest zxid on that server.
* currentEpoch  : the current epoch after syncing with the new leader, it's 
based on the maximum acceptedEpoch in the quorum, and usually it's the 
max(acceptedEpoch) + 1. The currentEpoch is used as the peerEpoch in the leader 
election, as we know (sid, zxid, peerEpoch) are the set used to decide a leader.
* electionEpoch : not part of the factors to decide leader, but it's used 
as a logical clock to avoid considering a vote delayed from a while ago.

Basically, we know there is a corner case where the learner may not update 
it's zxid, peerEpoch, and electionEpoch after leader election (check the new 
comment I added Leader.updateElectionVote), peerEpoch is fixed with a hack 
solution, but we cannot easily update the zxid and electionEpoch, so we try to 
ignore it. But IGNOREVALUE introduced will have compatible issue when rolling 
upgrade ensemble, that's why we introduced version in notification, and only 
compare id or peerEpoch based on version.


---


[GitHub] zookeeper issue #689: ZOOKEEPER-3183:Notifying the WatcherCleaner thread and...

2018-11-24 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/689
  
@tumativ thanks for working on this, only a minor comment now, I'll merge 
this when you updated this.


---


[GitHub] zookeeper pull request #689: ZOOKEEPER-3183:Notifying the WatcherCleaner thr...

2018-11-24 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/689#discussion_r236063171
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/watch/WatcherCleaner.java
 ---
@@ -102,12 +103,13 @@ public void addDeadWatcher(int watcherBit) {
 totalDeadWatchers.get() >= maxInProcessingDeadWatchers) {
 try {
 RATE_LOGGER.rateLimitLog("Waiting for dead watchers 
cleaning");
-synchronized(totalDeadWatchers) {
-totalDeadWatchers.wait(100);
+synchronized(processingCompletedEvent) {
+   processingCompletedEvent.wait(100);
--- End diff --

Can we change this with 4 extra white spaces relative to the synchronized 
statement?


---


[GitHub] zookeeper pull request #689: ZOOKEEPER-3183:Notifying the WatcherCleaner thr...

2018-11-24 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/689#discussion_r236063217
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/watch/WatcherCleaner.java
 ---
@@ -102,24 +104,24 @@ public void addDeadWatcher(int watcherBit) {
 totalDeadWatchers.get() >= maxInProcessingDeadWatchers) {
 try {
 RATE_LOGGER.rateLimitLog("Waiting for dead watchers 
cleaning");
-synchronized(totalDeadWatchers) {
-totalDeadWatchers.wait(100);
-}
-} catch (InterruptedException e) {
-LOG.info("Got interrupted while waiting for dead watches " 
+
-"queue size");
-}
-}
-synchronized (this) {
-if (deadWatchers.add(watcherBit)) {
-totalDeadWatchers.incrementAndGet();
-if (deadWatchers.size() >= watcherCleanThreshold) {
-synchronized (cleanEvent) {
-cleanEvent.notifyAll();
-}
-}
-}
+   synchronized (processingCompletedEvent) {
--- End diff --

We don't have a general format wiki yet, we'll work on that, in general 
it's 4 white space after { line.


---


[GitHub] zookeeper pull request #689: ZOOKEEPER-3183:Notifying the WatcherCleaner thr...

2018-11-24 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/689#discussion_r236063177
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/watch/WatcherCleaner.java
 ---
@@ -163,8 +165,8 @@ public void doWork() throws Exception {
 long latency = Time.currentElapsedTime() - 
startTime;
 LOG.info("Takes {} to process {} watches", 
latency, total);
 totalDeadWatchers.addAndGet(-total);
-synchronized(totalDeadWatchers) {
-totalDeadWatchers.notifyAll();
+synchronized(processingCompletedEvent) {
+   processingCompletedEvent.notifyAll();
--- End diff --

ditto.


---


[GitHub] zookeeper pull request #692: ZOOKEEPER-3184: Use the same method to generate...

2018-11-24 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/692#discussion_r236063130
  
--- Diff: README.md ---
@@ -1,49 +1,24 @@
 ## Generating the static Apache ZooKeeper website
 
-In this directory you will find text files formatted using Markdown, with 
an `.md` suffix.
+In the `src/main/resources/markdown` directory you will find text files 
formatted using Markdown, with an `.md` suffix.
 
-Building the site requires [Jekyll](http://jekyllrb.com/docs) 3.6.2 or 
newer. 
-The easiest way to install jekyll is via a Ruby Gem. Jekyll will create a 
directory called `_site` 
-containing `index.html` as well as the rest of the compiled directories 
and files. _site should not
-be committed to git as this is the generated content.
-
-To install Jekyll and its required dependencies, execute `sudo gem install 
jekyll pygments.rb` 
-and `sudo pip install Pygments`. See the Jekyll installation page for more 
details.
+Building the site requires [Maven](http://maven.apache.org/) 3.5.0 or 
newer. 
+The easiest way to [install Maven](http://maven.apache.org/install.html) 
depends on your OS.
+The build process will create a directory called `target/html` containing 
`index.html` as well as the rest of the
+compiled directories and files. `target` should not be committed to git as 
it is generated content.
 
 You can generate the static ZooKeeper website by running:
 
-1. `jekyll build` in this directory.
-2. `cp -RP _released_docs _site/doc` - this will include the documentation 
(see "sub-dir" section below) in the generated site.
+1. `mvn clean install` in this directory.
+2. `cp -RP _released_docs _target/html` - this will include the 
documentation (see "sub-dir" section below) in the generated site.
--- End diff --

Thanks @tamaashu, can you update the readme to point out how to copy those 
doc? 

One suggest, the top and bottom layout is a bit strange, they're too wide 
comparing to the main content on that page, can we make those not that wide?


---


[GitHub] zookeeper issue #703: [ZOOKEEPER-1818] Correctly handle potential inconsiste...

2018-11-18 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/703
  
@anmolnar this is the fix for the DONTCARE on trunk, please take a look, 
I'll port it to 3.5 when it's being reviewed and merged.


---


[GitHub] zookeeper pull request #689: ZOOKEEPER-3183:Notifying the WatcherCleaner thr...

2018-11-18 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/689#discussion_r234494972
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/watch/WatcherCleaner.java
 ---
@@ -102,24 +104,24 @@ public void addDeadWatcher(int watcherBit) {
 totalDeadWatchers.get() >= maxInProcessingDeadWatchers) {
 try {
 RATE_LOGGER.rateLimitLog("Waiting for dead watchers 
cleaning");
-synchronized(totalDeadWatchers) {
-totalDeadWatchers.wait(100);
-}
-} catch (InterruptedException e) {
-LOG.info("Got interrupted while waiting for dead watches " 
+
-"queue size");
-}
-}
-synchronized (this) {
-if (deadWatchers.add(watcherBit)) {
-totalDeadWatchers.incrementAndGet();
-if (deadWatchers.size() >= watcherCleanThreshold) {
-synchronized (cleanEvent) {
-cleanEvent.notifyAll();
-}
-}
-}
+   synchronized (processingCompletedEvent) {
--- End diff --

@tumativ looks like we still have some indent problem for this patch, can 
you help correct those?


---


[GitHub] zookeeper issue #632: [ZOOKEEPER-3150] Add tree digest check and verify data...

2018-11-17 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/632
  
Rebase onto latest master.


---


[GitHub] zookeeper issue #673: [ZOOKEEPER-3177] Refactor request throttle logic in NI...

2018-11-16 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/673
  
This seems able to be merged now, @anmolnar can you help merge it?


---


[GitHub] zookeeper issue #676: ZOOKEEPER-3181: ZOOKEEPER-2355 broke Curator TestingQu...

2018-11-16 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/676
  
@Randgalt I'm not familiar with the Curator release here, do you think it's 
easy to make a new release for 2.x for this?


---


[GitHub] zookeeper pull request #689: ZOOKEEPER-3183:Notifying the WatcherCleaner thr...

2018-11-16 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/689#discussion_r234399372
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/watch/WatcherCleaner.java
 ---
@@ -50,6 +50,8 @@
 
 private volatile boolean stopped = false;
 private final Object cleanEvent = new Object();
+private final Object produserAndConsumerLock = new Object();
--- End diff --

Can you give it a more explicit name? produserAndConsumerLock seems too 
general here.


---


[GitHub] zookeeper pull request #689: ZOOKEEPER-3183:Notifying the WatcherCleaner thr...

2018-11-16 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/689#discussion_r234399479
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/watch/WatcherCleaner.java
 ---
@@ -177,6 +180,7 @@ public void shutdown() {
 stopped = true;
 deadWatchers.clear();
 cleaners.stop();
+super.interrupt();
--- End diff --

Maybe just change it to this.interrupt(), since we're not overwriting it.


---


[GitHub] zookeeper pull request #689: ZOOKEEPER-3183:Notifying the WatcherCleaner thr...

2018-11-16 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/689#discussion_r234399392
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/watch/WatcherCleaner.java
 ---
@@ -102,24 +104,25 @@ public void addDeadWatcher(int watcherBit) {
 totalDeadWatchers.get() >= maxInProcessingDeadWatchers) {
 try {
 RATE_LOGGER.rateLimitLog("Waiting for dead watchers 
cleaning");
-synchronized(totalDeadWatchers) {
-totalDeadWatchers.wait(100);
+synchronized(produserAndConsumerLock) {
+   produserAndConsumerLock.wait(100);
 }
 } catch (InterruptedException e) {
 LOG.info("Got interrupted while waiting for dead watches " 
+
 "queue size");
+break;
 }
 }
-synchronized (this) {
-if (deadWatchers.add(watcherBit)) {
-totalDeadWatchers.incrementAndGet();
-if (deadWatchers.size() >= watcherCleanThreshold) {
-synchronized (cleanEvent) {
-cleanEvent.notifyAll();
+   synchronized (this) {
--- End diff --

The indent seems  not correct.


---


[GitHub] zookeeper issue #690: ZOOKEEPER-3179: Add snapshot compression to reduce the...

2018-11-16 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/690
  
retest this please


---


[GitHub] zookeeper pull request #692: ZOOKEEPER-3184: Use the same method to generate...

2018-11-16 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/692#discussion_r234399026
  
--- Diff: README.md ---
@@ -1,49 +1,24 @@
 ## Generating the static Apache ZooKeeper website
 
-In this directory you will find text files formatted using Markdown, with 
an `.md` suffix.
+In the `src/main/resources/markdown` directory you will find text files 
formatted using Markdown, with an `.md` suffix.
 
-Building the site requires [Jekyll](http://jekyllrb.com/docs) 3.6.2 or 
newer. 
-The easiest way to install jekyll is via a Ruby Gem. Jekyll will create a 
directory called `_site` 
-containing `index.html` as well as the rest of the compiled directories 
and files. _site should not
-be committed to git as this is the generated content.
-
-To install Jekyll and its required dependencies, execute `sudo gem install 
jekyll pygments.rb` 
-and `sudo pip install Pygments`. See the Jekyll installation page for more 
details.
+Building the site requires [Maven](http://maven.apache.org/) 3.5.0 or 
newer. 
+The easiest way to [install Maven](http://maven.apache.org/install.html) 
depends on your OS.
+The build process will create a directory called `target/html` containing 
`index.html` as well as the rest of the
+compiled directories and files. `target` should not be committed to git as 
it is generated content.
 
 You can generate the static ZooKeeper website by running:
 
-1. `jekyll build` in this directory.
-2. `cp -RP _released_docs _site/doc` - this will include the documentation 
(see "sub-dir" section below) in the generated site.
+1. `mvn clean install` in this directory.
+2. `cp -RP _released_docs _target/html` - this will include the 
documentation (see "sub-dir" section below) in the generated site.
--- End diff --

After 'mvn clean install' there is no _target dir but only target, is this 
a typo or am I missing any step?


---


[GitHub] zookeeper pull request #703: [ZOOKEEPER-1818] Correctly handle potential inc...

2018-11-16 Thread lvfangmin
GitHub user lvfangmin opened a pull request:

https://github.com/apache/zookeeper/pull/703

[ZOOKEEPER-1818] Correctly handle potential inconsistent zxid/electionEpoch 
and peerEpoch during leader election

This is similar to the 3.4 implementation in Jira ZOOKEEPER-1817, main 
differences with that:

1. removed bcVote, in Vote.equals it will skip compare peerEpoch when one 
of the version is 0x0.
2. added detailed scenarios which we tried to solve with peerEpoch update 
and the change in Vote.equals.
3. removed ooePredicate with inlined one, since master code is tracking 
voteSet..
4. improved the tests to make it cleaner.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvfangmin/zookeeper ZOOKEEPER-1818

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/703.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #703


commit ac2aa6147e5433567a4be433089c35a30a3d
Author: Fangmin Lyu 
Date:   2018-11-15T17:46:51Z

Correctly handle potential inconsitent zxid/electionEpoch and peerEpoch 
during leader election




---


[GitHub] zookeeper issue #689: ZOOKEEPER-3183:Notifying the WatcherCleaner thread and...

2018-11-14 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/689
  
@anmolnar that's a good point, I think the reason we didn't use 
ZooKeeperThread or ZooKeeperCriticalThread is because we don't expect to exit 
abnormally from the WatcherCleaner thread, which is true for now. But I think 
it's better to use ZooKeeperThread or even ZooKeeperCriticalThread to cover 
future changes which might cause  the thread exit abnormally.


---


[GitHub] zookeeper pull request #701: [ZOOKEEPER-3125] Only patching the pzxid when i...

2018-11-14 Thread lvfangmin
GitHub user lvfangmin opened a pull request:

https://github.com/apache/zookeeper/pull/701

[ZOOKEEPER-3125] Only patching the pzxid when it's larger than the current 
pzxid

This previous fix in #605 has a corner case which might revert the pzxid, 
it's being fixed when port to 3.5 in #647, update on master as well.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvfangmin/zookeeper ZOOKEEPER-3125-Update

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/701.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #701


commit e74895f08a923450e8eb95a81abf0c703d505050
Author: Fangmin Lyu 
Date:   2018-11-15T00:43:42Z

Only patching the pzxid when it's larger than the current pzxid




---


[GitHub] zookeeper issue #689: ZOOKEEPER-3183:Notifying the WatcherCleaner thread and...

2018-11-13 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/689
  
@tumativ There is a very short window that we'll still add the dead watcher 
to the cleaner thread, I don't expect that will add too much GC overhead for 
these small amount of dead watch bits.

For your 2nd point, you can interrupt and wait for this thread to exit 
using join, although I'm not sure it's worth to, give these clean up could be 
done in background without affecting us starting a new ZK server with new 
ZKDatabase.

So I still prefer to simply interrupt instead of this change, both from 
complexity (which also means error-prone and hard to maintain in the future) 
and efficient sacrificed here.


---


[GitHub] zookeeper pull request #647: [ZOOKEEPER-3125] Fixing pzxid consistent issue ...

2018-11-11 Thread lvfangmin
Github user lvfangmin closed the pull request at:

https://github.com/apache/zookeeper/pull/647


---


[GitHub] zookeeper issue #647: [ZOOKEEPER-3125] Fixing pzxid consistent issue when re...

2018-11-11 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/647
  
Merged, close this PR.


---


[GitHub] zookeeper issue #689: ZOOKEEPER-3183:Notifying the WatcherCleaner thread and...

2018-11-08 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/689
  
@tumativ I understand the intention of this JIRA, but to me looks like 
adding this.interrupt() in shutdown() should solve those problems.

We may add a dead watch to the cleaner after this, but it doesn't matter, 
this this watch cleaner won't be referenced anymore after shutdown, and it will 
GCed.

So why not just do this instead of the complexity changes here? Did I miss 
anything?


---


[GitHub] zookeeper issue #689: ZOOKEEPER-3183:Notifying the WatcherCleaner thread and...

2018-11-08 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/689
  
Thanks @tumativ, I'm reviewing this now.


---


[GitHub] zookeeper issue #647: [ZOOKEEPER-3125] Fixing pzxid consistent issue when re...

2018-11-08 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/647
  
@anmolnar should we get this in?


---


[GitHub] zookeeper issue #697: ZOOKEEPER-3155: Remove Forrest XMLs and their build pr...

2018-11-08 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/697
  
retest this please


---


[GitHub] zookeeper issue #698: ZOOKEEPER-3155: Remove Forrest XMLs and their build pr...

2018-11-08 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/698
  
Do we need this on 3.4, I thought we only do bug fixes on 3.4.


---


[GitHub] zookeeper issue #685: [ZOOKEEPER-3104] Fix potential data inconsistency due ...

2018-11-08 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/685
  
Thanks Andor, I'll close this one.


---


[GitHub] zookeeper pull request #685: [ZOOKEEPER-3104] Fix potential data inconsisten...

2018-11-08 Thread lvfangmin
Github user lvfangmin closed the pull request at:

https://github.com/apache/zookeeper/pull/685


---


[GitHub] zookeeper issue #676: ZOOKEEPER-3181: ZOOKEEPER-2355 broke Curator TestingQu...

2018-11-04 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/676
  
@eolivelli I agree the naming is a bit confusing here, in this case we 
should rename the Jira and PR to make it more explicitly, we can change the 
name here, but it doesn't make sense to rename it because Curator test is 
failing.

@aajisaka 3.4.x is only accepting bug fixes, I'm not sure renaming for 
making the Curator test happy is allowed there. @anmolnar what's your opinion 
on this?


---


[GitHub] zookeeper issue #682: ZOOKEEPER-2807. Flaky test: org.apache.zookeeper.test....

2018-11-04 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/682
  
Thanks @anmolnar, I've merged this onto master.


---


[GitHub] zookeeper pull request #685: [ZOOKEEPER-3104] Fix potential data inconsisten...

2018-11-02 Thread lvfangmin
GitHub user lvfangmin opened a pull request:

https://github.com/apache/zookeeper/pull/685

[ZOOKEEPER-3104] Fix potential data inconsistency due to NEWLEADER packet 
being sent too early during SNAP sync

Port this fix from master to 3.5.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvfangmin/zookeeper ZOOKEEPER-3104-3.5

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/685.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #685


commit 5703e334a1a775b9f8342409924fc3247475aa81
Author: Fangmin Lyu 
Date:   2018-11-02T20:08:07Z

Fix potential data inconsistency due to NEWLEADER packet being sent too 
early during SNAP sync




---


[GitHub] zookeeper pull request #684: ZOOKEEPER-3180: Add response cache to improve t...

2018-10-31 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/684#discussion_r229870463
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/ServerCnxn.java ---
@@ -68,29 +70,74 @@
 
 private volatile boolean stale = false;
 
+final ZooKeeperServer zkServer;
+
+public ServerCnxn(final ZooKeeperServer zkServer) {
+this.zkServer = zkServer;
+}
+
 abstract int getSessionTimeout();
 
 abstract void close();
 
+public abstract void sendResponse(ReplyHeader h, Record r,
+String tag, String cacheKey, Stat stat) throws IOException;
+
 public void sendResponse(ReplyHeader h, Record r, String tag) throws 
IOException {
-ByteArrayOutputStream baos = new ByteArrayOutputStream();
-// Make space for length
+sendResponse(h, r, tag, null, null);
+}
+
+protected byte[] serializeRecord(Record record) throws IOException {
+ByteArrayOutputStream baos = new ByteArrayOutputStream(
+ZooKeeperServer.intBufferStartingSizeBytes);
 BinaryOutputArchive bos = BinaryOutputArchive.getArchive(baos);
-try {
-baos.write(fourBytes);
-bos.writeRecord(h, "header");
-if (r != null) {
-bos.writeRecord(r, tag);
+bos.writeRecord(record, null);
+return baos.toByteArray();
+}
+
+protected ByteBuffer[] serialize(ReplyHeader h, Record r, String tag,
+String cacheKey, Stat stat) throws IOException {
+byte[] header = serializeRecord(h);
+byte[] data = null;
+if (r != null) {
+ResponseCache cache = zkServer.getReadResponseCache();
+if (cache != null && stat != null && cacheKey != null &&
+!cacheKey.endsWith(Quotas.statNode)) {
+// Use cache to get serialized data.
+//
+// NB: Tag is ignored both during cache lookup and 
serialization,
+// since is is not used in read responses, which are being 
cached.
+data = cache.get(cacheKey, stat);
+if (data == null) {
+// Cache miss, serialize the response and put it in 
cache.
+data = serializeRecord(r);
--- End diff --

We may hit the race condition to serialize the same record multiple times, 
but we made a trade off, instead of having write lock every time, this might be 
more efficient.


---


[GitHub] zookeeper pull request #684: ZOOKEEPER-3180: Add response cache to improve t...

2018-10-31 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/684#discussion_r229871456
  
--- Diff: 
zookeeper-server/src/test/java/org/apache/zookeeper/test/ResponseCacheTest.java 
---
@@ -0,0 +1,118 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.test;
+
+import java.util.Map;
+
+import org.apache.zookeeper.CreateMode;
+import org.apache.zookeeper.ZooDefs;
+import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.data.Stat;
+import org.apache.zookeeper.server.ServerStats;
+import org.apache.zookeeper.server.ServerMetrics;
+import org.apache.zookeeper.server.ZooKeeperServerMXBean;
+import org.junit.Assert;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import javax.management.JMX;
+import javax.management.MBeanServerConnection;
+import javax.management.ObjectName;
+
+public class ResponseCacheTest extends ClientBase {
+protected static final Logger LOG =
+LoggerFactory.getLogger(ResponseCacheTest.class);
+
+@Test
+public void testResponseCache() throws Exception {
+ObjectName bean = JMXEnv.getServerBean();
+MBeanServerConnection mbsc = JMXEnv.conn();
+ZooKeeperServerMXBean zkBean = JMX.newMBeanProxy(mbsc, bean, 
ZooKeeperServerMXBean.class);
+ZooKeeper zk = createClient();
+
+try {
+performCacheTest(zk, zkBean, "/cache", true);
+performCacheTest(zk, zkBean, "/nocache", false);
+}
+finally {
+zk.close();
+}
+}
+
+private void checkCacheStatus(long expectedHits, long expectedMisses) {
+Map metrics = ServerMetrics.getAllValues();
+Assert.assertEquals((Long) expectedHits, 
metrics.get("response_packet_cache_hits"));
+Assert.assertEquals((Long) expectedMisses, 
metrics.get("response_packet_cache_misses"));
+}
+
+public void performCacheTest(ZooKeeper zk, ZooKeeperServerMXBean 
zkBean, String path, boolean useCache) throws Exception {
+ServerMetrics.resetAll();
+Stat writeStat = new Stat();
+Stat readStat = new Stat();
+byte[] readData = null;
+int reads = 10;
+long expectedHits = 0;
+long expectedMisses = 0;
+
+zkBean.setResponseCachingEnabled(useCache);
--- End diff --

Brian, looks like we call ZooKeeperServer.setResponseCachingEnabled 
directly instead of introducing the JMX bean here, can you check if we can get 
rid of the JMX bean change in this diff?


---


[GitHub] zookeeper pull request #682: ZOOKEEPER-2807. Flaky test: org.apache.zookeepe...

2018-10-25 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/682#discussion_r228315225
  
--- Diff: 
zookeeper-server/src/test/java/org/apache/zookeeper/test/WatchEventWhenAutoResetTest.java
 ---
@@ -95,6 +96,7 @@ public void setUp() {
 }
 
 @Test
+@Ignore("ZOOKEEPER-3182")
--- End diff --

As I mentioned in the Jira, I don't think this is a ZK code problem, but we 
should improve the reliability in this test case, if the client is disconnected 
we cannot guarantee it will get the 'expected' watch event without doing sync 
first.


---


[GitHub] zookeeper issue #629: ZOOKEEPER-2641:AvgRequestLatency metric improves to be...

2018-10-25 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/629
  
@maoling if I remember correctly, the 3.4 branch is the stable branch, 
which only accept bug fix, new features or improvement will only be applied 
onto 3.5 and 3.6.

I think this is useful, you should port this to master.


---


[GitHub] zookeeper issue #676: ZOOKEEPER-3181: ZOOKEEPER-2355 broke Curator TestingQu...

2018-10-25 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/676
  
-1 

Wait for confirming why we have to do this.


---


[GitHub] zookeeper issue #676: ZOOKEEPER-3181: ZOOKEEPER-2355 broke Curator TestingQu...

2018-10-25 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/676
  
@aajisaka we shouldn't change the code because of a failed test, and this 
is only for a test case which has overriding this function name in Curator 
test. 

Isn't the right thing is only changing it in the Curator, as what you've 
donee in https://github.com/apache/curator/pull/219.


---


[GitHub] zookeeper issue #652: ZOOKEEPER-3156: Add in option to canonicalize host nam...

2018-10-25 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/652
  
+1 

The new change looks good to me, please rebase to resolve the conflict, 
will merge this in after that. Sorry for lately reply, somehow lost tracking 
this session.


---


[GitHub] zookeeper pull request #673: [ZOOKEEPER-3177] Refactor request throttle logi...

2018-10-24 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/673#discussion_r227962259
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java 
---
@@ -1107,6 +1102,19 @@ public void processPacket(ServerCnxn cnxn, 
ByteBuffer incomingBuffer) throws IOE
 BinaryInputArchive bia = BinaryInputArchive.getArchive(bais);
 RequestHeader h = new RequestHeader();
 h.deserialize(bia, "header");
+
+// Need to increase the outstanding request count first, otherwise
+// there might be a race condition that it enabled recv after
+// processing request and then disabled when check throttling.
+//
+// It changes the semantic a bit, since when check throttling it's
--- End diff --

That makes sense to me, it's hard to find out what's the previous behavior 
is, I'll change it as what you suggested to.


---


[GitHub] zookeeper pull request #632: [ZOOKEEPER-3150] Add tree digest check and veri...

2018-10-23 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/632#discussion_r227533265
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/util/AdHash.java ---
@@ -0,0 +1,84 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.server.util;
+
+/**
+ * This incremental hash is used to keep track of the hash of
+ * the data tree to that we can quickly validate that things
+ * are in sync.
+ *
+ * See the excellent paper: A New Paradigm for collision-free hashing:
+ *   Incrementality at reduced cost,  M. Bellare and D. Micciancio
+ */
+public class AdHash {
--- End diff --

I'd like to keep it as is, since it's consistent with the name in the paper.


---


[GitHub] zookeeper pull request #632: [ZOOKEEPER-3150] Add tree digest check and veri...

2018-10-23 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/632#discussion_r227532976
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/command/HashCommand.java
 ---
@@ -0,0 +1,49 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.server.command;
+
+import java.io.PrintWriter;
+import java.util.List;
+
+import org.apache.zookeeper.server.DataTree.ZxidDigest;
+import org.apache.zookeeper.server.ServerCnxn;
+
+/**
+ * Command used to dump the latest digest histories.
+ */
+public class HashCommand extends AbstractFourLetterCommand {
--- End diff --

That seems more consistent, will do.


---


[GitHub] zookeeper pull request #673: [ZOOKEEPER-3177] Refactor request throttle logi...

2018-10-23 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/673#discussion_r227532671
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java 
---
@@ -1107,6 +1102,19 @@ public void processPacket(ServerCnxn cnxn, 
ByteBuffer incomingBuffer) throws IOE
 BinaryInputArchive bia = BinaryInputArchive.getArchive(bais);
 RequestHeader h = new RequestHeader();
 h.deserialize(bia, "header");
+
+// Need to increase the outstanding request count first, otherwise
+// there might be a race condition that it enabled recv after
+// processing request and then disabled when check throttling.
+//
+// It changes the semantic a bit, since when check throttling it's
--- End diff --

@eolivelli I'll try to rephrase it, meanwhile please comment if you have 
any suggestion on how to rephrase this?


---


[GitHub] zookeeper issue #665: [ZOOKEEPER-3163] Use session map in the Netty to impro...

2018-10-23 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/665
  
@maoling, we can port this back to 3.4 in the same Jira, I'll send out a PR 
separately for that.


---


[GitHub] zookeeper issue #665: [ZOOKEEPER-3163] Use session map in the Netty to impro...

2018-10-22 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/665
  
Yes, it's similar to ZOOKEEPER-1669, which uses sessionMap to reduce the 
cost of close session, most of the code are identical as well.

In 3.4, it has the removeCnxn method defined in NettyServerCnxnFactory 
while it keeps the same logic in NettyServerCnxn.close() in 3.6, I'll leave it 
as is for now, given that it's not quite relative to this Jira and this diff is 
already ready to land.


---


[GitHub] zookeeper issue #647: [ZOOKEEPER-3125] Fixing pzxid consistent issue when re...

2018-10-22 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/647
  
@anmolnar what's your opinion with @hanm 's reply?


---


[GitHub] zookeeper pull request #673: [ZOOKEEPER-3177] Refactor request throttle logi...

2018-10-18 Thread lvfangmin
GitHub user lvfangmin opened a pull request:

https://github.com/apache/zookeeper/pull/673

[ZOOKEEPER-3177] Refactor request throttle logic in NIO and Netty to keep 
the same behavior and make the code easier to maintain



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvfangmin/zookeeper ZOOKEEPER-3177

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/673.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #673


commit c2ec463a79a77b678a624ea79713a85cd23cd996
Author: Fangmin Lyu 
Date:   2018-10-19T04:47:45Z

Refactor request throttle logic in NIO and Netty to keep the same behavior 
and make the code easier to maintain




---


[GitHub] zookeeper issue #647: [ZOOKEEPER-3125] Fixing pzxid consistent issue when re...

2018-10-18 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/647
  
@anmolnar I can understand your concern, let's remove the RetryRule for 
now, we can add it when necessary.


---


[GitHub] zookeeper issue #665: [ZOOKEEPER-3163] Use session map in the Netty to impro...

2018-10-16 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/665
  
retest this please


---


[GitHub] zookeeper issue #665: [ZOOKEEPER-3163] Use session map in the Netty to impro...

2018-10-16 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/665
  
The failure seems not related to this code change, I'll trigger another 
round of test, this should be ready to get in.


---


[GitHub] zookeeper issue #647: [ZOOKEEPER-3125] Fixing pzxid consistent issue when re...

2018-10-16 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/647
  
retest this please


---


[GitHub] zookeeper pull request #632: [ZOOKEEPER-3150] Add tree digest check and veri...

2018-10-12 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/632#discussion_r224842221
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java ---
@@ -154,6 +160,26 @@
 
 private final ReferenceCountedACLCache aclCache = new 
ReferenceCountedACLCache();
 
+// The maximum number of tree digests that we will keep in our history
+public static final int DIGEST_LOG_LIMIT = 1024;
+
+// Dump digest every 128 txns, in hex it's 80, which will make it 
easier 
+// to align and compare between servers.
+public static final int DIGEST_LOG_INTERVAL = 128;
+
+// If this is not null, we are actively looking for a target zxid that 
we
+// want to validate the digest for
+private ZxidDigest digestFromLoadedSnapshot;
+
+// The digest associated with the highest zxid in the data tree.
+private volatile ZxidDigest lastProcessedZxidDigest;
+
+// Will be notified when digest mismatch event triggered.
+private List digestWatchers = new ArrayList<>();
--- End diff --

We won't have concurrent modification here, this will be a static list and 
registered in the constructor of this class, so I don't expect to change this 
dynamically.




---


[GitHub] zookeeper pull request #632: [ZOOKEEPER-3150] Add tree digest check and veri...

2018-10-12 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/632#discussion_r224843197
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java ---
@@ -1521,4 +1566,179 @@ public boolean removeWatch(String path, WatcherType 
type, Watcher watcher) {
 public ReferenceCountedACLCache getReferenceCountedAclCache() {
 return aclCache;
 }
+
+/**
+ * Add the digest to the historical list, and update the latest zxid 
digest.
+ */
+private void logZxidDigest(long zxid, long digest) {
+ZxidDigest zxidDigest = new ZxidDigest(zxid, 
DigestCalculator.DIGEST_VERSION, digest);
+lastProcessedZxidDigest = zxidDigest;
+if (zxidDigest.zxid % DIGEST_LOG_INTERVAL == 0) {
+synchronized (digestLog) {
+digestLog.add(zxidDigest);
+if (digestLog.size() > DIGEST_LOG_LIMIT) {
+digestLog.poll();
+}
+}
+}
+}
+
+/**
+ * Serializing the digest to snapshot, this is done after the data 
tree 
+ * is being serialized, so when we replay the txns and it hits this 
zxid 
+ * we know we should be in a non-fuzzy state, and have the same 
digest. 
+ *
+ * @param oa the output stream to write to 
+ * @return true if the digest is serialized successfully
+ */
+public boolean serializeZxidDigest(OutputArchive oa) throws 
IOException {
+if (!DigestCalculator.digestEnabled()) {
+return false;
+}
+
+ZxidDigest zxidDigest = lastProcessedZxidDigest;
+if (zxidDigest == null) {
+// write an empty digest
+zxidDigest = new ZxidDigest();
+}
+zxidDigest.serialize(oa);
+return true;
+}
+
+/**
+ * Deserializing the zxid digest from the input stream and update the 
+ * digestFromLoadedSnapshot.
+ *
+ * @param ia the input stream to read from
+ * @return the true if it deserialized successfully
+ */
+public boolean deserializeZxidDigest(InputArchive ia) throws 
IOException {
+if (!DigestCalculator.digestEnabled()) {
+return false;
+}
+
+try  {
+ZxidDigest zxidDigest = new ZxidDigest();
+zxidDigest.deserialize(ia);
+if (zxidDigest.zxid > 0) {
+digestFromLoadedSnapshot = zxidDigest;
+}
+return true;
--- End diff --

I usually do early return if there are a bunch of code need to be handled 
after that early return, but I'm not a fan of moving 'return true" to the end 
of this function as a 'default' value, it's actually makes the code a bit 
harder to read to me, given that the 'early return' at line 1630 is also the 
kind of last line of this function.


---


[GitHub] zookeeper pull request #632: [ZOOKEEPER-3150] Add tree digest check and veri...

2018-10-12 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/632#discussion_r224844559
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/util/AdHash.java ---
@@ -0,0 +1,84 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.server.util;
+
+/**
+ * This incremental hash is used to keep track of the hash of
+ * the data tree to that we can quickly validate that things
+ * are in sync.
+ *
+ * See the excellent paper: A New Paradigm for collision-free hashing:
+ *   Incrementality at reduced cost,  M. Bellare and D. Micciancio
+ */
+public class AdHash {
--- End diff --

We may want to extend this class to modify the adHash a bit in the future, 
so I would leave it as is for now.


---


[GitHub] zookeeper pull request #632: [ZOOKEEPER-3150] Add tree digest check and veri...

2018-10-12 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/632#discussion_r224841261
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/DataNode.java ---
@@ -37,6 +37,14 @@
  * 
  */
 public class DataNode implements Record {
+
+// the digest value of this node, calculated from path, data and stat
+private long digest;
--- End diff --

It's a single thread read/write this value for now, so it doesn't matter, 
but I agree it would be better to have it as a volatile in case we need to 
visit this in different thread in the future, will change that.


---


[GitHub] zookeeper pull request #632: [ZOOKEEPER-3150] Add tree digest check and veri...

2018-10-12 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/632#discussion_r224843869
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/NodeHashMapImpl.java 
---
@@ -0,0 +1,116 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.server;
+
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+
+import org.apache.zookeeper.server.util.DigestCalculator;
+import org.apache.zookeeper.server.util.AdHash;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * a simple wrapper to ConcurrentHashMap that recalculates a digest after
+ * each mutation.
+ */
+public class NodeHashMapImpl implements NodeHashMap {
+
+private static final Logger LOG = 
LoggerFactory.getLogger(NodeHashMap.class);
+
+private final ConcurrentHashMap nodes;
+
+private AdHash hash;
+
+public NodeHashMapImpl() {
+nodes = new ConcurrentHashMap();
+hash = new AdHash();
--- End diff --

We can do that, previously we have other initialization in this 
constructor, but not anymore after I refactored it, will remove the constructor 
here.


---


[GitHub] zookeeper pull request #667: ZOOKEEPER-3158:firstConnect.countDown() will no...

2018-10-12 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/667#discussion_r224818567
  
--- Diff: 
zookeeper-common/src/main/java/org/apache/zookeeper/ClientCnxnSocketNetty.java 
---
@@ -146,13 +146,13 @@ public void operationComplete(ChannelFuture 
channelFuture) throws Exception {
 } else {
 needSasl.set(false);
 }
-
+
--- End diff --

nit: remove tailing white space.


---


[GitHub] zookeeper issue #624: ZOOKEEPER-3108:use a new property server.id in the zoo...

2018-10-09 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/624
  
Previously, it was not part of the dynamic config, because it’s in a 
separate config file. But after this change it will move to the dynamic config, 
if I remember correctly it check and move all server. prefix to dynamic config 
file, but maybe I remember it wrongly.

Anyway, we need a test case to cover that.


---


[GitHub] zookeeper pull request #663: ZOOKEEPER-3154: Update release process to use t...

2018-10-09 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/663#discussion_r223838835
  
--- Diff: pom.xml ---
@@ -30,7 +30,7 @@
   org.apache.zookeeper
   zookeeper
   pom
-  2.6.0-SNAPSHOT
+  3.5.5-beta-SNAPSHOT
--- End diff --

Should we move this to a separate PR? Seems it’s not highly related to 
the Markdown change here.


---


[GitHub] zookeeper pull request #665: [ZOOKEEPER-3163] Use session map in the Netty t...

2018-10-09 Thread lvfangmin
GitHub user lvfangmin opened a pull request:

https://github.com/apache/zookeeper/pull/665

[ZOOKEEPER-3163] Use session map in the Netty to improve close session 
performance

This is a refactor to make the Netty able to use the same closeSession 
logic in NIOServerCnxn, which is more efficient with the sessionMap. Rely on 
the existing tests for the refactor work here.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvfangmin/zookeeper ZOOKEEPER-3163

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/665.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #665


commit 4e9f80fa4c556269a64834b84128d84297c92e1d
Author: Fangmin Lyu 
Date:   2018-10-09T16:45:44Z

Use session map in the Netty to improve close session performance




---


[GitHub] zookeeper issue #658: ZOOKEEPER-3032 - MAVEN MIGRATION - branch 3.4 move jav...

2018-10-08 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/658
  
retest this please


---


[GitHub] zookeeper issue #660: ZOOKEEPER-2320. C-client crashes when removing watcher...

2018-10-08 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/660
  
@anmolnar do you know why it failing the check here? From the latest 
Jenkins build link I cannot find any failure.


---


[GitHub] zookeeper pull request #659: ZOOKEEPER-3161. Refactor QuorumPeerMainTest.jav...

2018-10-08 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/659#discussion_r223564880
  
--- Diff: 
zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/QuorumPeerTestBase.java
 ---
@@ -402,4 +421,129 @@ public File getConfFile() {
 }
 
 }
+
+// This class holds the servers and clients for those servers
+protected static class Servers {
+MainThread mt[];
+ZooKeeper zk[];
+int[] clientPorts;
+
+public void shutDownAllServers() throws InterruptedException {
+for (MainThread t: mt) {
+t.shutdown();
+}
+}
+
+public void restartAllServersAndClients(Watcher watcher) throws 
IOException, InterruptedException {
+for (MainThread t : mt) {
+if (!t.isAlive()) {
+t.start();
+}
+}
+for (int i = 0; i < zk.length; i++) {
+restartClient(i, watcher);
+}
+}
+
+public void restartClient(int clientIndex, Watcher watcher) throws 
IOException, InterruptedException {
+if (zk[clientIndex] != null) {
+zk[clientIndex].close();
+}
+zk[clientIndex] = new ZooKeeper("127.0.0.1:" + 
clientPorts[clientIndex], ClientBase.CONNECTION_TIMEOUT, watcher);
+}
+
+public int findLeader() {
+for (int i = 0; i < mt.length; i++) {
+if (mt[i].main.quorumPeer.leader != null) {
+return i;
+}
+}
+return -1;
+}
+}
+
+protected Servers LaunchServers(int numServers) throws IOException, 
InterruptedException {
+return LaunchServers(numServers, null);
+}
+
+/** * This is a helper function for launching a set of servers
+ *
+ * @param numServers the number of servers
+ * @param tickTime A ticktime to pass to MainThread
+ * @return
+ * @throws IOException
+ * @throws InterruptedException
+ */
+protected Servers LaunchServers(int numServers, Integer tickTime) 
throws IOException, InterruptedException {
+int SERVER_COUNT = numServers;
+QuorumPeerMainTest.Servers svrs = new QuorumPeerMainTest.Servers();
+svrs.clientPorts = new int[SERVER_COUNT];
+StringBuilder sb = new StringBuilder();
+for(int i = 0; i < SERVER_COUNT; i++) {
+svrs.clientPorts[i] = PortAssignment.unique();
+
sb.append("server."+i+"=127.0.0.1:"+PortAssignment.unique()+":"+PortAssignment.unique()+";"+svrs.clientPorts[i]+"\n");
+}
+String quorumCfgSection = sb.toString();
+
+svrs.mt = new MainThread[SERVER_COUNT];
+svrs.zk = new ZooKeeper[SERVER_COUNT];
+for(int i = 0; i < SERVER_COUNT; i++) {
+if (tickTime != null) {
+svrs.mt[i] = new MainThread(i, svrs.clientPorts[i], 
quorumCfgSection, new HashMap(), tickTime);
+} else {
+svrs.mt[i] = new MainThread(i, svrs.clientPorts[i], 
quorumCfgSection);
+}
+svrs.mt[i].start();
+svrs.restartClient(i, this);
+}
+
+waitForAll(svrs, ZooKeeper.States.CONNECTED);
+
+return svrs;
+}
+
+public static void waitForOne(ZooKeeper zk, ZooKeeper.States state) 
throws InterruptedException {
+int iterations = ClientBase.CONNECTION_TIMEOUT / 500;
+while (zk.getState() != state) {
+if (iterations-- == 0) {
+throw new RuntimeException("Waiting too long " + 
zk.getState() + " != " + state);
+}
+Thread.sleep(500);
+}
+}
+
+protected void waitForAll(Servers servers, ZooKeeper.States state) 
throws InterruptedException {
--- End diff --

Maybe change this to public as well?


---


[GitHub] zookeeper issue #659: ZOOKEEPER-3161. Refactor QuorumPeerMainTest.java: move...

2018-10-08 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/659
  
retest this please


---


[GitHub] zookeeper issue #647: [ZOOKEEPER-3125] Fixing pzxid consistent issue when re...

2018-10-08 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/647
  
I remember I commented in Jira ZOOKEEPER-3157, not sure why it didn't show 
up. 

I mentioned that we still need RetryRule, because there might be temporary 
quorum unstable issues like what we found on our test environment. The quorum 
set up in the test might be down due to leader election in case there is heavy 
load/limited resources on that test environment. We have seen this happened 
internally, so it's better to have retry for ConnectionLoss in this case.


---


[GitHub] zookeeper issue #632: [ZOOKEEPER-3150] Add tree digest check and verify data...

2018-10-08 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/632
  
retest this please


---


[GitHub] zookeeper pull request #632: [ZOOKEEPER-3150] Add tree digest check and veri...

2018-10-08 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/632#discussion_r223538616
  
--- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java ---
@@ -1521,4 +1562,179 @@ public boolean removeWatch(String path, WatcherType 
type, Watcher watcher) {
 public ReferenceCountedACLCache getReferenceCountedAclCache() {
 return aclCache;
 }
+
+/**
+ * Add the digest to the historical list, and update the latest zxid 
digest.
+ */
+private void logZxidDigest(long zxid, long digest) {
+ZxidDigest zxidDigest = new ZxidDigest(zxid, 
DigestCalculator.DIGEST_VERSION, digest);
+lastProcessedZxidDigest = zxidDigest;
+if (zxidDigest.zxid % 128 == 0) {
--- End diff --

I'll add the comment here, basically we want to only export the history of 
digest every 128 txns. It's a random number we picked, but not all random, in 
hex it's 80, which will print nicer when we dump the digest history.


---


[GitHub] zookeeper issue #632: [ZOOKEEPER-3150] Add tree digest check and verify data...

2018-10-06 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/632
  
This is a really useful feature, which helps us find multiple data 
inconsistent issues, like ZOOKEEPER-3144, ZOOKEEPER-3127, ZOOKEEPER-3125. 

It can avoid introducing new inconsistent bugs in ZooKeeper in the future, 
so please take a look when you have time. I'll introduce the 2nd part after 
this got reviewed and merged.

For performance, we saw some very minor impact, will provide the 
micro-benchmark result.


---


[GitHub] zookeeper pull request #650: ZOOKEEPER-1908: setAcl should be have a recursi...

2018-10-03 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/650#discussion_r222432633
  
--- Diff: src/java/main/org/apache/zookeeper/cli/SetAclCommand.java ---
@@ -19,12 +19,17 @@
 
 import java.util.List;
 import org.apache.commons.cli.*;
+import org.apache.zookeeper.AsyncCallback.StringCallback;
 import org.apache.zookeeper.KeeperException;
+import org.apache.zookeeper.ZKUtil;
 import org.apache.zookeeper.data.ACL;
 import org.apache.zookeeper.data.Stat;
 
 /**
- * setAcl command for cli
+ * setAcl command for cli.
+ * Available options are s for printing znode's stats, v for set version 
of znode(s), R for
+ * recursive setting. User can combine v and R options together, but not s 
and R considering the
+ * number of znodes could be large.
--- End diff --

Thanks for adding the description for this.


---


[GitHub] zookeeper issue #650: ZOOKEEPER-1908: setAcl should be have a recursive func...

2018-10-02 Thread lvfangmin
Github user lvfangmin commented on the issue:

https://github.com/apache/zookeeper/pull/650
  
retest this please


---


[GitHub] zookeeper pull request #652: ZOOKEEPER-3156: Add in option to canonicalize h...

2018-10-02 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/652#discussion_r221836418
  
--- Diff: src/java/main/org/apache/zookeeper/ClientCnxn.java ---
@@ -793,7 +794,87 @@ public RWServerFoundException(String msg) {
 super(msg);
 }
 }
-
+
+static class MockableInetSocketAddress {
--- End diff --

It's adding too much complexity to the code in order to test it, are we 
able to create the stub to extend the InetSocketAddress in the test case itself?


---


  1   2   3   4   >