[GitHub] zookeeper issue #375: ZOOKEEPER-1363: Categorise unit tests by 'test-commit'...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/375 Hi, This patch would be extremely useful for analysing flaky tests. Any plans for merging it? ---
[GitHub] zookeeper pull request #402: ZOOKEEPER-2922: Flaky Test fix: org.apache.zook...
GitHub user anmolnar opened a pull request: https://github.com/apache/zookeeper/pull/402 ZOOKEEPER-2922: Flaky Test fix: org.apache.zookeeper.test.LoadFromLogTest This patch is the backport of https://issues.apache.org/jira/browse/ZOOKEEPER-2484 in order to fix related flaky tests on branch-3.4. You can merge this pull request into a Git repository by running: $ git pull https://github.com/anmolnar/zookeeper ZOOKEEPER-2922 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/402.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #402 commit c78ae610b4283135c10d4541f08df0ef3cc72948 Author: Andor Molnár Date: 2017-10-20T09:42:03Z ZOOKEEPER-2922: Flaky Test fix: org.apache.zookeeper.test.LoadFromLogTest ---
[GitHub] zookeeper issue #402: ZOOKEEPER-2922: Flaky Test fix: org.apache.zookeeper.t...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/402 I got your point @afine, however I'm still under the impression that the root cause is that tests are running in parallel and multiple instances of Zookeeper are trying to run. I've seen "Address already in use exception" as an example of this. I also have a feeling that in some rare cases tests are trying to use the same instance in parallel and modifying each other's state. ---
[GitHub] zookeeper issue #402: ZOOKEEPER-2922: Flaky Test fix: org.apache.zookeeper.t...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/402 @afine Here is the command I've extracted from build logs showing that junit runs tests on 8 threads: `[ZooKeeper_branch34_jdk8] $ ant -Dtest.output=yes -Dtest.junit.threads=8 -Dtest.junit.output.format=xml -Djavac.target=1.8 clean test-core-java` Please let me know if I need to check something else too. ---
[GitHub] zookeeper issue #402: ZOOKEEPER-2922: Flaky Test fix: org.apache.zookeeper.t...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/402 @afine Do you think it's better to cancel this change? Affected tests have been running fine for a couple of builds. ---
[GitHub] zookeeper issue #403: Zookeeper 2923
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/403 You could use git rebase command to squash/fixup the commits to merge them into one. http://fle.github.io/git-tip-keep-your-branch-clean-with-fixup-and-autosquash.html Rebasing is also handy over standard merge when catching up with trunk, because it generates nice, flat history of commits instead of a railway station. ;) ---
[GitHub] zookeeper issue #404: ZOOKEEPER-2690: Update documentation source for ZOOKEE...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/404 This change looks good to me. ---
[GitHub] zookeeper pull request #409: ZOOKEEPER-2924: Refactor tests of LoadFromLogTe...
GitHub user anmolnar opened a pull request: https://github.com/apache/zookeeper/pull/409 ZOOKEEPER-2924: Refactor tests of LoadFromLogTest.java Tests which don't need ZK test server to run have been extracted to LoadFromLogNoServerTest.java. Rest of them have been refactored to do init/shutdown in JUnit Before/After methods. You can merge this pull request into a Git repository by running: $ git pull https://github.com/anmolnar/zookeeper ZOOKEEPER-2924 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/409.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #409 commit 552a7d2a8e4afe297fa36042fd60d7c55a00e9e5 Author: Andor Molnár Date: 2017-10-27T09:49:07Z ZOOKEEPER-2924: Refactor tests of LoadFromLogTest.java Tests which don't need ZK test server to run have been extracted to LoadFromLogNoServerTest.java. Rest of them have been refactored to do init/shutdown in JUnit Before/After methods. ---
[GitHub] zookeeper pull request #402: ZOOKEEPER-2922: Flaky Test fix: org.apache.zook...
Github user anmolnar closed the pull request at: https://github.com/apache/zookeeper/pull/402 ---
[GitHub] zookeeper issue #402: ZOOKEEPER-2922: Flaky Test fix: org.apache.zookeeper.t...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/402 Obsoleted by https://github.com/apache/zookeeper/pull/409 ---
[GitHub] zookeeper issue #409: ZOOKEEPER-2924: Refactor tests of LoadFromLogTest.java
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/409 Thanks @afine . I've updated the PR. ---
[GitHub] zookeeper pull request #412: ZOOKEEPER-2101: Transaction larger than max buf...
GitHub user anmolnar opened a pull request: https://github.com/apache/zookeeper/pull/412 ZOOKEEPER-2101: Transaction larger than max buffer of jute makes zookeeper unavailable This patch has been created to reanimate an ancient, unclosed Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2101 Original patch was done by Liu Shaohui and applied to latest trunk without any modification. This one would be a nice kick off for implementing jute (max) buffer size monitoring. You can merge this pull request into a Git repository by running: $ git pull https://github.com/anmolnar/zookeeper ZOOKEEPER-2101 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/412.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #412 commit 68a5672aa2df542d7f7979d3876be871b685a197 Author: Andor Molnár Date: 2017-10-30T14:28:28Z ZOOKEEPER-2101: This patch has been created to reanimate an ancient, unclosed Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2101 Original patch was done by Liu Shaohui and applied to latest trunk without any modification. This one would be a nice kick off for implementing jura buffer size monitoring. ---
[GitHub] zookeeper issue #411: ZOOKEEPER-2684 Fix a crashing bug in the mixed workloa...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/411 @kfirlevari yeah, that's an outstanding flaky test. Here's the jira about it: https://issues.apache.org/jira/browse/ZOOKEEPER-2807 Amend your latest commit to trigger another Jenkins build. ---
[GitHub] zookeeper issue #409: ZOOKEEPER-2924: Refactor tests of LoadFromLogTest.java
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/409 @phunt would you please kindly commit this patch? ---
[GitHub] zookeeper issue #412: ZOOKEEPER-2101: Transaction larger than max buffer of ...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/412 Thanks @afine for looking into this and the detailed explanation. Unfortunately I cannot answer your question whether there could be other cases than multi when proposal size is bigger than the request. I'd would say *no*, but I'm not sure. Original patch was intended to implement a generic validation on the proposal size, but to stay on safe side I can move that check inside multi to avoid the performance implication in other cases. What do you think? ---
[GitHub] zookeeper issue #413: WriteLock recipe: Fix bug in znode ordering when the s...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/413 @javicacheiro The change looks good to me. I believe too that it'd be nice to add some unit tests to validate the comparison. ---
[GitHub] zookeeper issue #410: [ZOOKEEPER-2338] - set SOCK_CLOEXEC on socket if defin...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/410 @fr0stbyte Reading some Stackoverflow about the flag (https://stackoverflow.com/questions/22304631/what-is-the-purpose-to-set-sock-cloexec-flag-with-accept4-same-as-o-cloexec), the change looks reasonable to me. However it would be beneficial to shed some light on how it's related to your scenario. ---
[GitHub] zookeeper pull request #412: ZOOKEEPER-2101: Transaction larger than max buf...
Github user anmolnar closed the pull request at: https://github.com/apache/zookeeper/pull/412 ---
[GitHub] zookeeper issue #412: ZOOKEEPER-2101: Transaction larger than max buffer of ...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/412 I'm unable to create proper unit test for the issue, because cannot reliably reproduce the problem. Closing this PR. ---
[GitHub] zookeeper pull request #413: WriteLock recipe: Fix bug in znode ordering whe...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/413#discussion_r149077678 --- Diff: src/recipes/lock/test/org/apache/zookeeper/recipes/lock/ZNodeNameTest.java --- @@ -37,17 +37,30 @@ public void testOrderWithSamePrefix() throws Exception { @Test public void testOrderWithDifferentPrefixes() throws Exception { String[] names = { "r-3", "r-2", "r-1", "w-2", "w-1" }; -String[] expected = { "r-1", "r-2", "r-3", "w-1", "w-2" }; +// names with duplicated sequence numbers are not included in the result +String[] expected = { "r-1", "r-2", "r-3" }; --- End diff -- Why have you removed "w-1" and "w-2" elements from here? I think the expected list could be: `String[] expected = { "r-1", "w-1", "r-2", "w-2", "r-3"};` ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2933: Added last/min/max proposal siz...
GitHub user anmolnar opened a pull request: https://github.com/apache/zookeeper/pull/415 ZOOKEEPER-2933: Added last/min/max proposal size JMX beans https://issues.apache.org/jira/browse/ZOOKEEPER-2933 - Refactor proposal serialization logic to a common place (SerializeUtils), - Add JMX metric to monitor jute.maxbuffer setting - Add JMX metric to monitor min/max/last size of generated proposals - Unit tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/anmolnar/zookeeper ZOOKEEPER-2933 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/415.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #415 commit 67efc8553d80b374146aac07613eb3e217ba080f Author: Andor Molnár Date: 2017-11-06T16:31:44Z ZOOKEEPER-2933: Added last/min/max proposal size JMX beans - Refactor proposal serialization logic to a common place, - Add JMX metric to monitor jute.maxbuffer setting - Add JMX metric to monitor min/max/last size of generated proposals ---
[GitHub] zookeeper issue #409: ZOOKEEPER-2924: Refactor tests of LoadFromLogTest.java
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/409 @phunt Do you think we can merge this now into branch-3.4? ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2933: Added last/min/max proposal siz...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r149945747 --- Diff: src/java/main/org/apache/zookeeper/server/ZooKeeperServerBean.java --- @@ -167,4 +167,9 @@ public String getSecureClientAddress() { public long getTxnLogElapsedSyncTime() { return zks.getTxnLogElapsedSyncTime(); } + +@Override +public int getJuteMaxBufferSize() { +return Integer.getInteger("jute.maxbuffer", 0xf); --- End diff -- I deal with it in the other way around. Directly access BinaryInputArchive.maxBuffer from here. ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2933: Added last/min/max proposal siz...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r149945916 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/Leader.java --- @@ -34,12 +33,12 @@ import java.util.List; import java.util.Map; import java.util.Set; +import java.util.concurrent.atomic.AtomicInteger; --- End diff -- Good catch, thanks. ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2933: Added last/min/max proposal siz...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r149946142 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/Leader.java --- @@ -984,8 +990,6 @@ public void commitAndActivate(long zxid, long designatedLeader) { /** * Create an inform packet and send it to all observers. - * @param zxid - * @param proposal --- End diff -- Intellij keep notifying me about these comments not referring to valid parameters in the method definition anymore. I thought it'd be nice to do some cleanup together with the patch. ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2933: Added last/min/max proposal siz...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r149946413 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/LeaderStats.java --- @@ -0,0 +1,61 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.quorum; + +/** + * Provides live statistics about a running Leader. + */ +class LeaderStats { --- End diff -- Done. ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2933: Added last/min/max proposal siz...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r149946468 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/LeaderStats.java --- @@ -0,0 +1,61 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.quorum; + +/** + * Provides live statistics about a running Leader. + */ +class LeaderStats { +/** + * Size of the last generated proposal. This should fit into server's jute.maxbuffer setting. + */ +private int lastProposalSize = 0; + +/** + * Size of the smallest proposal which has been generated since the server was started. + */ +private int minProposalSize = -1; + +/** + * Size of the largest proposal which has been generated since the server was started. + */ +private int maxProposalSize = 0; --- End diff -- Done. ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2933: Added last/min/max proposal siz...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r149946634 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/LeaderStatsTest.java --- @@ -0,0 +1,39 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.quorum; + +import org.junit.Test; + +import static org.junit.Assert.assertEquals; + +public class LeaderStatsTest { +@Test +public void testSetProposalSize_SetMinMax() { --- End diff -- Removed. ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2933: Added last/min/max proposal siz...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r149947085 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/LeaderStatsTest.java --- @@ -0,0 +1,39 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.quorum; + +import org.junit.Test; + +import static org.junit.Assert.assertEquals; + +public class LeaderStatsTest { +@Test +public void testSetProposalSize_SetMinMax() { +LeaderStats stats = new LeaderStats(); --- End diff -- Good point. I've also added checking the lastProposalSize size value besides min/max. ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2933: Added last/min/max proposal siz...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r149948702 --- Diff: src/java/test/org/apache/zookeeper/server/util/SerializeUtilsTest.java --- @@ -0,0 +1,73 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.util; + +import org.apache.jute.OutputArchive; +import org.apache.jute.Record; +import org.apache.zookeeper.server.Request; +import org.apache.zookeeper.txn.TxnHeader; +import org.junit.Test; +import org.mockito.InOrder; + +import java.io.IOException; + +import static org.junit.Assert.assertNotNull; +import static org.junit.Assert.assertNull; +import static org.mockito.Matchers.any; +import static org.mockito.Matchers.eq; +import static org.mockito.Mockito.inOrder; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.verify; + +public class SerializeUtilsTest { + +@Test +public void testSerializeRequest_RequestIsNull() { +byte[] data = SerializeUtils.serializeRequest(null); +assertNull(data); +} + +@Test +public void testSerializeRequest_RequestHeaderIsNull() { +Request request = new Request(0, 0, 0, null, null, 0); +byte[] data = SerializeUtils.serializeRequest(request); +assertNull(data); +} + +@Test +public void testSerializeRequest_WithoutTxn() throws IOException { +TxnHeader header = mock(TxnHeader.class); +Request request = new Request(1, 2, 3, header, null, 4); +byte[] data = SerializeUtils.serializeRequest(request); +assertNotNull(data); +verify(header).serialize(any(OutputArchive.class), eq("hdr")); +} + +@Test +public void testSerializeRequest_WithTxn() throws IOException { +Record txn = mock(Record.class); +TxnHeader header = mock(TxnHeader.class); +Request request = new Request(1, 2, 3, header, txn, 4); +byte[] data = SerializeUtils.serializeRequest(request); +assertNotNull(data); +InOrder inOrder = inOrder(header, txn); --- End diff -- Very good question, I'd been thinking about it for a while. The reason why I eventually left it out is that this unit test verifies the behaviour of serializeRequest() method and _not_ the serialization itself. So basically we have to verify that the right serialize() methods get called in the right order instead of what they actually do. On the other hand, you're right, because if we mock the serialize() methods, they won't have sideeffect on our test. What do you think? ---
[GitHub] zookeeper issue #415: ZOOKEEPER-2933: Added last/min/max proposal size JMX b...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/415 @afine I've created 2 subtasks for the Jira as you suggested and changed this PR to refer to the right one. ---
[GitHub] zookeeper issue #415: ZOOKEEPER-2939: Added last/min/max proposal size JMX b...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/415 @afine Take a look at please the latest commit, I've added verification of the byte array itself to serializeRequest() unit test. I'm not 100% sure it's needed, but it might makes sense this way. ---
[GitHub] zookeeper pull request #416: ZOOKEEPER-2934: Updated usage of LOG_DEBUG in r...
GitHub user anmolnar opened a pull request: https://github.com/apache/zookeeper/pull/416 ZOOKEEPER-2934: Updated usage of LOG_DEBUG in recipes to follow changes in ZK client LOG_DEBUG macro has been changed recently in https://issues.apache.org/jira/browse/ZOOKEEPER-1400 This patch updates 'lock' and 'queue' recipes to follow the changes. You can merge this pull request into a Git repository by running: $ git pull https://github.com/anmolnar/zookeeper ZOOKEEPER-2934 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/416.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #416 commit 86dc7190aeda82ffb3cf32aaa45426294ab7395b Author: Andor Molnár Date: 2017-11-07T12:09:02Z ZOOKEEPER-2934: Updated usage of LOG_DEBUG in recipes to follow changes in ZK client ---
[GitHub] zookeeper issue #415: ZOOKEEPER-2939: Added last/min/max proposal size JMX b...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/415 @afine Thanks for approving. I've added the same byte array verification to the other unit test too. Also squashed everything into single commit. Ready to merge. ---
[GitHub] zookeeper issue #415: ZOOKEEPER-2939: Added last/min/max proposal size JMX b...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/415 @phunt Thanks for looking into this. I'll include the reset capability into this patch as suggested. ---
[GitHub] zookeeper issue #415: ZOOKEEPER-2939: Added last/min/max proposal size JMX b...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/415 @phunt I've added capability of resetting proposal statistics as well as unit tests for LeaderBean class. ---
[GitHub] zookeeper issue #415: ZOOKEEPER-2939: Added last/min/max proposal size JMX b...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/415 @phunt Sorry, I've completely forgotten about it. I believe the easiest would be to add proposal stats to 'stat' 4lw in this patch, so that it could be easily applied to all major branches. After that I'd create a separate PR for jetty changes which will target trunk and 3.5 only. ---
[GitHub] zookeeper pull request #420: ZOOKEEPER-2924. Refactor tests of LoadFromLogTe...
GitHub user anmolnar opened a pull request: https://github.com/apache/zookeeper/pull/420 ZOOKEEPER-2924. Refactor tests of LoadFromLogTest.java This patch is the trunk version of https://github.com/apache/zookeeper/pull/409 You can merge this pull request into a Git repository by running: $ git pull https://github.com/anmolnar/zookeeper ZOOKEEPER-2924-trunk Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/420.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #420 commit c2fa3e179bc73937863176ec8538cb065b38ee30 Author: Andor Molnár Date: 2017-11-17T09:52:26Z ZOOKEEPER-2924. Refactor tests of LoadFromLogTest.java ---
[GitHub] zookeeper issue #415: ZOOKEEPER-2939: Added last/min/max proposal size JMX b...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/415 @phunt 4lw changes have been added. @afine please start the review over, because so many things have been changed since you approved it. For some weird reason I cannot get a green build, but seems like it's unrelated to this patch. ---
[GitHub] zookeeper issue #415: ZOOKEEPER-2939: Added last/min/max proposal size JMX b...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/415 @afine You're not missing anything, I didn't add jetty support to this patch intentionally, because afaik jetty is not supported in 3.4. So the plan is to keep this patch compatible with all branches and I'll create a separate one for 3.5/trunk only. ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2939: Added last/min/max proposal siz...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r152276479 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/StatCommandTest.java --- @@ -0,0 +1,106 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.quorum; + +import org.apache.zookeeper.server.ServerCnxn; +import org.apache.zookeeper.server.ServerCnxnFactory; +import org.apache.zookeeper.server.ServerStats; +import org.apache.zookeeper.server.ZKDatabase; +import org.apache.zookeeper.server.command.FourLetterCommands; +import org.apache.zookeeper.server.command.StatCommand; +import org.junit.Before; +import org.junit.Test; + +import java.io.IOException; +import java.io.PrintWriter; +import java.io.StringWriter; +import java.util.ArrayList; +import java.util.List; + +import static org.hamcrest.core.StringContains.containsString; --- End diff -- Thanks for the investigation. All done. ---
[GitHub] zookeeper pull request #415: ZOOKEEPER-2939: Added last/min/max proposal siz...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/415#discussion_r152276508 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/StatResetCommandTest.java --- @@ -0,0 +1,112 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.zookeeper.server.quorum; + +import org.apache.zookeeper.server.ServerCnxn; +import org.apache.zookeeper.server.ServerStats; +import org.apache.zookeeper.server.ZooKeeperServer; +import org.apache.zookeeper.server.command.StatResetCommand; +import org.junit.Before; +import org.junit.Test; + +import java.io.PrintWriter; +import java.io.StringWriter; + +import static org.junit.Assert.assertEquals; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.times; +import static org.mockito.Mockito.verify; +import static org.mockito.Mockito.when; + +public class StatResetCommandTest { +private static final String ZK_NOT_SERVING = "This ZooKeeper instance is not currently serving requests\n"; --- End diff -- Done. ---
[GitHub] zookeeper issue #420: ZOOKEEPER-2924. Refactor tests of LoadFromLogTest.java
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/420 @phunt I rebased the branch on trunk and the build is now green. Thanks. Please review. ---
[GitHub] zookeeper pull request #420: ZOOKEEPER-2924. Refactor tests of LoadFromLogTe...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/420#discussion_r153610044 --- Diff: src/java/test/org/apache/zookeeper/test/LoadFromLogTest.java --- @@ -18,77 +18,58 @@ package org.apache.zookeeper.test; -import java.io.ByteArrayOutputStream; import java.io.File; -import java.io.FileInputStream; import java.io.IOException; -import java.nio.ByteBuffer; -import java.util.ArrayList; -import java.util.List; - -import org.apache.zookeeper.common.Time; -import org.apache.jute.BinaryInputArchive; -import org.apache.jute.BinaryOutputArchive; -import org.apache.jute.Record; + import org.apache.zookeeper.CreateMode; import org.apache.zookeeper.KeeperException.NoNodeException; -import org.apache.zookeeper.PortAssignment; -import org.apache.zookeeper.ZKTestCase; import org.apache.zookeeper.ZooDefs.Ids; -import org.apache.zookeeper.ZooDefs.OpCode; import org.apache.zookeeper.ZooKeeper; import org.apache.zookeeper.data.Stat; -import org.apache.zookeeper.server.DataNode; -import org.apache.zookeeper.server.DataTree; import org.apache.zookeeper.server.ServerCnxnFactory; import org.apache.zookeeper.server.SyncRequestProcessor; import org.apache.zookeeper.server.ZooKeeperServer; -import org.apache.zookeeper.server.persistence.FileHeader; import org.apache.zookeeper.server.persistence.FileTxnLog; import org.apache.zookeeper.server.persistence.FileTxnSnapLog; import org.apache.zookeeper.server.persistence.Util; import org.apache.zookeeper.server.persistence.FileTxnLog.FileTxnIterator; import org.apache.zookeeper.server.persistence.TxnLog.TxnIterator; -import org.apache.zookeeper.txn.CreateTxn; -import org.apache.zookeeper.txn.DeleteTxn; -import org.apache.zookeeper.txn.MultiTxn; -import org.apache.zookeeper.txn.Txn; import org.apache.zookeeper.txn.TxnHeader; +import org.eclipse.jetty.util.SocketAddressResolver; +import org.junit.After; import org.junit.Assert; +import org.junit.Before; import org.junit.Test; import org.slf4j.Logger; import org.slf4j.LoggerFactory; -public class LoadFromLogTest extends ZKTestCase { -private static final String HOST = "127.0.0.1:"; -private static final int CONNECTION_TIMEOUT = 3000; +public class LoadFromLogTest extends ClientBase { private static final int NUM_MESSAGES = 300; protected static final Logger LOG = LoggerFactory.getLogger(LoadFromLogTest.class); // setting up the quorum has a transaction overhead for creating and closing the session private static final int TRANSACTION_OVERHEAD = 2; private static final int TOTAL_TRANSACTIONS = NUM_MESSAGES + TRANSACTION_OVERHEAD; +@Before +public void setUp() throws Exception { +SyncRequestProcessor.setSnapCount(100); +super.setUp(); +} + +@After +public void tearDown() throws Exception { +stopServer(); --- End diff -- Good point. super.tearDown() includes stopServer() so I'll remove this one completely. ---
[GitHub] zookeeper issue #420: ZOOKEEPER-2924. Refactor tests of LoadFromLogTest.java
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/420 @phunt @afine thanks for the detail comments. I think @afine made a good point, I've removed setting the SnapCount explicitly in each test, leaving only a common setting in the setUp method. Only those 2 tests really care about the setting and with a little modification, they can share the same setting. ---
[GitHub] zookeeper issue #420: ZOOKEEPER-2924. Refactor tests of LoadFromLogTest.java
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/420 Tests don't seem to be related to the snapCount apart from the 2 that @afine mentioned previously. This way we can keep the structure of this testfile nice and clean with doing the initialization in the setUp method and reusing everything from the base class. I prefer to keep it this way. Regarding making it volatile: ability to change the snapCount is for testing-purposes only. Which is already not the nicest thing in the world, I wouldn't make it even worse by making it thread safe just for test code. Commented lines have been removed, were there by accident. ---
[GitHub] zookeeper pull request #300: ZOOKEEPER-2807: Flaky test: org.apache.zookeepe...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/300#discussion_r154363054 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/CommitProcessor.java --- @@ -327,6 +257,95 @@ public void run() { LOG.info("CommitProcessor exited loop!"); } +private void processCommittedRequest() throws IOException, InterruptedException { +// In case of a spurious wakeup in waitForCommittedRequests we should not +// remove the request from the queue until it has been processed +Request request = committedRequests.peek(); + +if (request == null) { +committedRequests.poll(); --- End diff -- If request==null then the committedRequests queue is already empty. Why do need to poll() here? Could it be a small mistake and originally you wanted to poll() after the peek() was successful...? ---
[GitHub] zookeeper pull request #300: ZOOKEEPER-2807: Flaky test: org.apache.zookeepe...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/300#discussion_r154367742 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/CommitProcessor.java --- @@ -240,84 +240,14 @@ public void run() { } // Process committed head -if ((request = committedRequests.poll()) == null) { -throw new IOException("Error: committed head is null"); -} - -/* - * Check if request is pending, if so, update it with the committed info - */ -LinkedList sessionQueue = pendingRequests -.get(request.sessionId); -if (sessionQueue != null) { -// If session queue != null, then it is also not empty. -Request topPending = sessionQueue.poll(); -if (request.cxid != topPending.cxid) { -/* - * TL;DR - we should not encounter this scenario often under normal load. - * We pass the commit to the next processor and put the pending back with a warning. - * - * Generally, we can get commit requests that are not at the queue head after - * a session moved (see ZOOKEEPER-2684). Let's denote the previous server of the session - * with A, and the server that the session moved to with B (keep in mind that it is - * possible that the session already moved from B to a new server C, and maybe C=A). - * 1. If request.cxid < topPending.cxid : this means that the session requested this update - * from A, then moved to B (i.e., which is us), and now B receives the commit - * for the update after the session already performed several operations in B - * (and therefore its cxid is higher than that old request). - * 2. If request.cxid > topPending.cxid : this means that the session requested an updated - * from B with cxid that is bigger than the one we know therefore in this case we - * are A, and we lost the connection to the session. Given that we are waiting for a commit - * for that update, it means that we already sent the request to the leader and it will - * be committed at some point (in this case the order of cxid won't follow zxid, since zxid - * is an increasing order). It is not safe for us to delete the session's queue at this - * point, since it is possible that the session has newer requests in it after it moved - * back to us. We just leave the queue as it is, and once the commit arrives (for the old - * request), the finalRequestProcessor will see a closed cnxn handle, and just won't send a - * response. - * Also note that we don't have a local session, therefore we treat the request - * like any other commit for a remote request, i.e., we perform the update without sending - * a response. - */ -LOG.warn("Got request " + request + -" but we are expecting request " + topPending); -sessionQueue.addFirst(topPending); -} else { -/* - * Generally, we want to send to the next processor our version of the request, - * since it contains the session information that is needed for post update processing. - * In more details, when a request is in the local queue, there is (or could be) a client - * attached to this server waiting for a response, and there is other bookkeeping of - * requests that are outstanding and have originated from this server - * (e.g., for setting the max outstanding requests) - we need to update this info when an - * outstanding request completes. Note that in the other case (above), the operation
[GitHub] zookeeper issue #300: ZOOKEEPER-2807: Flaky test: org.apache.zookeeper.test....
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/300 @afine Generally speaking, I like the idea of using LinkedBlockingQueue's intrinsic lock to wait for becoming empty, but in this particular case I think it's possible that committedRequests will never be empty if the leader is constantly sending commit requests. Correct me if I'm wrong please (there's a very good chance that I completely misunderstand something), but my feeling is that the following situation is possible: 1. Learner starts syncing with leader in syncWithLeader() method, 2. Learner blocks and wait for all commits to be processed before finishing the sync, 3. FollowerZookeeperServer is already running and keep receiving commits from the Leader including non-syncing ones, 4. Learner will never be notified or only at some point in the future much more later then sync complete or way before that. To address this, if we could get the number of commits that we must wait before proceeding, we would be able to implement a CountDownLatch in CommitProcessor and wait for the number of commits which are expected in the sync process. However that does not guarantee that we received all sync-related commits either. Otherwise I could also agree with @shralex in the Jira saying: "Intuitively this may not be the right place for such a fix - this probably should be higher level - **making sure that follower does not even accept local ops before properly completing the sync.** Even if you drain the committedRequests, I'm not sure that guarantees that there are no more that will arrive." That would be the best solution here in my opinion. ---
[GitHub] zookeeper pull request #423: ZOOKEEPER-2949: using hostname and port to crea...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/423#discussion_r154133881 --- Diff: src/java/main/org/apache/zookeeper/ClientCnxnSocketNetty.java --- @@ -340,13 +340,24 @@ public static Packet getInstance() { return instance; } } + /** * ZKClientPipelineFactory is the netty pipeline factory for this netty * connection implementation. */ private class ZKClientPipelineFactory implements ChannelPipelineFactory { private SSLContext sslContext = null; private SSLEngine sslEngine = null; +private String host; +private int port; + +public ZKClientPipelineFactory() { --- End diff -- I second that. host & port are now mandatory parameters. ---
[GitHub] zookeeper issue #420: ZOOKEEPER-2924. Refactor tests of LoadFromLogTest.java
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/420 @phunt Correct. These 2 tests: testLoad and testLoadFailure make assertion on the number of log files. Rest of the tests don't care. Originally they've reset snapCount to 1 to produce only 1 test file, but they work perfectly with other setting as well. ---
[GitHub] zookeeper issue #300: ZOOKEEPER-2807: Flaky test: org.apache.zookeeper.test....
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/300 @afine If I'm not mistaken there's a thread running which receives messages from the Leader in QuorumPeer.java:997: Call stack is: QuorumPeer.run() -> Follower.followLeader() -> Follower.processPacket() -> FollowerZookeeperServer.commit() This code path could potentially add more commit messages to the queue which are not related to the sync. Even if I'm wrong with the above, @shralex 's concern is still valid: "Even if you drain the committedRequests, I'm not sure that guarantees that there are no more that will arrive." It's possible to drain the commit queue faster than leader sends messages. In other words: draining the commit queue doesn't seem to guarantee that the sync is complete. ---
[GitHub] zookeeper issue #415: ZOOKEEPER-2939: Added last/min/max proposal size JMX b...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/415 @afine @phunt Findbugs issues have been resolved. Please review & commit. ---
[GitHub] zookeeper issue #300: ZOOKEEPER-2807: Flaky test: org.apache.zookeeper.test....
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/300 @afine You're right. I missed that syncWithLeader() call is on the same path and in the same thread as adding commits to the queue. In which case this must be right: - syncWithLeader() blocks the follower, - Leader sends commits for the sync process, - Leader sends UPTODATE at the very end, - Follower drains the commit queue - Follower starts following. ---
[GitHub] zookeeper pull request #300: ZOOKEEPER-2807: Flaky test: org.apache.zookeepe...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/300#discussion_r154851878 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/CommitProcessor.java --- @@ -240,84 +240,14 @@ public void run() { } // Process committed head -if ((request = committedRequests.poll()) == null) { -throw new IOException("Error: committed head is null"); -} - -/* - * Check if request is pending, if so, update it with the committed info - */ -LinkedList sessionQueue = pendingRequests -.get(request.sessionId); -if (sessionQueue != null) { -// If session queue != null, then it is also not empty. -Request topPending = sessionQueue.poll(); -if (request.cxid != topPending.cxid) { -/* - * TL;DR - we should not encounter this scenario often under normal load. - * We pass the commit to the next processor and put the pending back with a warning. - * - * Generally, we can get commit requests that are not at the queue head after - * a session moved (see ZOOKEEPER-2684). Let's denote the previous server of the session - * with A, and the server that the session moved to with B (keep in mind that it is - * possible that the session already moved from B to a new server C, and maybe C=A). - * 1. If request.cxid < topPending.cxid : this means that the session requested this update - * from A, then moved to B (i.e., which is us), and now B receives the commit - * for the update after the session already performed several operations in B - * (and therefore its cxid is higher than that old request). - * 2. If request.cxid > topPending.cxid : this means that the session requested an updated - * from B with cxid that is bigger than the one we know therefore in this case we - * are A, and we lost the connection to the session. Given that we are waiting for a commit - * for that update, it means that we already sent the request to the leader and it will - * be committed at some point (in this case the order of cxid won't follow zxid, since zxid - * is an increasing order). It is not safe for us to delete the session's queue at this - * point, since it is possible that the session has newer requests in it after it moved - * back to us. We just leave the queue as it is, and once the commit arrives (for the old - * request), the finalRequestProcessor will see a closed cnxn handle, and just won't send a - * response. - * Also note that we don't have a local session, therefore we treat the request - * like any other commit for a remote request, i.e., we perform the update without sending - * a response. - */ -LOG.warn("Got request " + request + -" but we are expecting request " + topPending); -sessionQueue.addFirst(topPending); -} else { -/* - * Generally, we want to send to the next processor our version of the request, - * since it contains the session information that is needed for post update processing. - * In more details, when a request is in the local queue, there is (or could be) a client - * attached to this server waiting for a response, and there is other bookkeeping of - * requests that are outstanding and have originated from this server - * (e.g., for setting the max outstanding requests) - we need to update this info when an - * outstanding request completes. Note that in the other case (above), the operation
[GitHub] zookeeper issue #420: ZOOKEEPER-2924. Refactor tests of LoadFromLogTest.java
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/420 @phunt do you still have concerns? ---
[GitHub] zookeeper issue #415: ZOOKEEPER-2939: Added last/min/max proposal size JMX b...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/415 @phunt do you think we can commit this patch? ---
[GitHub] zookeeper pull request #300: ZOOKEEPER-2807: Flaky test: org.apache.zookeepe...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/300#discussion_r155103949 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/CommitProcessor.java --- @@ -240,84 +240,14 @@ public void run() { } // Process committed head -if ((request = committedRequests.poll()) == null) { -throw new IOException("Error: committed head is null"); -} - -/* - * Check if request is pending, if so, update it with the committed info - */ -LinkedList sessionQueue = pendingRequests -.get(request.sessionId); -if (sessionQueue != null) { -// If session queue != null, then it is also not empty. -Request topPending = sessionQueue.poll(); -if (request.cxid != topPending.cxid) { -/* - * TL;DR - we should not encounter this scenario often under normal load. - * We pass the commit to the next processor and put the pending back with a warning. - * - * Generally, we can get commit requests that are not at the queue head after - * a session moved (see ZOOKEEPER-2684). Let's denote the previous server of the session - * with A, and the server that the session moved to with B (keep in mind that it is - * possible that the session already moved from B to a new server C, and maybe C=A). - * 1. If request.cxid < topPending.cxid : this means that the session requested this update - * from A, then moved to B (i.e., which is us), and now B receives the commit - * for the update after the session already performed several operations in B - * (and therefore its cxid is higher than that old request). - * 2. If request.cxid > topPending.cxid : this means that the session requested an updated - * from B with cxid that is bigger than the one we know therefore in this case we - * are A, and we lost the connection to the session. Given that we are waiting for a commit - * for that update, it means that we already sent the request to the leader and it will - * be committed at some point (in this case the order of cxid won't follow zxid, since zxid - * is an increasing order). It is not safe for us to delete the session's queue at this - * point, since it is possible that the session has newer requests in it after it moved - * back to us. We just leave the queue as it is, and once the commit arrives (for the old - * request), the finalRequestProcessor will see a closed cnxn handle, and just won't send a - * response. - * Also note that we don't have a local session, therefore we treat the request - * like any other commit for a remote request, i.e., we perform the update without sending - * a response. - */ -LOG.warn("Got request " + request + -" but we are expecting request " + topPending); -sessionQueue.addFirst(topPending); -} else { -/* - * Generally, we want to send to the next processor our version of the request, - * since it contains the session information that is needed for post update processing. - * In more details, when a request is in the local queue, there is (or could be) a client - * attached to this server waiting for a response, and there is other bookkeeping of - * requests that are outstanding and have originated from this server - * (e.g., for setting the max outstanding requests) - we need to update this info when an - * outstanding request completes. Note that in the other case (above), the operation
[GitHub] zookeeper issue #300: ZOOKEEPER-2807: Flaky test: org.apache.zookeeper.test....
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/300 +1 looks good to me. @afine Would you please trigger another build to get a green one? ---
[GitHub] zookeeper issue #410: [ZOOKEEPER-2338] - set SOCK_CLOEXEC on socket if defin...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/410 @fr0stbyte According to my tiny C knowledge, it looks good to me. ---
[GitHub] zookeeper issue #409: ZOOKEEPER-2924: Refactor tests of LoadFromLogTest.java
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/409 @phunt Yes, I'm working on it. ---
[GitHub] zookeeper issue #409: ZOOKEEPER-2924: Refactor tests of LoadFromLogTest.java
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/409 @phunt Patch has been updated with the trunk review changes. Please take a look. ---
[GitHub] zookeeper issue #427: [ZOOKEEPER-2338] - set SOCK_CLOEXEC on socket if defin...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/427 @fr0stbyte no thanks, that's fine, I've dug it out already. Going forward I suggest to make linter / indentation / whitespace changes in a commit separate from the code changes. ---
[GitHub] zookeeper issue #415: ZOOKEEPER-2939: Added last/min/max proposal size JMX b...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/415 @phunt @afine are u happy with the change? ---
[GitHub] zookeeper issue #409: ZOOKEEPER-2924: Refactor tests of LoadFromLogTest.java
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/409 @phunt Build is green now. Please review & commit if you have a chance. ---
[GitHub] zookeeper pull request #429: ZOOKEEPER-2952. Upgrade third party libs: netty...
GitHub user anmolnar opened a pull request: https://github.com/apache/zookeeper/pull/429 ZOOKEEPER-2952. Upgrade third party libs: netty, slf4j You can merge this pull request into a Git repository by running: $ git pull https://github.com/anmolnar/zookeeper ZOOKEEPER-2952 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/429.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #429 commit 949ada4ae0ad87388269bcc728acce325f04 Author: Andor Molnar Date: 2017-12-12T11:04:48Z ZOOKEEPER-2952. Upgrade third party libs: netty, slf4j ---
[GitHub] zookeeper pull request #430: ZOOKEEPER-2893. Make 'addr' variable available ...
GitHub user anmolnar opened a pull request: https://github.com/apache/zookeeper/pull/430 ZOOKEEPER-2893. Make 'addr' variable available for error handling code to give a chance to fallback if the socket hasn't been initialized yet 'addr' variable is used to identify which server to connect to. I've made this available for error handling code in order to let it fallback to this address if the remote socket hasn't been initialised yet. This will give us better error messages if the client is unable to connect to server for some reason. You can merge this pull request into a Git repository by running: $ git pull https://github.com/anmolnar/zookeeper ZOOKEEPER-2893 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/430.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #430 commit fbe4ccde8516150cfd69d2a2260266fd1c0bf10d Author: Andor Molnar Date: 2017-12-12T13:46:34Z ZOOKEEPER-2893. Make 'addr' variable available for error handling code to give a chance to fallback if the socket hasn't been initialized yet ---
[GitHub] zookeeper pull request #409: ZOOKEEPER-2924: Refactor tests of LoadFromLogTe...
Github user anmolnar closed the pull request at: https://github.com/apache/zookeeper/pull/409 ---
[GitHub] zookeeper pull request #430: ZOOKEEPER-2893. very poor choice of logging if ...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/430#discussion_r156597536 --- Diff: src/java/main/org/apache/zookeeper/ClientCnxn.java --- @@ -1232,11 +1233,12 @@ public void run() { } else if (e instanceof RWServerFoundException) { LOG.info(e.getMessage()); } else { +SocketAddress remoteAddr = clientCnxnSocket.getRemoteSocketAddress(); LOG.warn( "Session 0x" + Long.toHexString(getSessionId()) + " for server " -+ clientCnxnSocket.getRemoteSocketAddress() ++ (remoteAddr == null ? addr : remoteAddr) --- End diff -- Makes sense. I'll just use 'addr' here. ---
[GitHub] zookeeper pull request #430: ZOOKEEPER-2893. very poor choice of logging if ...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/430#discussion_r156599702 --- Diff: src/java/main/org/apache/zookeeper/ClientCnxn.java --- @@ -1041,6 +1041,8 @@ private void sendPing() { private InetSocketAddress rwServerAddress = null; +private InetSocketAddress addr = null; --- End diff -- Remote side of the connection. Basically the ip address that the client is trying to connect to. ```rwServerAddress``` is the address of non-R/O server found if there's any. I'll change this to ```serverAddress```. ---
[GitHub] zookeeper issue #429: ZOOKEEPER-2952. Upgrade third party libs: netty, slf4j
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/429 @phunt I've updated the slf4j license. ---
[GitHub] zookeeper pull request #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/432#discussion_r156657072 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java --- @@ -335,6 +336,100 @@ public void testHighestZxidJoinLate() throws Exception { output[0], 2); } +/** + * This test validates that if a quorum member determines that it is leader without the support of the rest of the + * quorum (the other members do not believe it to be the leader) it will stop attempting to lead and become a follower. + * + * @throws IOException + * @throws InterruptedException + */ +@Test +public void testElectionFraud() throws IOException, InterruptedException { +// capture QuorumPeer logging +Layout layout = Logger.getRootLogger().getAppender("CONSOLE").getLayout(); +ByteArrayOutputStream os = new ByteArrayOutputStream(); +WriterAppender appender = new WriterAppender(layout, os); +appender.setThreshold(Level.INFO); +Logger qlogger = Logger.getLogger(QuorumPeer.class); +qlogger.addAppender(appender); + +int numServers = 3; + +// used for assertions later +boolean foundLeading = false; +boolean foundLooking = false; +boolean foundFollowing = false; + +try { + // spin up a quorum, we use a small ticktime to make the test run faster + Servers servers = LaunchServers(numServers, 500); --- End diff -- It'd be useful to utilize the class-wide ```servers``` field which is used to shutdown test servers in the ```tearDown()``` method. For example ```testHighestZxidJoinLate()``` test does it that way. ---
[GitHub] zookeeper pull request #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/432#discussion_r156660196 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java --- @@ -335,6 +336,100 @@ public void testHighestZxidJoinLate() throws Exception { output[0], 2); } +/** + * This test validates that if a quorum member determines that it is leader without the support of the rest of the + * quorum (the other members do not believe it to be the leader) it will stop attempting to lead and become a follower. + * + * @throws IOException + * @throws InterruptedException + */ +@Test +public void testElectionFraud() throws IOException, InterruptedException { +// capture QuorumPeer logging +Layout layout = Logger.getRootLogger().getAppender("CONSOLE").getLayout(); +ByteArrayOutputStream os = new ByteArrayOutputStream(); +WriterAppender appender = new WriterAppender(layout, os); +appender.setThreshold(Level.INFO); +Logger qlogger = Logger.getLogger(QuorumPeer.class); +qlogger.addAppender(appender); + +int numServers = 3; + +// used for assertions later +boolean foundLeading = false; +boolean foundLooking = false; +boolean foundFollowing = false; + +try { + // spin up a quorum, we use a small ticktime to make the test run faster + Servers servers = LaunchServers(numServers, 500); + + // find the leader + int trueLeader = -1; + for (int i = 0; i < numServers; i++) { +if (servers.mt[i].main.quorumPeer.leader != null) { + trueLeader = i; +} + } + Assert.assertTrue("There should be a leader", trueLeader >= 0); + + // find a follower + int falseLeader = (trueLeader + 1) % numServers; + Assert.assertTrue(servers.mt[falseLeader].main.quorumPeer.follower != null); + + // to keep the quorum peer running and force it to go into the looking state, we kill leader election + // and close the connection to the leader + servers.mt[falseLeader].main.quorumPeer.electionAlg.shutdown(); + servers.mt[falseLeader].main.quorumPeer.follower.getSocket().close(); + + // wait for the falseLeader to disconnect + waitForOne(servers.zk[falseLeader], States.CONNECTING); + + // convince falseLeader that it is the leader + servers.mt[falseLeader].main.quorumPeer.setPeerState(QuorumPeer.ServerState.LEADING); + + // provide time for the falseleader to realize no followers have connected + // (this is twice the timeout used in Leader#getEpochToPropose) + Thread.sleep(2 * servers.mt[falseLeader].main.quorumPeer.initLimit * servers.mt[falseLeader].main.quorumPeer.tickTime); + + // Restart leader election + servers.mt[falseLeader].main.quorumPeer.startLeaderElection(); --- End diff -- I wonder if this one has to be called explicitly. The peer should automatically realise that it's no longer the leader, stop leading and start leader election (which is basically looking). That's what this test is intended to validate and shouldn't be started explicitly. Correct me if I'm wrong. ---
[GitHub] zookeeper pull request #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/432#discussion_r156658012 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java --- @@ -335,6 +336,100 @@ public void testHighestZxidJoinLate() throws Exception { output[0], 2); } +/** + * This test validates that if a quorum member determines that it is leader without the support of the rest of the + * quorum (the other members do not believe it to be the leader) it will stop attempting to lead and become a follower. + * + * @throws IOException + * @throws InterruptedException + */ +@Test +public void testElectionFraud() throws IOException, InterruptedException { +// capture QuorumPeer logging +Layout layout = Logger.getRootLogger().getAppender("CONSOLE").getLayout(); +ByteArrayOutputStream os = new ByteArrayOutputStream(); +WriterAppender appender = new WriterAppender(layout, os); +appender.setThreshold(Level.INFO); +Logger qlogger = Logger.getLogger(QuorumPeer.class); +qlogger.addAppender(appender); + +int numServers = 3; + +// used for assertions later +boolean foundLeading = false; +boolean foundLooking = false; +boolean foundFollowing = false; + +try { + // spin up a quorum, we use a small ticktime to make the test run faster + Servers servers = LaunchServers(numServers, 500); + + // find the leader + int trueLeader = -1; + for (int i = 0; i < numServers; i++) { +if (servers.mt[i].main.quorumPeer.leader != null) { + trueLeader = i; +} + } + Assert.assertTrue("There should be a leader", trueLeader >= 0); + + // find a follower + int falseLeader = (trueLeader + 1) % numServers; + Assert.assertTrue(servers.mt[falseLeader].main.quorumPeer.follower != null); --- End diff -- Minor: I believe that it's generally a good idea to add a short message to asserts to describe what could be the problem when assertion fails. ---
[GitHub] zookeeper issue #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLeade...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/432 @afine Generally the new test looks good to me, but would you please elaborate why the old test was flaky and how the new fixes that? ---
[GitHub] zookeeper pull request #435: ZOOKEEPER-2952. Upgrade third party libs: netty...
GitHub user anmolnar opened a pull request: https://github.com/apache/zookeeper/pull/435 ZOOKEEPER-2952. Upgrade third party libs: netty, slf4j, log4j This is the branch-3.4 version of https://github.com/apache/zookeeper/pull/429 netty, log4j, slf4j You can merge this pull request into a Git repository by running: $ git pull https://github.com/anmolnar/zookeeper ZOOKEEPER-2952-branch34 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/435.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #435 commit 7aa2bcd7d8c6932e3517e45985c6ba3d5db2ab82 Author: Andor Molnar Date: 2017-12-13T14:04:44Z ZOOKEEPER-2952. Upgraded third party libraries + updated related license files ---
[GitHub] zookeeper issue #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLeade...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/432 @afine got it ---
[GitHub] zookeeper pull request #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/432#discussion_r156793172 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java --- @@ -335,6 +336,100 @@ public void testHighestZxidJoinLate() throws Exception { output[0], 2); } +/** + * This test validates that if a quorum member determines that it is leader without the support of the rest of the + * quorum (the other members do not believe it to be the leader) it will stop attempting to lead and become a follower. + * + * @throws IOException + * @throws InterruptedException + */ +@Test +public void testElectionFraud() throws IOException, InterruptedException { +// capture QuorumPeer logging +Layout layout = Logger.getRootLogger().getAppender("CONSOLE").getLayout(); +ByteArrayOutputStream os = new ByteArrayOutputStream(); +WriterAppender appender = new WriterAppender(layout, os); +appender.setThreshold(Level.INFO); +Logger qlogger = Logger.getLogger(QuorumPeer.class); +qlogger.addAppender(appender); + +int numServers = 3; + +// used for assertions later +boolean foundLeading = false; +boolean foundLooking = false; +boolean foundFollowing = false; + +try { + // spin up a quorum, we use a small ticktime to make the test run faster + Servers servers = LaunchServers(numServers, 500); + + // find the leader + int trueLeader = -1; + for (int i = 0; i < numServers; i++) { +if (servers.mt[i].main.quorumPeer.leader != null) { + trueLeader = i; +} + } + Assert.assertTrue("There should be a leader", trueLeader >= 0); + + // find a follower + int falseLeader = (trueLeader + 1) % numServers; + Assert.assertTrue(servers.mt[falseLeader].main.quorumPeer.follower != null); + + // to keep the quorum peer running and force it to go into the looking state, we kill leader election + // and close the connection to the leader + servers.mt[falseLeader].main.quorumPeer.electionAlg.shutdown(); + servers.mt[falseLeader].main.quorumPeer.follower.getSocket().close(); + + // wait for the falseLeader to disconnect + waitForOne(servers.zk[falseLeader], States.CONNECTING); + + // convince falseLeader that it is the leader + servers.mt[falseLeader].main.quorumPeer.setPeerState(QuorumPeer.ServerState.LEADING); + + // provide time for the falseleader to realize no followers have connected + // (this is twice the timeout used in Leader#getEpochToPropose) + Thread.sleep(2 * servers.mt[falseLeader].main.quorumPeer.initLimit * servers.mt[falseLeader].main.quorumPeer.tickTime); + + // Restart leader election + servers.mt[falseLeader].main.quorumPeer.startLeaderElection(); --- End diff -- I see. Hence the shutdown() call a few lines before. Makes sense. ---
[GitHub] zookeeper pull request #435: ZOOKEEPER-2952. Upgrade third party libs: netty...
Github user anmolnar closed the pull request at: https://github.com/apache/zookeeper/pull/435 ---
[GitHub] zookeeper pull request #430: ZOOKEEPER-2893. very poor choice of logging if ...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/430#discussion_r156951799 --- Diff: src/java/main/org/apache/zookeeper/ClientCnxn.java --- @@ -1041,6 +1041,8 @@ private void sendPing() { private InetSocketAddress rwServerAddress = null; +private InetSocketAddress serverAddress = null; --- End diff -- Good idea. I've made the change. ---
[GitHub] zookeeper pull request #430: ZOOKEEPER-2893. very poor choice of logging if ...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/430#discussion_r156952287 --- Diff: src/java/main/org/apache/zookeeper/ClientCnxn.java --- @@ -1236,7 +1237,7 @@ public void run() { "Session 0x" + Long.toHexString(getSessionId()) + " for server " -+ clientCnxnSocket.getRemoteSocketAddress() ++ serverAddress + ", unexpected error" + RETRY_CONN_MSG, e); --- End diff -- We're talking about the same thing in the Jira. It's arguable which exception should be logged at INFO level without the stack trace, I ended up separating SocketExceptions altogether and leaving the rest for the original handler. ---
[GitHub] zookeeper issue #415: ZOOKEEPER-2939: Added last/min/max proposal size JMX b...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/415 @phunt That's a perfectly valid point Pat, I've been asked about it several times before. Therefore I added some more comments to the parent Jira of monitoring, please take a look at the description: https://issues.apache.org/jira/browse/ZOOKEEPER-2933 I've also populated the description of subtasks too to help understanding them. Regarding to statistics tracking in Zookeeper: I wonder how much does it make sense to implementing something complicated within the server. One approach (which this PR tries to follow) is to expose basic values which can be sampled by a more sophisticated monitoring tool. Average, sliding window, etc. can be implemented in there. On the other hand we could add some more clever logic to ZK as well such as https://google.github.io/guava/releases/21.0/api/docs/com/google/common/math/Quantiles.html or https://commons.apache.org/proper/commons-math/javadocs/api-3.0/org/apache/commons/math3/stat/descriptive/rank/Percentile.html Basically calculating percentiles (50-90-99-Median-Max-etc.) with the appropriate library and expose the values with JMX. This will probably require us to gather samples with a sliding window within Zookeeper. ---
[GitHub] zookeeper pull request #433: ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLead...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/433#discussion_r157001279 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java --- @@ -335,6 +336,100 @@ public void testHighestZxidJoinLate() throws Exception { output[0], 2); } +/** + * This test validates that if a quorum member determines that it is leader without the support of the rest of the + * quorum (the other members do not believe it to be the leader) it will stop attempting to lead and become a follower. + * + * @throws IOException + * @throws InterruptedException + */ +@Test +public void testElectionFraud() throws IOException, InterruptedException { +// capture QuorumPeer logging +Layout layout = Logger.getRootLogger().getAppender("CONSOLE").getLayout(); +ByteArrayOutputStream os = new ByteArrayOutputStream(); +WriterAppender appender = new WriterAppender(layout, os); +appender.setThreshold(Level.INFO); +Logger qlogger = Logger.getLogger(QuorumPeer.class); +qlogger.addAppender(appender); + +int numServers = 3; + +// used for assertions later +boolean foundLeading = false; +boolean foundLooking = false; +boolean foundFollowing = false; + +try { + // spin up a quorum, we use a small ticktime to make the test run faster + Servers servers = LaunchServers(numServers, 500); --- End diff -- Here too. ---
[GitHub] zookeeper pull request #433: ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLead...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/433#discussion_r157001258 --- Diff: src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java --- @@ -335,6 +336,100 @@ public void testHighestZxidJoinLate() throws Exception { output[0], 2); } +/** + * This test validates that if a quorum member determines that it is leader without the support of the rest of the + * quorum (the other members do not believe it to be the leader) it will stop attempting to lead and become a follower. + * + * @throws IOException + * @throws InterruptedException + */ +@Test +public void testElectionFraud() throws IOException, InterruptedException { +// capture QuorumPeer logging +Layout layout = Logger.getRootLogger().getAppender("CONSOLE").getLayout(); +ByteArrayOutputStream os = new ByteArrayOutputStream(); +WriterAppender appender = new WriterAppender(layout, os); +appender.setThreshold(Level.INFO); +Logger qlogger = Logger.getLogger(QuorumPeer.class); +qlogger.addAppender(appender); + +int numServers = 3; --- End diff -- Please use the global field. ---
[GitHub] zookeeper issue #415: ZOOKEEPER-2939: Added last/min/max proposal size JMX b...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/415 @phunt I added proposal stats to 'mntr' command output as requested. In the meantime I found the following metrics lib which could be useful and easily integrated into Zookeeper: http://metrics.dropwizard.io/3.2.3/getting-started.html Do you think it's worth merging this PR and refactor it in a separate one to use that lib? ---
[GitHub] zookeeper pull request #430: ZOOKEEPER-2893. very poor choice of logging if ...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/430#discussion_r157266599 --- Diff: src/java/main/org/apache/zookeeper/ClientCnxn.java --- @@ -1231,14 +1231,14 @@ public void run() { LOG.info(e.getMessage() + RETRY_CONN_MSG); } else if (e instanceof RWServerFoundException) { LOG.info(e.getMessage()); +} else if (e instanceof SocketException) { +LOG.info("Socket error occurred: {}: {}", serverAddress, e.getMessage()); } else { -LOG.warn( -"Session 0x" -+ Long.toHexString(getSessionId()) -+ " for server " -+ clientCnxnSocket.getRemoteSocketAddress() -+ ", unexpected error" -+ RETRY_CONN_MSG, e); +LOG.warn("Session 0x{} for server {}, unexpected error{}", --- End diff -- No, because the RETRY_CONN_MSG should go right after it. (starts with ", ") ---
[GitHub] zookeeper issue #415: ZOOKEEPER-2939: Added last/min/max proposal size JMX b...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/415 @phunt I asked around quickly: HBase - own library (looks like they've started to use dropwizard somehow) HDFS - Hadoop Metrics2 (included in hadoop-common) Kafka - Yammer (dropwizard predecessor) Cassandra - Yammer I'll give dropwizard a try first. ---
[GitHub] zookeeper issue #430: ZOOKEEPER-2893. very poor choice of logging if client ...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/430 @paulmillar Please approve it, if you're happy with the change. @phunt Do you think it can be committed? ---
[GitHub] zookeeper pull request #440: ZOOKEEPER-2939 Deal with maxbuffer as it relate...
GitHub user anmolnar opened a pull request: https://github.com/apache/zookeeper/pull/440 ZOOKEEPER-2939 Deal with maxbuffer as it relates to proposals - Use dropwizard stats library This PR is intended to be the successor of https://github.com/apache/zookeeper/pull/415 Using dropwizard library's Histogram component we're able to provide more sophisticated statistics on Proposal sizes. From the docs: "A histogram measures the statistical distribution of values in a stream of data. In addition to minimum, maximum, mean, etc., it also measures median, 75th, 90th, 95th, 98th, 99th, and 99.9th percentiles." http://metrics.dropwizard.io/3.1.0/manual/core/#histograms You can merge this pull request into a Git repository by running: $ git pull https://github.com/anmolnar/zookeeper ZOOKEEPER-2939-dropwizard Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/440.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #440 commit 6738e2d36b07f4643a97afd7e251a9d6c1c115f2 Author: Andor Molnár Date: 2017-11-06T16:31:44Z ZOOKEEPER-2939: Added last/min/max proposal size JMX beans - Refactor proposal serialization logic to a common place, - Add JMX metric to monitor jute.maxbuffer setting - Add JMX metric to monitor min/max/last size of generated proposals - This patch deals with server side maxbuffer monitoring only. commit 68328d9027bf2a26d3425b9c83b4a9d5a526fc9b Author: Andor Molnár Date: 2017-12-01T11:59:48Z ZOOKEEPER-2939. Fixed findbugs issue: newline in format string commit ece8f09cb09c456ccb77903c9a2f6977c34c1543 Author: Andor Molnar Date: 2017-12-15T18:11:17Z ZOOKEEPER-2933. Added proposal size statistics to 'mntr' command commit 89376898995ecb60e6f26252853fce1d6d85ba2b Author: Andor Molnar Date: 2017-12-18T16:34:24Z ZOOKEEPER-2939. Added dropwizard library to calculate histogram of proposal sizes commit 70faba6c8f7478d44edf5fd22557df533bbbc6ce Author: Andor Molnar Date: 2018-01-02T15:24:12Z ZOOKEEPER-2939. Fixed/added unit tests for Stat and Monitor commands ---
[GitHub] zookeeper pull request #441: ZOOKEEPER-2960. The dataDir and dataLogDir are ...
GitHub user anmolnar opened a pull request: https://github.com/apache/zookeeper/pull/441 ZOOKEEPER-2960. The dataDir and dataLogDir are used opposingly Previously a small refactoring has swapped the usage of dataDir and dataLogDir when QuorumPeerMain instantiates FileTxnSnapLog class. This PR fixes it + adds unit test for verification. You can merge this pull request into a Git repository by running: $ git pull https://github.com/anmolnar/zookeeper ZOOKEEPER-2960 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/441.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #441 commit 516147499541f7eb2b30f7a8cb4f50ed204469bb Author: Andor Molnar Date: 2018-01-03T14:47:26Z ZOOKEEPER-2960. Fixed usage of dataDir and dataLogDir when instantiating FileTxnSnapLog ---
[GitHub] zookeeper pull request #441: ZOOKEEPER-2960. The dataDir and dataLogDir are ...
Github user anmolnar closed the pull request at: https://github.com/apache/zookeeper/pull/441 ---
[GitHub] zookeeper issue #438: ZOOKEEPER-2959: ignore accepted epoch and LEADERINFO a...
Github user anmolnar commented on the issue: https://github.com/apache/zookeeper/pull/438 Hi @xyq000 Thanks for the contribution. I think fixing this issues makes sense, would you please add at least one unit test to reproduce the problem? ---
[GitHub] zookeeper pull request #439: ZOOKEEPER-1621: Delete and skip txn log with in...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/439#discussion_r160243609 --- Diff: src/java/test/org/apache/zookeeper/test/LoadFromLogTest.java --- @@ -307,4 +315,104 @@ public void testReloadSnapshotWithMissingParent() throws Exception { startServer(); } + +/** + * Verify that FileTxnIterator doesn't throw an EOFException when the + * transaction log header is incomplete. + */ +@Test +public void testIncompleteHeader() throws Exception { +ClientBase.setupTestEnv(); +File dataDir = ClientBase.createTmpDir(); +loadDatabase(dataDir, NUM_MESSAGES); + +File logDir = new File(dataDir, FileTxnSnapLog.version + +FileTxnSnapLog.VERSION); +FileTxnLog.FileTxnIterator fileItr = new FileTxnLog.FileTxnIterator(logDir, 0); +List logFiles = fileItr.getStoredFiles(); +int numTransactions = 0; +while (fileItr.next()) { +numTransactions++; +} +Assert.assertTrue("Verify the number of log files", + logFiles.size() > 0); +Assert.assertTrue("Verify the number of transactions", + numTransactions >= NUM_MESSAGES); + +// Truncate the last log file. +File lastLogFile = logFiles.get(logFiles.size() - 1); +FileChannel channel = new FileOutputStream(lastLogFile).getChannel(); +channel.truncate(0); +channel.close(); + +// This shouldn't thow Exception. +fileItr = new FileTxnLog.FileTxnIterator(logDir, 0); +logFiles = fileItr.getStoredFiles(); +numTransactions = 0; +while (fileItr.next()) { +} + +// Verify that the truncated log file does not exist anymore. +Assert.assertFalse("Verify truncated log file has been deleted", + lastLogFile.exists()); +} + +/** + * Verifies that FileTxnIterator throws CorruptedStreamException if the + * magic number is corrupted. + */ +@Test(expected = StreamCorruptedException.class) +public void testCorruptMagicNumber() throws Exception { +ClientBase.setupTestEnv(); +File dataDir = ClientBase.createTmpDir(); +loadDatabase(dataDir, NUM_MESSAGES); + +File logDir = new File(dataDir, FileTxnSnapLog.version + +FileTxnSnapLog.VERSION); +FileTxnLog.FileTxnIterator fileItr = new FileTxnLog.FileTxnIterator(logDir, 0); +List logFiles = fileItr.getStoredFiles(); +Assert.assertTrue("Verify the number of log files", + logFiles.size() > 0); + +// Corrupt the magic number. +File lastLogFile = logFiles.get(logFiles.size() - 1); +RandomAccessFile file = new RandomAccessFile(lastLogFile, "rw"); +file.seek(0); +file.writeByte(123); +file.close(); + +// This should throw CorruptedStreamException. +while (fileItr.next()) { +} +} + +/** + * Starts a standalone server and create znodes. + */ +public void loadDatabase(File dataDir, int numEntries) throws Exception { +final String hostPort = HOST + PortAssignment.unique(); +// setup a single server cluster +ZooKeeperServer zks = new ZooKeeperServer(dataDir, dataDir, 3000); +SyncRequestProcessor.setSnapCount(100); +final int PORT = Integer.parseInt(hostPort.split(":")[1]); +ServerCnxnFactory f = ServerCnxnFactory.createFactory(PORT, -1); +f.startup(zks); +Assert.assertTrue("waiting for server being up ", +ClientBase.waitForServerUp(hostPort,CONNECTION_TIMEOUT)); +ZooKeeper zk = ClientBase.createZKClient(hostPort); --- End diff -- Down to this line the logic is already implemented in the base class. Please consider re-using it in your tests. ---
[GitHub] zookeeper pull request #439: ZOOKEEPER-1621: Delete and skip txn log with in...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/439#discussion_r160244289 --- Diff: src/java/test/org/apache/zookeeper/test/LoadFromLogTest.java --- @@ -38,10 +40,16 @@ import org.slf4j.LoggerFactory; import java.io.File; +import java.io.FileOutputStream; import java.io.IOException; +import java.io.RandomAccessFile; +import java.io.StreamCorruptedException; +import java.nio.channels.FileChannel; +import java.util.List; public class LoadFromLogTest extends ClientBase { private static final int NUM_MESSAGES = 300; +private static final String HOST = "127.0.0.1:"; --- End diff -- It's already in the base class. ---
[GitHub] zookeeper pull request #439: ZOOKEEPER-1621: Delete and skip txn log with in...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/439#discussion_r160244385 --- Diff: src/java/test/org/apache/zookeeper/test/LoadFromLogTest.java --- @@ -307,4 +315,104 @@ public void testReloadSnapshotWithMissingParent() throws Exception { startServer(); } + +/** + * Verify that FileTxnIterator doesn't throw an EOFException when the + * transaction log header is incomplete. + */ +@Test +public void testIncompleteHeader() throws Exception { +ClientBase.setupTestEnv(); +File dataDir = ClientBase.createTmpDir(); +loadDatabase(dataDir, NUM_MESSAGES); --- End diff -- Startup / shutdown logic should be in setUp() / tearDown() methods. ---
[GitHub] zookeeper pull request #439: ZOOKEEPER-1621: Delete and skip txn log with in...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/439#discussion_r160244035 --- Diff: src/java/test/org/apache/zookeeper/test/LoadFromLogTest.java --- @@ -307,4 +315,104 @@ public void testReloadSnapshotWithMissingParent() throws Exception { startServer(); } + +/** + * Verify that FileTxnIterator doesn't throw an EOFException when the + * transaction log header is incomplete. + */ +@Test +public void testIncompleteHeader() throws Exception { +ClientBase.setupTestEnv(); +File dataDir = ClientBase.createTmpDir(); +loadDatabase(dataDir, NUM_MESSAGES); + +File logDir = new File(dataDir, FileTxnSnapLog.version + +FileTxnSnapLog.VERSION); +FileTxnLog.FileTxnIterator fileItr = new FileTxnLog.FileTxnIterator(logDir, 0); +List logFiles = fileItr.getStoredFiles(); +int numTransactions = 0; +while (fileItr.next()) { +numTransactions++; +} +Assert.assertTrue("Verify the number of log files", + logFiles.size() > 0); +Assert.assertTrue("Verify the number of transactions", + numTransactions >= NUM_MESSAGES); + +// Truncate the last log file. +File lastLogFile = logFiles.get(logFiles.size() - 1); +FileChannel channel = new FileOutputStream(lastLogFile).getChannel(); +channel.truncate(0); +channel.close(); + +// This shouldn't thow Exception. +fileItr = new FileTxnLog.FileTxnIterator(logDir, 0); +logFiles = fileItr.getStoredFiles(); +numTransactions = 0; +while (fileItr.next()) { +} + +// Verify that the truncated log file does not exist anymore. +Assert.assertFalse("Verify truncated log file has been deleted", + lastLogFile.exists()); +} + +/** + * Verifies that FileTxnIterator throws CorruptedStreamException if the + * magic number is corrupted. + */ +@Test(expected = StreamCorruptedException.class) +public void testCorruptMagicNumber() throws Exception { +ClientBase.setupTestEnv(); +File dataDir = ClientBase.createTmpDir(); +loadDatabase(dataDir, NUM_MESSAGES); + +File logDir = new File(dataDir, FileTxnSnapLog.version + +FileTxnSnapLog.VERSION); +FileTxnLog.FileTxnIterator fileItr = new FileTxnLog.FileTxnIterator(logDir, 0); +List logFiles = fileItr.getStoredFiles(); +Assert.assertTrue("Verify the number of log files", + logFiles.size() > 0); + +// Corrupt the magic number. +File lastLogFile = logFiles.get(logFiles.size() - 1); +RandomAccessFile file = new RandomAccessFile(lastLogFile, "rw"); +file.seek(0); +file.writeByte(123); +file.close(); + +// This should throw CorruptedStreamException. +while (fileItr.next()) { +} +} + +/** + * Starts a standalone server and create znodes. + */ +public void loadDatabase(File dataDir, int numEntries) throws Exception { +final String hostPort = HOST + PortAssignment.unique(); +// setup a single server cluster +ZooKeeperServer zks = new ZooKeeperServer(dataDir, dataDir, 3000); +SyncRequestProcessor.setSnapCount(100); +final int PORT = Integer.parseInt(hostPort.split(":")[1]); +ServerCnxnFactory f = ServerCnxnFactory.createFactory(PORT, -1); +f.startup(zks); +Assert.assertTrue("waiting for server being up ", +ClientBase.waitForServerUp(hostPort,CONNECTION_TIMEOUT)); +ZooKeeper zk = ClientBase.createZKClient(hostPort); + +// Generate some transactions that will get logged. +try { +for (int i = 0; i < numEntries; i++) { +zk.create("/load-database-" + i, new byte[0], + Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT); +} +} finally { +zk.close(); +} +f.shutdown(); --- End diff -- Starting the server is already implemented in base class' setUp() method and shutdown is already in tearDown(). It's safer to let junit's infrastructure to deal with startup / shutdown methods which makes sure the server(s) gets shutdown even if some error occurs. ---
[GitHub] zookeeper pull request #439: ZOOKEEPER-1621: Delete and skip txn log with in...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/439#discussion_r160243418 --- Diff: src/java/test/org/apache/zookeeper/test/LoadFromLogTest.java --- @@ -307,4 +315,104 @@ public void testReloadSnapshotWithMissingParent() throws Exception { startServer(); } + +/** + * Verify that FileTxnIterator doesn't throw an EOFException when the + * transaction log header is incomplete. + */ +@Test +public void testIncompleteHeader() throws Exception { +ClientBase.setupTestEnv(); +File dataDir = ClientBase.createTmpDir(); +loadDatabase(dataDir, NUM_MESSAGES); + +File logDir = new File(dataDir, FileTxnSnapLog.version + +FileTxnSnapLog.VERSION); +FileTxnLog.FileTxnIterator fileItr = new FileTxnLog.FileTxnIterator(logDir, 0); +List logFiles = fileItr.getStoredFiles(); +int numTransactions = 0; +while (fileItr.next()) { +numTransactions++; +} +Assert.assertTrue("Verify the number of log files", + logFiles.size() > 0); +Assert.assertTrue("Verify the number of transactions", + numTransactions >= NUM_MESSAGES); + +// Truncate the last log file. +File lastLogFile = logFiles.get(logFiles.size() - 1); +FileChannel channel = new FileOutputStream(lastLogFile).getChannel(); +channel.truncate(0); +channel.close(); + +// This shouldn't thow Exception. +fileItr = new FileTxnLog.FileTxnIterator(logDir, 0); +logFiles = fileItr.getStoredFiles(); +numTransactions = 0; +while (fileItr.next()) { +} + +// Verify that the truncated log file does not exist anymore. +Assert.assertFalse("Verify truncated log file has been deleted", + lastLogFile.exists()); +} + +/** + * Verifies that FileTxnIterator throws CorruptedStreamException if the + * magic number is corrupted. + */ +@Test(expected = StreamCorruptedException.class) +public void testCorruptMagicNumber() throws Exception { +ClientBase.setupTestEnv(); +File dataDir = ClientBase.createTmpDir(); +loadDatabase(dataDir, NUM_MESSAGES); + +File logDir = new File(dataDir, FileTxnSnapLog.version + +FileTxnSnapLog.VERSION); +FileTxnLog.FileTxnIterator fileItr = new FileTxnLog.FileTxnIterator(logDir, 0); +List logFiles = fileItr.getStoredFiles(); +Assert.assertTrue("Verify the number of log files", + logFiles.size() > 0); + +// Corrupt the magic number. +File lastLogFile = logFiles.get(logFiles.size() - 1); +RandomAccessFile file = new RandomAccessFile(lastLogFile, "rw"); +file.seek(0); +file.writeByte(123); +file.close(); + +// This should throw CorruptedStreamException. +while (fileItr.next()) { +} +} + +/** + * Starts a standalone server and create znodes. + */ +public void loadDatabase(File dataDir, int numEntries) throws Exception { --- End diff -- This method could be private. ---
[GitHub] zookeeper pull request #439: ZOOKEEPER-1621: Delete and skip txn log with in...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/439#discussion_r160245528 --- Diff: src/java/test/org/apache/zookeeper/test/LoadFromLogTest.java --- @@ -307,4 +315,104 @@ public void testReloadSnapshotWithMissingParent() throws Exception { startServer(); } + +/** + * Verify that FileTxnIterator doesn't throw an EOFException when the + * transaction log header is incomplete. + */ +@Test +public void testIncompleteHeader() throws Exception { +ClientBase.setupTestEnv(); +File dataDir = ClientBase.createTmpDir(); +loadDatabase(dataDir, NUM_MESSAGES); + +File logDir = new File(dataDir, FileTxnSnapLog.version + +FileTxnSnapLog.VERSION); +FileTxnLog.FileTxnIterator fileItr = new FileTxnLog.FileTxnIterator(logDir, 0); +List logFiles = fileItr.getStoredFiles(); +int numTransactions = 0; +while (fileItr.next()) { +numTransactions++; +} +Assert.assertTrue("Verify the number of log files", + logFiles.size() > 0); +Assert.assertTrue("Verify the number of transactions", + numTransactions >= NUM_MESSAGES); + +// Truncate the last log file. +File lastLogFile = logFiles.get(logFiles.size() - 1); +FileChannel channel = new FileOutputStream(lastLogFile).getChannel(); +channel.truncate(0); +channel.close(); + +// This shouldn't thow Exception. +fileItr = new FileTxnLog.FileTxnIterator(logDir, 0); +logFiles = fileItr.getStoredFiles(); +numTransactions = 0; --- End diff -- logFiles & numTransactions variables are not being used in the rest of this test. ---
[GitHub] zookeeper pull request #446: ZOOKEEPER-1580:QuorumPeer.setRunning is not use...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/446#discussion_r160366222 --- Diff: src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java --- @@ -1751,7 +1751,7 @@ public synchronized void initConfigInZKDatabase() { if (zkDb != null) zkDb.initConfigInZKDatabase(getQuorumVerifier()); } -public void setRunning(boolean running) { +private void setRunning(boolean running) { --- End diff -- I can't see the point of using a private setter. Set the field directly instead. ---
[GitHub] zookeeper pull request #440: ZOOKEEPER-2939 Deal with maxbuffer as it relate...
Github user anmolnar commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/440#discussion_r160398663 --- Diff: build.xml --- @@ -198,7 +198,7 @@ xmlns:cs="antlib:com.puppycrawl.tools.checkstyle.ant"> - + --- End diff -- Not sure how to get an evidence, but if remember correctly it was dropwizard-core which forced me to use an earlier version of slf4j. ---