[
https://issues.apache.org/jira/browse/ZOOKEEPER-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15316819#comment-15316819
]
Martin Kuchta commented on ZOOKEEPER-2355:
------------------------------------------
[~arshad.mohammad]:
I'm looking into fixing this since I'm seeing the same issue. I understand the
reasoning behind your fix, but it seems to be causing some other tests to fail
consistently when applied to trunk. The Jenkins build is too old to view, but
I'm guessing it failed for similar reasons. Were you seeing these failures and
did you look at what was happening? I've only scratched the surface with
investigating this bug and your patch, but I wanted to check to avoid repeating
any work you had already done. I'll keep investigating to see if I can find a
solution.
Failures listed below:
Zab1_0Test:
{noformat}
Testcase: testNormalFollowerRun took 4.198 sec
FAILED
expected:<4294967297> but was:<4294967296>
junit.framework.AssertionFailedError: expected:<4294967297> but was:<4294967296>
at
org.apache.zookeeper.server.quorum.Zab1_0Test$4.converseWithFollower(Zab1_0Test.java:705)
at
org.apache.zookeeper.server.quorum.Zab1_0Test.testFollowerConversation(Zab1_0Test.java:511)
at
org.apache.zookeeper.server.quorum.Zab1_0Test.testNormalFollowerRun(Zab1_0Test.java:643)
at
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79)
Testcase: testNormalFollowerRunWithDiff took 4.073 sec
FAILED
expected:<4294967298> but was:<4294967296>
junit.framework.AssertionFailedError: expected:<4294967298> but was:<4294967296>
at
org.apache.zookeeper.server.quorum.Zab1_0Test$5.converseWithFollower(Zab1_0Test.java:847)
at
org.apache.zookeeper.server.quorum.Zab1_0Test.testFollowerConversation(Zab1_0Test.java:511)
at
org.apache.zookeeper.server.quorum.Zab1_0Test.testNormalFollowerRunWithDiff(Zab1_0Test.java:771)
at
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79)
Testcase: testNormalObserverRun took 4.054 sec
FAILED
expected:<4294967298> but was:<4294967296>
junit.framework.AssertionFailedError: expected:<4294967298> but was:<4294967296>
at
org.apache.zookeeper.server.quorum.Zab1_0Test$8.converseWithObserver(Zab1_0Test.java:1072)
at
org.apache.zookeeper.server.quorum.Zab1_0Test.testObserverConversation(Zab1_0Test.java:562)
at
org.apache.zookeeper.server.quorum.Zab1_0Test.testNormalObserverRun(Zab1_0Test.java:997)
at
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79)
{noformat}
ZxidRolloverTest:
{noformat}
Testcase: testRolloverThenFollowerRestart took 23.677 sec
Caused an ERROR
KeeperErrorCode = ConnectionLoss for /foofoofoo-connected
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode =
ConnectionLoss for /foofoofoo-connected
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1846)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1874)
at
org.apache.zookeeper.server.ZxidRolloverTest.checkClientConnected(ZxidRolloverTest.java:119)
at
org.apache.zookeeper.server.ZxidRolloverTest.checkClientsConnected(ZxidRolloverTest.java:90)
at
org.apache.zookeeper.server.ZxidRolloverTest.start(ZxidRolloverTest.java:165)
at
org.apache.zookeeper.server.ZxidRolloverTest.testRolloverThenFollowerRestart(ZxidRolloverTest.java:345)
at
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79)
{noformat}
WatchEventWhenAutoResetTest:
{noformat}
Testcase: testNodeChildrenChanged took 0.001 sec
Caused an ERROR
Timeout occurred. Please note the time in the report does not reflect the time
until the timeout.
junit.framework.AssertionFailedError: Timeout occurred. Please note the time in
the report does not reflect the time until the timeout.
{noformat}
> Ephemeral node is never deleted if follower fails while reading the proposal
> packet
> -----------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-2355
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2355
> Project: ZooKeeper
> Issue Type: Bug
> Components: quorum, server
> Reporter: Arshad Mohammad
> Assignee: Arshad Mohammad
> Priority: Critical
> Fix For: 3.4.9
>
> Attachments: ZOOKEEPER-2355-01.patch, ZOOKEEPER-2355-02.patch
>
>
> ZooKeeper ephemeral node is never deleted if follower fail while reading the
> proposal packet
> The scenario is as follows:
> # Configure three node ZooKeeper cluster, lets say nodes are A, B and C,
> start all, assume A is leader, B and C are follower
> # Connect to any of the server and create ephemeral node /e1
> # Close the session, ephemeral node /e1 will go for deletion
> # While receiving delete proposal make Follower B to fail with
> {{SocketTimeoutException}}. This we need to do to reproduce the scenario
> otherwise in production environment it happens because of network fault.
> # Remove the fault, just check that faulted Follower is now connected with
> quorum
> # Connect to any of the server, create the same ephemeral node /e1, created
> is success.
> # Close the session, ephemeral node /e1 will go for deletion
> # {color:red}/e1 is not deleted from the faulted Follower B, It should have
> been deleted as it was again created with another session{color}
> # {color:green}/e1 is deleted from Leader A and other Follower C{color}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)