[
https://issues.apache.org/jira/browse/CURATOR-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16279066#comment-16279066
]
Borja Bravo Alférez commented on CURATOR-444:
---------------------------------------------
Just for reference. I am strugging to create a deterministic and faster test.
But at least I have seen a pattern after injecting connectionStateListeners.
Errors appear 95% of the time with the following sequence.
Client A: Connection SUSPENDED
Client A: Connection RECONNECTED
Client B or A: Connection SUSPENDED
Client A or B: Connection SUSPENDED
Client A: Connection: RECONNECTED
Client B: Connection: RECONNECTED
Resulting in two toLeader() events in both clients. I have tried to reduce the
timming to make it faster but then errors tend to disappear :(
> LeaderLatch sends events that leads to simultaneously leadership after
> blocking zookeeper peer communication
> ------------------------------------------------------------------------------------------------------------
>
> Key: CURATOR-444
> URL: https://issues.apache.org/jira/browse/CURATOR-444
> Project: Apache Curator
> Issue Type: Bug
> Components: Recipes
> Affects Versions: 2.12.0, 4.0.0
> Environment: Linux
> Reporter: Borja Bravo Alférez
> Priority: Critical
> Attachments: blockZookeeperPort.sh
>
>
> How to reproduce the error:
> - 3 Zookeeper nodes.
> - Using LeaderLatch recipe with a listener.
> - We block the zookeeper network with the attached script. It blocks
> communication between zookeepers. Note that comunication is lost only for 10
> seconds that is exactly our default session timeout.
> Our curator configuration:
> - Base sleep 100ms
> - Max sleep 5000ms
> - Default connection timeout 15000ms
> - Default Session timeout 10000ms
> I am working with a pull request. Fix seems trivial but creating the test not
> so much.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)