Antoine DESSAIGNE created ZOOKEEPER-3725:
--------------------------------------------

             Summary: Zookeeper fails to establish quorum with 2 servers using 
3.5.6
                 Key: ZOOKEEPER-3725
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3725
             Project: ZooKeeper
          Issue Type: Bug
    Affects Versions: 3.5.6
            Reporter: Antoine DESSAIGNE
         Attachments: failure-3.5.6.txt, success-3.4.14.txt, success-3.5.6.txt

Hello everyone,

We noticed that with Zookeeper 3.5.6, it fails to establish quorum on a new 
deployment on a regular basis (approx 50% of the time)

We were able to reduce the reproduction steps to the bare minimum we could. 
Consider the following docker-compose.yml file
{noformat}
version: '2'
services:
  orchestrator1.cameltest.int:
    image: zookeeper:3.5.6
    environment:
      ZOO_MY_ID: 1
      ZOO_SERVERS: server.1=0.0.0.0:2888:3888 
server.2=orchestrator2.cameltest.int:2888:3888
  orchestrator2.cameltest.int:
    image: zookeeper:3.5.6
    environment:
      ZOO_MY_ID: 2
      ZOO_SERVERS: server.1=orchestrator1.cameltest.int:2888:3888 
server.2=0.0.0.0:2888:3888
{noformat}
When launching it (with {{docker-compose up}}) it fails half of the time with 
3.5.6 and never in 3.4.14.

You'll find attached 3 logs:
* a failure one using 3.5.6
* a success one using 3.5.6
* a success one 3.4.14

I don't think it's related to some docker/docker-compose issue (as it's working 
using 3.4.14 on the same server)

I'll try to check each intermediate release to pin a more specific version.

Unfortunately, I don't know yet my way in the Zookeeper code, what can I do to 
help? Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to