[jira] [Created] (ZOOKEEPER-3822) Zookeeper 3.6.1 EndOfStreamException

Sebastian Schmitz (Jira) Thu, 07 May 2020 13:23:26 -0700

Sebastian Schmitz created ZOOKEEPER-3822:
--------------------------------------------


             Summary: Zookeeper 3.6.1 EndOfStreamException
                 Key: ZOOKEEPER-3822
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3822
             Project: ZooKeeper
          Issue Type: Bug
    Affects Versions: 3.6.1
            Reporter: Sebastian Schmitz
         Attachments: zookeeper.log

Hello,

after Zookeeper 3.6.1 solved the issue with leader-election containing the IP 
and so causing it to fail in separate networks, like in our docker-setup I 
updated from 3.4.14 to 3.6.1 in Dev- and Test-Environments. It all went 
smoothly and ran for one day. This night I had a new Update of the environment 
as we deploy as a whole package of all containers (Kafka, Zookeeper, 
Mirrormaker etc.) we also replace the Zookeeper-Containers with latest ones. In 
this case, there was no change, the containers were just removed and deployed 
again. As the config and data of zookeeper is not stored inside the containers 
that's not a problem but this night it broke the whole clusters of Zookeeper 
and so also Kafka was down.
 * zookeeper_node_1 was stopped and the container removed and created again
 * zookeeper_node_1 starts up and the election takes place
 * zookeeper_node_2 is elected as leader again
 * zookeeper_node_2 is stopped and the container removed and created again
 * zookeeper_node_3 is elected as the leader while zookeeper_node_2 is down
 * zookeeper_node_2 starts up and zookeeper_node_3 remains leader

And from there all servers just report

2020-05-07 14:07:57,187 [myid:3] - WARN  [NIOWorkerThread-2:NIOServerCnxn@364] 
- Unexpected exception2020-05-07 14:07:57,187 [myid:3] - WARN  
[NIOWorkerThread-2:NIOServerCnxn@364] - Unexpected 
exceptionEndOfStreamException: Unable to read additional data from client, it 
probably closed the socket: address = /z.z.z.z:46060, session = 
0x2014386bbde0000 at 
org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:163)
 at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:326) at 
org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
 at 
org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
  at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown 
Source)  at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)  
at java.base/java.lang.Thread.run(Unknown Source)

and don't recover.

I was able to recover the cluster in Test-Environment by stopping and starting 
all the zookeeper-nodes. The cluster in dev is still in that state and I'm 
checking the logs to find out more...

The full log of the deployment that started at 02:00 is attached. The first 
time in local NZ-time and the second one is UTC. the IPs I replaced are x.x.x.x 
for node_1, y.y.y.y for node_2 and z.z.z.z for node_3

The Kafka-Servers are running on the same machine. Which means that the 
EndOfStreamEceptions could also be connections from Kafka as I don't think that 
zookeeper_node_3 establish a session with itself?

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (ZOOKEEPER-3822) Zookeeper 3.6.1 EndOfStreamException

Reply via email to