Hi,
I have noticed the following pattern in our cluster today:
The leader reports:
> Unexpected exception causing shutdown while sock still open
> java.io.EOFException
> ******* GOODBYE ... ********
All the other server report:
> Exception when following the leader
> java.net.SocketTimeoutException: Read timed out
This is a five node cluster running ZK 3.3.3 (yes, it's very old, sorry).
It all happened within the a second across the whole cluster. Does that
sound like a network issue?
As a side effect, the database got corrupted as well. Anybody knows if
this is a known issue in 3.3.3? I checked the release notes and JIRA
tickets but didn't found anything that looks like the pattern we saw.
-Gunnar
--
Gunnar Wagenknecht
[email protected]
http://wagenknecht.org/