Stéphane Loeuillet created ZOOKEEPER-4783:
---------------------------------------------
Summary: leader crash because of zxid 32b rollover but no other
server takes the lead
Key: ZOOKEEPER-4783
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4783
Project: ZooKeeper
Issue Type: Bug
Affects Versions: 3.8.3
Environment: Linux amd64 Ubuntu 20.04.5
Java OpenJDK17U-jre_x64_linux_hotspot_17.0.8.1_1.tar.gz
Reporter: Stéphane Loeuillet
Attachments: zookeeper_crash.log
Got a 5 node cluster running on baremetal servers (with NVMe) used by a
ClickHouse cluster on a separate cluster.
This morning, a crash on the leader did let my clusters unusable as while the
leader crashed, none of the 4 followers did take the lead
zookeeper leader was zookeeper08
05/06/07/09 were the followers
Only a restart of zookeeper05 process did unfreeze the whole cluster
--
This message was sent by Atlassian Jira
(v8.20.10#820010)