[ https://issues.apache.org/jira/browse/ZOOKEEPER-4444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17881320#comment-17881320 ]
Margaret Figura edited comment on ZOOKEEPER-4444 at 9/12/24 2:30 PM: --------------------------------------------------------------------- Hi, We seem to have also run into the same issue. ZooKeeper version is 3.8.1, and it is running on Kubernetes using the Bitnami chart. It is a 3 broker ensemble. The issue persisted until we deleted the data (PVC) of the affected broker. One thing that is odd... Unfortunately I'm unable to access the system to check current status, but the timestamps on the logs we collected are far from current time (8 months ago). I noticed [~fshlin]'s logs also have timestamps that don't match up with the time he posted his comment (by about 6 months). However, [~dbros]'s logs are at least close to the time he posted. So _possibly_ it has something to do with time synchronization across the nodes. Thank you! Our logs: {code:java} [38;5;6mzookeeper [38;5;5m02:45:38.53 [0m [38;5;6mzookeeper [38;5;5m02:45:38.53 [0m[1mWelcome to the Bitnami zookeeper container[0m [38;5;6mzookeeper [38;5;5m02:45:38.53 [0mSubscribe to project updates by watching [1mhttps://github.com/bitnami/containers[0m [38;5;6mzookeeper [38;5;5m02:45:38.54 [0mSubmit issues and feature requests at [1mhttps://github.com/bitnami/containers/issues[0m [38;5;6mzookeeper [38;5;5m02:45:38.54 [0m [38;5;6mzookeeper [38;5;5m02:45:38.54 [0m[38;5;2mINFO [0m ==> ** Starting ZooKeeper setup ** [38;5;6mzookeeper [38;5;5m02:45:38.59 [0m[38;5;3mWARN [0m ==> You have set the environment variable ALLOW_ANONYMOUS_LOGIN=yes. For safety reasons, do not use this flag in a production environment. [38;5;6mzookeeper [38;5;5m02:45:38.61 [0m[38;5;2mINFO [0m ==> Initializing ZooKeeper... [38;5;6mzookeeper [38;5;5m02:45:38.62 [0m[38;5;2mINFO [0m ==> No injected configuration file found, creating default config files... [38;5;6mzookeeper [38;5;5m02:45:38.68 [0m[38;5;2mINFO [0m ==> Adding server: kafka-zookeeper-0.kafka-zookeeper-headless.empirix-cloud.svc.cluster.local:2888:3888 with id: 1 [38;5;6mzookeeper [38;5;5m02:45:38.69 [0m[38;5;2mINFO [0m ==> Adding server: kafka-zookeeper-1.kafka-zookeeper-headless.empirix-cloud.svc.cluster.local:2888:3888 with id: 2 [38;5;6mzookeeper [38;5;5m02:45:38.69 [0m[38;5;2mINFO [0m ==> Adding server: kafka-zookeeper-2.kafka-zookeeper-headless.empirix-cloud.svc.cluster.local:2888:3888 with id: 3 [38;5;6mzookeeper [38;5;5m02:45:38.72 [0m[38;5;2mINFO [0m ==> Enabling Prometheus metrics... [38;5;6mzookeeper [38;5;5m02:45:38.75 [0m[38;5;2mINFO [0m ==> Deploying ZooKeeper with persisted data... [38;5;6mzookeeper [38;5;5m02:45:38.75 [0m[38;5;2mINFO [0m ==> ** ZooKeeper setup finished! **[38;5;6mzookeeper [38;5;5m02:45:38.77 [0m[38;5;2mINFO [0m ==> ** Starting ZooKeeper ** /opt/bitnami/java/bin/java ZooKeeper JMX enabled by default Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg 2024-01-11 02:45:39,746 [myid:1] - ERROR [main:o.a.z.s.DataTree@1807] - First digest mismatch on txn: 74196379424522240,3141,64424510482,1701895146384,14 , v{s{13,#000112f636f6e74726f6c6c65725f65706f636800022},s{5,#0003d2f62726f6b6572732f746f706963732f4f64735f4564725f736d70705f736d735f7864725f696e6465782f706172746974696f6e732f312f73746174650004b7b22636f6e74726f6c6c65725f65706f6368223a33332c226c6561646572223a312c2276657273696f6e223a312c226c65616465725f65706f6368223a3235312c22697372223a5b315d7d00179}} , expected digest is 2,1621536443130 , actual digest is 1632889486483, 2024-01-11 02:45:39,755 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510482(next log) for type 14 2024-01-11 02:45:39,755 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510483(next log) for type 14 2024-01-11 02:45:39,756 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510484(next log) for type 14 2024-01-11 02:45:39,756 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510485(next log) for type 14 2024-01-11 02:45:39,756 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510486(next log) for type 14 2024-01-11 02:45:39,756 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510487(next log) for type 14 2024-01-11 02:45:39,757 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510488(next log) for type 14 2024-01-11 02:45:39,757 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510489(next log) for type 14 2024-01-11 02:45:39,757 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510490(next log) for type 14 2024-01-11 02:45:39,758 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510491(next log) for type 14 2024-01-11 02:45:39,758 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510492(next log) for type 14 2024-01-11 02:45:39,758 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510493(next log) for type 14 2024-01-11 02:45:39,758 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510494(next log) for type 14 2024-01-11 02:45:39,758 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510495(next log) for type 14 2024-01-11 02:45:39,759 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510496(next log) for type 14 2024-01-11 02:45:39,759 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510497(next log) for type 14 2024-01-11 02:45:39,759 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510498(next log) for type 14 2024-01-11 02:45:39,759 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510499(next log) for type 14 2024-01-11 02:45:39,759 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510500(next log) for type 14 2024-01-11 02:45:39,759 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510501(next log) for type 14 2024-01-11 02:45:39,760 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510502(next log) for type 14 2024-01-11 02:45:39,760 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510503(next log) for type 14 2024-01-11 02:45:39,760 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510504(next log) for type 14 2024-01-11 02:45:39,760 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510505(next log) for type 14 2024-01-11 02:45:39,760 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510506(next log) for type 14 2024-01-11 02:45:39,760 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510507(next log) for type 14 2024-01-11 02:45:39,760 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510508(next log) for type 14 2024-01-11 02:45:39,761 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510509(next log) for type 14 2024-01-11 02:45:39,761 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510510(next log) for type 14 2024-01-11 02:45:39,761 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510511(next log) for type 14 2024-01-11 02:45:39,761 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510512(next log) for type 14 2024-01-11 02:45:39,761 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510513(next log) for type 14 2024-01-11 02:45:39,762 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510514(next log) for type 14 2024-01-11 02:45:39,762 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510515(next log) for type 14 2024-01-11 02:45:39,762 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510516(next log) for type 14 2024-01-11 02:45:39,762 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510517(next log) for type 14 2024-01-11 02:45:39,762 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510518(next log) for type 14 2024-01-11 02:45:39,762 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510519(next log) for type 14 2024-01-11 02:45:39,762 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510520(next log) for type 14 {code} was (Author: mfigura): Hi, We seem to have also run into the same issue. ZooKeeper version is 3.8.1, and it is running on Kubernetes using the Bitnami chart. It is a 3 broker ensemble. The issue persisted until we deleted the data (PVC) of the affected broker. One thing that is odd... Unfortunately I'm unable to access the system to check current status, but the timestamps on the logs we collected are far from current time (8 months ago). I noticed [~fshlin]'s logs also have timestamps that don't match up with the time he posted his comment (by about 6 months). However, [~dbros]'s logs are at least close to the time he posted. Thank you! Our logs: {code:java} [38;5;6mzookeeper [38;5;5m02:45:38.53 [0m [38;5;6mzookeeper [38;5;5m02:45:38.53 [0m[1mWelcome to the Bitnami zookeeper container[0m [38;5;6mzookeeper [38;5;5m02:45:38.53 [0mSubscribe to project updates by watching [1mhttps://github.com/bitnami/containers[0m [38;5;6mzookeeper [38;5;5m02:45:38.54 [0mSubmit issues and feature requests at [1mhttps://github.com/bitnami/containers/issues[0m [38;5;6mzookeeper [38;5;5m02:45:38.54 [0m [38;5;6mzookeeper [38;5;5m02:45:38.54 [0m[38;5;2mINFO [0m ==> ** Starting ZooKeeper setup ** [38;5;6mzookeeper [38;5;5m02:45:38.59 [0m[38;5;3mWARN [0m ==> You have set the environment variable ALLOW_ANONYMOUS_LOGIN=yes. For safety reasons, do not use this flag in a production environment. [38;5;6mzookeeper [38;5;5m02:45:38.61 [0m[38;5;2mINFO [0m ==> Initializing ZooKeeper... [38;5;6mzookeeper [38;5;5m02:45:38.62 [0m[38;5;2mINFO [0m ==> No injected configuration file found, creating default config files... [38;5;6mzookeeper [38;5;5m02:45:38.68 [0m[38;5;2mINFO [0m ==> Adding server: kafka-zookeeper-0.kafka-zookeeper-headless.empirix-cloud.svc.cluster.local:2888:3888 with id: 1 [38;5;6mzookeeper [38;5;5m02:45:38.69 [0m[38;5;2mINFO [0m ==> Adding server: kafka-zookeeper-1.kafka-zookeeper-headless.empirix-cloud.svc.cluster.local:2888:3888 with id: 2 [38;5;6mzookeeper [38;5;5m02:45:38.69 [0m[38;5;2mINFO [0m ==> Adding server: kafka-zookeeper-2.kafka-zookeeper-headless.empirix-cloud.svc.cluster.local:2888:3888 with id: 3 [38;5;6mzookeeper [38;5;5m02:45:38.72 [0m[38;5;2mINFO [0m ==> Enabling Prometheus metrics... [38;5;6mzookeeper [38;5;5m02:45:38.75 [0m[38;5;2mINFO [0m ==> Deploying ZooKeeper with persisted data... [38;5;6mzookeeper [38;5;5m02:45:38.75 [0m[38;5;2mINFO [0m ==> ** ZooKeeper setup finished! **[38;5;6mzookeeper [38;5;5m02:45:38.77 [0m[38;5;2mINFO [0m ==> ** Starting ZooKeeper ** /opt/bitnami/java/bin/java ZooKeeper JMX enabled by default Using config: /opt/bitnami/zookeeper/bin/../conf/zoo.cfg 2024-01-11 02:45:39,746 [myid:1] - ERROR [main:o.a.z.s.DataTree@1807] - First digest mismatch on txn: 74196379424522240,3141,64424510482,1701895146384,14 , v{s{13,#000112f636f6e74726f6c6c65725f65706f636800022},s{5,#0003d2f62726f6b6572732f746f706963732f4f64735f4564725f736d70705f736d735f7864725f696e6465782f706172746974696f6e732f312f73746174650004b7b22636f6e74726f6c6c65725f65706f6368223a33332c226c6561646572223a312c2276657273696f6e223a312c226c65616465725f65706f6368223a3235312c22697372223a5b315d7d00179}} , expected digest is 2,1621536443130 , actual digest is 1632889486483, 2024-01-11 02:45:39,755 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510482(next log) for type 14 2024-01-11 02:45:39,755 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510483(next log) for type 14 2024-01-11 02:45:39,756 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510484(next log) for type 14 2024-01-11 02:45:39,756 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510485(next log) for type 14 2024-01-11 02:45:39,756 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510486(next log) for type 14 2024-01-11 02:45:39,756 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510487(next log) for type 14 2024-01-11 02:45:39,757 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510488(next log) for type 14 2024-01-11 02:45:39,757 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510489(next log) for type 14 2024-01-11 02:45:39,757 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510490(next log) for type 14 2024-01-11 02:45:39,758 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510491(next log) for type 14 2024-01-11 02:45:39,758 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510492(next log) for type 14 2024-01-11 02:45:39,758 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510493(next log) for type 14 2024-01-11 02:45:39,758 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510494(next log) for type 14 2024-01-11 02:45:39,758 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510495(next log) for type 14 2024-01-11 02:45:39,759 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510496(next log) for type 14 2024-01-11 02:45:39,759 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510497(next log) for type 14 2024-01-11 02:45:39,759 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510498(next log) for type 14 2024-01-11 02:45:39,759 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510499(next log) for type 14 2024-01-11 02:45:39,759 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510500(next log) for type 14 2024-01-11 02:45:39,759 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510501(next log) for type 14 2024-01-11 02:45:39,760 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510502(next log) for type 14 2024-01-11 02:45:39,760 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510503(next log) for type 14 2024-01-11 02:45:39,760 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510504(next log) for type 14 2024-01-11 02:45:39,760 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510505(next log) for type 14 2024-01-11 02:45:39,760 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510506(next log) for type 14 2024-01-11 02:45:39,760 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510507(next log) for type 14 2024-01-11 02:45:39,760 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510508(next log) for type 14 2024-01-11 02:45:39,761 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510509(next log) for type 14 2024-01-11 02:45:39,761 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510510(next log) for type 14 2024-01-11 02:45:39,761 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510511(next log) for type 14 2024-01-11 02:45:39,761 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510512(next log) for type 14 2024-01-11 02:45:39,761 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510513(next log) for type 14 2024-01-11 02:45:39,762 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510514(next log) for type 14 2024-01-11 02:45:39,762 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510515(next log) for type 14 2024-01-11 02:45:39,762 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510516(next log) for type 14 2024-01-11 02:45:39,762 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510517(next log) for type 14 2024-01-11 02:45:39,762 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510518(next log) for type 14 2024-01-11 02:45:39,762 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510519(next log) for type 14 2024-01-11 02:45:39,762 [myid:1] - ERROR [main:o.a.z.s.p.FileTxnSnapLog@345] - 64424510521(highestZxid) > 64424510520(next log) for type 14 {code} > Follower doesn't get synchronized after process restart > ------------------------------------------------------- > > Key: ZOOKEEPER-4444 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4444 > Project: ZooKeeper > Issue Type: Bug > Affects Versions: 3.6.3 > Reporter: Andreas Weber > Priority: Major > > Hi folks, I've got an issue with 3.6.3. > I can't provide a simple test, because it seems to depend on timing in a > cluster environment, but I tried to reduce the scenario as far as possible: > * zookeeper cluster with 5 nodes, all of them Followers (no Observers) > * start some parallel threads which do some writes in the cluster (e.g. > create/delete znodes) > * kill one of the zookeeper processes, let's say on node X (where node X is > not the Leader) > * restart zookeeper process on node X > * wait a few seconds > * kill zookeeper process on node X again > * restart zookeeper process on node X again > In some cases (every 3-4 runs) I see the following in the log of node X: > After first restart of node X: > {noformat} > WARN persistence.FileTxnLog - Current zxid 4294968525 is <= > 4294969524 for 15 > WARN persistence.FileTxnLog - Current zxid 4294968526 is <= > 4294969524 for 15 > WARN persistence.FileTxnLog - Current zxid 4294968527 is <= > 4294969524 for 15 > ... (this kind of WARN is repeated some hundred times) > WARN quorum.SendAckRequestProcessor - Closing connection to leader, > exception during packet send java.net.SocketException: Socket closed ... > ... (this kind of WARN is repeated some hundred times) > {noformat} > After second restart of node X: > {noformat} > ERROR persistence.FileTxnSnapLog - 4294970146(highestZxid) > > 4294969147(next log) for type 2 > WARN server.DataTree - Message:Digests are not matching. > Value is Zxid. Value:4294969147 > ERROR server.DataTree - First digest mismatch on txn: > 360466402305310720,3870,4294969147,1639258399998,2 > , ... > , expected digest is 2,1365261838770 > , actual digest is 1098406565142, > ERROR persistence.FileTxnSnapLog - 4294970146(highestZxid) > > 4294969148(next log) for type 2 > ERROR persistence.FileTxnSnapLog - 4294970146(highestZxid) > > 4294969149(next log) for type 5 > ERROR persistence.FileTxnSnapLog - 4294970146(highestZxid) > > 4294969150(next log) for type 2 > ... (this kind of ERROR is repeated some hundred times) > {noformat} > And afterwards (in the actual application), zookepeer on node X seems to have > a different view of the cluster state and doesn't get synchronized, at least > for a few hours. > This e.g. leads to phantom reads of znodes that were deleted a long time ago. > (The resulting behaviour looks a little bit similar as described in > ZOOKEEPER-3911.) > This does not happen with zookeeper 3.6.2 ! > (at least I can't reproduce it with this version) -- This message was sent by Atlassian Jira (v8.20.10#820010)