[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dharani updated ZOOKEEPER-4878:
-------------------------------
    Description: 
We are running zookeeper in kubernetes as stateful set with 3 replicas. when we 
performed chaos mesh IO fault experiment, zookeeper servers are not recovering.
{code:java}
2024-10-24T09:43:40.896+0000 [myid:] - ERROR 
[QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=[0:0:0:0:0:0:0:0]:2281):o.a.z.s.ZooKeeperServer@552]
 - Severe unrecoverable error, exiting
java.io.FileNotFoundException: 
/var/lib/zookeeper/data/version-2/snapshot.1100000859 (Input/output error)
        at java.base/java.io.FileOutputStream.open0(Native Method)
        at java.base/java.io.FileOutputStream.open(FileOutputStream.java:298)
        at java.base/java.io.FileOutputStream.<init>(FileOutputStream.java:237)
        at java.base/java.io.FileOutputStream.<init>(FileOutputStream.java:187)
        at 
org.apache.zookeeper.server.persistence.SnapStream.getOutputStream(SnapStream.java:133)
        at 
org.apache.zookeeper.server.persistence.FileSnap.serialize(FileSnap.java:242)
        at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.save(FileTxnSnapLog.java:481)
        at 
org.apache.zookeeper.server.ZooKeeperServer.takeSnapshot(ZooKeeperServer.java:550)
        at 
org.apache.zookeeper.server.ZooKeeperServer.takeSnapshot(ZooKeeperServer.java:544)
        at 
org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:540)
        at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:597)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1552)
2024-10-24T09:43:40.898+0000 [myid:] - ERROR 
[QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=[0:0:0:0:0:0:0:0]:2281):o.a.z.u.ServiceUtils@48]
 - Exiting JVM with code 10 {code}
 

  was:
We are running zookeeper in kubernetes as stateful set with 3 replicas. when we 
performed chaos mesh IO fault experiment, zookeeper servers are not recovering.

"[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=[0:0:0:0:0:0:0:0]:2281):o.a.z.s.ZooKeeperServer@552]
 - Severe unrecoverable error, exiting"

java.io.FileNotFoundException: 
/var/lib/zookeeper/data/version-2/snapshot.400000ed9 (Input/output error)


> Zookeeper servers not running after Chaos mesh IO fault experiment
> ------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-4878
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4878
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.8.3
>            Reporter: Dharani
>            Priority: Major
>
> We are running zookeeper in kubernetes as stateful set with 3 replicas. when 
> we performed chaos mesh IO fault experiment, zookeeper servers are not 
> recovering.
> {code:java}
> 2024-10-24T09:43:40.896+0000 [myid:] - ERROR 
> [QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=[0:0:0:0:0:0:0:0]:2281):o.a.z.s.ZooKeeperServer@552]
>  - Severe unrecoverable error, exiting
> java.io.FileNotFoundException: 
> /var/lib/zookeeper/data/version-2/snapshot.1100000859 (Input/output error)
>         at java.base/java.io.FileOutputStream.open0(Native Method)
>         at java.base/java.io.FileOutputStream.open(FileOutputStream.java:298)
>         at 
> java.base/java.io.FileOutputStream.<init>(FileOutputStream.java:237)
>         at 
> java.base/java.io.FileOutputStream.<init>(FileOutputStream.java:187)
>         at 
> org.apache.zookeeper.server.persistence.SnapStream.getOutputStream(SnapStream.java:133)
>         at 
> org.apache.zookeeper.server.persistence.FileSnap.serialize(FileSnap.java:242)
>         at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.save(FileTxnSnapLog.java:481)
>         at 
> org.apache.zookeeper.server.ZooKeeperServer.takeSnapshot(ZooKeeperServer.java:550)
>         at 
> org.apache.zookeeper.server.ZooKeeperServer.takeSnapshot(ZooKeeperServer.java:544)
>         at 
> org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:540)
>         at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:597)
>         at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1552)
> 2024-10-24T09:43:40.898+0000 [myid:] - ERROR 
> [QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=[0:0:0:0:0:0:0:0]:2281):o.a.z.u.ServiceUtils@48]
>  - Exiting JVM with code 10 {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to