[jira] [Updated] (ZOOKEEPER-4781) ZooKeeper not starting because the accepted epoch is less than the current epoch.

zhanglu153 (Jira) Thu, 21 Dec 2023 04:12:04 -0800


     [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


zhanglu153 updated ZOOKEEPER-4781:
----------------------------------
    Description: 
This issue occurred in our abnormal testing environment, where the disk was 
injected with anomalies and frequently filled up.

The the scenario is as follows:
 * Configure three node ZooKeeper cluster, lets say nodes are A, B and C.
 * Start the cluster, and node C becomes the leader.
 * The disk of Node C is injected with a full disk exception.
 * Node C called the org.apache.zookeeper.server.quorum.Leader#lead method.

{code:java}
void lead() throws IOException, InterruptedException {
    self.end_fle = Time.currentElapsedTime();
    long electionTimeTaken = self.end_fle - self.start_fle;
    self.setElectionTimeTaken(electionTimeTaken);
    LOG.info("LEADING - LEADER ELECTION TOOK - {}", electionTimeTaken);
    self.start_fle = 0;
    self.end_fle = 0;

    zk.registerJMX(new LeaderBean(this, zk), self.jmxLocalPeerBean);

    try {
        self.tick.set(0);
        zk.loadData();
        
        leaderStateSummary = new StateSummary(self.getCurrentEpoch(), 
zk.getLastProcessedZxid());

        // Start thread that waits for connection requests from 
        // new followers.
        cnxAcceptor = new LearnerCnxAcceptor();
        cnxAcceptor.start();
        
        readyToStart = true;
        long epoch = getEpochToPropose(self.getId(), self.getAcceptedEpoch());
        
        zk.setZxid(ZxidUtils.makeZxid(epoch, 0));
        
        synchronized(this){
            lastProposed = zk.getZxid();
        }
        
        newLeaderProposal.packet = new QuorumPacket(NEWLEADER, zk.getZxid(),
                null, null);


        if ((newLeaderProposal.packet.getZxid() & 0xffffffffL) != 0) {
            LOG.info("NEWLEADER proposal has Zxid of "
                    + Long.toHexString(newLeaderProposal.packet.getZxid()));
        }
        
        waitForEpochAck(self.getId(), leaderStateSummary);
        self.setCurrentEpoch(epoch);
... {code}
 * Node C, as the leader, will start the LearnerCnxAcceptor thread, and then 
call the org.apache.zookeeper.server.quorum.Leader#getEpochToPropose method. At 
this time, the value of waitingForNewEpoch is true, and the size of 
connectingFollowers is not greater than n/2. Node C directly calls 
connectingFollowers.wait to wait. The maximum waiting time is 
self.getInitLimit()*self.getTickTime() ms.

{code:java}
public long getEpochToPropose(long sid, long lastAcceptedEpoch) throws 
InterruptedException, IOException {
    synchronized(connectingFollowers) {
        if (!waitingForNewEpoch) {
            return epoch;
        }
        if (lastAcceptedEpoch >= epoch) {
            epoch = lastAcceptedEpoch+1;
        }
        if (isParticipant(sid)) {
            connectingFollowers.add(sid);
        }
        QuorumVerifier verifier = self.getQuorumVerifier();
        if (connectingFollowers.contains(self.getId()) && 
                                        
verifier.containsQuorum(connectingFollowers)) {
            self.setAcceptedEpoch(epoch);
            waitingForNewEpoch = false;
            connectingFollowers.notifyAll();
        } else {
            long start = Time.currentElapsedTime();
            long cur = start;
            long end = start + self.getInitLimit()*self.getTickTime();
            while(waitingForNewEpoch && cur < end) {
                connectingFollowers.wait(end - cur);
                cur = Time.currentElapsedTime();
            }
            if (waitingForNewEpoch) {
                throw new InterruptedException("Timeout while waiting for epoch 
from quorum");        
            }
        }
        return epoch;
    }
}{code}
 * Node B connects to the 2888 communication port of node C and starts a new 
LeanerHandler thread.
 * Node A connects to the 2888 communication port of node C and starts a new 
LeanerHandler thread.
 * After node B connects to node C, call the 
org.apache.zookeeper.server.quorum.Leader#getEpochToPropose method in the 
LearnerHandler thread.At this point, the value of waitingForNewEpoch is true, 
and the size of connectingFollowers is greater than n/2. Then, set the value of 
waitingForNewEpoch to false. Due to the disk of node C being full, calling 
setAcceptedEpoch to write the acceptedEpoch value failed with an IO exception. 
Node C fails to update the acceptedEpoch file and did not successfully call the 
connectingFollowers.notifyAll() method. This will cause node C to wait at 
connectingFollowers.wait, with a maximum wait of 
self.getInitLimit()*self.getTickTime() ms. Due to an IO exception thrown, the 
socket connection between node B and node C was disconnected, and the 
LeanerHandler thread exited.

{code:java}
2023-12-15 15:08:35,572 [myid:3] - ERROR 
[LearnerHandler-/192.168.174.227:41696:LearnerHandler@648] - Unexpected 
exception causing shutdown while sock still open2023-12-15 15:08:35,572 
[myid:3] - ERROR [LearnerHandler-/192.168.174.227:41696:LearnerHandler@648] - 
Unexpected exception causing shutdown while sock still 
openjava.io.FileNotFoundException: 
/cloud/data/zookeeper/data/version-2/acceptedEpoch.tmp (No space left on 
device) at java.io.FileOutputStream.open0(Native Method) at 
java.io.FileOutputStream.open(FileOutputStream.java:270) at 
java.io.FileOutputStream.<init>(FileOutputStream.java:213) at 
java.io.FileOutputStream.<init>(FileOutputStream.java:162) at 
org.apache.zookeeper.common.AtomicFileOutputStream.<init>(AtomicFileOutputStream.java:59)
 at 
org.apache.zookeeper.server.quorum.QuorumPeer.writeLongToFile(QuorumPeer.java:1393)
 at 
org.apache.zookeeper.server.quorum.QuorumPeer.setAcceptedEpoch(QuorumPeer.java:1437)
 at 
org.apache.zookeeper.server.quorum.Leader.getEpochToPropose(Leader.java:885) at 
org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)  
{code}
 * After node A connects to node C, the 
org.apache.zookeeper.server.quorum.Leader#getEpochToPropose method is called in 
the LearnerHandler thread. At this time, the value of waitingForNewEpoch is 
false, and epoch is returned directly.
 * Node B reconnects to the 2888 cluster communication port of node C and 
starts a new LeanerHandler thread. Call the 
org.apache.zookeeper.server.quorum.Leader#getEpochToPropose method in the 
LearnerHandler thread. At this time, the value of waitingForNewEpoch is false, 
and epoch is returned directly.
 * After node A connects to node C, the LeanerHandler thread will then call the 
org.apache.zookeeper.server.quorum.Leader#waitForEpochAck method. At this 
point, selectingFollowers does not include node C, Call the 
selectingFollowers.wait method to wait.
 * After node B connects to node C, the LeanerHandler thread will then call the 
org.apache.zookeeper.server.quorum.Leader#waitForEpochAck method. At this 
point, selectingFollowers does not include node C, Call the 
selectingFollowers.wait method to wait.

{code:java}
public void waitForEpochAck(long id, StateSummary ss) throws IOException, 
InterruptedException {
    synchronized(electingFollowers) {
        if (electionFinished) {
            return;
        }
        if (ss.getCurrentEpoch() != -1) {
            if (ss.isMoreRecentThan(leaderStateSummary)) {
                throw new IOException("Follower is ahead of the leader, leader 
summary: " 
                                                + 
leaderStateSummary.getCurrentEpoch()
                                                + " (current epoch), "
                                                + 
leaderStateSummary.getLastZxid()
                                                + " (last zxid)");
            }
            if (isParticipant(id)) {
                electingFollowers.add(id);
            }
        }
        QuorumVerifier verifier = self.getQuorumVerifier();
        if (electingFollowers.contains(self.getId()) && 
verifier.containsQuorum(electingFollowers)) {
            electionFinished = true;
            electingFollowers.notifyAll();
        } else {                
            long start = Time.currentElapsedTime();
            long cur = start;
            long end = start + self.getInitLimit()*self.getTickTime();
            while(!electionFinished && cur < end) {
                electingFollowers.wait(end - cur);
                cur = Time.currentElapsedTime();
            }
            if (!electionFinished) {
                throw new InterruptedException("Timeout while waiting for epoch 
to be acked by quorum");
            }
        }
    }
} {code}
 * Node C waits for self.getInitLimit()*self.getTickTime() ms in the 
connectingFollowers.wait method. However, since the value of waitingForNewEpoch 
is false, it does not throw a timeout exception and returns the epoch directly.
 * Node C calls the org.apache.zookeeper.server.quorum.Leader#waitForEpochAck 
method. At this time, selectingFollowers contains node C and the size of 
selectingFollowers is greater than n/2. Then call the 
selectingFollowers.notifyAll() method to release the lock.
 * The disk space of node C has been partially released, so subsequent calls to 
setCurrentEpoch have successfully updated the currentEpoch file.
 * The cluster provides services normally for a period of time.
 * Stop C.
 * Start C, bellow exception with message "The accepted epoch, c is less than 
the current epoch, d" is thrown.

{code:java}
2023-12-15 15:57:00,051 [myid:3] - ERROR [main:QuorumPeer@698] - Unable to load 
database on disk
java.io.IOException: The accepted epoch, c is less than the current epoch, d
    at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:695)
    at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:636)
    at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
    at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
    at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
2023-12-15 15:57:00,132 [myid:3] - ERROR [main:QuorumPeerMain@92] - Unexpected 
exception, exiting abnormally
java.lang.RuntimeException: Unable to run quorum server 
    at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:699)
    at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:636)
    at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
    at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
    at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
Caused by: java.io.IOException: The accepted epoch, c is less than the current 
epoch, d
    at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:695) 
{code}

  was:
This issue occurred in our abnormal testing environment, where the disk was 
injected with anomalies and frequently filled up.

The the scenario is as follows:
 # Configure three node ZooKeeper cluster, lets say nodes are A, B and C.
 # Start the cluster, and node C becomes the leader.
 # The disk of Node C is injected with a full disk exception.
 # Node C called the org.apache.zookeeper.server.quorum.Leader#lead method.
{code:java}
void lead() throws IOException, InterruptedException {
    self.end_fle = Time.currentElapsedTime();
    long electionTimeTaken = self.end_fle - self.start_fle;
    self.setElectionTimeTaken(electionTimeTaken);
    LOG.info("LEADING - LEADER ELECTION TOOK - {}", electionTimeTaken);
    self.start_fle = 0;
    self.end_fle = 0;

    zk.registerJMX(new LeaderBean(this, zk), self.jmxLocalPeerBean);

    try {
        self.tick.set(0);
        zk.loadData();
        
        leaderStateSummary = new StateSummary(self.getCurrentEpoch(), 
zk.getLastProcessedZxid());

        // Start thread that waits for connection requests from 
        // new followers.
        cnxAcceptor = new LearnerCnxAcceptor();
        cnxAcceptor.start();
        
        readyToStart = true;
        long epoch = getEpochToPropose(self.getId(), self.getAcceptedEpoch());
        
        zk.setZxid(ZxidUtils.makeZxid(epoch, 0));
        
        synchronized(this){
            lastProposed = zk.getZxid();
        }
        
        newLeaderProposal.packet = new QuorumPacket(NEWLEADER, zk.getZxid(),
                null, null);


        if ((newLeaderProposal.packet.getZxid() & 0xffffffffL) != 0) {
            LOG.info("NEWLEADER proposal has Zxid of "
                    + Long.toHexString(newLeaderProposal.packet.getZxid()));
        }
        
        waitForEpochAck(self.getId(), leaderStateSummary);
        self.setCurrentEpoch(epoch);
... {code}

 # Node C, as the leader, will start the LearnerCnxAcceptor thread, and then 
call the org.apache.zookeeper.server.quorum.Leader#getEpochToPropose method. At 
this time, the value of waitingForNewEpoch is true, and the size of 
connectingFollowers is not greater than n/2. Node C directly calls 
connectingFollowers.wait to wait. The maximum waiting time is 
self.getInitLimit()*self.getTickTime() ms.
{code:java}
public long getEpochToPropose(long sid, long lastAcceptedEpoch) throws 
InterruptedException, IOException {
    synchronized(connectingFollowers) {
        if (!waitingForNewEpoch) {
            return epoch;
        }
        if (lastAcceptedEpoch >= epoch) {
            epoch = lastAcceptedEpoch+1;
        }
        if (isParticipant(sid)) {
            connectingFollowers.add(sid);
        }
        QuorumVerifier verifier = self.getQuorumVerifier();
        if (connectingFollowers.contains(self.getId()) && 
                                        
verifier.containsQuorum(connectingFollowers)) {
            self.setAcceptedEpoch(epoch);
            waitingForNewEpoch = false;
            connectingFollowers.notifyAll();
        } else {
            long start = Time.currentElapsedTime();
            long cur = start;
            long end = start + self.getInitLimit()*self.getTickTime();
            while(waitingForNewEpoch && cur < end) {
                connectingFollowers.wait(end - cur);
                cur = Time.currentElapsedTime();
            }
            if (waitingForNewEpoch) {
                throw new InterruptedException("Timeout while waiting for epoch 
from quorum");        
            }
        }
        return epoch;
    }
}{code}

 # Node B connects to the 2888 communication port of node C and starts a new 
LeanerHandler thread. 
 # Node A connects to the 2888 communication port of node C and starts a new 
LeanerHandler thread.
 # After node B connects to node C, call the 
org.apache.zookeeper.server.quorum.Leader#getEpochToPropose method in the 
LearnerHandler thread.At this point, the value of waitingForNewEpoch is true, 
and the size of connectingFollowers is greater than n/2. Then, set the value of 
waitingForNewEpoch to false. Due to the disk of node C being full, calling 
setAcceptedEpoch to write the acceptedEpoch value failed with an IO exception. 
Node C fails to update the acceptedEpoch file and did not successfully call the 
connectingFollowers.notifyAll() method. This will cause node C to wait at 
connectingFollowers.wait, with a maximum wait of 
self.getInitLimit()*self.getTickTime() ms. Due to an IO exception thrown, the 
socket connection between node B and node C was disconnected, and the 
LeanerHandler thread exited.
{code:java}
2023-12-15 15:08:35,572 [myid:3] - ERROR 
[LearnerHandler-/192.168.174.227:41696:LearnerHandler@648] - Unexpected 
exception causing shutdown while sock still open2023-12-15 15:08:35,572 
[myid:3] - ERROR [LearnerHandler-/192.168.174.227:41696:LearnerHandler@648] - 
Unexpected exception causing shutdown while sock still 
openjava.io.FileNotFoundException: 
/cloud/data/zookeeper/data/version-2/acceptedEpoch.tmp (No space left on 
device) at java.io.FileOutputStream.open0(Native Method) at 
java.io.FileOutputStream.open(FileOutputStream.java:270) at 
java.io.FileOutputStream.<init>(FileOutputStream.java:213) at 
java.io.FileOutputStream.<init>(FileOutputStream.java:162) at 
org.apache.zookeeper.common.AtomicFileOutputStream.<init>(AtomicFileOutputStream.java:59)
 at 
org.apache.zookeeper.server.quorum.QuorumPeer.writeLongToFile(QuorumPeer.java:1393)
 at 
org.apache.zookeeper.server.quorum.QuorumPeer.setAcceptedEpoch(QuorumPeer.java:1437)
 at 
org.apache.zookeeper.server.quorum.Leader.getEpochToPropose(Leader.java:885) at 
org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)  
{code}

 # After node A connects to node C, the 
org.apache.zookeeper.server.quorum.Leader#getEpochToPropose method is called in 
the LearnerHandler thread. At this time, the value of waitingForNewEpoch is 
false, and epoch is returned directly.
 # Node B reconnects to the 2888 cluster communication port of node C and 
starts a new LeanerHandler thread. Call the 
org.apache.zookeeper.server.quorum.Leader#getEpochToPropose method in the 
LearnerHandler thread. At this time, the value of waitingForNewEpoch is false, 
and epoch is returned directly.
 # After node A connects to node C, the LeanerHandler thread will then call the 
org.apache.zookeeper.server.quorum.Leader#waitForEpochAck method. At this 
point, selectingFollowers does not include node C, Call the 
selectingFollowers.wait method to wait.
 # After node B connects to node C, the LeanerHandler thread will then call the 
org.apache.zookeeper.server.quorum.Leader#waitForEpochAck method. At this 
point, selectingFollowers does not include node C, Call the 
selectingFollowers.wait method to wait.
{code:java}
public void waitForEpochAck(long id, StateSummary ss) throws IOException, 
InterruptedException {
    synchronized(electingFollowers) {
        if (electionFinished) {
            return;
        }
        if (ss.getCurrentEpoch() != -1) {
            if (ss.isMoreRecentThan(leaderStateSummary)) {
                throw new IOException("Follower is ahead of the leader, leader 
summary: " 
                                                + 
leaderStateSummary.getCurrentEpoch()
                                                + " (current epoch), "
                                                + 
leaderStateSummary.getLastZxid()
                                                + " (last zxid)");
            }
            if (isParticipant(id)) {
                electingFollowers.add(id);
            }
        }
        QuorumVerifier verifier = self.getQuorumVerifier();
        if (electingFollowers.contains(self.getId()) && 
verifier.containsQuorum(electingFollowers)) {
            electionFinished = true;
            electingFollowers.notifyAll();
        } else {                
            long start = Time.currentElapsedTime();
            long cur = start;
            long end = start + self.getInitLimit()*self.getTickTime();
            while(!electionFinished && cur < end) {
                electingFollowers.wait(end - cur);
                cur = Time.currentElapsedTime();
            }
            if (!electionFinished) {
                throw new InterruptedException("Timeout while waiting for epoch 
to be acked by quorum");
            }
        }
    }
} {code}

 # Node C waits for self.getInitLimit()*self.getTickTime() ms in the 
connectingFollowers.wait method. However, since the value of waitingForNewEpoch 
is false, it does not throw a timeout exception and returns the epoch directly.
 # Node C calls the org.apache.zookeeper.server.quorum.Leader#waitForEpochAck 
method. At this time, selectingFollowers contains node C and the size of 
selectingFollowers is greater than n/2. Then call the 
selectingFollowers.notifyAll() method to release the lock.
 # The disk space of node C has been partially released, so subsequent calls to 
setCurrentEpoch have successfully updated the currentEpoch file.
 # The cluster provides services normally for a period of time.
 # Stop C.
 # Start C, bellow exception with message "The accepted epoch, c is less than 
the current epoch, d" is thrown.

{code:java}
2023-12-15 15:57:00,051 [myid:3] - ERROR [main:QuorumPeer@698] - Unable to load 
database on disk
java.io.IOException: The accepted epoch, c is less than the current epoch, d
    at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:695)
    at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:636)
    at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
    at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
    at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
2023-12-15 15:57:00,132 [myid:3] - ERROR [main:QuorumPeerMain@92] - Unexpected 
exception, exiting abnormally
java.lang.RuntimeException: Unable to run quorum server 
    at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:699)
    at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:636)
    at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
    at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
    at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
Caused by: java.io.IOException: The accepted epoch, c is less than the current 
epoch, d
    at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:695) 
{code}


> ZooKeeper not starting because the accepted epoch is less than the current 
> epoch.
> ---------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-4781
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4781
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.5.10, 3.4.14, 3.6.4, 3.7.2, 3.8.3, 3.9.1
>            Reporter: zhanglu153
>            Priority: Major
>         Attachments: 0001-zk-leader-acceptedEpoch-zk-zk.patch
>
>
> This issue occurred in our abnormal testing environment, where the disk was 
> injected with anomalies and frequently filled up.
> The the scenario is as follows:
>  * Configure three node ZooKeeper cluster, lets say nodes are A, B and C.
>  * Start the cluster, and node C becomes the leader.
>  * The disk of Node C is injected with a full disk exception.
>  * Node C called the org.apache.zookeeper.server.quorum.Leader#lead method.
> {code:java}
> void lead() throws IOException, InterruptedException {
>     self.end_fle = Time.currentElapsedTime();
>     long electionTimeTaken = self.end_fle - self.start_fle;
>     self.setElectionTimeTaken(electionTimeTaken);
>     LOG.info("LEADING - LEADER ELECTION TOOK - {}", electionTimeTaken);
>     self.start_fle = 0;
>     self.end_fle = 0;
>     zk.registerJMX(new LeaderBean(this, zk), self.jmxLocalPeerBean);
>     try {
>         self.tick.set(0);
>         zk.loadData();
>         
>         leaderStateSummary = new StateSummary(self.getCurrentEpoch(), 
> zk.getLastProcessedZxid());
>         // Start thread that waits for connection requests from 
>         // new followers.
>         cnxAcceptor = new LearnerCnxAcceptor();
>         cnxAcceptor.start();
>         
>         readyToStart = true;
>         long epoch = getEpochToPropose(self.getId(), self.getAcceptedEpoch());
>         
>         zk.setZxid(ZxidUtils.makeZxid(epoch, 0));
>         
>         synchronized(this){
>             lastProposed = zk.getZxid();
>         }
>         
>         newLeaderProposal.packet = new QuorumPacket(NEWLEADER, zk.getZxid(),
>                 null, null);
>         if ((newLeaderProposal.packet.getZxid() & 0xffffffffL) != 0) {
>             LOG.info("NEWLEADER proposal has Zxid of "
>                     + Long.toHexString(newLeaderProposal.packet.getZxid()));
>         }
>         
>         waitForEpochAck(self.getId(), leaderStateSummary);
>         self.setCurrentEpoch(epoch);
> ... {code}
>  * Node C, as the leader, will start the LearnerCnxAcceptor thread, and then 
> call the org.apache.zookeeper.server.quorum.Leader#getEpochToPropose method. 
> At this time, the value of waitingForNewEpoch is true, and the size of 
> connectingFollowers is not greater than n/2. Node C directly calls 
> connectingFollowers.wait to wait. The maximum waiting time is 
> self.getInitLimit()*self.getTickTime() ms.
> {code:java}
> public long getEpochToPropose(long sid, long lastAcceptedEpoch) throws 
> InterruptedException, IOException {
>     synchronized(connectingFollowers) {
>         if (!waitingForNewEpoch) {
>             return epoch;
>         }
>         if (lastAcceptedEpoch >= epoch) {
>             epoch = lastAcceptedEpoch+1;
>         }
>         if (isParticipant(sid)) {
>             connectingFollowers.add(sid);
>         }
>         QuorumVerifier verifier = self.getQuorumVerifier();
>         if (connectingFollowers.contains(self.getId()) && 
>                                         
> verifier.containsQuorum(connectingFollowers)) {
>             self.setAcceptedEpoch(epoch);
>             waitingForNewEpoch = false;
>             connectingFollowers.notifyAll();
>         } else {
>             long start = Time.currentElapsedTime();
>             long cur = start;
>             long end = start + self.getInitLimit()*self.getTickTime();
>             while(waitingForNewEpoch && cur < end) {
>                 connectingFollowers.wait(end - cur);
>                 cur = Time.currentElapsedTime();
>             }
>             if (waitingForNewEpoch) {
>                 throw new InterruptedException("Timeout while waiting for 
> epoch from quorum");        
>             }
>         }
>         return epoch;
>     }
> }{code}
>  * Node B connects to the 2888 communication port of node C and starts a new 
> LeanerHandler thread.
>  * Node A connects to the 2888 communication port of node C and starts a new 
> LeanerHandler thread.
>  * After node B connects to node C, call the 
> org.apache.zookeeper.server.quorum.Leader#getEpochToPropose method in the 
> LearnerHandler thread.At this point, the value of waitingForNewEpoch is true, 
> and the size of connectingFollowers is greater than n/2. Then, set the value 
> of waitingForNewEpoch to false. Due to the disk of node C being full, calling 
> setAcceptedEpoch to write the acceptedEpoch value failed with an IO 
> exception. Node C fails to update the acceptedEpoch file and did not 
> successfully call the connectingFollowers.notifyAll() method. This will cause 
> node C to wait at connectingFollowers.wait, with a maximum wait of 
> self.getInitLimit()*self.getTickTime() ms. Due to an IO exception thrown, the 
> socket connection between node B and node C was disconnected, and the 
> LeanerHandler thread exited.
> {code:java}
> 2023-12-15 15:08:35,572 [myid:3] - ERROR 
> [LearnerHandler-/192.168.174.227:41696:LearnerHandler@648] - Unexpected 
> exception causing shutdown while sock still open2023-12-15 15:08:35,572 
> [myid:3] - ERROR [LearnerHandler-/192.168.174.227:41696:LearnerHandler@648] - 
> Unexpected exception causing shutdown while sock still 
> openjava.io.FileNotFoundException: 
> /cloud/data/zookeeper/data/version-2/acceptedEpoch.tmp (No space left on 
> device) at java.io.FileOutputStream.open0(Native Method) at 
> java.io.FileOutputStream.open(FileOutputStream.java:270) at 
> java.io.FileOutputStream.<init>(FileOutputStream.java:213) at 
> java.io.FileOutputStream.<init>(FileOutputStream.java:162) at 
> org.apache.zookeeper.common.AtomicFileOutputStream.<init>(AtomicFileOutputStream.java:59)
>  at 
> org.apache.zookeeper.server.quorum.QuorumPeer.writeLongToFile(QuorumPeer.java:1393)
>  at 
> org.apache.zookeeper.server.quorum.QuorumPeer.setAcceptedEpoch(QuorumPeer.java:1437)
>  at 
> org.apache.zookeeper.server.quorum.Leader.getEpochToPropose(Leader.java:885) 
> at 
> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:358)
>   {code}
>  * After node A connects to node C, the 
> org.apache.zookeeper.server.quorum.Leader#getEpochToPropose method is called 
> in the LearnerHandler thread. At this time, the value of waitingForNewEpoch 
> is false, and epoch is returned directly.
>  * Node B reconnects to the 2888 cluster communication port of node C and 
> starts a new LeanerHandler thread. Call the 
> org.apache.zookeeper.server.quorum.Leader#getEpochToPropose method in the 
> LearnerHandler thread. At this time, the value of waitingForNewEpoch is 
> false, and epoch is returned directly.
>  * After node A connects to node C, the LeanerHandler thread will then call 
> the org.apache.zookeeper.server.quorum.Leader#waitForEpochAck method. At this 
> point, selectingFollowers does not include node C, Call the 
> selectingFollowers.wait method to wait.
>  * After node B connects to node C, the LeanerHandler thread will then call 
> the org.apache.zookeeper.server.quorum.Leader#waitForEpochAck method. At this 
> point, selectingFollowers does not include node C, Call the 
> selectingFollowers.wait method to wait.
> {code:java}
> public void waitForEpochAck(long id, StateSummary ss) throws IOException, 
> InterruptedException {
>     synchronized(electingFollowers) {
>         if (electionFinished) {
>             return;
>         }
>         if (ss.getCurrentEpoch() != -1) {
>             if (ss.isMoreRecentThan(leaderStateSummary)) {
>                 throw new IOException("Follower is ahead of the leader, 
> leader summary: " 
>                                                 + 
> leaderStateSummary.getCurrentEpoch()
>                                                 + " (current epoch), "
>                                                 + 
> leaderStateSummary.getLastZxid()
>                                                 + " (last zxid)");
>             }
>             if (isParticipant(id)) {
>                 electingFollowers.add(id);
>             }
>         }
>         QuorumVerifier verifier = self.getQuorumVerifier();
>         if (electingFollowers.contains(self.getId()) && 
> verifier.containsQuorum(electingFollowers)) {
>             electionFinished = true;
>             electingFollowers.notifyAll();
>         } else {                
>             long start = Time.currentElapsedTime();
>             long cur = start;
>             long end = start + self.getInitLimit()*self.getTickTime();
>             while(!electionFinished && cur < end) {
>                 electingFollowers.wait(end - cur);
>                 cur = Time.currentElapsedTime();
>             }
>             if (!electionFinished) {
>                 throw new InterruptedException("Timeout while waiting for 
> epoch to be acked by quorum");
>             }
>         }
>     }
> } {code}
>  * Node C waits for self.getInitLimit()*self.getTickTime() ms in the 
> connectingFollowers.wait method. However, since the value of 
> waitingForNewEpoch is false, it does not throw a timeout exception and 
> returns the epoch directly.
>  * Node C calls the org.apache.zookeeper.server.quorum.Leader#waitForEpochAck 
> method. At this time, selectingFollowers contains node C and the size of 
> selectingFollowers is greater than n/2. Then call the 
> selectingFollowers.notifyAll() method to release the lock.
>  * The disk space of node C has been partially released, so subsequent calls 
> to setCurrentEpoch have successfully updated the currentEpoch file.
>  * The cluster provides services normally for a period of time.
>  * Stop C.
>  * Start C, bellow exception with message "The accepted epoch, c is less than 
> the current epoch, d" is thrown.
> {code:java}
> 2023-12-15 15:57:00,051 [myid:3] - ERROR [main:QuorumPeer@698] - Unable to 
> load database on disk
> java.io.IOException: The accepted epoch, c is less than the current epoch, d
>     at 
> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:695)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:636)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
> 2023-12-15 15:57:00,132 [myid:3] - ERROR [main:QuorumPeerMain@92] - 
> Unexpected exception, exiting abnormally
> java.lang.RuntimeException: Unable to run quorum server 
>     at 
> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:699)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:636)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
> Caused by: java.io.IOException: The accepted epoch, c is less than the 
> current epoch, d
>     at 
> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:695)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (ZOOKEEPER-4781) ZooKeeper not starting because the accepted epoch is less than the current epoch.

Reply via email to