Ramnatthan Alagappan created ZOOKEEPER-2553:
-----------------------------------------------

             Summary: ZooKeeper cluster unavailable due to corrupted log file 
during power failures -- java.io.IOException: Unreasonable length
                 Key: ZOOKEEPER-2553
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2553
             Project: ZooKeeper
          Issue Type: Bug
          Components: server
    Affects Versions: 3.4.8
         Environment: Normal ZooKeeper cluster with 3 nodes running Linux
            Reporter: Ramnatthan Alagappan


I am running a three node ZooKeeper cluster. 

When a new log file is created by ZooKeeper, I see the following sequence of 
system calls:

1. creat(new_log)
2. write(new_log, count=16) // This is a log header I believe/
3. truncate(new_log, from 16 bytes to 16 KBytes) // I have configured the log 
size to be 16K. 

When the above sequence of operations complete, it is reasonable to expect the 
newly created log file to contain the header(16 bytes) and then filled with 
zeros till the end of the log.

But when a crash occurs (due to a power failure), while the truncate system 
call is in progress, it is possible for the log to contain garbage data when 
the system restarts from a crash. Note that if the crash occurs just after the 
truncate system call completes, then there is no problem. Basically, the 
truncate needs to be atomically persisted for ZooKeeper to recover from crashes 
correctly. 

As mentioned, if a crash occurs during the truncate system call, then ZooKeeper 
will fail to start with the following exception. Here is the stack trace:

java.io.IOException: Unreasonable length = -295704495
        at 
org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:127)
        at 
org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:92)
        at 
org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:233)
        at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
        at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:652)
        at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:552)
        at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:527)
        at 
org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:354)
        at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:132)
        at 
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:510)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:500)
        at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:153)
        at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
        at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
[myid:1] - ERROR [main:QuorumPeerMain@89] - Unexpected exception, exiting 
abnormally
java.lang.RuntimeException: Unable to run quorum server
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:558)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:500)
        at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:153)
        at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
        at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
Caused by: java.io.IOException: Unreasonable length = -295704495
        at 
org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:127)
        at 
org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:92)
        at 
org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:233)
        at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
        at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:652)
        at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:552)
        at 
org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:527)
        at 
org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:354)
        at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:132)
        at 
org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:510)
        ... 4 more


Next, it is possible for two nodes of a 3-node  ZooKeeper cluster to reach the 
same state. In that case, they both will fail to startup, rendering the entire 
cluster unavailable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to