[jira] [Updated] (ZOOKEEPER-4813) Make zookeeper start successfully when the last log file is dirty during the restore progress
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andor Molnar updated ZOOKEEPER-4813: Fix Version/s: 3.9.3 (was: 3.9.2) > Make zookeeper start successfully when the last log file is dirty during the > restore progress > - > > Key: ZOOKEEPER-4813 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4813 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.9.1 >Reporter: Yan Zhao >Assignee: Yan Zhao >Priority: Major > Labels: pull-request-available > Fix For: 3.9.3 > > Time Spent: 10m > Remaining Estimate: 0h > > When the zookeeper restarts, it will restore the data from the last valid > snapshot file, and replay txn log to append data. > But if the last log file is empty due to some reason, the restore will fail, > not make the zookeeper can not restart. > The logs as followings: > {noformat} > 14:12:16.023 [main] INFO org.apache.zookeeper.server.persistence.SnapStream > - Invalid snapshot snapshot.188700025d87. len = 761554294, byte = 45 > 14:12:16.024 [main] INFO org.apache.zookeeper.server.persistence.FileSnap - > Reading snapshot /pulsar/data/zookeeper/version-2/snapshot.188700025a05 > 14:12:17.350 [main] INFO org.apache.zookeeper.server.DataTree - The digest > in the snapshot has digest version of 2, with zxid as 0x188700025b07, and > digest value as 510776662607117 > 14:12:17.492 [main] ERROR org.apache.zookeeper.server.quorum.QuorumPeer - > Unable to load database on disk > java.io.EOFException: null > at java.io.DataInputStream.readInt(DataInputStream.java:386) ~[?:?] > at > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:96) > ~[org.apache.zookeeper-zookeeper-jute-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:67) > ~[org.apache.zookeeper-zookeeper-jute-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:725) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:743) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:711) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:792) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:361) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.lambda$restore$0(FileTxnSnapLog.java:267) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:312) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:288) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:1149) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:1135) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:229) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:137) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:91) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > 14:12:17.502 [main] INFO > org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider - Shutdown > executor service with timeout 1000 > 14:12:17.508 [main] INFO org.eclipse.jetty.server.AbstractConnector - > Stopped ServerConnector@2484f433{HTTP/1.1, (http/1.1)}{0.0.0.0:8000} > 14:12:17.510 [main] INFO org.eclipse.jetty.server.handler.ContextHandler - > Stopped o.e.j.s.ServletContextHandler@59a67c3a{/,null,STOPPED} > 14:12:17.515 [main] ERROR org.apache.zookeeper.server.quorum.QuorumPeerMain - > Unexpected exception, exiting abnormally > java.lang.RuntimeException: Unable to run quorum server > at > org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:1204) >
[jira] [Updated] (ZOOKEEPER-4813) Make zookeeper start successfully when the last log file is dirty during the restore progress
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhao updated ZOOKEEPER-4813: Description: When the zookeeper restarts, it will restore the data from the last valid snapshot file, and replay txn log to append data. But if the last log file is empty due to some reason, the restore will fail, not make the zookeeper can not restart. The logs as followings: {noformat} 14:12:16.023 [main] INFO org.apache.zookeeper.server.persistence.SnapStream - Invalid snapshot snapshot.188700025d87. len = 761554294, byte = 45 14:12:16.024 [main] INFO org.apache.zookeeper.server.persistence.FileSnap - Reading snapshot /pulsar/data/zookeeper/version-2/snapshot.188700025a05 14:12:17.350 [main] INFO org.apache.zookeeper.server.DataTree - The digest in the snapshot has digest version of 2, with zxid as 0x188700025b07, and digest value as 510776662607117 14:12:17.492 [main] ERROR org.apache.zookeeper.server.quorum.QuorumPeer - Unable to load database on disk java.io.EOFException: null at java.io.DataInputStream.readInt(DataInputStream.java:386) ~[?:?] at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:96) ~[org.apache.zookeeper-zookeeper-jute-3.9.1.jar:3.9.1] at org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:67) ~[org.apache.zookeeper-zookeeper-jute-3.9.1.jar:3.9.1] at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:725) ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:743) ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:711) ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:792) ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] at org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:361) ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] at org.apache.zookeeper.server.persistence.FileTxnSnapLog.lambda$restore$0(FileTxnSnapLog.java:267) ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:312) ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:288) ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:1149) ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:1135) ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:229) ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:137) ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:91) ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] 14:12:17.502 [main] INFO org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider - Shutdown executor service with timeout 1000 14:12:17.508 [main] INFO org.eclipse.jetty.server.AbstractConnector - Stopped ServerConnector@2484f433{HTTP/1.1, (http/1.1)}{0.0.0.0:8000} 14:12:17.510 [main] INFO org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.s.ServletContextHandler@59a67c3a{/,null,STOPPED} 14:12:17.515 [main] ERROR org.apache.zookeeper.server.quorum.QuorumPeerMain - Unexpected exception, exiting abnormally java.lang.RuntimeException: Unable to run quorum server at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:1204) ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:1135) ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:229) ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:137) ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:91) ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:386) ~[?:?] at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:96)
[jira] [Updated] (ZOOKEEPER-4813) Make zookeeper start successfully when the last log file is dirty during the restore progress
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ZOOKEEPER-4813: -- Labels: pull-request-available (was: ) > Make zookeeper start successfully when the last log file is dirty during the > restore progress > - > > Key: ZOOKEEPER-4813 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4813 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.9.1 >Reporter: Yan Zhao >Assignee: Yan Zhao >Priority: Major > Labels: pull-request-available > Fix For: 3.9.2 > > Time Spent: 10m > Remaining Estimate: 0h > > When the zookeeper restarts, it will restore the data from the last valid > snapshot file, and replay txn log to append data. > But if the last log file is empty due to some reason, the restore will fail, > not make the zookeeper can not restart. > The logs as followings: > {noformat} > 14:12:16.023 [main] INFO org.apache.zookeeper.server.persistence.SnapStream > - Invalid snapshot snapshot.188700025d87. len = 761554294, byte = 45 > 14:12:16.024 [main] INFO org.apache.zookeeper.server.persistence.FileSnap - > Reading snapshot /pulsar/data/zookeeper/version-2/snapshot.188700025a05 > 14:12:17.350 [main] INFO org.apache.zookeeper.server.DataTree - The digest > in the snapshot has digest version of 2, with zxid as 0x188700025b07, and > digest value as 510776662607117 > 14:12:17.492 [main] ERROR org.apache.zookeeper.server.quorum.QuorumPeer - > Unable to load database on disk > java.io.EOFException: null > at java.io.DataInputStream.readInt(DataInputStream.java:386) ~[?:?] > at > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:96) > ~[org.apache.zookeeper-zookeeper-jute-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:67) > ~[org.apache.zookeeper-zookeeper-jute-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:725) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:743) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:711) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:792) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:361) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.lambda$restore$0(FileTxnSnapLog.java:267) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:312) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:288) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:1149) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:1135) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:229) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:137) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:91) > ~[org.apache.zookeeper-zookeeper-3.9.1.jar:3.9.1] > 14:12:17.502 [main] INFO > org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider - Shutdown > executor service with timeout 1000 > 14:12:17.508 [main] INFO org.eclipse.jetty.server.AbstractConnector - > Stopped ServerConnector@2484f433{HTTP/1.1, (http/1.1)}{0.0.0.0:8000} > 14:12:17.510 [main] INFO org.eclipse.jetty.server.handler.ContextHandler - > Stopped o.e.j.s.ServletContextHandler@59a67c3a{/,null,STOPPED} > 14:12:17.515 [main] ERROR org.apache.zookeeper.server.quorum.QuorumPeerMain - > Unexpected exception, exiting abnormally > java.lang.RuntimeException: Unable to run quorum server > at > org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:1204) >