I confirmed that the fix is included in 3.4.13. That’s why I asked if you can see ‘updatingEpoch’ file in the data folder.
I don’t think the issue is not related, but I want to make sure that you’re running the right version by verifying the beginning of ZK logs. Andor > On 2019. Aug 26., at 13:43, Debraj Manna <subharaj.ma...@gmail.com> wrote: > > Below is the content of currentEpoch.tmp > > support@platform2:/var/lib/zookeeper/version-2$ sudo cat acceptedEpoch > 8support@platform2:/var/lib/zookeeper/version-2$ sudo cat currentEpoch > 7support@platform2:/var/lib/zookeeper/version-2$ sudo cat currentEpoch.tmp > 8support@platform2 > > Starting zookeeper logs are rolled over as the issue was there for some > time. Will the current log with the node in this state help? Btw why do you > think this issue may not be related to zookeeper? > > > > On Mon, Aug 26, 2019 at 4:56 PM Andor Molnar <an...@apache.org> wrote: > >> Hi Debraj, >> >> The fix should be in all 3.4 versions from 3.4.6 onward, including 3.4.13. >> Can you see ‘updatingEpoch’ file in /var/lib/zookeeper/version-2 ? >> Also what is ‘currentEpoch.tmp’ ? I’m not sure if it relates to ZooKeeper. >> >> Would you please share full startup logs of the failing node? >> >> Regards, >> Andor >> >> >> >> >>> On 2019. Aug 23., at 18:53, Debraj Manna <subharaj.ma...@gmail.com> >> wrote: >>> >>> Can someone answer by below query? >>> >>> I am getting confused after going through ZOOKEEPER-1653 >>> <https://issues.apache.org/jira/browse/ZOOKEEPER-1653> and >> ZOOKEEPER-2354 >>> <https://issues.apache.org/jira/browse/ZOOKEEPER-2354> . The issues say >> it >>> is fixed in 3.4.6 but exists in 3.5.x. But I am seeing the issue in >> 3.4.13 >>> also. Can someone let me know if the issue is present in 3.4.13 also? >>> >>> >>> On Wed 21 Aug, 2019, 12:35 PM Debraj Manna, <subharaj.ma...@gmail.com> >>> wrote: >>> >>>> With the other two zookeeper servers running I stopped the zookeeper in >>>> the broken node and the deleted all the contents inside >> /var/lib/zookeeper/version-2 >>>> and started the zookeeper back on the node. It is running fine now and >> got >>>> all the data from the other servers. >>>> >>>> I am getting confused after going through ZOOKEEPER-1653 >>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-1653> and >> ZOOKEEPER-2354 >>>> <https://issues.apache.org/jira/browse/ZOOKEEPER-2354> . The issues say >>>> it is fixed in 3.4.6 but exists in 3.5.x. But I am seeing the issue in >>>> 3.4.13 also. Can someone let me know if the issue is present in 3.4.13 >> also? >>>> >>>> >>>> >>>> On Wed, Aug 21, 2019 at 8:54 AM Debraj Manna <subharaj.ma...@gmail.com> >>>> wrote: >>>> >>>>> Thanks for replying. >>>>> >>>>> What is the recommended way to remove a node and delete all data from >> it >>>>> and make it start fresh? >>>>> >>>>> On Wed 21 Aug, 2019, 12:58 AM Enrico Olivelli, <eolive...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hello, >>>>>> Sorry for so late reply. >>>>>> If you have 3 servers you can nuke the broken one and make it start >> from >>>>>> scratch, it will join the cluster and then recover data from the other >>>>>> servers >>>>>> >>>>>> Try it in a staging env, not in production >>>>>> >>>>>> Enrico >>>>>> >>>>>> Il mar 20 ago 2019, 20:30 Debraj Manna <subharaj.ma...@gmail.com> ha >>>>>> scritto: >>>>>> >>>>>>> The same has been asked in stackoverflow >>>>>>> < >>>>>>> >>>>>> >> https://stackoverflow.com/questions/57574298/zookeeper-error-the-current-epoch-is-older-than-the-last-zxid >>>>>>>> >>>>>>> also. But no response there also. >>>>>>> >>>>>>> Anyone any thoughts on this one? >>>>>>> >>>>>>> On Tue, Aug 20, 2019 at 4:43 PM Debraj Manna < >> subharaj.ma...@gmail.com >>>>>>> >>>>>>> wrote: >>>>>>> >>>>>>>> Posted wrong Jira link. I meant >>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-2354. Can someone >>>>>> let >>>>>>> me >>>>>>>> know what is the recommended way to recover the node? >>>>>>>> >>>>>>>> support@platform2:/var/lib/zookeeper/version-2$ sudo cat >>>>>> acceptedEpoch >>>>>>>> 8support@platform2:/var/lib/zookeeper/version-2$ sudo cat >>>>>> currentEpoch >>>>>>>> 7support@platform2:/var/lib/zookeeper/version-2$ sudo cat >>>>>>> currentEpoch.tmp >>>>>>>> 8support@platform2 >>>>>>>> >>>>>>>> On Tue, Aug 20, 2019 at 3:14 PM Debraj Manna < >>>>>> subharaj.ma...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi >>>>>>>>> >>>>>>>>> I am using a zookeeper ensemble of 3 nodes running 3.4.13. >> Sometimes >>>>>>>>> after reboot of machine zookeeper is not starting and I am seeing >>>>>> the >>>>>>> below >>>>>>>>> errors in logs. >>>>>>>>> >>>>>>>>> I have seen https://issues.apache.org/jira/browse/ZOOKEEPER-1653 . >>>>>> Can >>>>>>>>> someone let me if this is fixed in 3.4.13 or not as I can see the >>>>>> issue >>>>>>>>> still open? Also can somone suggest what is the recommended way to >>>>>>> recover >>>>>>>>> the set-up ? >>>>>>>>> >>>>>>>>> 2019-08-19 04:18:36,906 [myid:2] - ERROR [main:QuorumPeer@692] - >>>>>> Unable >>>>>>>>> to load database on disk >>>>>>>>> java.io.IOException: The current epoch, 7, is older than the last >>>>>> zxid, >>>>>>>>> 34359738370 >>>>>>>>> at >>>>>>>>> >>>>>>> >>>>>> >> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:674) >>>>>>>>> at >>>>>>>>> >>>>>> >> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:635) >>>>>>>>> at >>>>>>>>> >>>>>>> >>>>>> >> org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170) >>>>>>>>> at >>>>>>>>> >>>>>>> >>>>>> >> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114) >>>>>>>>> at >>>>>>>>> >>>>>>> >>>>>> >> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81) >>>>>>>>> 2019-08-19 04:18:36,908 [myid:2] - ERROR [main:QuorumPeerMain@92] >> - >>>>>>>>> Unexpected exception, exiting abnormally >>>>>>>>> java.lang.RuntimeException: Unable to run quorum server >>>>>>>>> at >>>>>>>>> >>>>>>> >>>>>> >> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:693) >>>>>>>>> at >>>>>>>>> >>>>>> >> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:635) >>>>>>>>> at >>>>>>>>> >>>>>>> >>>>>> >> org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170) >>>>>>>>> at >>>>>>>>> >>>>>>> >>>>>> >> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114) >>>>>>>>> at >>>>>>>>> >>>>>>> >>>>>> >> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81) >>>>>>>>> Caused by: java.io.IOException: The current epoch, 7, is older than >>>>>> the >>>>>>>>> last zxid, 34359738370 >>>>>>>>> at >>>>>>>>> >>>>>>> >>>>>> >> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:674) >>>>>>>>> ... 4 more---- >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>> >>>>> >> >>