[
https://issues.apache.org/jira/browse/ZOOKEEPER-2994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16442495#comment-16442495
]
ASF GitHub Bot commented on ZOOKEEPER-2994:
-------------------------------------------
Github user nkalmar commented on the issue:
https://github.com/apache/zookeeper/pull/487
I used your updated documentation, and managed to recover a corrupted log
file:
bin/zkTxnLogToolkit.sh -d ~/workspace/zookeeper/standalone/version-2/log.1
ZooKeeper Transactional Log File with dbid 0 txnlog format version 2
4/9/18 3:13:19 PM CEST session 0x10000ebe13a0000 cxid 0x0 zxid 0x1
createSession 30000
4/9/18 3:15:21 PM CEST session 0x10000ebe13a0000 cxid 0x0 zxid 0x2
closeSession null
4/9/18 3:17:41 PM CEST session 0x10000ebe13a0001 cxid 0x0 zxid 0x3
createSession 30000
4/9/18 3:18:13 PM CEST session 0x10000ebe13a0001 cxid 0x0 zxid 0x4
closeSession null
EOF reached after 4 txns.
Corrupted log.1 file
bin/zkTxnLogToolkit.sh -d ~/workspace/zookeeper/standalone/version-2/log.1
ZooKeeper Transactional Log File with dbid 0 txnlog format version 2
CRC ERROR - 4/10/18 5:12:11 AM CEST session 0x10000ebe13a0000 cxid 0x0 zxid
0x1 createSession 30000
4/10/18 5:12:11 AM CEST session 0x10000ebe13a0000 cxid 0x0 zxid 0x1
createSession 30000
4/9/18 3:15:21 PM CEST session 0x10000ebe13a0000 cxid 0x0 zxid 0x2
closeSession null
CRC ERROR - 4/9/18 3:17:41 PM CEST session 0x10044aa44aaaaaa cxid 0x0 zxid
0x3 createSession 30000
4/9/18 3:17:41 PM CEST session 0x10044aa44aaaaaa cxid 0x0 zxid 0x3
createSession 30000
4/9/18 3:18:13 PM CEST session 0x10000ebe13a0001 cxid 0x0 zxid 0x4
closeSession null
EOF reached after 4 txns.
bin/zkTxnLogToolkit.sh -r ~/workspace/zookeeper/standalone/version-2/log.1
ZooKeeper Transactional Log File with dbid 0 txnlog format version 2
CRC ERROR - 4/10/18 5:12:11 AM CEST session 0x10000ebe13a0000 cxid 0x0 zxid
0x1 createSession 30000
Would you like to fix it (Yes/No/Abort) ? Y
EOF reached after 4 txns.
Recovery file
/Users/norbertkalmar/workspace/zookeeper/standalone/version-2/log.1.fixed has
been written with 1 fixed CRC error(s)
bin/zkTxnLogToolkit.sh -d
~/workspace/zookeeper/standalone/version-2/log.1.fixed
ZooKeeper Transactional Log File with dbid 0 txnlog format version 2
4/9/18 3:13:19 PM CEST session 0x10000ebe13a0000 cxid 0x0 zxid 0x1
createSession 30000
4/9/18 3:15:21 PM CEST session 0x10000ebe13a0000 cxid 0x0 zxid 0x2
closeSession null
4/9/18 3:17:41 PM CEST session 0x10044aa44aaaaaa cxid 0x0 zxid 0x3
createSession 30000
4/9/18 3:18:13 PM CEST session 0x10000ebe13a0001 cxid 0x0 zxid 0x4
closeSession null
EOF reached after 4 txns.
LGTM!
> Tool required to recover log and snapshot entries with CRC errors
> -----------------------------------------------------------------
>
> Key: ZOOKEEPER-2994
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2994
> Project: ZooKeeper
> Issue Type: New Feature
> Reporter: Andor Molnar
> Assignee: Andor Molnar
> Priority: Major
> Fix For: 3.5.4, 3.6.0, 3.4.13
>
>
> In the even that the zookeeper transaction log or snapshot become corrupted
> and fail CRC checks (preventing startup) we should have a mechanism to get
> the cluster running again.
> Previously we achieved this by loading the broken transaction log with a
> modified version of ZK with disabled CRC check and forced it to snapshot.
> It'd very handy to have a tool which can do this for us. LogFormatter and
> SnapshotFormatter have already been designed to dump log and snapshot files,
> it'd be nice to extend their functionality and add ability for such recovery.
> It has proven that once you end up with the corrupt txn log there is no way
> to recover except manually modifying the crc check. That's basically why the
> tool is needed.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)