[
https://issues.apache.org/jira/browse/ZOOKEEPER-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400098#comment-13400098
]
Patrick Hunt commented on ZOOKEEPER-1453:
-----------------------------------------
Marshall are you sure you have this fix?
https://issues.apache.org/jira/browse/ZOOKEEPER-1156
I think it would be good to fix this issue, however I'm more concerned about
the fact that you're seeing corruption in your stress test environment. I think
we need to track down the cause of that, my initial reaction is that you are
seeing some bug that we need to fix. The ability to handle a corrupted log file
is secondary to actually fixing the issue causing the corruption in the first
place. Any insights on what might be causing the issue you are seeing? Perhaps
you could file a bug report with log4j logs and a copy of your datadir?
To answer your question, I think we need to capture any failure to verify the
crc as a failure, rather than EOF (as you suggest).
> corrupted logs may not be correctly identified by FileTxnIterator
> -----------------------------------------------------------------
>
> Key: ZOOKEEPER-1453
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1453
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.3.3
> Reporter: Patrick Hunt
> Priority: Critical
>
> See ZOOKEEPER-1449 for background on this issue. The main problem is that
> during server recovery
> org.apache.zookeeper.server.persistence.FileTxnLog.FileTxnIterator.next()
> does not indicate if the available logs are valid or not. In some cases (say
> a truncated record and a single txnlog in the datadir) we will not detect
> that the file is corrupt, vs reaching the end of the file.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira