[
https://issues.apache.org/jira/browse/HDFS-8965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14992293#comment-14992293
]
Colin Patrick McCabe commented on HDFS-8965:
--------------------------------------------
bq. 1. What do we do with the corrupted edit log entry? skip it?
We only would attempt to skip corrupted edit log entries if recovery mode were
turned on. Otherwise, an exception would be thrown, which would ordinarily
lead to us trying to read a different copy of the edit log.
bq. 2. Do we have idea about whether it is a bug that corrupt edit log, or is
it a disk/network problem that corrupt the edit log?
In general, I don't see how we could know the answer to that question without
more information.
> Harden edit log reading code against out of memory errors
> ---------------------------------------------------------
>
> Key: HDFS-8965
> URL: https://issues.apache.org/jira/browse/HDFS-8965
> Project: Hadoop HDFS
> Issue Type: Improvement
> Affects Versions: 2.0.0-alpha
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
> Fix For: 2.8.0
>
> Attachments: HDFS-8965.001.patch, HDFS-8965.002.patch,
> HDFS-8965.003.patch, HDFS-8965.004.patch, HDFS-8965.005.patch,
> HDFS-8965.006.patch, HDFS-8965.007.patch
>
>
> We should harden the edit log reading code against out of memory errors. Now
> that each op has a length prefix and a checksum, we can validate the checksum
> before trying to load the Op data. This should avoid out of memory errors
> when trying to load garbage data as Op data.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)