[
https://issues.apache.org/jira/browse/HBASE-29158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17937330#comment-17937330
]
Nick Dimiduk commented on HBASE-29158:
--------------------------------------
We haven't done a perf analysis of this change, but IMHO it's a necessary
correctness fix. Over here at my employer, we hit exceptions related to reads
of unverified block headers, these errors make it through to our alerting at
least once/year _that I'm aware of_. This heuristic still isn't perfect
coverage but in order to achieve perfect coverage, I think we need to change
the HFile format.
FYI [~apurtell] [~vjasani] This one applies cleanly to branch-2.5.
> Unknown checksum type code exception occurred while reading HFileBlock
> ----------------------------------------------------------------------
>
> Key: HBASE-29158
> URL: https://issues.apache.org/jira/browse/HBASE-29158
> Project: HBase
> Issue Type: Bug
> Components: HFile
> Affects Versions: 2.2.6, 2.6.2
> Reporter: Guanglei Xia
> Assignee: Guanglei Xia
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.0.0-beta-2, 2.6.3, 2.5.12
>
>
> In our HBase cluster, we encountered frequent checksum type error messages.
> After reviewing the relevant Jira, we found that HBASE-28605 had previously
> discussed the issue of HBase checksum. Currently, HBase checksum does not
> check the hfile header cache, which can cause some problems when HFile is
> corrupted. This patch(HBASE-28605) also fixes several cases of corrupt HFile.
> However, HBASE-28605 cannot solve the problem of checksum type error when the
> HFile header is corrupted. We propose a new patch to fix the issue of
> checksum type error. We will check the checksum type value of the hfile
> header before the checksum. If this is incorrect, it means that the hfile
> header is corrupted and cannot be used anymore. Finally, this patch was
> applied in our HBase cluster and the bug has been resolved in our cluster.
> We will provide feedback on this patch to the community and display the error
> stack in the comments, hoping to receive some guidance......
--
This message was sent by Atlassian Jira
(v8.20.10#820010)