[
https://issues.apache.org/jira/browse/HBASE-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15275084#comment-15275084
]
Appy commented on HBASE-11625:
------------------------------
Uploading the patch.
Testing:
The bug being fixed here happens only when the data is actually corrupted.
We already have tests which 'simulate' checksum failure, i.e. when a checksum
request comes,
[this|https://github.com/apache/hbase/blob/513ca3483f1d32450ffa0c034e7a7f97b63ff582/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java#L347]
simply returns false. But this is not sufficient, consider this example.
Say the correct logic order for something is A --> B, but the code actually
has B --> A. Since we 'simulate' the failure exactly at point A, the test
doesn't care about position of B relative to A. Instead if the data was
corrupted for real, we would have seen an unexpected crash at B in the buggy
case (while expecting crash at A) and caught this earlier.
The test change does exactly that. The first output below using the test on
current master, and second output is with the patch.
{noformat}
-------------------------------------------------------
T E S T S
-------------------------------------------------------
Running org.apache.hadoop.hbase.io.hfile.TestChecksum
Tests run: 4, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.538 sec <<<
FAILURE! - in org.apache.hadoop.hbase.io.hfile.TestChecksum
testChecksumCorruption(org.apache.hadoop.hbase.io.hfile.TestChecksum) Time
elapsed: 0.048 sec <<< ERROR!
java.io.IOException: Invalid HFile block magic: D\x00TABLK*
at org.apache.hadoop.hbase.io.hfile.BlockType.parse(BlockType.java:159)
at org.apache.hadoop.hbase.io.hfile.BlockType.read(BlockType.java:172)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock.<init>(HFileBlock.java:337)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1695)
at
org.apache.hadoop.hbase.io.hfile.TestChecksum$CorruptedFSReaderImpl.readBlockDataInternal(TestChecksum.java:372)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1527)
at
org.apache.hadoop.hbase.io.hfile.TestChecksum.testChecksumCorruptionInternals(TestChecksum.java:197)
at
org.apache.hadoop.hbase.io.hfile.TestChecksum.testChecksumCorruption(TestChecksum.java:152)
Results :
Tests in error:
TestChecksum.testChecksumCorruption:152->testChecksumCorruptionInternals:197
ยป IO
{noformat}
{noformat}
-------------------------------------------------------
T E S T S
-------------------------------------------------------
Running org.apache.hadoop.hbase.io.hfile.TestChecksum
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.862 sec - in
org.apache.hadoop.hbase.io.hfile.TestChecksum
Results :
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0
{noformat}
Note:
On local filesystem (specifying path as file:///....), we don't enable hbase
checksum. So simple {{hbase HFIle -p -f ...}} doesn't work. This actually makes
the last repro method meaningless. Ref
[1|https://github.com/apache/hbase/blob/8ace5bbfcea01e02c5661f75fe9458e04fa3b60f/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java#L117]
and
[2|https://github.com/apache/hbase/blob/8ace5bbfcea01e02c5661f75fe9458e04fa3b60f/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java#L542]
> Reading datablock throws "Invalid HFile block magic" and can not switch to
> hdfs checksum
> -----------------------------------------------------------------------------------------
>
> Key: HBASE-11625
> URL: https://issues.apache.org/jira/browse/HBASE-11625
> Project: HBase
> Issue Type: Bug
> Components: HFile
> Affects Versions: 0.94.21, 0.98.4, 0.98.5, 1.0.1.1, 1.0.3
> Reporter: qian wang
> Assignee: Pankaj Kumar
> Fix For: 2.0.0
>
> Attachments: 2711de1fdf73419d9f8afc6a8b86ce64.gz, HBASE-11625.patch,
> correct-hfile, corrupted-header-hfile
>
>
> when using hbase checksum,call readBlockDataInternal() in hfileblock.java, it
> could happen file corruption but it only can switch to hdfs checksum
> inputstream till validateBlockChecksum(). If the datablock's header corrupted
> when b = new HFileBlock(),it throws the exception "Invalid HFile block magic"
> and the rpc call fail
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)