[
https://issues.apache.org/jira/browse/HDFS-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wei-Chiu Chuang resolved HDFS-16161.
------------------------------------
Resolution: Duplicate
Turns out it was fixed by HDFS-14706
> Corrupt block checksum is not reported to NameNode
> --------------------------------------------------
>
> Key: HDFS-16161
> URL: https://issues.apache.org/jira/browse/HDFS-16161
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Reporter: Wei-Chiu Chuang
> Priority: Major
>
> One of our user reported this error in the log:
> {noformat}
> 2021-07-30 09:51:27,509 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> an02nphda5777.example.com:1004:DataXceiver error processing READ_BLOCK
> operation src: /10.30.10.68:35680 dst: /10.30.10.67:1004
> java.lang.IllegalArgumentException: id=-46 out of range [0, 5)
> at
> org.apache.hadoop.util.DataChecksum$Type.valueOf(DataChecksum.java:76)
> at
> org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:167)
> {noformat}
> Analysis:
> it looks like the first few bytes of checksum was bad. The first few bytes
> determines the type of checksum (CRC32, CRC32C…etc). But the block was never
> reported to NameNode and removed.
> if DN throws an IOException reading a block, it starts another thread to scan
> the block. If the block is indeed bad, it tells NN it’s got a bad block. But
> this is an IllegalArgumentException which is a RuntimeException not an IOE so
> it’s not handled that way.
> its’ a bug in the error handling code. It should be made more graceful.
> Suggest: catch the IllegalArgumentException in
> BlockMetadataHeader.preadHeader() and throw CorruptMetaHeaderException, so
> that DN catches the exception and perform the regular block scan check.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]