deepujain opened a new pull request, #8310: URL: https://github.com/apache/hadoop/pull/8310
### Summary When a block read fails with `CorruptMetaHeaderException` (e.g. corrupt or too-short block meta file on the DataNode), DFSInputStream did not add the block to the corrupted set or report it to the NameNode, unlike `ChecksumException`. Corrupt blocks were therefore not invalidated or re-replicated. This change treats `CorruptMetaHeaderException` the same as `ChecksumException` in the client read path so the block is reported to the NameNode and can be re-replicated. ### Change - **DFSInputStream**: Import `CorruptMetaHeaderException`. In `readBuffer()` (block read loop), add a `catch (CorruptMetaHeaderException)` that logs, adds the block to `corruptedBlocks`, and sets `retryCurrentNode = false`, mirroring the existing `ChecksumException` handling. In `actualGetFromOneDataNode()` (fetchBlockByteRange), add a `catch (CorruptMetaHeaderException)` that logs, adds the block to `corruptedBlocks`, marks the datanode dead, and throws an `IOException`, mirroring the `ChecksumException` handling. - **TestCorruptMetadataFile**: Add `testReportCorruptMetaHeaderToNameNode()`: create a file, corrupt the block meta file with an invalid 7-byte header, read from the client (expect IOException), then wait until the NameNode’s corrupt block count is 1 (HDFS-17179). ### JIRA Fixes HDFS-17179 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
