deepujain opened a new pull request, #8310:
URL: https://github.com/apache/hadoop/pull/8310

   ### Summary
   
   When a block read fails with `CorruptMetaHeaderException` (e.g. corrupt or 
too-short block meta file on the DataNode), DFSInputStream did not add the 
block to the corrupted set or report it to the NameNode, unlike 
`ChecksumException`. Corrupt blocks were therefore not invalidated or 
re-replicated. This change treats `CorruptMetaHeaderException` the same as 
`ChecksumException` in the client read path so the block is reported to the 
NameNode and can be re-replicated.
   
   ### Change
   
   - **DFSInputStream**: Import `CorruptMetaHeaderException`. In `readBuffer()` 
(block read loop), add a `catch (CorruptMetaHeaderException)` that logs, adds 
the block to `corruptedBlocks`, and sets `retryCurrentNode = false`, mirroring 
the existing `ChecksumException` handling. In `actualGetFromOneDataNode()` 
(fetchBlockByteRange), add a `catch (CorruptMetaHeaderException)` that logs, 
adds the block to `corruptedBlocks`, marks the datanode dead, and throws an 
`IOException`, mirroring the `ChecksumException` handling.
   - **TestCorruptMetadataFile**: Add 
`testReportCorruptMetaHeaderToNameNode()`: create a file, corrupt the block 
meta file with an invalid 7-byte header, read from the client (expect 
IOException), then wait until the NameNode’s corrupt block count is 1 
(HDFS-17179).
   
   ### JIRA
   
   Fixes HDFS-17179
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to