[
https://issues.apache.org/jira/browse/HDFS-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andres Perez updated HDFS-5280:
-------------------------------
Attachment: HDFS-5280.patch
> Corrupted meta files on data nodes prevents DFClient from connecting to data
> nodes and updating corruption status to name node.
> -------------------------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-5280
> URL: https://issues.apache.org/jira/browse/HDFS-5280
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode, hdfs-client
> Affects Versions: 1.1.1, 3.0.0, 2.1.0-beta, 2.0.4-alpha, 2.7.2
> Environment: Red hat enterprise 6.4
> Hadoop-2.1.0
> Reporter: Jinghui Wang
> Assignee: Andres Perez
> Attachments: HDFS-5280.patch
>
>
> Meta files being corrupted causes the DFSClient not able to connect to the
> datanodes to access the blocks, so DFSClient never perform a read on the
> block, which is what throws the ChecksumException when file blocks are
> corrupted and report to the namenode to mark the block as corrupt. Since the
> client never got to that far, thus the file status remain as healthy and so
> are all the blocks.
> To replicate the error, put a file onto HDFS.
> run hadoop fsck /tmp/bogus.csv -files -blocks -location will get that
> following output.
> FSCK started for path /tmp/bogus.csv at 11:33:29
> /tmp/bogus.csv 109 bytes, 1 block(s): OK
> 0. blk_-4255166695856420554_5292 len=109 repl=3
> find the block/meta files for 4255166695856420554 by running
> ssh datanode1.address find /hadoop/ -name "*4255166695856420554*" and it will
> get the following output:
> /hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554
> /hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554_5292.meta
> now corrupt the meta file by running
> ssh datanode1.address "sed -i -e '1i 1234567891'
> /hadoop/data1/hdfs/current/subdir2/blk_-4255166695856420554_5292.meta"
> now run hadoop fs -cat /tmp/bogus.csv
> will show the stack trace of DFSClient failing to connect to the data node
> with the corrupted meta file.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)