[ 
https://issues.apache.org/jira/browse/HADOOP-4291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhuGuanyin updated HADOOP-4291:
-------------------------------

    Description: 
On some special cases, all replications of a given file has truncated to zero  
but the namenode still hold the original size (we don't know why),  the 
mapreduce streaming job will hang if we don't specified mapred.task.timeout 
when the input files contain this corrupted file, even the dfs shell "cat" will 
hang when fetch data from this corrupted file.

We found that job hang at DFSInputStream.blockSeekTo() when chosing a datanode. 
 The following test will show:
1)      Copy a small file to hdfs. 
2)      Get the file blocks and login to these datanodes, and truncate these 
blocks to zero.
3)      Cat this file through dfs shell "cat"
4)      Cat command will enter dead loop.


  was:
On some special cases, all replications of a given file has truncated to zero  
but the namenode still hold the original size (we don't know why),  the 
mapreduce streaming job will hang if we don't specified mapred.task.timeout 
when the input files contain this corrupted file, even the dfs shell "cat" will 
hang when fetch data from this corrupted file.

We found that job hang at DFSInputStream.blockSeekTo() when chosing a datanode. 
 The following test will show:
1)      Copy a little file to hdfs. 
2)      Get the file blocks and login to these datanodes, and truncate these 
blocks to zero.
3)      Cat this file through dfs shell "cat"
4)      Cat command will enter dead loop.



> MapReduce Streaming job hang when all replications of the input file has 
> corrupted!
> -----------------------------------------------------------------------------------
>
>                 Key: HADOOP-4291
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4291
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.1
>            Reporter: ZhuGuanyin
>            Priority: Critical
>
> On some special cases, all replications of a given file has truncated to zero 
>  but the namenode still hold the original size (we don't know why),  the 
> mapreduce streaming job will hang if we don't specified mapred.task.timeout 
> when the input files contain this corrupted file, even the dfs shell "cat" 
> will hang when fetch data from this corrupted file.
> We found that job hang at DFSInputStream.blockSeekTo() when chosing a 
> datanode.  The following test will show:
> 1)    Copy a small file to hdfs. 
> 2)    Get the file blocks and login to these datanodes, and truncate these 
> blocks to zero.
> 3)    Cat this file through dfs shell "cat"
> 4)    Cat command will enter dead loop.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to