distcp does not fail if source directory has files with missing blocks
----------------------------------------------------------------------

                 Key: HADOOP-2049
                 URL: https://issues.apache.org/jira/browse/HADOOP-2049
             Project: Hadoop
          Issue Type: Bug
          Components: util
    Affects Versions: 0.15.0
         Environment: Nightly build: Oct 11, 2007.
            Reporter: Murtaza A. Basrai
            Priority: Critical


I copied a directory using distcp (to another directory on the same file 
system).

There were 9 data blocks missing in the files in the source directory, which 
caused distcp to print messages like the following:

...
07/10/13 00:09:16 INFO mapred.JobClient:  map 1% reduce 0%
07/10/13 00:09:16 INFO mapred.JobClient: Task Id : 
task_200710120717_0081_m_000020_0, Status : FAILED
java.io.IOException: Could not obtain block: blk_6787282547149034655 
file=/srcdir/file1
        at 
org.apache.hadoop.dfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1136)
        at 
org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:988)
        at 
org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1094)
        at java.io.DataInputStream.read(DataInputStream.java:83)
        at 
org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.copy(CopyFiles.java:289)
        at 
org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:348)
        at 
org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:216)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
        at 
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1753)
...

The corresponding tasks failed, but the retries were successful (all files with 
missing blocks in the source directory were copied as empty files in the target 
directory).

I think that distcp should fail if it cannot successfully copy all the files 
(at least when no command-line options are given).

This is critical for us as we intend to use distcp to copy databases from one 
dfs to another, and if silent failures can happen then we would have to monitor 
each distcp manually to ensure that it succeeded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to