Distcp setup is slow
--------------------

                 Key: HADOOP-2379
                 URL: https://issues.apache.org/jira/browse/HADOOP-2379
             Project: Hadoop
          Issue Type: Improvement
          Components: dfs
    Affects Versions: 0.14.3
         Environment: from 35 node cluster to 10 node cluster
            Reporter: Johan Oskarsson
            Priority: Minor


When starting a distcp the setup phase often takes a very long time. For 
example during the distcp I just ran the setup phase took 15 minutes and the 
actual copy 3 minutes. Could this be improved? Or at least a progress bar added 
so the user doesn't think it stalled.

I also often see exceptions like this in the setup, but the distcp finishes 
eventually.
java.io.EOFException
        at java.io.DataInputStream.readShort(DataInputStream.java:298)
        at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.endBlock(DFSClient.java:1672)
        at 
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.close(DFSClient.java:1744)
        at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
        at 
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
        at org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:774)
        at 
org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.setup(CopyFiles.java:351)
        at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:773)
        at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:854)
        at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
        at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:864)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to