Bryan,

hftp://ds-nn1:7276
hdfs://ds-nn2:7276

Are you using the same port number for hftp and hdfs?

Looking at the stack trace, it seems like it failed before starting a
distcp job.

Koji

-----Original Message-----
From: Bryan Duxbury [mailto:[email protected]] 
Sent: Wednesday, April 08, 2009 11:40 PM
To: [email protected]
Subject: Issue distcp'ing from 0.19.2 to 0.18.3

Hey all,

I was trying to copy some data from our cluster on 0.19.2 to a new  
cluster on 0.18.3 by using disctp and the hftp:// filesystem.  
Everything seemed to be going fine for a few hours, but then a few  
tasks failed because a few files got 500 errors when trying to be  
read from the 19 cluster. As a result the job died. Now that I'm  
trying to restart it, I get this error:

[rapl...@ds-nn2 ~]$ hadoop distcp hftp://ds-nn1:7276/ hdfs://ds- 
nn2:7276/cluster-a
09/04/08 23:32:39 INFO tools.DistCp: srcPaths=[hftp://ds-nn1:7276/]
09/04/08 23:32:39 INFO tools.DistCp: destPath=hdfs://ds-nn2:7276/ 
cluster-a
With failures, global counters are inaccurate; consider running with -i
Copy failed: java.net.SocketException: Unexpected end of file from  
server
         at sun.net.www.http.HttpClient.parseHTTPHeader 
(HttpClient.java:769)
         at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
         at sun.net.www.http.HttpClient.parseHTTPHeader 
(HttpClient.java:766)
         at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
         at sun.net.www.protocol.http.HttpURLConnection.getInputStream 
(HttpURLConnection.java:1000)
         at org.apache.hadoop.dfs.HftpFileSystem$LsParser.fetchList 
(HftpFileSystem.java:183)
         at org.apache.hadoop.dfs.HftpFileSystem 
$LsParser.getFileStatus(HftpFileSystem.java:193)
         at org.apache.hadoop.dfs.HftpFileSystem.getFileStatus 
(HftpFileSystem.java:222)
         at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:667)
         at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:588)
         at org.apache.hadoop.tools.DistCp.copy(DistCp.java:609)
         at org.apache.hadoop.tools.DistCp.run(DistCp.java:768)
         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
         at org.apache.hadoop.tools.DistCp.main(DistCp.java:788)

I changed nothing at all between the first attempt and the subsequent  
failed attempts. The only clues in the namenode log for the 19  
cluster are:

2009-04-08 23:29:09,786 WARN org.apache.hadoop.ipc.Server: Incorrect  
header or version mismatch from 10.100.50.252:47733 got version 47  
expected version 2

Anyone have any ideas?

-Bryan

Reply via email to