Re: Issue distcp'ing from 0.19.2 to 0.18.3

Bryan Duxbury Thu, 09 Apr 2009 08:57:28 -0700

Ah, nevermind. It turns out that I just shouldn't rely on commandhistory so much. I accidentally pointed the hftp:// at the actualnamenode port, not the namenode HTTP port. It appears to be startinga regular copy again.


-Bryan


On Apr 8, 2009, at 11:57 PM, Todd Lipcon wrote:

Hey Bryan,
Any chance you can get a tshark trace on the 0.19 namenode? Maybetshark -s
100000 -w nndump.pcap port 7276
Also, are the clocks synced on the two machines? The failure ofyour distcpis at 23:32:39, but the namenode log message you posted was23:29:09. Did
those messages actually pop out at the same time?

Thanks
-Todd
On Wed, Apr 8, 2009 at 11:39 PM, Bryan Duxbury <br...@rapleaf.com>wrote:
Hey all,
I was trying to copy some data from our cluster on 0.19.2 to a newclusteron 0.18.3 by using disctp and the hftp:// filesystem. Everythingseemed tobe going fine for a few hours, but then a few tasks failed becausea fewfiles got 500 errors when trying to be read from the 19 cluster.As a result
the job died. Now that I'm trying to restart it, I get this error:

[rapl...@ds-nn2 ~]$ hadoop distcp hftp://ds-nn1:7276/
hdfs://ds-nn2:7276/cluster-a
09/04/08 23:32:39 INFO tools.DistCp: srcPaths=[hftp://ds-nn1:7276/]
09/04/08 23:32:39 INFO tools.DistCp: destPath=hdfs://ds-nn2:7276/cluster-aWith failures, global counters are inaccurate; consider runningwith -iCopy failed: java.net.SocketException: Unexpected end of file fromserverat sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:769)
       at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:766)
       at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
       at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1000)
       at
org.apache.hadoop.dfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:183)
       at
org.apache.hadoop.dfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:193)
       at
org.apache.hadoop.dfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:222)
       at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:667)
at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:588)
       at org.apache.hadoop.tools.DistCp.copy(DistCp.java:609)
       at org.apache.hadoop.tools.DistCp.run(DistCp.java:768)
       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
       at org.apache.hadoop.tools.DistCp.main(DistCp.java:788)

I changed nothing at all between the first attempt and the subsequent
failed attempts. The only clues in the namenode log for the 19cluster are:
2009-04-08 23:29:09,786 WARN org.apache.hadoop.ipc.Server:Incorrect header
or version mismatch from 10.100.50.252:47733 got version 47 expected
version 2

Anyone have any ideas?

-Bryan

Re: Issue distcp'ing from 0.19.2 to 0.18.3

Reply via email to