Hi Tom. Ah... From reading (your?) article:
http://developer.amazonwebservices.com/connect/entry.jspa?externalID=873&categoryID=112 I got confused; it seems to suggest that distcp is used to move ordinary S3 objects onto HDFS.. Thanks for the clarification. Cheers, Einar On Sat, May 31, 2008 at 11:58 PM, Tom White <[EMAIL PROTECTED]> wrote: > Hi Einar, > > How did you put the data onto S3, using Hadoop's S3 FileSystem or > using other S3 tools? If it's the latter then it won't work as the s3 > scheme is for Hadoop's block-based S3 storage. Native S3 support is > coming - see https://issues.apache.org/jira/browse/HADOOP-930, but > it's not integrated yet. > > Tom > > On Thu, May 29, 2008 at 10:15 PM, Einar Vollset > <[EMAIL PROTECTED]> wrote: >> Hi, >> >> I'm using the current Hadoop ec2 image (ami-ee53b687), and am having >> some trouble getting hadoop >> to access S3. Specifically, I'm trying to copy files from my bucket, >> into HDFS on the running cluster, so >> (on the master on the booted cluster) I do: >> >> hadoop-0.17.0 einar$ bin/hadoop distcp >> s3://ID:[EMAIL PROTECTED]/ input >> 08/05/29 14:10:44 INFO util.CopyFiles: srcPaths=[ >> s3://ID:[EMAIL PROTECTED]/] >> 08/05/29 14:10:44 INFO util.CopyFiles: destPath=input >> 08/05/29 14:10:46 WARN fs.FileSystem: "localhost:9000" is a deprecated >> filesystem name. Use "hdfs://localhost:9000/" instead. >> With failures, global counters are inaccurate; consider running with -i >> Copy failed: org.apache.hadoop.mapred.InvalidInputException: Input >> source s3://ID:[EMAIL PROTECTED]/ does not >> exist. >> at org.apache.hadoop.util.CopyFiles.checkSrcPath(CopyFiles.java:578) >> at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:594) >> at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:743) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >> at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:763) >> >> ..which clearly doesn't work. The ID:SECRET are right - as if I change >> them I get : >> >> org.jets3t.service.S3ServiceException: S3 HEAD request failed. >> ResponseCode=403, ResponseMessage=Forbidden >> ..etc >> >> I suspect it might be a generic problem, as if I do: >> >> bin/hadoop fs -ls s3://ID:[EMAIL PROTECTED]/ >> >> I get: >> ls: Cannot access s3://ID:[EMAIL PROTECTED]/ : >> No such file or directory. >> >> >> ..even though the bucket is there and has a lot of data in it. >> >> >> Any thoughts? >> >> Cheers, >> >> Einar >> > -- Einar Vollset Chief Scientist Something Simpler Systems 690 - 220 Cambie St Vancouver, BC V6B 2M9 Canada ph: +1-778-987-4256 http://somethingsimpler.com