Hi, I'm using the current Hadoop ec2 image (ami-ee53b687), and am having some trouble getting hadoop to access S3. Specifically, I'm trying to copy files from my bucket, into HDFS on the running cluster, so (on the master on the booted cluster) I do:
hadoop-0.17.0 einar$ bin/hadoop distcp s3://ID:[EMAIL PROTECTED]/ input 08/05/29 14:10:44 INFO util.CopyFiles: srcPaths=[ s3://ID:[EMAIL PROTECTED]/] 08/05/29 14:10:44 INFO util.CopyFiles: destPath=input 08/05/29 14:10:46 WARN fs.FileSystem: "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead. With failures, global counters are inaccurate; consider running with -i Copy failed: org.apache.hadoop.mapred.InvalidInputException: Input source s3://ID:[EMAIL PROTECTED]/ does not exist. at org.apache.hadoop.util.CopyFiles.checkSrcPath(CopyFiles.java:578) at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:594) at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:743) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:763) ..which clearly doesn't work. The ID:SECRET are right - as if I change them I get : org.jets3t.service.S3ServiceException: S3 HEAD request failed. ResponseCode=403, ResponseMessage=Forbidden ..etc I suspect it might be a generic problem, as if I do: bin/hadoop fs -ls s3://ID:[EMAIL PROTECTED]/ I get: ls: Cannot access s3://ID:[EMAIL PROTECTED]/ : No such file or directory. ..even though the bucket is there and has a lot of data in it. Any thoughts? Cheers, Einar