problem using top level s3 buckets as input/output directories
--------------------------------------------------------------

                 Key: HADOOP-5805
                 URL: https://issues.apache.org/jira/browse/HADOOP-5805
             Project: Hadoop Core
          Issue Type: Bug
          Components: fs/s3
    Affects Versions: 0.18.3
         Environment: ec2, cloudera AMI, 20 nodes
            Reporter: Arun Jacob


When I specify top level s3 buckets as input or output directories, I get the 
following exception.

hadoop jar subject-map-reduce.jar s3n://infocloud-input s3n://infocloud-output

java.lang.IllegalArgumentException: Path must be absolute: 
s3n://infocloud-output
        at 
org.apache.hadoop.fs.s3native.NativeS3FileSystem.pathToKey(NativeS3FileSystem.java:246)
        at 
org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:319)
        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:667)
        at 
org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:109)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:738)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1026)
        at 
com.evri.infocloud.prototype.subjectmapreduce.SubjectMRDriver.run(SubjectMRDriver.java:63)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at 
com.evri.infocloud.prototype.subjectmapreduce.SubjectMRDriver.main(SubjectMRDriver.java:25)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
        at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
        at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

The workaround is to specify input/output buckets with sub-directories:

 
hadoop jar subject-map-reduce.jar s3n://infocloud-input/input-subdir  
s3n://infocloud-output/output-subdir



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to