[ https://issues.apache.org/jira/browse/HADOOP-13237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Nauroth updated HADOOP-13237: ----------------------------------- Attachment: HADOOP-13237.001.patch Hello [~ste...@apache.org]. I got curious about this, and I think I have a solution, so I'm reopening and attaching a patch. This is an incomplete patch just to communicate the idea, so I won't click Submit Patch yet. I mentioned before that I think anonymous access should be opt-in only through explicit configuration, so users don't mistakenly set up an insecure deployment. Instead of adding a new property, I now think the existing {{fs.s3a.aws.credentials.provider}} should be fine for this. By setting it equal to {{AnonymousAWSCredentialsProvider}}, it should bypass the credentials chain (which insists on finding non-null credentials) and instead use anonymous credentials directly. Unfortunately, there is a bug with that. The reflection-based credential provider initialization logic demands that the class have a constructor that accepts a {{URI}} and a {{Configuration}}. That wouldn't make sense for an {{AnonymousAWSCredentialsProvider}}, so I've added a fallback path to the initialization to support calling a default constructor. I tested this by removing my S3A credentials from configuration and trying to access the public landsat-pds bucket. I was able to repro the bug you reported. Then, I applied my patch, retried, and it worked fine. {code} > hadoop fs -cat s3a://landsat-pds/run_info.json cat: doesBucketExist on landsat-pds: com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain: Unable to load AWS credentials from any provider in the chain > hadoop fs > -Dfs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider > -cat s3a://landsat-pds/run_info.json {"active_run": "unknown on ip-10-144-75-61 started at 2016-06-06 18:09:24.791372 (landsat_ingestor_exec.py)", "last_run": 4215} {code} Is this what you had in mind? If so, let me know, and I'll finish off the remaining work for this patch: # Add a unit test for anonymous access. # Update documentation of fs.s3a.aws.credentials.provider in core-default.xml. # Update hadoop-aws site documentation with more discussion of fs.s3a.aws.credentials.provider. # Any other feedback from you or other code reviewers. > s3a initialization against public bucket fails if caller lacks any credentials > ------------------------------------------------------------------------------ > > Key: HADOOP-13237 > URL: https://issues.apache.org/jira/browse/HADOOP-13237 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 2.8.0 > Reporter: Steve Loughran > Assignee: Steve Loughran > Fix For: 2.8.0 > > Attachments: HADOOP-13237.001.patch > > > If an S3 bucket is public, anyone should be able to read from it. > However, you cannot create an s3a client bonded to a public bucket unless you > have some credentials; the {{doesBucketExist()}} check rejects the call. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org