[
https://issues.apache.org/jira/browse/HADOOP-13237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Nauroth updated HADOOP-13237:
-----------------------------------
Attachment: HADOOP-13237.001.patch
Hello [[email protected]]. I got curious about this, and I think I have a
solution, so I'm reopening and attaching a patch. This is an incomplete patch
just to communicate the idea, so I won't click Submit Patch yet.
I mentioned before that I think anonymous access should be opt-in only through
explicit configuration, so users don't mistakenly set up an insecure
deployment. Instead of adding a new property, I now think the existing
{{fs.s3a.aws.credentials.provider}} should be fine for this. By setting it
equal to {{AnonymousAWSCredentialsProvider}}, it should bypass the credentials
chain (which insists on finding non-null credentials) and instead use anonymous
credentials directly.
Unfortunately, there is a bug with that. The reflection-based credential
provider initialization logic demands that the class have a constructor that
accepts a {{URI}} and a {{Configuration}}. That wouldn't make sense for an
{{AnonymousAWSCredentialsProvider}}, so I've added a fallback path to the
initialization to support calling a default constructor.
I tested this by removing my S3A credentials from configuration and trying to
access the public landsat-pds bucket. I was able to repro the bug you
reported. Then, I applied my patch, retried, and it worked fine.
{code}
> hadoop fs -cat s3a://landsat-pds/run_info.json
cat: doesBucketExist on landsat-pds: com.amazonaws.AmazonClientException:
Unable to load AWS credentials from any provider in the chain: Unable to load
AWS credentials from any provider in the chain
> hadoop fs
> -Dfs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider
> -cat s3a://landsat-pds/run_info.json
{"active_run": "unknown on ip-10-144-75-61 started at 2016-06-06
18:09:24.791372 (landsat_ingestor_exec.py)", "last_run": 4215}
{code}
Is this what you had in mind? If so, let me know, and I'll finish off the
remaining work for this patch:
# Add a unit test for anonymous access.
# Update documentation of fs.s3a.aws.credentials.provider in core-default.xml.
# Update hadoop-aws site documentation with more discussion of
fs.s3a.aws.credentials.provider.
# Any other feedback from you or other code reviewers.
> s3a initialization against public bucket fails if caller lacks any credentials
> ------------------------------------------------------------------------------
>
> Key: HADOOP-13237
> URL: https://issues.apache.org/jira/browse/HADOOP-13237
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 2.8.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Fix For: 2.8.0
>
> Attachments: HADOOP-13237.001.patch
>
>
> If an S3 bucket is public, anyone should be able to read from it.
> However, you cannot create an s3a client bonded to a public bucket unless you
> have some credentials; the {{doesBucketExist()}} check rejects the call.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]