[ 
https://issues.apache.org/jira/browse/HADOOP-13237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HADOOP-13237:
-----------------------------------
    Attachment: HADOOP-13237.001.patch

Hello [~ste...@apache.org].  I got curious about this, and I think I have a 
solution, so I'm reopening and attaching a patch.  This is an incomplete patch 
just to communicate the idea, so I won't click Submit Patch yet.

I mentioned before that I think anonymous access should be opt-in only through 
explicit configuration, so users don't mistakenly set up an insecure 
deployment.  Instead of adding a new property, I now think the existing 
{{fs.s3a.aws.credentials.provider}} should be fine for this.  By setting it 
equal to {{AnonymousAWSCredentialsProvider}}, it should bypass the credentials 
chain (which insists on finding non-null credentials) and instead use anonymous 
credentials directly.

Unfortunately, there is a bug with that.  The reflection-based credential 
provider initialization logic demands that the class have a constructor that 
accepts a {{URI}} and a {{Configuration}}.  That wouldn't make sense for an 
{{AnonymousAWSCredentialsProvider}}, so I've added a fallback path to the 
initialization to support calling a default constructor.

I tested this by removing my S3A credentials from configuration and trying to 
access the public landsat-pds bucket.  I was able to repro the bug you 
reported.  Then, I applied my patch, retried, and it worked fine.

{code}
> hadoop fs -cat s3a://landsat-pds/run_info.json
cat: doesBucketExist on landsat-pds: com.amazonaws.AmazonClientException: 
Unable to load AWS credentials from any provider in the chain: Unable to load 
AWS credentials from any provider in the chain

> hadoop fs 
> -Dfs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider
>  -cat s3a://landsat-pds/run_info.json
{"active_run": "unknown on ip-10-144-75-61 started at 2016-06-06 
18:09:24.791372 (landsat_ingestor_exec.py)", "last_run": 4215}
{code}

Is this what you had in mind?  If so, let me know, and I'll finish off the 
remaining work for this patch:

# Add a unit test for anonymous access.
# Update documentation of fs.s3a.aws.credentials.provider in core-default.xml.
# Update hadoop-aws site documentation with more discussion of 
fs.s3a.aws.credentials.provider.
# Any other feedback from you or other code reviewers.

> s3a initialization against public bucket fails if caller lacks any credentials
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-13237
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13237
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>             Fix For: 2.8.0
>
>         Attachments: HADOOP-13237.001.patch
>
>
> If an S3 bucket is public, anyone should be able to read from it.
> However, you cannot create an s3a client bonded to a public bucket unless you 
> have some credentials; the {{doesBucketExist()}} check rejects the call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to