[
https://issues.apache.org/jira/browse/HADOOP-9293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13575509#comment-13575509
]
Andy Sautins commented on HADOOP-9293:
--------------------------------------
Steven,
Correct, this is intended largely for clients. The particular use case that I
am attempting to address is that if a Amazon Elastic Map Reduce ( EMR ) cluster
is created I want to be able to perform client side actions on S3 with the same
credentials method I use to manage the cluster ( elastic-mapreduce command ).
Example actions could be the ls of an S3 bucket or performing a require in Pig
from an S3 bucket. As far as I understand both are client side activities and
require S3 authentication.
Your solution of using the XInclude definitely works ( and is something I
didn't know so thank you for pointing that out ). The only downside I see to
the approach you describe is that if one would want to use other tools like
s3cmd or elastic-mapreduce one would need to have two different files. The
JSON that the S3 tool uses and the XML file that I use through Hadoop like the
one you describe. From a credential management standpoint that isn't as ideal
as having one spot on the client to manage credentials.
I agree from a security standpoint the -site.xml file needs the same trust as
the credentials file. This patch is suggested as a way to limit the number of
copies of the credentials. By reading the same JSON file as the EMR ruby
client I am able to use both the emr client tools ( elastic-mapreduce, s3cmd )
as well as hadoop commands with a single credentials file. The other
alternative is to change the amazon tools to read xml instead of JSON, but
given the size of the change to read the Amazon files this patch seems to make
sense to me.
Does it make more sense now or do you still think it's not the right approach?
> For S3 use credentials file
> ---------------------------
>
> Key: HADOOP-9293
> URL: https://issues.apache.org/jira/browse/HADOOP-9293
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs/s3
> Affects Versions: 1.0.2
> Environment: Linux
> Reporter: Andy Sautins
> Priority: Minor
> Labels: features, newbie
> Attachments: HADOOP-9293.patch
>
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> The following document describes the current way that S3 credentials can be
> specified ( http://wiki.apache.org/hadoop/AmazonS3 ). In summary they are:
> * in the S3 URI.
> * in the hadoop-site.xml file as
> ** fs.s3.awsAccessKeyId
> ** fs.s3.awsSecretAccessKey
> ** fs.s3n.awsAccessKeyId
> ** fs.s3n.aswSecretAccessKey
> The amazon EMR tool elastic-mapreduce already provide the ability to use a
> credentials file ( see
> http://s3.amazonaws.com/awsdocs/ElasticMapReduce/latest/emr-qrc.pdf ).
> I would propose that we allow roughly the same access to credentials through
> a credentials file that is currently provided by elastic-mapreduce. This
> should allow for centralized administration of credentials which should be
> positive for security.
> I propose the following properties:
> {quote}
>
> <property><name>f3.s3.awsCredentialsFile</name><value>/path/to/file</value></property>
>
> <property><name>fs.s3n.awsCredentialsFile</name><value>/path/to/file</value></property>
> {quote}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira