[
https://issues.apache.org/jira/browse/HADOOP-9293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13575824#comment-13575824
]
Steve Loughran commented on HADOOP-9293:
----------------------------------------
# I thought EMR set the credentials in the site XML config, at least for the
standard source/dest -meaning anything running in the cluster gets at them. But
I could be mistaken; EMR isn't something I've used much.
# In theory you could use XSL to take the credentials out of the XML config
file and generate a properties file. In practise, it'd be hard and there'd
probably be spurious spaces and things in the output, at least from my
experience. It'd be simpler to write a 10 line piece of python code to load the
XML file and extract the relevant settings.
Returning to the patch, -which I'm not over enthusiastic about, as you can guess
* IOException conversion to InvalidArgument should include the inner exception
for anyone trying to diagnose what went wrong.
* exceptions raised on missing key/value would need to be extended to include
the extra way to set properties.
* all other S3-related docs would need changing.
* the s3 test suite needs to include something that verifies that you can
authenticate this way too.
I wouldn't bother with that unless you get support from others that this is
going to be useful. Because I don't see it useful server-side, where the item
that gets serialized around the cluster is the XML jobconf file. Though even
there I'm thinking I should verify that the Xinclude happens before it is sent
around the network...
Summary: I'd go for the python option. If you do write a script to extract the
XML credentials, Apache Whirr could be a good home for it -there are other .py
helper scripts there already
> For S3 use credentials file
> ---------------------------
>
> Key: HADOOP-9293
> URL: https://issues.apache.org/jira/browse/HADOOP-9293
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs/s3
> Affects Versions: 1.0.2
> Environment: Linux
> Reporter: Andy Sautins
> Priority: Minor
> Labels: features, newbie
> Attachments: HADOOP-9293.patch
>
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> The following document describes the current way that S3 credentials can be
> specified ( http://wiki.apache.org/hadoop/AmazonS3 ). In summary they are:
> * in the S3 URI.
> * in the hadoop-site.xml file as
> ** fs.s3.awsAccessKeyId
> ** fs.s3.awsSecretAccessKey
> ** fs.s3n.awsAccessKeyId
> ** fs.s3n.aswSecretAccessKey
> The amazon EMR tool elastic-mapreduce already provide the ability to use a
> credentials file ( see
> http://s3.amazonaws.com/awsdocs/ElasticMapReduce/latest/emr-qrc.pdf ).
> I would propose that we allow roughly the same access to credentials through
> a credentials file that is currently provided by elastic-mapreduce. This
> should allow for centralized administration of credentials which should be
> positive for security.
> I propose the following properties:
> {quote}
>
> <property><name>f3.s3.awsCredentialsFile</name><value>/path/to/file</value></property>
>
> <property><name>fs.s3n.awsCredentialsFile</name><value>/path/to/file</value></property>
> {quote}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira