Laszlo Gaal has uploaded this change for review. (
http://gerrit.cloudera.org:8080/8294
Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
......................................................................
IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
JENKINS-1102 added the IAM role ImpalaDev to the Impala Jenkins workers
to facilitate s3 access without having to carry around AWS credentials
in environment variables, from where they were prone to escape to log
files posted in public places.
This change paves the way for Impala build and test jobs to use the IAM
roles to access s3 buckets. There are a few minor changes that allow
this to happen:
Changes to the configuration script:
1. bin/impala-config.sh stops setting the AWS_* environment variables
to dummy default values. When AWS credentials are not supplied in
the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY,
these variables are unset (removed from the environment), otherwise
they would preempt authentication based on the IAM role.
2. Having AWS credentials in the AWS_* environment variables is now
optional. They are still accepted to allow for private test runs
accessing private/nondefault buckets with custom credentials.
3. bin/impala-config.sh now checks if credentials are supplied in the
AWS_* variables or via the IAM role.
Changes to the frontend tests:
1. Some front-end tests still referenced the old s3 connector s3n:,
this connector does not support s3 auth via IAM roles. These
locations are changed to use the newer s3a:, which is the connector
capable of using IAM roles for authentication and which is used
in all other code locations.
Changes to the minicluster setup:
1. As a corollary the s3n: configuration sections are removed from
core-site.xml.tmpl.
2. Remove empty AWS credentials from core-site.xml.tmpl:
The minicluster setup script susbstitutes values from environment
variables into Hadoop *-site.xml config files when setting up
the minicluster runtime environment. The configuration file
core-site.xml.tmpl contains a section for s3 access, including
AWS credentials.
Impala can now use IAM roles for s3 access; this requires the removal
of environment variables holding AWS credentials, which
1. breaks the substitution logic in testdata/cluster/admin, and
2. would break the IAM-based credentials if empty credentials were
supplied in core-site.xml
The fix for all of the above issues is to remove the AWS credential
settings from the generated core-site.xml if both AWS_ACCESS_KEY_ID and
AWS_SECRET_ACCESS_KEY environment variables are absent or empty.
Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
---
M bin/impala-config.sh
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl
5 files changed, 52 insertions(+), 29 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/8294/1
--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal <[email protected]>