Laszlo Gaal has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8294


Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
......................................................................

IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

JENKINS-1102 added the IAM role ImpalaDev to the Impala Jenkins workers
to facilitate s3 access without having to carry around AWS credentials
in environment variables, from where they were prone to escape to log
files posted in public places.

This change paves the way for Impala build and test jobs to use the IAM
roles to access s3 buckets. There are a few minor changes that allow
this to happen:

Changes to the configuration script:
1. bin/impala-config.sh stops setting the AWS_* environment variables
   to dummy default values. When AWS credentials are not supplied in
   the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY,
   these variables are unset (removed from the environment), otherwise
   they would preempt authentication based on the IAM role.
2. Having AWS credentials in the AWS_* environment variables is now
   optional. They are still accepted to allow for private test runs
   accessing private/nondefault buckets with custom credentials.
3. bin/impala-config.sh now checks if credentials are supplied in the
   AWS_* variables or via the IAM role.

Changes to the frontend tests:
1. Some front-end tests still referenced the old s3 connector s3n:,
   this connector does not support s3 auth via IAM roles. These
   locations are changed to use the newer s3a:, which is the connector
   capable of using IAM roles for authentication and which is used
   in all other code locations.

Changes to the minicluster setup:
1. As a corollary the s3n: configuration sections are removed from
   core-site.xml.tmpl.
2. Remove empty AWS credentials from core-site.xml.tmpl:

   The minicluster setup script susbstitutes values from environment
   variables into Hadoop *-site.xml config files when setting up
   the minicluster runtime environment. The configuration file
   core-site.xml.tmpl contains a section for s3 access, including
   AWS credentials.

   Impala can now use IAM roles for s3 access; this requires the removal
   of environment variables holding AWS credentials, which
   1. breaks the substitution logic in testdata/cluster/admin, and
   2. would break the IAM-based credentials if empty credentials were
      supplied in core-site.xml

   The fix for all of the above issues is to remove the AWS credential
   settings from the generated core-site.xml if both AWS_ACCESS_KEY_ID and
   AWS_SECRET_ACCESS_KEY environment variables are absent or empty.

Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
---
M bin/impala-config.sh
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl
5 files changed, 52 insertions(+), 29 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/8294/1
--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal <[email protected]>

Reply via email to