Hello Lars Volker, Michael Brown, Jim Apple, Philip Zeyliger, Sailesh Mukil, David Knupp, Joe McDonnell, Tim Armstrong, Alex Behm,
I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8294 to look at the new patch set (#3). Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs ...................................................................... IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs For some time Impala in a production environment has been able to access data stored in Amazon S3 buckets using credentials specified in a number of ways: - storing Amazon access keys in environment variables or in core-site.xml. - using proprietary management tools to store Amazon access keys securely - using Amazon IAM roles bound to VMs running in EC2. The development minicluster environment used the first approach, which risked leaking these keys. This change enables Impala builds to use IAM roles to access S3 buckets when running on an Amazon EC2 virtual machine. The changes mainly ensure that environment variables and/or Jenkins parameters carrying the traditional AWS credentials do not conflict with credentials supplied by the IAM role attached to the VM instance. The change also moves the logic performing the S3 access checks into a separate script file: bin/check-s3-access.sh. IAM role based credentials are accessible through the EC2 instance-property mechanism; for further details see Amazon's docs at http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html#instance-metadata-security-credentials Changes to the configuration script: 1. bin/impala-config.sh stops setting the AWS_* environment variables to dummy default values. When AWS credentials are not supplied in the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, these variables are unset (removed from the environment), otherwise they would interfere with authentication based on the IAM role. 2. Having AWS credentials in the AWS_* environment variables is now optional. They are still accepted to allow for private test runs accessing private/nondefault buckets with custom credentials. 3. bin/impala-config.sh now calls bin/check-s3-access.sh to perform the actual S3-dependent checks. check-s3-access.sh contains the S3-specific logic and network access needed to check if the requested S3 bucket is accessible for the build. Changes to the minicluster configuration: 1. Security credentials for the s3n: connector, located in core-site.xml are no longer replaced with actual AWS_ credentials when configuring the minicluster. These parameters are used for some front-end tests, which don't actually reach out to S3, the s3n: notation just simulates non-HDFS storage. For these tests to work s3n: authentication parameters still need to exist in core-site.xml. Their values do not matter, so the configuration template now has fixed dummy values for these parameters. 2. Remove empty s3a: security parameter sections from core-site.xml: The testdata/cluster/admin setup script substitutes values from environment variables into core-site.xml when it sets up the minicluster runtime environment. The configuration section for s3a: credentials is now completely removed if both of the following conditions are met: - the target filesystem is set to "s3" - the AWS credential environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are both empty or missing. The configuration file core-site.xml.tmpl is extended with comment markers that delimit the section to be removed in this case. Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae --- A bin/check-s3-access.sh M bin/impala-config.sh M testdata/cluster/admin M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl 4 files changed, 163 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/8294/3 -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 3 Gerrit-Owner: Laszlo Gaal <laszlo.g...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: David Knupp <dkn...@cloudera.com> Gerrit-Reviewer: Jim Apple <jbapple-imp...@apache.org> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Laszlo Gaal <laszlo.g...@cloudera.com> Gerrit-Reviewer: Michael Brown <mi...@cloudera.com> Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com> Gerrit-Reviewer: Sailesh Mukil <sail...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>