[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 3: (3 comments) > I also don't want to block progress on making this good improvement > to our S3 development, but my fear is that this script will get > more and more out of control if we don't draw a line somewhere > about what belongs in it. That's a valid point. I moved the logic to bin/check-s3-access.sh, which actually lets us reuse the the same check elsewhere if it becomes necessary or useful. http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@294 PS2, Line 294: export LOCAL_FS="file:${WAREHOUSE_LOCATION_PREFIX}" > Yeah, I'd question whether "aws ls" belongs in this script either. Done, moved to bin/check-s3-access.sh. http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@303 PS2, Line 303: set AWS_A > This variable will leak out into the user's shell. Done, I moved these checks into a separate script, check-s3-access.sh http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@307 PS2, Line 307: > Good point; I'll check if wget can be set up the same way. Done. Although the check was moved to bin/check-s3-access.sh, I have replaced curl with wget, using equivalent parameters for silencing and short timeouts. -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 3 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 13 Nov 2017 22:26:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Hello Lars Volker, Michael Brown, Jim Apple, Philip Zeyliger, Sailesh Mukil, David Knupp, Joe McDonnell, Tim Armstrong, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8294 to look at the new patch set (#3). Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs For some time Impala in a production environment has been able to access data stored in Amazon S3 buckets using credentials specified in a number of ways: - storing Amazon access keys in environment variables or in core-site.xml. - using proprietary management tools to store Amazon access keys securely - using Amazon IAM roles bound to VMs running in EC2. The development minicluster environment used the first approach, which risked leaking these keys. This change enables Impala builds to use IAM roles to access S3 buckets when running on an Amazon EC2 virtual machine. The changes mainly ensure that environment variables and/or Jenkins parameters carrying the traditional AWS credentials do not conflict with credentials supplied by the IAM role attached to the VM instance. The change also moves the logic performing the S3 access checks into a separate script file: bin/check-s3-access.sh. IAM role based credentials are accessible through the EC2 instance-property mechanism; for further details see Amazon's docs at http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html#instance-metadata-security-credentials Changes to the configuration script: 1. bin/impala-config.sh stops setting the AWS_* environment variables to dummy default values. When AWS credentials are not supplied in the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, these variables are unset (removed from the environment), otherwise they would interfere with authentication based on the IAM role. 2. Having AWS credentials in the AWS_* environment variables is now optional. They are still accepted to allow for private test runs accessing private/nondefault buckets with custom credentials. 3. bin/impala-config.sh now calls bin/check-s3-access.sh to perform the actual S3-dependent checks. check-s3-access.sh contains the S3-specific logic and network access needed to check if the requested S3 bucket is accessible for the build. Changes to the minicluster configuration: 1. Security credentials for the s3n: connector, located in core-site.xml are no longer replaced with actual AWS_ credentials when configuring the minicluster. These parameters are used for some front-end tests, which don't actually reach out to S3, the s3n: notation just simulates non-HDFS storage. For these tests to work s3n: authentication parameters still need to exist in core-site.xml. Their values do not matter, so the configuration template now has fixed dummy values for these parameters. 2. Remove empty s3a: security parameter sections from core-site.xml: The testdata/cluster/admin setup script substitutes values from environment variables into core-site.xml when it sets up the minicluster runtime environment. The configuration section for s3a: credentials is now completely removed if both of the following conditions are met: - the target filesystem is set to "s3" - the AWS credential environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are both empty or missing. The configuration file core-site.xml.tmpl is extended with comment markers that delimit the section to be removed in this case. Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae --- A bin/check-s3-access.sh M bin/impala-config.sh M testdata/cluster/admin M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl 4 files changed, 163 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/8294/3 -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 3 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 2: I also don't want to block progress on making this good improvement to our S3 development, but my fear is that this script will get more and more out of control if we don't draw a line somewhere about what belongs in it. -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 2 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 09 Nov 2017 02:22:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@294 PS2, Line 294: if (set +x; [[ -z ${AWS_ACCESS_KEY_ID-} && -z ${AWS_SECRET_ACCESS_KEY-} ]]); then > I'm not sure that putting the checks into a separate script would change an Lines 252-264 seem ok - I don't have a problem with setting environment variables in this script. http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@294 PS2, Line 294: if (set +x; [[ -z ${AWS_ACCESS_KEY_ID-} && -z ${AWS_SECRET_ACCESS_KEY-} ]]); then > In most of the use cases the network access is avoided, it happens only if Yeah, I'd question whether "aws ls" belongs in this script either. The performance is one thing but I just generally think we should restrict this script to doing the minimum possible to set up environment variables. I don't see why it should be extended to do heavyweight things to validate the configuration - we don't do anything to validate the vast majority of other variables that are set in this script. It does appear that people have added validations in an ad-hoc way before so if a majority of people think that that's a good idea, I can yield to that. We should also keep in mind that this script is a crappy programming environment, because it's running in the context of the user's shell and we can't use things like "set -x" and have to be careful setting variables that we don't intend to leak into the user's shell. So I think regardless this logic would be more maintainable in a separate script to validate the AWS config. My preference is that we also run that script from elsewhere to keep impala-config.sh lightweight but if other people feel strongly that impala-config.sh should be doing more validation of configs, etc then that's not the worst thing in the world. http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@303 PS2, Line 303: CURL_ARGS This variable will leak out into the user's shell. -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 2 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 09 Nov 2017 02:21:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@294 PS2, Line 294: if (set +x; [[ -z ${AWS_ACCESS_KEY_ID-} && -z ${AWS_SECRET_ACCESS_KEY-} ]]); then > (I think it's fine if this is put in a separate script and called from buil I'm not sure that putting the checks into a separate script would change anything. The part in lines 252-264 have to remain here because they touch environment variables that are used further in this script. http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@294 PS2, Line 294: if (set +x; [[ -z ${AWS_ACCESS_KEY_ID-} && -z ${AWS_SECRET_ACCESS_KEY-} ]]); then > Does impala-config.sh really have to talk to the internet? It just tends to In most of the use cases the network access is avoided, it happens only if the build is set up to run the tests on S3. -- and talking to S3 means lots of network access anyway. Even the current version of the script performs a network access in the S3 case: line 321 uses AWS CLI to check the existence of the specified bucket. Checking the credential URL is gated by the following environment variables: - TARGET_FILESYSTEM has to be set to "s3" - S3_BUCKET has to be non-empty - AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID both have to be empty or missing before the script attempts to look for credentials by checking the URL. Even if a network check is performed, it will not go out to the internet. The specific network address belongs to the "link-local address" category (see https://en.wikipedia.org/wiki/Link-local_address), which is valid only in the immediate network neighborhood. Within EC2 the request should be server quickly; outside of EC2 I don't expect the target URL to exist at all. Based on this I could set short timeouts, so that the call fails quickly even in rare failure cases. http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@307 PS2, Line 307: if ! curl "${CURL_ARGS[@]}" ; then > curl is not a development dependency in bin/bootstrap_system.sh. I'd sugges Good point; I'll check if wget can be set up the same way. -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 2 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Nov 2017 21:40:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@307 PS2, Line 307: if ! curl "${CURL_ARGS[@]}" ; then curl is not a development dependency in bin/bootstrap_system.sh. I'd suggest using wget or adding curl to the apt-get install list in bin/bootstrap_system.sh -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 2 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Nov 2017 18:20:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@294 PS2, Line 294: if (set +x; [[ -z ${AWS_ACCESS_KEY_ID-} && -z ${AWS_SECRET_ACCESS_KEY-} ]]); then > Does impala-config.sh really have to talk to the internet? It just tends to (I think it's fine if this is put in a separate script and called from buildall.sh or testdata/bin/run-all.sh or something like that) -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 2 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Nov 2017 18:01:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 2: Code-Review-1 (1 comment) http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@294 PS2, Line 294: if (set +x; [[ -z ${AWS_ACCESS_KEY_ID-} && -z ${AWS_SECRET_ACCESS_KEY-} ]]); then Does impala-config.sh really have to talk to the internet? It just tends to cause problems if impala-config.sh does things aside from setting environment variables. -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 2 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Nov 2017 18:01:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Michael Brown has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 2 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 08 Nov 2017 17:05:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 2: > Hi Laszlo, > > We're interested in this change for avoiding some compatibility > problems with Hadoop 3. Any news here? > > Thanks! Hi Philip, Sorry for the late update on this patch, I was detoured by other, more timeboxed activities. I have just posted a new patch set and I'm looking forward to your comments on it. -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 2 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 08 Nov 2017 15:27:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 1: (8 comments) http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@240 PS1, Line 240: #export AWS_SECRET_ACCESS_KEY="${AWS_SECRET_ACCESS_KEY-}" : #export AWS_ACCESS_KEY_ID="${AWS_ACCESS_KEY_ID-}" > Self-inflicted review finding: remove commented-out lines. Done http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@256 PS1, Line 256: if [[ -z ${AWS_ACCESS_KEY_ID-} ]]; then > (Bash hackery) Thanks for the tip, Philip, I ended up using it. http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@268 PS1, Line 268: test > tests Done http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@276 PS1, Line 276: curl -sf --connect-timeout 1 --max-time 5 > Are these curl options common across a wide variety of OSs? If they are new Yes. I have tested the curl options on all the platforms on which the packaging build runs. Curl understands these options on all these platforms. http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@276 PS1, Line 276: http://169.254.169.254/latest/meta-data/iam/security-credentials/ > Is there a AWS reference page you can add as a comment so I can read up mor Done http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@276 PS1, Line 276: 169.254.169.254 > http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.ht I have put the URL to the Amazon doc page where this mechanism is described. http://gerrit.cloudera.org:8080/#/c/8294/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java: http://gerrit.cloudera.org:8080/#/c/8294/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@3000 PS1, Line 3000: AnalysisError(String.format("load data inpath '%s' %s into table " + : "tpch.lineitem", "s3a://bucket/test-warehouse/test.out", overwrite), : "INPATH location 's3a://bucket/test-warehouse/test.out' must point to an " + : "HDFS, S3A or ADL filesystem."); > Impala cannot load data from s3n. I think this test is intended to verify o That's right; I was pattern-matching too eagerly. Thanks for catching this. I have in fact reverted all the s3n->s3a changes in the FE tests; looks like they don't in fact touch S3 and they are OK with some existing but obviously fake credentials in core-site.xml. http://gerrit.cloudera.org:8080/#/c/8294/1/testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl File testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl: http://gerrit.cloudera.org:8080/#/c/8294/1/testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl@61 PS1, Line 61: > Maybe have a comment explaining testdata/cluster/admin needs this (and the Done, referred the Amazon doc site as well. -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 08 Nov 2017 15:24:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Hello Lars Volker, Michael Brown, Jim Apple, Philip Zeyliger, Sailesh Mukil, David Knupp, Joe McDonnell, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8294 to look at the new patch set (#2). Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs For some time Impala in a production environment has been able to access data stored in Amazon S3 buckets using credentials specified in a number of ways: - storing Amazon access keys in environment variables or in core-site.xml. - using proprietary management tools to store Amazon access keys securely - using Amazon IAM roles bound to VMs running in EC2. The development minicluster environment used the first approach, which risked leaking these keys. This change paves the way for Impala development setups to use IAM roles to access S3 buckets when running on an Amazon EC2 virtual machine. The changes mainly ensure that traditional credentials supplied in environment variables do not conflict with credentials supplied by the IAM role attached to the VM instance. The IAM role based credentials are accessible through the EC2 instance-property mechanism; for further details see Amazon's docs at http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html#instance-metadata-security-credentials Changes to the configuration script: 1. bin/impala-config.sh stops setting the AWS_* environment variables to dummy default values. When AWS credentials are not supplied in the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, these variables are unset (removed from the environment), otherwise they would preempt authentication based on the IAM role. 2. Having AWS credentials in the AWS_* environment variables is now optional. They are still accepted to allow for private test runs accessing private/nondefault buckets with custom credentials. 3. bin/impala-config.sh now checks if credentials are supplied in the AWS_* variables or via the IAM role. Changes to the minicluster configuration: 1. Some front-end tests still refer to the old S3 connector s3n:. This connector does not support s3 auth via IAM roles, but this is not a problem: these front-end tests don't actually reach out to S3, the s3n: notation is just for the FE. For these tests to work authentication parameters still need to exist for the s3n: connector in core-site.xml, but the values do not matter, so the configuration template now has fixed dummy values for the s3n: AWS credentials. 2. Remove empty AWS credentials from core-site.xml.tmpl: The testdata/cluster/admin setup script substitutes values from environment variables into Hadoop *-site.xml configuration files when setting up the minicluster runtime environment. The configuration section for s3a: credentials are now completely removed if: - the target filesystem is set to "s3" - and the AWS credential environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are both empty or missing. The configuration file core-site.xml.tmpl was extended with comment markers that delimit the section to be removed in this case. Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae --- M bin/impala-config.sh M testdata/cluster/admin M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl 3 files changed, 92 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/8294/2 -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 2 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 1: Hi Laszlo, We're interested in this change for avoiding some compatibility problems with Hadoop 3. Any news here? Thanks! -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 24 Oct 2017 23:03:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@276 PS1, Line 276: 169.254.169.254 > What is this address? Can we use a domain name and https? http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html This is EC2 magic. If you're not on EC2, this will tell you to set the environment variables (if you're testing S3). If you're on EC2, this will return something. We don't actually check that those credentials can access the relevant bucket, but at least it lets you through. -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 18 Oct 2017 21:06:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@276 PS1, Line 276: 169.254.169.254 What is this address? Can we use a domain name and https? What would a new Impala developer get as a result of this? -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 18 Oct 2017 20:55:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@256 PS1, Line 256: if [[ -z ${AWS_ACCESS_KEY_ID-} ]]; then > We may want to include a check here that "set -x" has not been enabled? (Bash hackery) If you want, you can use a subshell: $(set -x; (set +x; [ $USER == philip ])) && echo yes || echo no + set +x yes 10:46:02 mystery ~ $(set -x; (set +x; [ $USER == phili ])) && echo yes || echo no + set +x no Return values survive, and you can just do if (set +x; [[ -z ${AWS_SECRET_ACCESS_KEY_ID-} ]]); then ... fi I tested something similar above. -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 18 Oct 2017 17:47:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@256 PS1, Line 256: if [[ -z ${AWS_ACCESS_KEY_ID-} ]]; then We may want to include a check here that "set -x" has not been enabled? # Prevent leaking the AWS keys to the log set_x=0 if set +o | grep -q "set -o xtrace"; then set_x=1 set +x fi DO STUFF # Restore xtrace flag if [[ $set_x -eq 1 ]]; then set -x fi -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 18 Oct 2017 17:32:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8294/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java: http://gerrit.cloudera.org:8080/#/c/8294/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@3000 PS1, Line 3000: AnalysisError(String.format("load data inpath '%s' %s into table " + : "tpch.lineitem", "s3a://bucket/test-warehouse/test.out", overwrite), : "INPATH location 's3a://bucket/test-warehouse/test.out' must point to an " + : "HDFS, S3A or ADL filesystem."); Impala cannot load data from s3n. I think this test is intended to verify our error message when given an s3n path, so I don't think an s3a path will work here. The source of the error is LoadDataStmt::analyzePaths(). -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 17 Oct 2017 23:30:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Michael Brown has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/8294/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8294/1//COMMIT_MSG@9 PS1, Line 9: JENKINS-1102 added the IAM role ImpalaDev to the Impala Jenkins workers : to facilitate s3 access without having to carry around AWS credentials : in environment variables, from where they were prone to escape to log : files posted in public places. This looks like something specific to a downstream environment and should be removed from the commit message. http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@268 PS1, Line 268: test tests http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@276 PS1, Line 276: curl -sf --connect-timeout 1 --max-time 5 Are these curl options common across a wide variety of OSs? If they are newer and only work under, say, Ubuntu 16, they would be a problem for contributors using older distributions. http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@276 PS1, Line 276: http://169.254.169.254/latest/meta-data/iam/security-credentials/ Is there a AWS reference page you can add as a comment so I can read up more on this? http://gerrit.cloudera.org:8080/#/c/8294/1/testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl File testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl: http://gerrit.cloudera.org:8080/#/c/8294/1/testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl@61 PS1, Line 61: Maybe have a comment explaining testdata/cluster/admin needs this (and the END) as a marker and not to remove? -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 17 Oct 2017 23:09:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/8294 ) Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/8294/1/bin/impala-config.sh@240 PS1, Line 240: #export AWS_SECRET_ACCESS_KEY="${AWS_SECRET_ACCESS_KEY-}" : #export AWS_ACCESS_KEY_ID="${AWS_ACCESS_KEY_ID-}" Self-inflicted review finding: remove commented-out lines. -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 17 Oct 2017 19:45:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
Laszlo Gaal has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8294 Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs .. IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs JENKINS-1102 added the IAM role ImpalaDev to the Impala Jenkins workers to facilitate s3 access without having to carry around AWS credentials in environment variables, from where they were prone to escape to log files posted in public places. This change paves the way for Impala build and test jobs to use the IAM roles to access s3 buckets. There are a few minor changes that allow this to happen: Changes to the configuration script: 1. bin/impala-config.sh stops setting the AWS_* environment variables to dummy default values. When AWS credentials are not supplied in the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, these variables are unset (removed from the environment), otherwise they would preempt authentication based on the IAM role. 2. Having AWS credentials in the AWS_* environment variables is now optional. They are still accepted to allow for private test runs accessing private/nondefault buckets with custom credentials. 3. bin/impala-config.sh now checks if credentials are supplied in the AWS_* variables or via the IAM role. Changes to the frontend tests: 1. Some front-end tests still referenced the old s3 connector s3n:, this connector does not support s3 auth via IAM roles. These locations are changed to use the newer s3a:, which is the connector capable of using IAM roles for authentication and which is used in all other code locations. Changes to the minicluster setup: 1. As a corollary the s3n: configuration sections are removed from core-site.xml.tmpl. 2. Remove empty AWS credentials from core-site.xml.tmpl: The minicluster setup script susbstitutes values from environment variables into Hadoop *-site.xml config files when setting up the minicluster runtime environment. The configuration file core-site.xml.tmpl contains a section for s3 access, including AWS credentials. Impala can now use IAM roles for s3 access; this requires the removal of environment variables holding AWS credentials, which 1. breaks the substitution logic in testdata/cluster/admin, and 2. would break the IAM-based credentials if empty credentials were supplied in core-site.xml The fix for all of the above issues is to remove the AWS credential settings from the generated core-site.xml if both AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables are absent or empty. Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae --- M bin/impala-config.sh M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M testdata/cluster/admin M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl 5 files changed, 52 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/8294/1 -- To view, visit http://gerrit.cloudera.org:8080/8294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae Gerrit-Change-Number: 8294 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo Gaal