Laszlo Gaal has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21442


Change subject: IMPALA-13017: Add S3 bucket endpoint parameters for the 
minicluster
......................................................................

IMPALA-13017: Add S3 bucket endpoint parameters for the minicluster

Extend the minicluster configuration with S3-specific parameters for
setting up S3 bucket locations or access endpoints.

When the Impala minicluster is set up for testing against an S3 bucket,
currently the only parameter that can be set is the name of the targeted
S3 bucket. Some other settings are set to preset or precomputed values,
but the majority of the settings are left empty, implying default
values.

For bucket access this means going through the global legacy endpoint
for all requests, which is not optimal for several reasons, including
latency and disregarding possibly cheaper (or free) access paths.

This patch extends the minicluster configuration with two extra
parameters for S3 buckets:

- S3_BUCKET_REGION: specifies the AWS region where the test target S3
  bucket is located
- S3_BUCKET_ENDPOINT_URL: specifies an S3 endpoint URL servicing the
  targeted S3 bucket

Non-empty values for both parameters will flow down into the core-site.xml
Hadoop configuration file for the minicluster. The Hadoop-AWS s3a:
provider will pick them up from there.

Usage:

- Both parameters are optional.
- If both parameters are omitted, then bucket access happens via the
  legacy global S3 endpoint in us-east-1 (this is the current behavior).
- When either parameter (but not both) is supplied, then the specified
  value is used to route the request to the desired endpoint.
- If both parameters are supplied, they need to be consistent for the
  configuration to be valid; if they are inconsistent, bucket access
  will fail.

Tested by running S3 test runs with multiple combinations of the above
setting on private infrastructure.

Change-Id: I8411e934a6b40fcd183bf597efbeb701a35e0db6
---
M bin/check-s3-access.sh
M bin/impala-config.sh
M bin/jenkins/release_cloud_resources.sh
M testdata/bin/load-test-warehouse-snapshot.sh
M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.py
M tests/common/impala_test_suite.py
M tests/util/filesystem_utils.py
7 files changed, 53 insertions(+), 7 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/21442/1
--
To view, visit http://gerrit.cloudera.org:8080/21442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8411e934a6b40fcd183bf597efbeb701a35e0db6
Gerrit-Change-Number: 21442
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal <laszlo.g...@cloudera.com>

Reply via email to