[
https://issues.apache.org/jira/browse/HADOOP-13336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15805590#comment-15805590
]
Aaron Fabbri commented on HADOOP-13336:
---------------------------------------
+1 (non-binding) once you remove the change of debug log level at end of patch.
I like this approach. I had thought about this a bit before you started work
and was thinking of having a "flatten" (a.k.a. propagate) function as you do
here. Seems more efficient at runtime and easier to debug. I like the use of
the Configuration source feature too.
It looks like your omission of <prefix>.impl keys still works with
fs.s3a.metadatastore.impl, since {{"impl".equals("metadatastore.impl") ==
false}}. (You compare the whole key, not just the suffix.) This is good
because we'd like to be able to enable/disable s3guard on a per-bucket basis.
Other patch comments:
{code}
+ public static Configuration propagateBucketOptions(Configuration source,
+ String bucket) {
+
<snip>
+ for (Map.Entry<String, String> entry : source) {
+ final String key = entry.getKey();
+ // get the (unexpanded) value.
+ final String value = entry.getValue();
+ if (!key.startsWith(bucketPrefix)
+ || bucketPrefix.equals(key)
+ || value == null) {
+ continue;
+ }
{code}
Was curious about the {{value == null}} part.. Does that ever happen? Anyhow,
seems safe to include the check.
Minor nit in the docs:
{code}
+Different S3 buckets can be accessed with S3A client configurations.
{code}
Should this read "can be accessed with different S3A client configurations"?
{code}
--- a/hadoop-tools/hadoop-aws/src/test/resources/log4j.properties
+++ b/hadoop-tools/hadoop-aws/src/test/resources/log4j.properties
@@ -21,7 +21,7 @@ log4j.logger.org.apache.hadoop.util.NativeCodeLoader=ERROR
# for debugging low level S3a operations, uncomment these lines
# Log all S3A classes
-#log4j.logger.org.apache.hadoop.fs.s3a=DEBUG
+log4j.logger.org.apache.hadoop.fs.s3a=DEBUG
# Log S3Guard classes
#log4j.logger.org.apache.hadoop.fs.s3a.s3guard=DEBUG
{code}
Was this intentional?
> S3A to support per-bucket configuration
> ---------------------------------------
>
> Key: HADOOP-13336
> URL: https://issues.apache.org/jira/browse/HADOOP-13336
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 2.8.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Attachments: HADOOP-13336-HADOOP-13345-001.patch,
> HADOOP-13336-HADOOP-13345-002.patch
>
>
> S3a now supports different regions, by way of declaring the endpoint —but you
> can't do things like read in one region, write back in another (e.g. a distcp
> backup), because only one region can be specified in a configuration.
> If s3a supported region declaration in the URL, e.g. s3a://b1.frankfurt
> s3a://b2.seol , then this would be possible.
> Swift does this with a full filesystem binding/config: endpoints, username,
> etc, in the XML file. Would we need to do that much? It'd be simpler
> initially to use a domain suffix of a URL to set the region of a bucket from
> the domain and have the aws library sort the details out itself, maybe with
> some config options for working with non-AWS infra
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]