[
https://issues.apache.org/jira/browse/HADOOP-13876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15830928#comment-15830928
]
Aaron Fabbri commented on HADOOP-13876:
---------------------------------------
Thanks [~steve_l].
I agree that most of this is addressed by per-bucket config. On the "one
DynamoDB table per cluster" part, however, there are still assumptions in the
DynamoDB (DDB) code that a DynamoDBMetadataStore is 1:1 with a S3AFileSystem:
- Paths stored in DDB do not include the bucket name.
- DDB code uses {{S3AFileSystem#getUri()}} value for call to
{{Path#makeQualified()}}. See callers of {{itemToPathMetadata()}}. (This part
actually breaks when the new
{{DynamoDBMetadataStore#initialize(Configuration)}} method added for the CLI
work is used).
I want to fix this part, as the single DDB table per cluster is the main use
case my users want. I already went through this exercise in LocalMetadataStore
(which stores bucket name with path), so it should be straightforward.
I could see us merging to trunk without this fixed, if we could enforce that
users can't access the same fs.s3a.s3guard.ddb.table with multiple buckets. If
they did that, it appears they'd risk collisions (e.g. s3a://bucket-a/path1 ==
s3a://bucket-b/path1)
> S3Guard: better support for multi-bucket access including read-only
> -------------------------------------------------------------------
>
> Key: HADOOP-13876
> URL: https://issues.apache.org/jira/browse/HADOOP-13876
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: HADOOP-13345
> Reporter: Aaron Fabbri
> Assignee: Mingliang Liu
> Attachments: HADOOP-13876-HADOOP-13345.000.patch
>
>
> HADOOP-13449 adds support for DynamoDBMetadataStore.
> The code currently supports two options for choosing DynamoDB table names:
> 1. Use name of each s3 bucket and auto-create a DynamoDB table for each.
> 2. Configure a table name in the {{fs.s3a.s3guard.ddb.table}} parameter.
> One of the issues is with accessing read-only buckets. If a user accesses a
> read-only bucket with credentials that do not have DynamoDB write
> permissions, they will get errors when trying to access the read-only bucket.
> This manifests causes test failures for {{ITestS3AAWSCredentialsProvider}}.
> Goals for this JIRA:
> - Fix {{ITestS3AAWSCredentialsProvider}} in a way that makes sense for the
> real use-case.
> - Allow for a "one DynamoDB table per cluster" configuration with a way to
> chose which credentials are used for DynamoDB.
> - Document limitations etc. in the s3guard.md site doc.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]