[
https://issues.apache.org/jira/browse/HADOOP-17855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17401858#comment-17401858
]
Steve Loughran commented on HADOOP-17855:
-----------------------------------------
Pretty reluctant to do this -at least on my personal development schedule.
* when we do things with directories, we often create markers in parent dirs.
This complicates life as we'd have to choose which to use there too
* S3A Delegation tokens pass down all encryption settings so that you can
submit work into a shared cluster where all encryption options including your
secrets come with the job. This will need to be extended.
* all the usual stuff related to hierarchical references, duplicate conflicting
entries et cetera et cetera.
* would you support different SSE options (SSE-C vs SSE-KMS)? SSE-KMS is the
only sensible option, really.
That said: these are all tractable and I can see the rationale for it. If you
were to work on this I and others will do what we can to help nurture the
change in.
I look forward to your submission, please follow the documented test process.
(This probably complicates testing even more as you will need 2+ KMS keys. Docs
will need to be updated...)
> S3A: Allow SSE configurations per object path
> ---------------------------------------------
>
> Key: HADOOP-17855
> URL: https://issues.apache.org/jira/browse/HADOOP-17855
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Reporter: Mike Dias
> Priority: Major
>
> Currently, we can map the SSE configurations at bucket level only:
> {code:java}
> <property>
> <name>fs.s3a.bucket.ireland-dev.server-side-encryption-algorithm</name>
> <value>SSE-KMS</value>
> </property>
> <property>
> <name>fs.s3a.bucket.ireland-dev.server-side-encryption.key</name>
>
> <value>arn:aws:kms:eu-west-1:98067faff834c:key/071a86ff-8881-4ba0-9230-95af6d01ca01</value>
> </property>
> {code}
> But sometimes we want to encrypt data in different paths with different keys
> within the same bucket. For example, a partitioned table might benefit from
> encrypting each partition with a different key when the partition represents
> a customer or a country.
> [S3 already can encrypt using different keys/configurations at the object
> level|https://aws.amazon.com/premiumsupport/knowledge-center/s3-encrypt-specific-folder/],
> so what we need to do on Hadoop is to provide a way to map which key to use.
> One idea could be mapping them in the XML config:
>
> {code:java}
> <property>
> <name>fs.s3a.server-side-encryption.paths</name>
>
> <value>s3://bucket/my_table/country=ireland,s3://bucket/my_table/country=uk,
> s3://bucket/my_table/country=germany</value>
> </property>
> <property>
> <name>fs.s3a.server-side-encryption.path-keys</name>
>
> <value>arn:aws:kms:eu-west-1:90ireland09:key/ireland-key,arn:aws:kms:eu-west-1:980uk0993c:key/uk-key,arn:aws:kms:eu-west-1:98germany089:key/germany-key</value>
> </property>
> {code}
> Or potentially fetch the mappings from the filesystem:
>
> {code:java}
> <property>
> <name>fs.s3a.server-side-encryption.mappings</name>
> <value>s3://bucket/configs/encryption_mappings.json</value>
> </property> {code}
> where encryption_mappings.json could be something like this:
>
> {code:java}
> {
> "path": "s3://bucket/customer_table/customerId=abc123",
> "algorithm": "SSE-KMS",
> "key": "arn:aws:kms:eu-west-1:933993746:key/abc123-key"
> }
> ...
> {
> "path": "s3://bucket/customer_table/customerId=xyx987",
> "algorithm": "SSE-KMS",
> "key": "arn:aws:kms:eu-west-1:933993746:key/xyx987-key"
> }
> {code}
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]