[jira] [Work logged] (HADOOP-13887) Encrypt S3A data client-side with AWS SDK (S3-CSE)

ASF GitHub Bot (Jira) Mon, 19 Jul 2021 13:39:11 -0700


     [ 
https://issues.apache.org/jira/browse/HADOOP-13887?focusedWorklogId=624631&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-624631
 ]


ASF GitHub Bot logged work on HADOOP-13887:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 19/Jul/21 20:38
            Start Date: 19/Jul/21 20:38
    Worklog Time Spent: 10m 
      Work Description: steveloughran commented on a change in pull request 
#2706:
URL: https://github.com/apache/hadoop/pull/2706#discussion_r672614095



##########
File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/encryption.md
##########
@@ -70,6 +76,53 @@ by Amazon's Key Management Service, a key referenced by name 
in the uploading cl
 * SSE-C : the client specifies an actual base64 encoded AES-256 key to be used
 to encrypt and decrypt the data.
 
+Encryption options
+
+|  type | encryption | config on write | config on read |
+|-------|---------|-----------------|----------------|
+| `SSE-S3` | server side, AES256 | encryption algorithm | none |
+| `SSE-KMS` | server side, KMS key | key used to encrypt/decrypt | none |
+| `SSE-C` | server side, custom key | encryption algorithm and secret | 
encryption algorithm and secret |
+| `CSE-KMS` | client side, KMS key | encryption algorithm and key ID | 
encryption algorithm |
+
+With server-side encryption, the data is uploaded to S3 unencrypted (but 
wrapped by the HTTPS
+encryption channel).
+The data is dynamically encrypted and decrypted in the S3 Store, as needed.
+
+A server side algorithm can be enabled by default for a bucket, so that
+whenever data is uploaded unencrypted a default encryption algorithm is added.
+When data is encrypted with S3-SSE or SSE-KMS it is transparent to all clients
+downloading the data.
+SSE-C is different in that every client must know the secret key needed to 
decypt the data.
+
+Working with SSE-C data hard because every client must be configured to use the
+algorithm and supply the key. In particular, it is very hard to mix SSE-C
+encrypted objects in the same S3 bucket with objects encrypted with other
+algorithms or unencrypted; The S3A client
+(and other applications) get very confused.
+
+KMS-based key encryption is powerful as access to a key can be restricted to
+specific users/IAM roles. However, use of the key is billed and can be
+throttled. Furthermore as a client seeks around a file, the KMS key *may* be

Review comment:
       we've not done anything with client side keys. probably relevant on 
random IO where we do many GET's across a single input stream, so could cache 
it there.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 624631)
    Time Spent: 9.5h  (was: 9h 20m)

> Encrypt S3A data client-side with AWS SDK (S3-CSE)
> --------------------------------------------------
>
>                 Key: HADOOP-13887
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13887
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.0
>            Reporter: Jeeyoung Kim
>            Assignee: Igor Mazur
>            Priority: Minor
>              Labels: pull-request-available
>         Attachments: HADOOP-13887-002.patch, HADOOP-13887-007.patch, 
> HADOOP-13887-branch-2-003.patch, HADOOP-13897-branch-2-004.patch, 
> HADOOP-13897-branch-2-005.patch, HADOOP-13897-branch-2-006.patch, 
> HADOOP-13897-branch-2-008.patch, HADOOP-13897-branch-2-009.patch, 
> HADOOP-13897-branch-2-010.patch, HADOOP-13897-branch-2-012.patch, 
> HADOOP-13897-branch-2-014.patch, HADOOP-13897-trunk-011.patch, 
> HADOOP-13897-trunk-013.patch, HADOOP-14171-001.patch, S3-CSE Proposal.pdf
>
>          Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> Expose the client-side encryption option documented in Amazon S3 
> documentation  - 
> http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingClientSideEncryption.html
> Currently this is not exposed in Hadoop but it is exposed as an option in AWS 
> Java SDK, which Hadoop currently includes. It should be trivial to propagate 
> this as a parameter passed to the S3client used in S3AFileSystem.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Work logged] (HADOOP-13887) Encrypt S3A data client-side with AWS SDK (S3-CSE)

Reply via email to