[ 
https://issues.apache.org/jira/browse/HADOOP-13887?focusedWorklogId=621878&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-621878
 ]

ASF GitHub Bot logged work on HADOOP-13887:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 13/Jul/21 09:19
            Start Date: 13/Jul/21 09:19
    Worklog Time Spent: 10m 
      Work Description: mukund-thakur commented on a change in pull request 
#2706:
URL: https://github.com/apache/hadoop/pull/2706#discussion_r668460078



##########
File path: hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
##########
@@ -2029,3 +2029,51 @@ signer for all services.
 For a specific service, the service specific signer is looked up first.
 If that is not specified, the common signer is looked up. If this is
 not specified as well, SDK settings are used.
+
+### <a name="cse"></a> Amazon S3 Client Side Encryption
+Amazon S3 Client Side Encryption(CSE), uses `AmazonS3EncryptionClientV2.java
+` AmazonS3 client. The encryption and decryption is done in AWS SDK side. Atm
+, Only KMS CSE method is supported.
+
+What is needed to be set to enable CSE for a bucket?
+- Create an AWS KMS Key ID from AWS console for your bucket, with same region 
as
+ your bucket.
+- If already created, [view the kms key ID by these 
steps.](https://docs.aws.amazon.com/kms/latest/developerguide/find-cmk-id-arn.html)
+- Set `fs.s3a.cse.method=KMS`.
+- Set `fs.s3a.cse.kms.keyId=<KMS_KEY_ID>`.
+
+*Note:* If `fs.s3a.cse.method=KMS
+` is set, `fs.s3a.cse.kms.keyId=<KMS_KEY_ID>` property needs to be set for
+ CSE to work.
+
+Limitations of CSE on S3A:
+
+- No multipart uploader API support.
+- Only CSE enabled clients should be using encrypted data.
+- No S3 Select support.
+- Multipart uploads through S3ABlockOutputStream would be serial and partsize
+ would be a multiple of 16 bytes.
+- Performance hit.
+
+All encrypted data via CSE have a padding of 16 bytes. This would cause
+ inconsistencies in content length and thus lead to errors in application
+  that rely on content length before opening a file or calculation of splits on
+   a table and so on. To rectify this inconsistency, 16 bytes would be
+    stripped from content length when:
+
+- CSE is enabled.
+- If a getFileStatus call is made, a check to see if any client side
+     encryption algorithm was used.
+- contentLength >= 16 bytes.

Review comment:
       When all three conditions are met right?

##########
File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md
##########
@@ -1682,3 +1682,105 @@ com.amazonaws.SdkClientException: Unable to execute 
HTTP request:
 
 When this happens, try to set `fs.s3a.connection.request.timeout` to a larger 
value or disable it
 completely by setting it to `0`.
+
+### <a name="client-side-encryption"></a> S3 Client Side Encryption

Review comment:
       Shouldn't we describe solutions to these problems as well?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 621878)
    Time Spent: 5h 10m  (was: 5h)

> Encrypt S3A data client-side with AWS SDK (S3-CSE)
> --------------------------------------------------
>
>                 Key: HADOOP-13887
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13887
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.0
>            Reporter: Jeeyoung Kim
>            Assignee: Igor Mazur
>            Priority: Minor
>              Labels: pull-request-available
>         Attachments: HADOOP-13887-002.patch, HADOOP-13887-007.patch, 
> HADOOP-13887-branch-2-003.patch, HADOOP-13897-branch-2-004.patch, 
> HADOOP-13897-branch-2-005.patch, HADOOP-13897-branch-2-006.patch, 
> HADOOP-13897-branch-2-008.patch, HADOOP-13897-branch-2-009.patch, 
> HADOOP-13897-branch-2-010.patch, HADOOP-13897-branch-2-012.patch, 
> HADOOP-13897-branch-2-014.patch, HADOOP-13897-trunk-011.patch, 
> HADOOP-13897-trunk-013.patch, HADOOP-14171-001.patch, S3-CSE Proposal.pdf
>
>          Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Expose the client-side encryption option documented in Amazon S3 
> documentation  - 
> http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingClientSideEncryption.html
> Currently this is not exposed in Hadoop but it is exposed as an option in AWS 
> Java SDK, which Hadoop currently includes. It should be trivial to propagate 
> this as a parameter passed to the S3client used in S3AFileSystem.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to