mukund-thakur commented on a change in pull request #2706:
URL: https://github.com/apache/hadoop/pull/2706#discussion_r668460078



##########
File path: hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
##########
@@ -2029,3 +2029,51 @@ signer for all services.
 For a specific service, the service specific signer is looked up first.
 If that is not specified, the common signer is looked up. If this is
 not specified as well, SDK settings are used.
+
+### <a name="cse"></a> Amazon S3 Client Side Encryption
+Amazon S3 Client Side Encryption(CSE), uses `AmazonS3EncryptionClientV2.java
+` AmazonS3 client. The encryption and decryption is done in AWS SDK side. Atm
+, Only KMS CSE method is supported.
+
+What is needed to be set to enable CSE for a bucket?
+- Create an AWS KMS Key ID from AWS console for your bucket, with same region 
as
+ your bucket.
+- If already created, [view the kms key ID by these 
steps.](https://docs.aws.amazon.com/kms/latest/developerguide/find-cmk-id-arn.html)
+- Set `fs.s3a.cse.method=KMS`.
+- Set `fs.s3a.cse.kms.keyId=<KMS_KEY_ID>`.
+
+*Note:* If `fs.s3a.cse.method=KMS
+` is set, `fs.s3a.cse.kms.keyId=<KMS_KEY_ID>` property needs to be set for
+ CSE to work.
+
+Limitations of CSE on S3A:
+
+- No multipart uploader API support.
+- Only CSE enabled clients should be using encrypted data.
+- No S3 Select support.
+- Multipart uploads through S3ABlockOutputStream would be serial and partsize
+ would be a multiple of 16 bytes.
+- Performance hit.
+
+All encrypted data via CSE have a padding of 16 bytes. This would cause
+ inconsistencies in content length and thus lead to errors in application
+  that rely on content length before opening a file or calculation of splits on
+   a table and so on. To rectify this inconsistency, 16 bytes would be
+    stripped from content length when:
+
+- CSE is enabled.
+- If a getFileStatus call is made, a check to see if any client side
+     encryption algorithm was used.
+- contentLength >= 16 bytes.

Review comment:
       When all three conditions are met right?

##########
File path: 
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md
##########
@@ -1682,3 +1682,105 @@ com.amazonaws.SdkClientException: Unable to execute 
HTTP request:
 
 When this happens, try to set `fs.s3a.connection.request.timeout` to a larger 
value or disable it
 completely by setting it to `0`.
+
+### <a name="client-side-encryption"></a> S3 Client Side Encryption

Review comment:
       Shouldn't we describe solutions to these problems as well?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to