raphaelazzolini opened a new pull request, #7193:
URL: https://github.com/apache/hadoop/pull/7193

   Add the property fs.s3a.encryption.context that allow users to specify the 
AWS KMS Encryption Context to be used in S3A.
   
   The value of the encryption context is a key/value string that will be 
Base64 encoded and set in the parameter ssekmsEncryptionContext from the S3 
client.
   
   Contributed by Raphael Azzolini
   
   ### Description of PR
   This code change adds a new property to S3A: fs.s3a.encryption.context\
   
   The property's value accepts a set of key/value attributes to be set on S3's 
encryption context. The value of the property will be base64 encoded and set in 
the parameter ssekmsEncryptionContext from the S3 client.
   
   This change was merged to trunk by the pull request 
https://github.com/apache/hadoop/pull/6874. This request is to merge the code 
change to branch-3.4.
   
   ### How was this patch tested?
   Tested in us-east-1 with `mvn -Dparallel-tests -DtestsThreadCount=16 clean 
verify`
   
   I added a new test `ITestS3AEncryptionSSEKMSWithEncryptionContext`. However, 
S3's head-object response doesn't contain the object encryption key. Therefore, 
I enabled CloudTrails data logs in my bucket to verify that the tests were 
passing the encryption context to the request.
   
   I added this property to `auth-keys.xml`
   
   ```
   <property>
     <name>fs.s3a.encryption.context</name>
     <value>
       project=hadoop,
       jira=HADOOP-19197,
       component=fs/s3
     </value>
   </property>
   ```
   
   Then I executed the following tests:
   
   ```
   mvn clean verify -Dit.test=ITestS3AEncryption* -Dtest=none
   
   [INFO] -------------------------------------------------------
   [INFO]  T E S T S
   [INFO] -------------------------------------------------------
   [INFO] Running 
org.apache.hadoop.fs.s3a.ITestS3AEncryptionSSEKMSDefaultKeyWithEncryptionContext
   [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.10 
s -- in 
org.apache.hadoop.fs.s3a.ITestS3AEncryptionSSEKMSDefaultKeyWithEncryptionContext
   [INFO] Running org.apache.hadoop.fs.s3a.ITestS3AEncryptionSSEC
   [INFO] Tests run: 24, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
48.17 s -- in org.apache.hadoop.fs.s3a.ITestS3AEncryptionSSEC
   [INFO] Running org.apache.hadoop.fs.s3a.ITestS3AEncryptionAlgorithmValidation
   [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0 
s -- in org.apache.hadoop.fs.s3a.ITestS3AEncryptionAlgorithmValidation
   [INFO] Running 
org.apache.hadoop.fs.s3a.ITestS3AEncryptionSSEKMSUserDefinedKeyWithEncryptionContext
   [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.575 
s -- in 
org.apache.hadoop.fs.s3a.ITestS3AEncryptionSSEKMSUserDefinedKeyWithEncryptionContext
   [INFO] Running org.apache.hadoop.fs.s3a.ITestS3AEncryptionSSEKMSDefaultKey
   [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.246 
s -- in org.apache.hadoop.fs.s3a.ITestS3AEncryptionSSEKMSDefaultKey
   [INFO] Running 
org.apache.hadoop.fs.s3a.ITestS3AEncryptionWithDefaultS3Settings
   [WARNING] Tests run: 5, Failures: 0, Errors: 0, Skipped: 5, Time elapsed: 
2.600 s -- in org.apache.hadoop.fs.s3a.ITestS3AEncryptionWithDefaultS3Settings
   [INFO] Running 
org.apache.hadoop.fs.s3a.ITestS3AEncryptionSSEKMSUserDefinedKey
   [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.414 
s -- in org.apache.hadoop.fs.s3a.ITestS3AEncryptionSSEKMSUserDefinedKey
   [INFO] Running org.apache.hadoop.fs.s3a.ITestS3AEncryptionSSES3
   [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.680 
s -- in org.apache.hadoop.fs.s3a.ITestS3AEncryptionSSES3
   [INFO] Running 
org.apache.hadoop.fs.s3a.ITestS3AEncryptionDSSEKMSUserDefinedKeyWithEncryptionContext
   [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.538 
s -- in 
org.apache.hadoop.fs.s3a.ITestS3AEncryptionDSSEKMSUserDefinedKeyWithEncryptionContext
   [INFO] Running 
org.apache.hadoop.fs.s3a.ITestS3AEncryptionDSSEKMSUserDefinedKey
   [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.425 
s -- in org.apache.hadoop.fs.s3a.ITestS3AEncryptionDSSEKMSUserDefinedKey
   [INFO]
   [INFO] Results:
   [INFO]
   [WARNING] Tests run: 53, Failures: 0, Errors: 0, Skipped: 6
   ```
   
   ```
   mvn clean verify -Dit.test=TestMarshalledCredentials -Dtest=none
   
   [INFO] -------------------------------------------------------
   [INFO]  T E S T S
   [INFO] -------------------------------------------------------
   [INFO] Running org.apache.hadoop.fs.s3a.auth.TestMarshalledCredentials
   [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.133 
s -- in org.apache.hadoop.fs.s3a.auth.TestMarshalledCredentials
   [INFO]
   [INFO] Results:
   [INFO]
   [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0
   ```
   
   ```
   mvn clean verify -Dit.test=ITestS3AHugeFilesEncryption -Dtest=none
   
   [INFO] -------------------------------------------------------
   [INFO]  T E S T S
   [INFO] -------------------------------------------------------
   [INFO] Running org.apache.hadoop.fs.s3a.scale.ITestS3AHugeFilesEncryption
   [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
33.04 s -- in org.apache.hadoop.fs.s3a.scale.ITestS3AHugeFilesEncryption
   [INFO]
   [INFO] Results:
   [INFO]
   [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0
   ```
   
   Finally, I verified in the CloudTrail logs, that the encryption context was 
set for both `aws:kms` and `aws:kms:dsse`.
   
   ```
   (...)
       {
         "eventTime": "2024-06-08T03:49:49Z",
         "eventSource": "s3.amazonaws.com",
         "eventName": "PutObject",
         "awsRegion": "us-west-1",
         "userAgent": "[Hadoop 3.5.0-SNAPSHOT, aws-sdk-java/2.24.6 
Linux/5.10.217-183.860.amzn2int.x86_64 OpenJDK_64-Bit_Server_VM/25.412-b08 
Java/1.8.0_412 vendor/Private_Build io/sync http/Apache cfg/retry-mode/adaptive 
hll/cross-region ft/s3-transfer]",
         "requestParameters": {
           "bucketName": "************",
           "x-amz-server-side-encryption-aws-kms-key-id": 
"arn:aws:kms:us-west-1:************:key/************",
           "Host": "************.s3.us-west-1.amazonaws.com",
           "x-amz-server-side-encryption": "aws:kms:dsse",
           "x-amz-server-side-encryption-context": 
"eyJjb21wb25lbnQiOiJmcy9zMyIsInByb2plY3QiOiJoYWRvb3AiLCJqaXJhIjoiSEFET09QLTE5MTk3In0=",
           "key": "test/"
         },
   (...)
   ```
   ```
   (...)
         "awsRegion": "us-west-1",
         "sourceIPAddress": "204.246.162.39",
         "userAgent": "[Hadoop 3.5.0-SNAPSHOT, aws-sdk-java/2.24.6 
Linux/5.10.217-183.860.amzn2int.x86_64 OpenJDK_64-Bit_Server_VM/25.412-b08 
Java/1.8.0_412 vendor/Private_Build io/sync http/Apache cfg/retry-mode/adaptive 
hll/cross-region ft/s3-transfer]",
         "requestParameters": {
           "bucketName": "************",
           "x-amz-server-side-encryption-aws-kms-key-id": 
"arn:aws:kms:us-west-1:************:key/************",
           "Host": "************.s3.us-west-1.amazonaws.com",
           "x-amz-server-side-encryption": "aws:kms",
           "x-amz-server-side-encryption-context": 
"eyJjb21wb25lbnQiOiJmcy9zMyIsInByb2plY3QiOiJoYWRvb3AiLCJqaXJhIjoiSEFET09QLTE5MTk3In0=",
           "key": "test/testEncryptionOverRename-0400"
         },
   (...)
   ```
   ```
   echo 
eyJjb21wb25lbnQiOiJmcy9zMyIsInByb2plY3QiOiJoYWRvb3AiLCJqaXJhIjoiSEFET09QLTE5MTk3In0=
 | base64 --decode
   {"component":"fs/s3","project":"hadoop","jira":"HADOOP-19197"}%
   ```
   
   I also executed the test with the following statement in my KMS key:
   ```
   {
        "Effect": "Deny",
        "Principal": {
            "AWS": "*"
        },
        "Action": "kms:Decrypt",
        "Resource": "*",
        "Condition": {
            "StringNotEquals": {
                "kms:EncryptionContext:project": "hadoop"
            }
        }
   }
   ```
   
   When using that statement, tests without encryption context fail, and the 
new test will pass only if the given key-pair is set in 
`fs.s3a.encryption.context`.
   
   ### For code changes:
   
   - [X] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [X] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [-] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [-] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to