ivandika3 commented on code in PR #9294:
URL: https://github.com/apache/ozone/pull/9294#discussion_r2556047230


##########
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##########
@@ -1568,22 +1604,23 @@ private S3ChunkInputStreamInfo 
getS3ChunkInputStreamInfo(
     }
 
     // DigestInputStream is used for ETag calculation

Review Comment:
   This comment might need to be updated.



##########
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/KeyDataStreamOutput.java:
##########
@@ -436,6 +436,13 @@ public void close() throws IOException {
     }
   }
 
+  /**
+   * Cleanup the incomplete multipart upload parts.
+   */

Review Comment:
   This seems inaccurate, `BlockDataStreamOutputEntryPool#cleanUp` does not do 
any MPU aborts.



##########
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##########
@@ -395,8 +425,9 @@ public Response put(
     } finally {
       // Reset the thread-local message digest instance in case of exception
       // and MessageDigest#digest is never called
-      if (digestInputStream != null) {
-        digestInputStream.getMessageDigest().reset();
+      if (multiDigestInputStream != null) {
+        multiDigestInputStream.getMessageDigest(OzoneConsts.MD5_HASH).reset();
+        multiDigestInputStream.getMessageDigest(OzoneConsts.FILE_HASH).reset();
       }

Review Comment:
   I think this can be merged into `MultDigestInputStream#resetDigests`? Same 
with the other instances.



##########
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/signature/StringToSignProducer.java:
##########
@@ -70,19 +80,47 @@ public final class StringToSignProducer {
       DateTimeFormatter.ofPattern("yyyyMMdd'T'HHmmss'Z'")
           .withZone(ZoneOffset.UTC);
 
+  private static final Set<String> VALID_UNSIGNED_PAYLOADS;
+
+  static {
+    Set<String> set = new HashSet<>();
+    set.add(UNSIGNED_PAYLOAD);
+    set.add(STREAMING_UNSIGNED_PAYLOAD_TRAILER);
+    set.add(STREAMING_AWS4_HMAC_SHA256_PAYLOAD);
+    set.add(STREAMING_AWS4_HMAC_SHA256_PAYLOAD_TRAILER);
+    set.add(STREAMING_AWS4_ECDSA_P256_SHA256_PAYLOAD);
+    set.add(STREAMING_AWS4_ECDSA_P256_SHA256_PAYLOAD_TRAILER);
+    VALID_UNSIGNED_PAYLOADS = Collections.unmodifiableSet(set);
+  }
+
   private StringToSignProducer() {
   }
 
   public static String createSignatureBase(
       SignatureInfo signatureInfo,
       ContainerRequestContext context
   ) throws Exception {
-    return createSignatureBase(signatureInfo,
+    LowerCaseKeyStringMap lowerCaseKeyStringMap = 
LowerCaseKeyStringMap.fromHeaderMap(context.getHeaders());
+    String signatureBase = createSignatureBase(signatureInfo,
         context.getUriInfo().getRequestUri().getScheme(),
         context.getMethod(),
-        LowerCaseKeyStringMap.fromHeaderMap(context.getHeaders()),
+        lowerCaseKeyStringMap,
         fromMultiValueToSingleValueMap(
             context.getUriInfo().getQueryParameters()));
+
+    // validate x-amz-content-sha256
+    String payloadHash = getPayloadHash(lowerCaseKeyStringMap, 
!signatureInfo.isSignPayload());
+    if (!VALID_UNSIGNED_PAYLOADS.contains(payloadHash)) {
+      byte[] payload = IOUtils.toByteArray(context.getEntityStream());
+      context.setEntityStream(new ByteArrayInputStream(payload));

Review Comment:
   @hevinhsu Thanks for the continued work.
   
   I think removing try with resources will leak a lot of implementations to 
`ObjectEndpoint` (seem it requires nested finally). Is it possible to throw an 
exception in the `KeyOutputStream#close` without committing and handling it in 
the catch block if there is a mismatch? The cleanup can also be done before 
throwing an exception.
   
   Also left some implementation comments.
   
   



##########
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##########
@@ -1568,22 +1604,23 @@ private S3ChunkInputStreamInfo 
getS3ChunkInputStreamInfo(
     }
 
     // DigestInputStream is used for ETag calculation
-    DigestInputStream digestInputStream = new 
DigestInputStream(chunkInputStream, getMessageDigestInstance());
-    return new S3ChunkInputStreamInfo(digestInputStream, effectiveLength);
+    MultiDigestInputStream multiDigestInputStream =
+        new MultiDigestInputStream(chunkInputStream, 
getMessageDigestInstance(), getSha256DigestInstance());

Review Comment:
   We should only add the SHA-256 digest when needed (i.e. when 
"x-amz-content-sha256" header is actually a sha hash) to reduce unnecessary CPU 
overhead.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to