ivandika3 commented on code in PR #8506:
URL: https://github.com/apache/ozone/pull/8506#discussion_r2106138338


##########
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/util/S3Utils.java:
##########
@@ -198,4 +200,11 @@ public static String generateCanonicalUserId(String input) 
{
     return DigestUtils.sha256Hex(input);
   }
 
+  public static boolean isEtagMisMatch(String encodedClientETag, String 
serverETag) {

Review Comment:
   Nit: rename to `isETagMismatch`



##########
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##########
@@ -896,15 +903,21 @@ public Response 
completeMultipartUpload(@PathParam("bucket") String bucket,
         LOG.debug("Parts map {}", partsMap);
       }
 
-      omMultipartUploadCompleteInfo = getClientProtocol()
+      OmMultipartUploadCompleteInfo omMultipartUploadCompleteInfo = 
getClientProtocol()
           .completeMultipartUpload(volume.getName(), bucket, key, uploadID,
               partsMap);
+      String serverEtag = omMultipartUploadCompleteInfo.getHash();
+      String encodedClientETag = headers.getHeaderString(CHECKSUM_HEADER);
+      if (S3Utils.isEtagMisMatch(encodedClientETag, serverEtag)) {
+        abortMultipartUpload(volume, bucket, key, uploadID);

Review Comment:
   Is this multipart upload abort specified in the AWS S3 spec? If not, I'd 
prefer not to abort the incomplete multipart uploads. User can abort it 
themselves or we can let the background multipart upload cleanup service to 
abort them after a while (default 7 days).



##########
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java:
##########
@@ -342,6 +343,13 @@ public Response put(
           output.getMetadata().put(ETAG, eTag);
         }
       }
+
+      String encodedClientETag = headers.getHeaderString(CHECKSUM_HEADER);
+      if (S3Utils.isEtagMisMatch(encodedClientETag, eTag)) {
+        delete(bucketName, keyPath, uploadID, null);

Review Comment:
   I'm a bit worried about this since if there are concurrent key creation from 
the user, it might delete other user's key instead. I would prefer if the ETag 
checking is done before the key is committed (i.e. before `OzoneOutputStream` 
is closed). In the case of multipart uploads, user can decide to continue 
uploading the same part again, instead of aborting the multipart uploads (that 
might already contain a lot of large parts).
   
   Since we have an open key cleanup service, the single / multipart upload 
open keys will be deleted eventually. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to