Copilot commented on code in PR #7853: URL: https://github.com/apache/hadoop/pull/7853#discussion_r2260273456
########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobClient.java: ########## @@ -1103,8 +1107,21 @@ public AbfsRestOperation flush(byte[] buffer, AbfsRestOperation op1 = getPathStatus(path, true, tracingContext, contextEncryptionAdapter); String metadataMd5 = op1.getResult().getResponseHeader(CONTENT_MD5); - if (blobMd5 != null && !blobMd5.equals(metadataMd5)) { - throw ex; + /* + * Validate the response by comparing the server's MD5 metadata against either: + * 1. The full blob content MD5 (if full blob checksum validation is enabled), or + * 2. The full block ID buffer MD5 (fallback if blob checksum validation is disabled) + */ + if (getAbfsConfiguration().isFullBlobChecksumValidationEnabled() && blobMd5 != null) { Review Comment: The configuration check `getAbfsConfiguration().isFullBlobChecksumValidationEnabled()` is performed inside the response validation logic, but this same check was already done earlier in the method. Consider storing the result in a variable to avoid redundant method calls. ```suggestion if (fullBlobChecksumValidationEnabled && blobMd5 != null) { ``` ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobClient.java: ########## @@ -1073,11 +1073,15 @@ public AbfsRestOperation flush(byte[] buffer, requestHeaders.add(new AbfsHttpHeader(CONTENT_LENGTH, String.valueOf(buffer.length))); requestHeaders.add(new AbfsHttpHeader(CONTENT_TYPE, APPLICATION_XML)); requestHeaders.add(new AbfsHttpHeader(IF_MATCH, eTag)); + String md5Hash = null; if (leaseId != null) { requestHeaders.add(new AbfsHttpHeader(X_MS_LEASE_ID, leaseId)); } - if (blobMd5 != null) { + if (isFullBlobChecksumValidationEnabled() && blobMd5 != null) { requestHeaders.add(new AbfsHttpHeader(X_MS_BLOB_CONTENT_MD5, blobMd5)); + } else { + md5Hash = computeMD5Hash(buffer, 0, buffer.length); + requestHeaders.add(new AbfsHttpHeader(X_MS_BLOB_CONTENT_MD5, md5Hash)); Review Comment: The variable `md5Hash` is declared but only used in the else branch. Consider declaring it within the else block to improve code clarity and reduce the scope of the variable. ########## hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/services/ITestAbfsOutputStream.java: ########## @@ -481,6 +481,8 @@ public void testResetCalledOnExceptionInRemoteFlush() throws Exception { //expected exception } // Verify that reset was called on the message digest - Mockito.verify(mockMessageDigest, Mockito.times(1)).reset(); + if (spiedClient.isChecksumValidationEnabled()) { Review Comment: The test is checking `isChecksumValidationEnabled()` but the code change suggests this should be checking `isFullBlobChecksumValidationEnabled()` since the MD5 reset behavior is now conditional on full blob checksum validation, not general checksum validation. ```suggestion if (spiedClient.isFullBlobChecksumValidationEnabled()) { ``` ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AbfsConfiguration.java: ########## @@ -1705,6 +1709,10 @@ public void setIsChecksumValidationEnabled(boolean isChecksumValidationEnabled) this.isChecksumValidationEnabled = isChecksumValidationEnabled; } + public boolean isFullBlobChecksumValidationEnabled() { Review Comment: The getter method `isFullBlobChecksumValidationEnabled()` lacks a corresponding setter method, which breaks the pattern established by other configuration properties like `isChecksumValidationEnabled`. This could cause issues for programmatic configuration changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org