[
https://issues.apache.org/jira/browse/HADOOP-18910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769965#comment-17769965
]
ASF GitHub Bot commented on HADOOP-18910:
-----------------------------------------
anujmodi2021 commented on code in PR #6069:
URL: https://github.com/apache/hadoop/pull/6069#discussion_r1339816363
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java:
##########
@@ -861,6 +875,12 @@ private boolean checkUserError(int responseStatusCode) {
&& responseStatusCode < HttpURLConnection.HTTP_INTERNAL_ERROR);
}
+ private boolean isMd5ChecksumError(final AzureBlobFileSystemException e) {
Review Comment:
Added test to verify that this works and we return proper Exceptions to the
caller in case of MD5Mismatch.
Can you please elaborate why we need to make this static?
Added javadoc
> ABFS: Adding Support for MD5 Hash based integrity verification of the request
> content during transport
> -------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-18910
> URL: https://issues.apache.org/jira/browse/HADOOP-18910
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/azure
> Reporter: Anuj Modi
> Assignee: Anuj Modi
> Priority: Major
> Labels: pull-request-available
>
> Azure Storage Supports Content-MD5 Request Headers in Both Read and Append
> APIs.
> Read: [Path - Read - REST API (Azure Storage Services) | Microsoft
> Learn|https://learn.microsoft.com/en-us/rest/api/storageservices/datalakestoragegen2/path/read]
> Append: [Path - Update - REST API (Azure Storage Services) | Microsoft
> Learn|https://learn.microsoft.com/en-us/rest/api/storageservices/datalakestoragegen2/path/update]
> This change is to make client-side changes to support them. In Read request,
> we will send the appropriate header in response to which server will return
> the MD5 Hash of the data it sends back. On Client we will tally this with the
> MD5 hash computed from the data received.
> In Append request, we will compute the MD5 Hash of the data that we are
> sending to the server and specify that in appropriate header. Server on
> finding that header will tally this with the MD5 hash it will compute on the
> data received.
> This whole Checksum Validation Support is guarded behind a config, Config is
> by default disabled because with the use of "https" integrity of data is
> preserved anyways. This is introduced as an additional data integrity check
> which will have a performance impact as well.
> Users can decide if they want to enable this or not by setting the following
> config to *"true"* or *"false"* respectively. *Config:
> "fs.azure.enable.checksum.validation"*
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]