Anuj Modi created HADOOP-18910:
----------------------------------
Summary: ABFS: Adding Support for MD5 Hash based integrity
verification of the request content during transport
Key: HADOOP-18910
URL: https://issues.apache.org/jira/browse/HADOOP-18910
Project: Hadoop Common
Issue Type: Sub-task
Components: fs/azure
Reporter: Anuj Modi
Assignee: Anuj Modi
Azure Storage Supports Content-MD5 Request Headers in Both Read and Append APIs.
Read: [Path - Read - REST API (Azure Storage Services) | Microsoft
Learn|https://learn.microsoft.com/en-us/rest/api/storageservices/datalakestoragegen2/path/read]
Append: [Path - Update - REST API (Azure Storage Services) | Microsoft
Learn|https://learn.microsoft.com/en-us/rest/api/storageservices/datalakestoragegen2/path/update]
This change is to make client-side changes to support them. In Read request, we
will send the appropriate header in response to which server will return the
MD5 Hash of the data it sends back. On Client we will tally this with the MD5
hash computed from the data received.
In Append request, we will compute the MD5 Hash of the data that we are sending
to the server and specify that in appropriate header. Server on finding that
header will tally this with the MD5 hash it will compute on the data received.
This whole Checksum Validation Support is guarded behind a config, Config is by
default disabled because with the use of "https" integrity of data is preserved
anyways. This is introduced as an additional data integrity check which will
have a performance impact as well.
Users can decide if they want to enable this or not by setting the following
config to *"true"* or *"false"* respectively. *Config:
"fs.azure.enable.checksum.validation"*
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]