[ https://issues.apache.org/jira/browse/HADOOP-19604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18011833#comment-18011833 ]
ASF GitHub Bot commented on HADOOP-19604: ----------------------------------------- anmolanmol1234 opened a new pull request, #7853: URL: https://github.com/apache/hadoop/pull/7853 This PR adds a configuration flag to control the use of full blob MD5 validation during the PutBlockList (flush) operation. The functionality to validate the MD5 hash of the entire blob already existed, but it could not be toggled. With this change, the feature is now configurable and is disabled by default. When the config is set to false, the system uses the default block ID hash for integrity checks. When set to true, it performs full blob MD5 validation. This config has been introduced because full blob MD5 computation can lead to increased latency and higher CPU usage, especially for large blobs. > ABFS: Fix WASB ABFS compatibility issues > ---------------------------------------- > > Key: HADOOP-19604 > URL: https://issues.apache.org/jira/browse/HADOOP-19604 > Project: Hadoop Common > Issue Type: Sub-task > Affects Versions: 3.4.1 > Reporter: Anmol Asrani > Assignee: Anmol Asrani > Priority: Major > Labels: pull-request-available > Fix For: 3.4.1 > > > Fix WASB ABFS compatibility issues. Fix issues such as:- > # BlockId computation to be consistent across clients for PutBlock and > PutBlockList > # Restrict url encoding of certain json metadata during setXAttr calls. > # Maintain the md5 hash of whole block to validate data integrity during > flush. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org