[ https://issues.apache.org/jira/browse/HADOOP-15224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17939230#comment-17939230 ]
ASF GitHub Bot commented on HADOOP-15224: ----------------------------------------- raphaelazzolini commented on PR #7396: URL: https://github.com/apache/hadoop/pull/7396#issuecomment-2761379637 > > > BTW, does setting the algorithm to md5 restore > > - compatibility with third party stores when sdk >= 2.30.0 > > - AWS v4 signer to work? that seems to have broken on _all_ previous releases, even those which were working I am not sure... this code change didn't add md5, the sdk supports crc32, crc32c, sha1 and sha256. MD5 is not an option through the SDK methods, but it is allowed by using the old approach where you calculate the digest by yourself and provide it in the Content-MD5 header. > Add option to set checksum on S3 object uploads > ----------------------------------------------- > > Key: HADOOP-15224 > URL: https://issues.apache.org/jira/browse/HADOOP-15224 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.0.0 > Reporter: Steve Loughran > Assignee: Raphael Azzolini > Priority: Minor > Labels: pull-request-available > Fix For: 3.5.0 > > > [~rdblue] reports sometimes he sees corrupt data on S3. Given MD5 checks from > upload to S3, its likelier to have happened in VM RAM, HDD or nearby. > If the MD5 checksum for each block was built up as data was written to it, > and checked against the etag RAM/HDD storage of the saved blocks could be > removed as sources of corruption > The obvious place would be > {{org.apache.hadoop.fs.s3a.S3ADataBlocks.DataBlock}} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org