Steve Loughran created HADOOP-15224:
---------------------------------------

             Summary: builld up md5 checksum as blocks are built in 
S3ABlockOutputStream; validate upload
                 Key: HADOOP-15224
                 URL: https://issues.apache.org/jira/browse/HADOOP-15224
             Project: Hadoop Common
          Issue Type: Sub-task
          Components: fs/s3
    Affects Versions: 3.0.0
            Reporter: Steve Loughran


[~rdblue] reports sometimes he sees corrupt data on S3. Given MD5 checks from 
upload to S3, its likelier to have happened in VM RAM, HDD or nearby.

If the MD5 checksum for each block was built up as data was written to it, and 
checked against the etag RAM/HDD storage of the saved blocks could be removed 
as sources of corruption

The obvious place would be {{org.apache.hadoop.fs.s3a.S3ADataBlocks.DataBlock}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to