Hi John, I didn't actually calculating the MD5 explicitly. I traced the code to ServiceUtils.downloadObjectToFile method from amazon s3 sdk, my invocation of S3Utils.getObject failed at the following code in ServiceUtils:
byte[] clientSideHash = null; byte[] serverSideHash = null; try { // Multipart Uploads don't have an MD5 calculated on the service side if (ServiceUtils.isMultipartUploadETag(s3Object.getObjectMetadata().getETag()) == false) { clientSideHash = Md5Utils.computeMD5Hash(new FileInputStream(destinationFile)); serverSideHash = BinaryUtils.fromHex(s3Object.getObjectMetadata().getETag()); } } catch (Exception e) { log.warn("Unable to calculate MD5 hash to validate download: " + e.getMessage(), e); } if (performIntegrityCheck && clientSideHash != null && serverSideHash != null && !Arrays.equals(clientSideHash, serverSideHash)) { throw new AmazonClientException("Unable to verify integrity of data download. " + "Client calculated content hash didn't match hash calculated by Amazon S3. " + "The data stored in '" + destinationFile.getAbsolutePath() + "' may be corrupt."); } Some web discussion mentioned that this is related to multi-part copy: http://sourceforge.net/p/s3tools/discussion/618865/thread/50a00c18. But the resolution there seems not working for me. Any advise? Thanks -min On 6/6/13 2:02 PM, "John Burwell" <jburw...@basho.com> wrote: >Min, > >Are you calculating the MD5 or letting the Amazon client do it? > >Thanks, >-John > >On Jun 6, 2013, at 4:54 PM, Min Chen <min.c...@citrix.com> wrote: > >> Thanks Tom. Indeed I have a S3 question that need some advise from some >>S3 >> experts. To support upload object > 5G, I have used >>TransferManager.upload >> to upload object to S3, upload went fine and object are successfully put >> to S3. However, later on when I am using "s3cmd get <object key>" to >> retrieve this object, I always got this exception: >> >> "MD5 signatures do not match: computed=Y, received="X" >> >> It seems that Amazon S3 kept a different Md5 sum for the multi-part >> uploaded object. We have been using Riak CS for our S3 testing. If I >> changed to not using multi-part upload and directly invoking S3 >>putObject, >> I will not run into this issue. Do you have such experience before? >> >> -min >> >> On 6/6/13 1:56 AM, "Thomas O'Dowd" <tpod...@cloudian.com> wrote: >> >>> Thanks Min. I've printed out the material and am reading new threads. >>> Can't comment much yet until I understand things a bit more. >>> >>> Meanwhile, feel free to hit me up with any S3 questions you have. I'm >>> looking forward to playing with the object_store branch and testing it >>> out. >>> >>> Tom. >>> >>> On Wed, 2013-06-05 at 16:14 +0000, Min Chen wrote: >>>> Welcome Tom. You can check out this FS >>>> >>>> >>>>https://cwiki.apache.org/confluence/display/CLOUDSTACK/Storage+Backup+O >>>>bj >>>> ec >>>> t+Store+Plugin+Framework for secondary storage architectural work done >>>> in >>>> object_store branch.You may also check out the following recent >>>>threads >>>> regarding 3 major technical questions raised by community as well as >>>>our >>>> answers and clarification. >>>> >>>> >>>>http://mail-archives.apache.org/mod_mbox/cloudstack-dev/201306.mbox/%3C >>>>77 >>>> B3 >>>> 37AF224FD84CBF8401947098DD87036A76%40SJCPEX01CL01.citrite.net%3E >>>> >>>> >>>>http://mail-archives.apache.org/mod_mbox/cloudstack-dev/201306.mbox/%3C >>>>CD >>>> D2 >>>> 2955.3DDDC%25min.chen%40citrix.com%3E >>>> >>>> >>>>http://mail-archives.apache.org/mod_mbox/cloudstack-dev/201306.mbox/%3C >>>>CD >>>> D2 >>>> 300D.3DE0C%25min.chen%40citrix.com%3E >>>> >>>> >>>> That branch is mainly worked on by Edison and me, and we are at PST >>>> timezone. >>>> >>>> Thanks >>>> -min >>> -- >>> Cloudian KK - http://www.cloudian.com/get-started.html >>> Fancy 100TB of full featured S3 Storage? >>> Checkout the Cloudian® Community Edition! >>> >> >