Hi John,
I didn't actually calculating the MD5 explicitly. I traced the code to
ServiceUtils.downloadObjectToFile method from amazon s3 sdk, my invocation
of S3Utils.getObject failed at the following code in ServiceUtils:
byte[] clientSideHash = null;
byte[] serverSideHash = null;
try {
// Multipart Uploads don't have an MD5 calculated on the
service side
if
(ServiceUtils.isMultipartUploadETag(s3Object.getObjectMetadata().getETag())
== false) {
clientSideHash = Md5Utils.computeMD5Hash(new
FileInputStream(destinationFile));
serverSideHash =
BinaryUtils.fromHex(s3Object.getObjectMetadata().getETag());
}
} catch (Exception e) {
log.warn("Unable to calculate MD5 hash to validate download: "
+ e.getMessage(), e);
}
if (performIntegrityCheck && clientSideHash != null &&
serverSideHash != null && !Arrays.equals(clientSideHash, serverSideHash)) {
throw new AmazonClientException("Unable to verify integrity of
data download. " +
"Client calculated content hash didn't match hash
calculated by Amazon S3. " +
"The data stored in '" +
destinationFile.getAbsolutePath() + "' may be corrupt.");
}
Some web discussion mentioned that this is related to multi-part copy:
http://sourceforge.net/p/s3tools/discussion/618865/thread/50a00c18. But
the resolution there seems not working for me.
Any advise?
Thanks
-min
On 6/6/13 2:02 PM, "John Burwell" <[email protected]> wrote:
>Min,
>
>Are you calculating the MD5 or letting the Amazon client do it?
>
>Thanks,
>-John
>
>On Jun 6, 2013, at 4:54 PM, Min Chen <[email protected]> wrote:
>
>> Thanks Tom. Indeed I have a S3 question that need some advise from some
>>S3
>> experts. To support upload object > 5G, I have used
>>TransferManager.upload
>> to upload object to S3, upload went fine and object are successfully put
>> to S3. However, later on when I am using "s3cmd get <object key>" to
>> retrieve this object, I always got this exception:
>>
>> "MD5 signatures do not match: computed=Y, received="X"
>>
>> It seems that Amazon S3 kept a different Md5 sum for the multi-part
>> uploaded object. We have been using Riak CS for our S3 testing. If I
>> changed to not using multi-part upload and directly invoking S3
>>putObject,
>> I will not run into this issue. Do you have such experience before?
>>
>> -min
>>
>> On 6/6/13 1:56 AM, "Thomas O'Dowd" <[email protected]> wrote:
>>
>>> Thanks Min. I've printed out the material and am reading new threads.
>>> Can't comment much yet until I understand things a bit more.
>>>
>>> Meanwhile, feel free to hit me up with any S3 questions you have. I'm
>>> looking forward to playing with the object_store branch and testing it
>>> out.
>>>
>>> Tom.
>>>
>>> On Wed, 2013-06-05 at 16:14 +0000, Min Chen wrote:
>>>> Welcome Tom. You can check out this FS
>>>>
>>>>
>>>>https://cwiki.apache.org/confluence/display/CLOUDSTACK/Storage+Backup+O
>>>>bj
>>>> ec
>>>> t+Store+Plugin+Framework for secondary storage architectural work done
>>>> in
>>>> object_store branch.You may also check out the following recent
>>>>threads
>>>> regarding 3 major technical questions raised by community as well as
>>>>our
>>>> answers and clarification.
>>>>
>>>>
>>>>http://mail-archives.apache.org/mod_mbox/cloudstack-dev/201306.mbox/%3C
>>>>77
>>>> B3
>>>> 37AF224FD84CBF8401947098DD87036A76%40SJCPEX01CL01.citrite.net%3E
>>>>
>>>>
>>>>http://mail-archives.apache.org/mod_mbox/cloudstack-dev/201306.mbox/%3C
>>>>CD
>>>> D2
>>>> 2955.3DDDC%25min.chen%40citrix.com%3E
>>>>
>>>>
>>>>http://mail-archives.apache.org/mod_mbox/cloudstack-dev/201306.mbox/%3C
>>>>CD
>>>> D2
>>>> 300D.3DE0C%25min.chen%40citrix.com%3E
>>>>
>>>>
>>>> That branch is mainly worked on by Edison and me, and we are at PST
>>>> timezone.
>>>>
>>>> Thanks
>>>> -min
>>> --
>>> Cloudian KK - http://www.cloudian.com/get-started.html
>>> Fancy 100TB of full featured S3 Storage?
>>> Checkout the Cloudian® Community Edition!
>>>
>>
>