In AWS, the Etag for multipart object is hash of hashes of all parts dash 
number of parts.
See: https://forums.aws.amazon.com/thread.jspa?messageID=456442

In general, S3 says the ETag would not be a valid MD5 in a number of cases, 
including multipart.
See ETag definition here: 
http://docs.aws.amazon.com/AmazonS3/latest/API/RESTCommonResponseHeaders.html



On 9/22/2015 11:10 AM, Veit Guna wrote:
> Hi.
>  
> We're using jclouds 1.9.1 with the aws-s3 provider. Until now, we have used 
> the returned etag of blobStore.putBlob() to manually verify
> against a client provided hash. That worked quite well for us. But since we 
> are hitting the 5GB limit of S3, we switched to the multipart() upload
> that jclouds offers. But now, putBlob() returns someting like 
> <md5-hash>-<number> e.g. 90644a2d0c7b74483f8d2036f3e29fc5-2 that of course
> fails with our validation.
>  
> I guess this is due to the fact, that each chunk is hashed separately and 
> send to S3. So there is no complete hash over the whole payload that could
> be returned by putBlob() - is that correct?
>  
> During my research I stumbled across this:
>  
> https://github.com/jclouds/jclouds/commit/f2d897d9774c2c0225c199c7f2f46971637327d6
>  
> Now I'm wondering, what the contract of putBlob() is. Should it only return 
> valid etag/hashes otherwise return null?
>  
> I'm asking that, because otherwise, I would have to start parsing and 
> validating the returned value by myself and skip any
> validation when it isn't a normal md5 hash. My guess is, that this is the 
> hash from the last transferred chunk plus
> the chunk number?
>  
> Maybe someone can shed some light on this :).
>  
> Thanks
> Veit
>  
> 

Reply via email to