On Twitter, there was a conversation a few days ago that started:

@fbeausoleil: How do you ensure end-to-end integrity of the data you store
on #S3 ? I calculate a checksum, upload, download then compare. Thoughts?

That didn't sound right to me, and indeed, @rafaelrosafu pointed out
shortly thereafter: @fbeausoleil and docs.aws.amazon.com/AmazonS3/lates…
under Content-MD5 header

But in fact, s3cmd wasn't issuing the Content-MD5 header.  Even when we
could know what to put into it.  So S3 couldn't explicitly tell us when an
object was corrupted on upload (which should be a rarity anyhow). Who needs
end-to-end checking anyhow?

So, a few patches later, and we can.  I'd appreciate some more eyes on this
patch series before pulling it into master, but it feels about right.

https://github.com/mdomsch/s3cmd/commits/feature/content-md5

Matt Domsch (5):
      add Content-MD5 header to PUT objects (not multipart)
      add Content-MD5 header for each multipart chunk
      Don't double-calculate MD5s on multipart chunks
      add Content-MD5 on put, not just sync
      handle errors during multipart uploads


If I got this right, if S3 returns a 400 BadDigest, we retry (like we would
any other retry-able error).

I'll also note, we aren't explicitly capturing a 503 SlowDown anywhere else
(multipart does now in this series) to use for future operations; it's only
caught and used during retries of this one operation on this one object,
not thereafter.  Maybe that's OK.  But I'm tempted to catch SlowDown at the
send_request() level rather than higher, and retry there (with exponential
backoff rather than the linear backoff currently being used).  Otherwise we
have to scatter retry and backoff logic all over the place.

I'm wondering if this isn't the source of the various "failed retry" errors
that are routinely posted to the bug list.

Thanks,
Matt
------------------------------------------------------------------------------
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform
_______________________________________________
S3tools-general mailing list
S3tools-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/s3tools-general

Reply via email to