To be specific, the Content-MD5 is always the MD5 of the response body, but this is not necessarily true for ETag. If you do want it to match, then either use a content-type that is not going to be compressed, or remove the content-type from couchdb's configuration.
It is appropriate (w.r.t RFC 2616) to depend on the Content-MD5 header. If you supply when when PUT'ting a standalone attachment, we'll even verify it matches and return an error if it doesn't. Jens, I'm not familiar with that optimization but, if it exists, it came after I exposed the MD5 in this manner. The only place I think the replicator is involved is that, by emitting this information, the replicator validates that attachments aren't corrupted in transit. On 2 November 2012 00:05, Jens Alfke <[email protected]> wrote: > > On Nov 1, 2012, at 1:49 PM, "Mclean, Adam" <[email protected]> wrote: > > > The digest produced by the file upload is key to this working for me so > > I'm not replacing files that are already the same in couch. I've been > > IMHO you should not try to interpret the contents of the attachment > ‘digest’ property. It’s mostly meant as an optimization for the replicator, > not as a user feature. Don’t assume that it consists of the string “md5-“ > followed by a hex MD5 digest of the actual attachment contents. As you’ve > seen, this isn’t true for compressed attachments. It’s even more untrue for > attachments on TouchDB, which uses a SHA1 digest instead. > > If you need to track the identities of attachments using a digest, it > would be safer to add your own digest property to the document, so that you > have control over how it’s generated. > > —Jens
