Thanks Jens,

I can backport that.

B.

On 5 April 2012 13:41, Jens Alfke <[email protected]> wrote:
> Documents stored in Cloudant databases aren't including MD5 digests of 
> attachment contents in the _attachments metadata. Here's an example:
>
>    "_attachments": {
>        "photo-15357DCF-9566-4DFD-9120-8A9164EE5873": {
>            "follows": true,
>            "length": 79608,
>            "content_type": "image/jpeg",
>            "revpos": 2
>        }
>    },
>
> Other servers don't do this; I assume this is a difference between BigCouch 
> and CouchDB. Is this intentional? It's causing problems replicating databases 
> from Cloudant to TouchDB, and the workarounds I can think of for this in 
> TouchDB are either fairly ugly (basically involving writing a custom JSON 
> parser…) or involve performance regressions.
>
> Here's more detail on my problem:
> * For efficiency, the replicator in TouchDB (like CouchDB 1.2) fetches 
> documents in MIME multipart format, so that attachments are easily streamable 
> to disk and aren't base64-encoded.
> * This requires correlating the MIME bodies with the metadata objects in the 
> _attachments object.
> * CouchDB (and BigCouch) unfortunately don't add any headers to the MIME 
> bodies to identify what they are. I've already filed a bug report against 
> this.
> * TouchDB's replicator works around this by computing an MD5 digest of each 
> MIME body and then correlating those with the "digest" properties of the 
> attachment metadata objects.
> * …which fails with Cloudant/BigCouch because that "digest" property is 
> missing.
>
> The reason CouchDB itself doesn't have trouble correlating the attachments is 
> that it knows the MIME bodies are written in the same order as the 
> attachments appear in the _attachments object. However, key order is not 
> significant in JSON objects, and in most implementations the parser stores 
> the object contents in a hash table (like a Ruby Hash object or a Cocoa 
> NSDictionary), which means the ordering of the keys is lost. The only way for 
> me to determine the true order of the attachment keys would be to write my 
> own specialized JSON parser that could identify the keys and put the names 
> into an ordered structure like an array.
>
> —Jens

Reply via email to