Hi Jens,

It cherry-picked cleanly so it should turn up in next week's code push.

B.

On 5 April 2012 13:49, Robert Newson <[email protected]> wrote:
> Thanks Jens,
>
> I can backport that.
>
> B.
>
> On 5 April 2012 13:41, Jens Alfke <[email protected]> wrote:
>> Documents stored in Cloudant databases aren't including MD5 digests of 
>> attachment contents in the _attachments metadata. Here's an example:
>>
>>    "_attachments": {
>>        "photo-15357DCF-9566-4DFD-9120-8A9164EE5873": {
>>            "follows": true,
>>            "length": 79608,
>>            "content_type": "image/jpeg",
>>            "revpos": 2
>>        }
>>    },
>>
>> Other servers don't do this; I assume this is a difference between BigCouch 
>> and CouchDB. Is this intentional? It's causing problems replicating 
>> databases from Cloudant to TouchDB, and the workarounds I can think of for 
>> this in TouchDB are either fairly ugly (basically involving writing a custom 
>> JSON parser…) or involve performance regressions.
>>
>> Here's more detail on my problem:
>> * For efficiency, the replicator in TouchDB (like CouchDB 1.2) fetches 
>> documents in MIME multipart format, so that attachments are easily 
>> streamable to disk and aren't base64-encoded.
>> * This requires correlating the MIME bodies with the metadata objects in the 
>> _attachments object.
>> * CouchDB (and BigCouch) unfortunately don't add any headers to the MIME 
>> bodies to identify what they are. I've already filed a bug report against 
>> this.
>> * TouchDB's replicator works around this by computing an MD5 digest of each 
>> MIME body and then correlating those with the "digest" properties of the 
>> attachment metadata objects.
>> * …which fails with Cloudant/BigCouch because that "digest" property is 
>> missing.
>>
>> The reason CouchDB itself doesn't have trouble correlating the attachments 
>> is that it knows the MIME bodies are written in the same order as the 
>> attachments appear in the _attachments object. However, key order is not 
>> significant in JSON objects, and in most implementations the parser stores 
>> the object contents in a hash table (like a Ruby Hash object or a Cocoa 
>> NSDictionary), which means the ordering of the keys is lost. The only way 
>> for me to determine the true order of the attachment keys would be to write 
>> my own specialized JSON parser that could identify the keys and put the 
>> names into an ordered structure like an array.
>>
>> —Jens

Reply via email to