I've figured this out, thanks to Robert Newson looking at a TCP dump Pieter van 
der Eems sent him. It turns out to be an issue with CouchDB that I already knew 
about but had forgotten would bite in this particular circumstance. 
Specifically, CouchDB isn't associating the MIME bodies with the attachments 
correctly; it gets them mixed up. As a result it gets confused about the 
lengths and blows up.

The issue is with CouchDB's multipart support, specifically the way in which it 
matches MIME bodies to attachment names. The IMHO correct way to do this would 
be to look at the filename in the Content-Disposition header, and this is in 
fact what TouchDB generates:
        Content-Disposition: attachment; filename="20120808-092628.png"
But CouchDB ignores this header. Instead it assumes that the order in which the 
MIME bodies appear matches the order in which the attachment objects appear in 
the _attachments object.

The problem with this is that JSON objects (dictionaries) are _not_ ordered 
collections. I know that Erlang's implementation of them (as linked lists of 
key/value pairs) happens to be ordered, and I think some JavaScript 
implementations have the side effect of preserving order; but in many languages 
these are implemented as hash tables and genuinely unordered.

So when TouchDB serializes the NSDictionary object representing the 
attachments, it has _no idea_ in what order the JSON encoder will write the 
keys. This means it can't comply with CouchDB's ordering requirement because it 
doesn't know what order in which to write out the attachments. I believe I am 
going to have to work around this by using a custom JSON encoder that I can 
make write out dictionary entries in a known (sorted?) order.

I've filed this as COUCHDB-1521. As I said, I can work around it, but I really 
think this should be fixed as it's a hurdle for interoperability.

(Ironically I ran into the flip side of this issue last year and filed a bug on 
it (COUCHDB-1368): when _receiving_ a multipart body from CouchDB, it's 
difficult to match attachments with their MIME bodies because CouchDB doesn't 
put any headers into the MIME bodies to indicate filenames; the only clue is 
the ordering of the entries in the _attachments dictionary, and that ordering 
is lost when Cocoa's JSON parser converts it into an NSDictionary object.)

—Jens

Reply via email to