Why does CouchDb need attachment length?

Nick North Sat, 16 Nov 2013 10:12:57 -0800

I'm working with CouchDb documents with multiple attachments, submitted
using MIME multipart/related requests. In this case the document JSON has
to have an "_attachments" property specifying each attachment's name,
content type and length as described
here<http://wiki.apache.org/couchdb/HTTP_Document_API#Multiple_Attachments>.
The document and attachments are MIME-encoded and submitted in a single
request.


Although this works, programming it is awkward as each attachment's length
must be known in advance in order to populate the _attachments property.
Attachments are often in the form of streams, and finding the length means
having to read through the whole stream. Then you have to spool through the
stream again when submitting the HTTP request. (In some languages I suspect
the only way to do this is to buffer the entire stream contents in memory.)
If the length did not have to be put into the initial JSON object, then the
stream could just be passed straight through to the HTTP request with no
need for reading twice or buffering in memory.

So my question is: why does CouchDb require the length to be supplied? It's
definitely necessary as I've tried giving the wrong length, or no length at
all, and that causes the request to fail. But a quick look at the Erlang
source suggests that the length is not used when parsing the request, and
presumably that parsing process could calculate each attachment's length
for use later on if it's needed.

If, in principle, the length could be dropped when submitting requests,
then I'd be interested in trying to modify the code to make that possible.
But, if there is a good reason why it has to be supplied, then I don't want
to waste time working out what's going on in the Erlang. So any advice on
why attachments were designed as they are would be very welcome. Many
thanks,

Nick

Why does CouchDb need attachment length?

Reply via email to