janl commented on issue #745: Replication with attachments never completes, {mp_parser_died,noproc} error URL: https://github.com/apache/couchdb/issues/745#issuecomment-369905803 Great repro Joan. I played with it and came up with this: The python script uses the standalone attachment API: `/db/doc/att` The handler for this request does NOT apply `max_http_request_size` (which happens in [`chttpd:body/2`](https://github.com/apache/couchdb/blob/master/src/chttpd/src/chttpd.erl#L628-L643) or [couch_httpd:check_max_request_length()](https://github.com/apache/couchdb/blob/master/src/couch/src/couch_httpd.erl#L452-L460), neither of which is used by the standalone attachment API). The twist now is that the replicator uses multipart requests and not standalone attachment requests. Multipart requests are subject to the `max_http_request_size` limit. This leads to the observed behaviour that you can create an attachment in one db and can NOT replicate that attachment to another db on the same CouchDB node (or another node with the same `max_http_request_size` limit). Applying `max_http_request_size` in the standalone attachment API is trivial[1], but leads to the next unfortunate behaviour: Say you create a doc with two attachments, with a length that is just under `max_http_request_size`, each individual attachment write will succeed, but replicating it to another db will, again, produce a multipart request that overall is > `max_http_request_size`. I haven?t checked this, but a conflicting doc with one attachment < `max_http_request_size` where the attachment data is conflicted might also produce a multipart http request > `max_http_request_size` to replicate both conflicting revisions and attachment bodies. This leads us to having to decide: 1. is `max_http_request_size` a hard hard hard limit or do we accept requests larger than that, if they are multipart http requests? - if yes, do we apply the `max_document_size` and `max_attachment_size` to individual chunks of the multipart request? 2. if not 1., do we need to rewrite the replicator to not produce requests > `max_http_request_size` and potentially do attachments individually? References: [1]: ```diff --- a/src/chttpd/src/chttpd_db.erl +++ b/src/chttpd/src/chttpd_db.erl @@ -1218,6 +1218,7 @@ db_attachment_req(#httpd{method=Method, user_ctx=Ctx}=Req, Db, DocId, FileNamePa undefined -> <<"application/octet-stream">>; CType -> list_to_binary(CType) end, + couch_httpd:check_max_request_length(Req), Data = fabric:att_receiver(Req, chttpd:body_length(Req)), ContentLen = case couch_httpd:header_value(Req,"Content-Length") of undefined -> undefined; ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services