Nick Vatamaniuc created COUCHDB-3291:
----------------------------------------

             Summary: Excessivly long document IDs prevent replicator from 
making progress
                 Key: COUCHDB-3291
                 URL: https://issues.apache.org/jira/browse/COUCHDB-3291
             Project: CouchDB
          Issue Type: Bug
            Reporter: Nick Vatamaniuc


Currently there is not protection in couchdb from creating IDs which are too 
long. So large IDs will hit various implicit limits which usually results in 
unpredictable failure modes.

On such example implicit limit is hit in the replicator code. Replicate usually 
fetches document IDs in a bulk-like call either gets them via changes feed, 
computes revs_diffs in a post or inserts them with bulk_docs, except one case 
when it fetch open_revs. There it uses a single GET request. That requests 
fails because there is a bug / limitation in the http parser. The first GET 
line in the http request has to fit in the receive buffer for the receiving 
socket. 

Increasing that buffer allow passing through larger http requests lines. In 
configuration options it can be manipulated as 
{code}
 chttpd.server_options="[...,{recbuf, 32768},...]"
{code}

Steve Vinoski mentions something about a possible bug in http packet parser 
code as well:

http://erlang.org/pipermail/erlang-questions/2011-June/059567.html

Tracing this a bit I see that a proper mochiweb request is never even created 
and instead request hangs. So that confirms it further. It seems in the code 
here:

https://github.com/apache/couchdb-mochiweb/blob/bd6ae7cbb371666a1f68115056f7b30d13765782/src/mochiweb_http.erl#L90

The timeout clause is hit. Adding a catchall exception I get the 
{tcp_error,#Port<0.40682>,emsgsize} message which we don't handle. Seems like a 
sane place to throw a 413 or such there.

There are probably multiple ways to address the issue:

 * Increase mochiweb listener buffer to fit larger doc ids. However that is a 
separate bug and using it to control document size during replication is not 
reliable. Moreover that would allow larger IDs to propagate through the system 
during replication, then would have to configure all future replication source 
with the same maximum recbuf value.

 * Introduce a validation step in {code} couch_doc:validate_docid {code}. 
Currently that code doesn't read from config files and is in the hotpath. Added 
a config read in there might reduce performance.  If that is enabled it would 
stop creating new documents with large ids. But have to decide how to handle 
already existing IDs which are larger than the limit.

 * Introduce a validation/bypass in the replicator. Specifically targeting 
replicator might help prevent propagation of large IDs during replication. 
There is a already a similar case of skipping writing large attachment or large 
documents (which exceed request size) and bumping {code} doc_write_failures 
{code}.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to