GitHub user tonysun83 opened a pull request:

    https://github.com/apache/couchdb-chttpd/pull/156

    Introduce max_http_request_size to replace max_document_size

    This PR serves to consolidate ideas from 
https://github.com/apache/couchdb-chttpd/pull/114 (the discussion got a little 
long and convoluted) and to finalize implementation details.
    
    **Background**
    Sometimes users want to limit the actual document size of various requests. 
Currently, our ```max_document_size``` is a misnomer. It's actually the request 
body of the http request. So ```_bulk_docs``` requests with multiple docs or 
docs with attachments would still count within this ```max_document_size``` 
parameter. The name is misleading. 
    
    **Requirements**
    This feature within the following requirements:
    
    R1) Use config parameter names that actually reflect their intentions.
    R2) Prevent DDOS attacks. 
    R3) Do not allow loopholes for users to bypass the restriction and thereby 
create weird scenarios. We see this mostly in replication where attachments use 
multi-part requests.
    R4) Reject a document that is actually valid. We see this scenario in 
replication as well. Replication needs its own section so will discuss a bit 
more further down.
    
    **Proposal**
    P1) Replace ```max_document_size``` with ```max_http_request_size```. This 
initial change will serve the same purpose as before except with a different 
name. It meets requirements R1 and R2 above.
    P2) Actually use ```max_document_size``` for update requests. So for 
PUT/POST requests that may or may not include an attachment, we only look at 
the document itself. For ```_bulk_docs```, if a document exceeds 
```max_document_size```, we return an error with the document id, and the``` 
_bulk_docs``` is rejected. For update ```_update``` handlers, the same 
restriction is applied to update so that the updated document should not exceed 
the limit. This meets most of the R3 requirement, but multipart requests with 
replication are the problem. (More on that in replication section)
    P3) Possibly introduce a ```use_max_document_size``` parameter so that we 
don't always execute document size computations.
    
    **Replication**
    
    The new restriction has a big impact on replication for many reasons. 
    
    1) Before the following fix:
    https://github.com/apache/couchdb-couch-replicator/pull/49/files, 
replications would crash when the request limit was set too low. Now that we're 
changing the meaning to use actual document size instead of request size, we 
may have to revisit that fix.
    
    2) Currently, a rare scenario exists during replication when a customer has 
documents very close to the old misnamed ```max_document_size```. When we add 
extra query parameters or the doc is wrapped in ```_bulk_docs```, the request 
size exceeds the old  ```max_document_size``` config, and replication fails. By 
changing the name and actually looking at the documents itself, we should be 
able to avoid this scenario in R4.
    
    3) Replication uses multipart requests when attachments are included with 
the document. This means that a request is streamed. This means we can't use 
document size as the restriction because we would have to wait for the stream 
to finish before extracting the document out of the entire request. @davisp 
suggested we read from the socket and fail once the number of bytes from the 
stream exceed ```max_document_size```. However, the details are still unclear 
to me. I need to look at how this socket restriction will work with attachments 
and headers. Bottom line is that we need to get replication working such that 
the user can't use attachments to bypass the request, and at the same time, 
only the actual document is used as the limitation.
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloudant/couchdb-chttpd 
64299-add-new-request-parameter

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/couchdb-chttpd/pull/156.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #156
    
----

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to