Hi, devs.
I'd like to discuss input validation.
I like the validate_doc_update() system, however I think there is some
missing information for it to approve/reject a potential update. Two
examples:
1. A document was updated in a valid way, but this is the 10th time in one
second the update was made.
2. A form was posted to CouchDB but there is no CAPTCHA information to
validate.
Note that these updates, particularly #1, can comprise a denial-of-service
attack against CouchDB, because we can make "valid" updates until the disk
is full.
I have a couple of ideas and would like to hear thoughts.
## reCAPTCHA support
I am thinking about some sort of /_config setting, where you can input some
reCAPTCHA settings (I have not used reCAPTCHA yet so forgive the ignorance.)
Next, maybe there is a /_recaptcha URL or namespace where CouchDB serves
whatever it needs to to add a captcha to forms (maybe saving state in
memory or in a config or database). When a form is submitted, Couch checks
the captcha and sends the pass/fail value to validate_doc_update, either
inside userCtx, secObj, or a new argument. (None seem ideal; I would go
with userCtx or a new argument).
## Rate limiting
CouchDB could keep state about how frequently "clients" are making updates.
(A "client" is not defined. It could be an IP address, or an {IPAddr,
Username} tuple, or something else.)
Then in validate_doc_update, the rate of updates is passed in somehow.
(Again, in userCtx, or maybe a new argument.) Maybe it looks like this:
update_rate:
{ "second": 0
, "minute": 3
, "hour: 5
, "day": 8
, "week": 10
, "year": 50
}
Maybe week and year is overkill. But if your policy is only 10 updates per
minute, then that's easy to enforce now:
if(userCtx.update_rate.minute > 10)
throw {forbidden: "Too many updates in the past minute"}
## Banning
This is a more distant plan. I have always loved fail2ban.
Maybe Couch has a /_config setting to ban "clients" (as defined above) if
they fail validation too much. Bans could have several features:
* Temporary, expires automatically after some time
* Perhaps immediately reject the client for all writes, and maybe reads too
* Maybe some kind of "bog" where responses take 29 seconds to return
* Maybe a "hellban: where we return 200 but silenly reject the update
* Maybe some whitelists so that trusted clients can still replicate to you
regardless of validation failures. (Some people may use validation as a
server-side replication filter.)