Re: Input validation and limits

Robert Newson Mon, 25 Mar 2013 01:12:00 -0700

This is a great topic and one that goes to the heart of CouchDB's twin
roles as database and web server.

Does CouchDB need to directly support every feature that a web server
ought to support? Or does CouchDB, by virtue of speaking HTTP, get to
stay lean, providing only what must be provided by an origin server in
the modern Web, and rely on other, hopefully solid and focused tools,
for everything else? Supporting CAPTCHA, in whatever form, seems quite
reasonable. It's an extension of our auth model in many respects and
something that can't easily be externalized.

CouchDB's strength is that it's a database that speaks HTTP. In my
mind, it does that for one reason - to integrate with other things
that also speak HTTP. That obviously includes browsers but it also
includes load balancers, caching proxies, and so on.

To the topic at hand I feel that rate limiting and IP blocking is
something best done externally, just as I feel about virtual hosting
and URL rewriting. Are our log files rich enough to power fail2ban
itself? Could they be enhanced if not? Would an iptables approach to
rate limiting be preferable? Can we, as the CouchDB developer
community, really support and maintain all the extra features if we
decided CouchDB-as-a-web-server means it ought to do all these things?
Will we work to make a clustered CouchDB work without external load
balancers or DNS failover services, to pick just two examples? Will we
add an http caching layer?

I sound opinionated and entrenched when I ask too many questions in a
row, but they are sincere questions; it's not my intention to bludgeon
the proposal into the ground with them. I do want to explicitly reject
an accusation of "stop energy" before it's made, though. That phrase
is easily invoked though I do see that it's often been true in the
past, from myself and other developers.

Adding this kind of statefulness seems inappropriate to me but it's
hard to argue the case when we have the URL rewriting and virtual
hosting built in. A separate conversation is looming about virtual
hosting because the Nebraska merge that brings clustering will not
bring virtual hosting with it; BigCouch has never supported native
virtual hosting, it's provided by HAProxy instead.

I would love a broader discussion about where CouchDB ends and other
software begins. Is there a crisp line? I'd argue there could be,
though it's not crisp today. For me, as I've said, CouchDB is a
database that you talk to over HTTP. I'm for keeping that as lean as
possible; that's a big enough task already.

B.

On 25 March 2013 05:01, Jason Smith <[email protected]> wrote:
> On Mon, Mar 25, 2013 at 4:14 AM, Alexander Shorin <[email protected]> wrote:
>
>> Hi Jason!
>>
>> On Mon, Mar 25, 2013 at 7:22 AM, Jason Smith <[email protected]> wrote:
>> > ## reCAPTCHA support
>> > ...
>> > ## Rate limiting
>>
>> Wouldn't these things break bulk updates and replications? Both of
>> them triggers vdu much and let them fail on half way just because they
>> hit update rate wouldn't be nice.
>>
>
> Good point. This is why I wanted to have the discussion.
>
> I think the feature should be disabled by default. So upgrading CouchDB
> would not change how updates validate.
>
> I did mention a whitelist in my ideas, however I am not sure how it would
> work. That is why I identified "clients" rather than "users" or "remote ip
> addresses." Do you whitelist users or ip addresses or {user,ip} tuples? Or
> something else? And I don't think we want to start leaking client
> information into validate_doc_update. (Note that the req object is not
> passed to it, I think intentionally.)
>
> Anyway, yes, I agree, a client which fails validation very often
>
> Perhaps instead of a config, there could be a new flag when validation
> fails. So legacy code could never trigger banning.
>
>     throw {"forbidden":"Not allowed", "fail":true}
>
> So we might ban on "fail" frequency, not just anything thrown.
>
>
>> P.S. Currently, these questions could be solved via nginx in front of
>> CouchDB + fail2ban. May be better to integrate with existed tools? For
>> example, providing auth.log with authentication successful and failure
>> attempts - fail2ban will be happy for this. Currently you have to live
>> with verbose logs (or configure per-module logging, thanks to Jan!)
>> which looks a bit overhead if you're interested only in auth problems.
>>
>
> I have not looked at fail2ban for a while. However I am encouraged by my
> javascript.log work. I would love to make a fail2ban-friendly auth.log file.
>
> --
> Iris Couch

Re: Input validation and limits

Reply via email to