Thanks all for the feedback. Status and follow-up questions inline

On Thu, Aug 25, 2016 at 10:02 PM, William A Rowe Jr <wr...@rowe-clan.net>
wrote:

> A couple key questions now that the full refactoring of legacy vs. strict
> is mostly complete (there remain potential issues with some of the 3-4 yr
> old changes on trunk which I'll raise in other posts.) But speaking only to
> the request line and request header parsing...
>
> 1. Does it make sense to emit these parsing failures at the info level? Or
> debug level (or in trunk/2.4, only at the trace level?) Granted some
> legitimate internal diagnostics may be required, so it needs to have some
> potential visibility, but the vast majority of such traffic is abusive and
> doesn't need a place in most error logs.
>
This is now changed, all are DEBUG except where the admin toggles for
Limit's
are broken (since this is a relatively common case where an admin sets some
arbitrarily short limit without understanding fields such as User-Agent or
Cookies.)

Discussion item:

Limit* based failures could also become debug. Thoughts?

2. Should we ban \r\n\v\f unequivocally from request and request header
> fields altogether, or is there a legitimate need to support these? Or
> should these follow the UnsafeWhitespace toggle and be permitted?
>
I am changing this right now. Specifically,

 * Leading whitespace on the URI line prohibited (and always was)
 * \r\n\v\f as well as \t always trigger 400 in the request line (new
behavior)
 * multiple or trailing spaces only permitted in Unsafe mode (always
permitted before)
 * trailing text following trailing spaces after protocol is always 400
(new behavior)

 * Spaces in or trailing the request field name always prohibited (new per
RFC7230 3.2.4)
 * \r\n\v\f always triggers 400 in the request line (new behavior)

Discussion items:

I'm working on the patch to require CRLF line termination within
ap_rgetline_core().
This is necessary because any CR is eaten before this function completes.
It's
not clear how this can be done trivially without negative impacts on other
consumers
of the function. But since current 'fold' flag, which we don't use, was a
bool (0 or 1),
adding a second control bit-flag of value 2 for CR-required seems the most
obvious
way to go, with a minor MMN bump. Thoughts?

Two choices, CRLF is always required, or verifying this becomes a byproduct
of toggling the Strict (vs Unsafe) mode. Thoughts?

3. Do we need multiple
>
layers of 'Strict'ness, or should there be a single toggle, or no toggle,
> no tolerant input at all in the next 2.2/2.4 releases?
>
This is now changed for whitespace, no such toggle anymore, it falls under
Strict|Unsafe.

Discussion item:

I am not sold that StrictURI can be collapsed into this flag. Right now,
not even
httpd itself promises to correctly encode resulting URI's, AIUI. Until we
have our
own house in order, it seems we need to remain flexible about this. The
\t\v\r\f\0
characters are always now prohibited, so it's considerably more safe.
Strict further
bans all unencoded ctrl's in the URL. So StrictURI takes this one step
further, and
bans all unencoded obs-text along with SP / '"' / '<' / '>' / '\' / '^' /
'`' / '{' / '|' / '}'

Since it's expected that a number of sites will have to relax UnsafeURI due
to
these encoding issues, even with the resulting URI's generated by httpd
servers,
and will have to do so for *public facing* interfaces, I strongly believe
that this
flag  needs to remain distinct, or we will have lots of servers with
entirely unsafe
parsing, not with only limited exposure by accepting bad URIs. Thoughts?

4. Should the next 2.4/2.2 releases default to Strict at all? Or remain
> permissive (Unsafe) and allow the user to toggle these to Strict(...
> Whitespace... URI)?
>
> Real world direct observation especially appreciated from actual
> deployments.
>
Strict (and StrictURI) remain the default. The Allow0.9 and LenientMethods
remain the defaults.

Discussion item:

RegisteredMethods was always wrong for proxy, CGI and other cases, was
nonsense per the spec, and cannot become a default. The Strict mode now
requires all methods to conform to 'token' text, so we have significant
added
protection here.

Do folks believe we want to ship with Require1.0 out of the box as the
default,
even with the 2.2.x/2.4.x backports? It seems the RFC is correct on this,
that
nearly all of these will be badly formed HTTP/1.x requests. Thoughts?

Reply via email to