Thanks all for the feedback. Status and follow-up questions inline On Thu, Aug 25, 2016 at 10:02 PM, William A Rowe Jr <wr...@rowe-clan.net> wrote:
> A couple key questions now that the full refactoring of legacy vs. strict > is mostly complete (there remain potential issues with some of the 3-4 yr > old changes on trunk which I'll raise in other posts.) But speaking only to > the request line and request header parsing... > > 1. Does it make sense to emit these parsing failures at the info level? Or > debug level (or in trunk/2.4, only at the trace level?) Granted some > legitimate internal diagnostics may be required, so it needs to have some > potential visibility, but the vast majority of such traffic is abusive and > doesn't need a place in most error logs. > This is now changed, all are DEBUG except where the admin toggles for Limit's are broken (since this is a relatively common case where an admin sets some arbitrarily short limit without understanding fields such as User-Agent or Cookies.) Discussion item: Limit* based failures could also become debug. Thoughts? 2. Should we ban \r\n\v\f unequivocally from request and request header > fields altogether, or is there a legitimate need to support these? Or > should these follow the UnsafeWhitespace toggle and be permitted? > I am changing this right now. Specifically, * Leading whitespace on the URI line prohibited (and always was) * \r\n\v\f as well as \t always trigger 400 in the request line (new behavior) * multiple or trailing spaces only permitted in Unsafe mode (always permitted before) * trailing text following trailing spaces after protocol is always 400 (new behavior) * Spaces in or trailing the request field name always prohibited (new per RFC7230 3.2.4) * \r\n\v\f always triggers 400 in the request line (new behavior) Discussion items: I'm working on the patch to require CRLF line termination within ap_rgetline_core(). This is necessary because any CR is eaten before this function completes. It's not clear how this can be done trivially without negative impacts on other consumers of the function. But since current 'fold' flag, which we don't use, was a bool (0 or 1), adding a second control bit-flag of value 2 for CR-required seems the most obvious way to go, with a minor MMN bump. Thoughts? Two choices, CRLF is always required, or verifying this becomes a byproduct of toggling the Strict (vs Unsafe) mode. Thoughts? 3. Do we need multiple > layers of 'Strict'ness, or should there be a single toggle, or no toggle, > no tolerant input at all in the next 2.2/2.4 releases? > This is now changed for whitespace, no such toggle anymore, it falls under Strict|Unsafe. Discussion item: I am not sold that StrictURI can be collapsed into this flag. Right now, not even httpd itself promises to correctly encode resulting URI's, AIUI. Until we have our own house in order, it seems we need to remain flexible about this. The \t\v\r\f\0 characters are always now prohibited, so it's considerably more safe. Strict further bans all unencoded ctrl's in the URL. So StrictURI takes this one step further, and bans all unencoded obs-text along with SP / '"' / '<' / '>' / '\' / '^' / '`' / '{' / '|' / '}' Since it's expected that a number of sites will have to relax UnsafeURI due to these encoding issues, even with the resulting URI's generated by httpd servers, and will have to do so for *public facing* interfaces, I strongly believe that this flag needs to remain distinct, or we will have lots of servers with entirely unsafe parsing, not with only limited exposure by accepting bad URIs. Thoughts? 4. Should the next 2.4/2.2 releases default to Strict at all? Or remain > permissive (Unsafe) and allow the user to toggle these to Strict(... > Whitespace... URI)? > > Real world direct observation especially appreciated from actual > deployments. > Strict (and StrictURI) remain the default. The Allow0.9 and LenientMethods remain the defaults. Discussion item: RegisteredMethods was always wrong for proxy, CGI and other cases, was nonsense per the spec, and cannot become a default. The Strict mode now requires all methods to conform to 'token' text, so we have significant added protection here. Do folks believe we want to ship with Require1.0 out of the box as the default, even with the 2.2.x/2.4.x backports? It seems the RFC is correct on this, that nearly all of these will be badly formed HTTP/1.x requests. Thoughts?