On Tue, Nov 8, 2016 at 1:39 PM, Ruediger Pluem <rpl...@apache.org> wrote:

>
> On 11/04/2016 03:20 PM, wr...@apache.org wrote:
> > Author: wrowe
> > Date: Fri Nov  4 14:20:16 2016
> > New Revision: 1768036
> >
> > URL: http://svn.apache.org/viewvc?rev=1768036&view=rev
> > Log:
> > Add an option to enforce stricter HTTP conformance
> >
> > This is a first stab, the checks will likely have to be revised.
> > For now, we check
> >
> >  * if the request line contains control characters
> >  * if the request uri has fragment or username/password
> >  * that the request method is standard or registered with
> RegisterHttpMethod
> >  * that the request protocol is of the form HTTP/[1-9]+.[0-9]+,
> >    or missing for 0.9
> >  * if there is garbage in the request line after the protocol
> >  * if any request header contains control characters
> >  * if any request header has an empty name
> >  * for the host name in the URL or Host header:
> >    - if an IPv4 dotted decimal address: Reject octal or hex values,
> require
> >      exactly four parts
> >    - if a DNS host name: Reject non-alphanumeric characters besides '.'
> and
> >      '-'. As a side effect, this rejects multiple Host headers.
> >  * if any response header contains control characters
> >  * if any response header has an empty name
> >  * that the Location response header (if present) has a valid scheme and
> is
> >    absolute
> >
> > If we have a host name both from the URL and the Host header, we replace
> the
> > Host header with the value from the URL to enforce RFC conformance.
> >
> > There is a log-only mode, but the loglevels of the logged messages need
> some
> > thought/work. Currently, the  checks for incoming data log for 'core'
> and the
> > checks for outgoing data log for 'http'. Maybe we need a way to
> configure the
> > loglevels separately from the core/http loglevels.
> >
> > change protocol number parsing in strict mode according to HTTPbis draft
> > - only accept single digit version components
> > - don't accept white-space after protocol specification
> >
> > Clean up comment, fix log tags.
> > Submitted by: sf
> > Backports: r1426877, r1426879, r1426988, r1426992
> > Modified: httpd/httpd/branches/2.4.x-merge-http-strict/server/vhost.c
> > URL: http://svn.apache.org/viewvc/httpd/httpd/branches/2.4.x-
> merge-http-strict/server/vhost.c?rev=1768036&r1=
> 1768035&r2=1768036&view=diff
> > ============================================================
> ==================
> > --- httpd/httpd/branches/2.4.x-merge-http-strict/server/vhost.c
> (original)
> > +++ httpd/httpd/branches/2.4.x-merge-http-strict/server/vhost.c Fri
> Nov  4 14:20:16 2016
> > +    if (conf->http_conformance & AP_HTTP_CONFORMANCE_STRICT) {
> > +        /*
> > +         * If we have both hostname from an absoluteURI and a Host
> header,
> > +         * we must ignore the Host header (RFC 2616 5.2).
> > +         * To enforce this, we reset the Host header to the value from
> the
> > +         * request line.
> > +         */
> > +        if (have_hostname_from_url && host_header != NULL) {
> > +            const char *info = "Would replace";
> > +            const char *new = construct_host_header(r, is_v6literal);
> > +            if (!(conf->http_conformance &
> AP_HTTP_CONFORMANCE_LOGONLY)) {
> > +                apr_table_set(r->headers_in, "Host", r->hostname);
>
> Hm, why don't we use "new" here instead of r->hostname
>
> > +                info = "Replacing";
> > +            }
> > +            ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO(02417)
> > +                          "%s Host header '%s' with host from request
> uri: "
> > +                          "'%s'", info, host_header, new);
> > +        }
> > +    }


Good question, with LOGONLY no longer an option, all that logic got simpler.

Here's why I think the whole logic is busted and the preserve r->hostname is
the right thing to do for the outer request (not a child/client request to
any
backend host)...

I believe this logic to be busted. Given this request;

GET http://distant-host.com/ HTTP/1.1
Host: proxy-host

we would now fail to evaluate the proxy-host virtual host rules.

This seems like a breaking change to our config. mod_proxy already
follows this rule of RFC7230 section 5.4;

   When a proxy receives a request with an absolute-form of
   request-target, the proxy MUST ignore the received Host header field
   (if any) and instead replace it with the host information of the
   request-target.  A proxy that forwards such a request MUST generate a
   new Host field-value based on the received request-target rather than
   forward the received Host field-value.

Section 5.5 of RFC7230 has this to say;

   Once the effective request URI has been constructed, an origin server
   needs to decide whether or not to provide service for that URI via
   the connection in which the request was received.  For example, the
   request might have been misdirected, deliberately or accidentally,
   such that the information within a received request-target or Host
   header field differs from the host or port upon which the connection
   has been made.  If the connection is from a trusted gateway, that
   inconsistency might be expected; otherwise, it might indicate an
   attempt to bypass security filters, trick the server into delivering
   non-public content, or poison a cache.  See Section 9 for security
   considerations regarding message routing.

Section 5.3.1 states;

   To allow for transition to the absolute-form for all requests in some
   future version of HTTP, a server MUST accept the absolute-form in
   requests, even though HTTP/1.1 clients will only send them in
   requests to proxies.

It seems to me we should simply trust the Host: header and dump this whole
mess. If we want to reject requests in absolute form after the proxy modules
have had a chance to accept them, that wouldn't be a bad solution.

Reply via email to