Hi Stefan,

Thanks for this work, but I don't consider HTTP conformance to be
an option.  These are checks we should be making while parsing
the received message, not as a separate pass, and in many cases
they are required to result in a 400, 500, or 502 response.

I am trying to get HTTPbis ready for last call this week.
After that, I will be looking into making the changes in httpd,
and I won't be using a configurable option.  I suggest we just
remove that part and iterate on these checks as we go.

The current HTTP/1.1 drafts are at

  http://svn.tools.ietf.org/svn/wg/httpbis/draft-ietf-httpbis/latest/

and the message parsing requirements are in p1-messaging.html

BTW, the protocol version is now restricted to uppercase and
one digit per major and minor, so we can simplify that check
to a very specific "HTTP/[0-9].[0-9]".

And, yes, it is possible to have a valid empty Host because
HTTP can be used with a proxy for any URI (including URNs).

....Roy

On Dec 29, 2012, at 5:23 PM, s...@apache.org wrote:

> Author: sf
> Date: Sun Dec 30 01:23:24 2012
> New Revision: 1426877
> 
> URL: http://svn.apache.org/viewvc?rev=1426877&view=rev
> Log:
> Add an option to enforce stricter HTTP conformance
> 
> This is a first stab, the checks will likely have to be revised.
> For now, we check
> 
> * if the request line contains control characters
> * if the request uri has fragment or username/password
> * that the request method is standard or registered with RegisterHttpMethod
> * that the request protocol is of the form HTTP/[1-9]+.[0-9]+,
>   or missing for 0.9
> * if there is garbage in the request line after the protocol
> * if any request header contains control characters
> * if any request header has an empty name
> * for the host name in the URL or Host header:
>   - if an IPv4 dotted decimal address: Reject octal or hex values, require
>     exactly four parts
>   - if a DNS host name: Reject non-alphanumeric characters besides '.' and
>     '-'. As a side effect, this rejects multiple Host headers.
> * if any response header contains control characters
> * if any response header has an empty name
> * that the Location response header (if present) has a valid scheme and is
>   absolute
> 
> If we have a host name both from the URL and the Host header, we replace the
> Host header with the value from the URL to enforce RFC conformance.
> 
> There is a log-only mode, but the loglevels of the logged messages need some
> thought/work. Currently, the  checks for incoming data log for 'core' and the
> checks for outgoing data log for 'http'. Maybe we need a way to configure the
> loglevels separately from the core/http loglevels.


Reply via email to