Re: Request line parsing

2020-03-23 Thread Filip Hanik
+1

Thorough and clear write up

On Mon, Mar 23, 2020 at 06:01 Mark Thomas  wrote:

> Hi,
>
> I am currently looking at the request line parsing. I'll try and set out
> each issue in turn.
>
> End of line parsing
> ===
>
> Prior to the recent changes, Tomcat allowed CRLF or LF to mark the end
> of a line. The unwanted side effect was that CR could appear in the
> header value. This caused problems and was tightened up to only allow
> CRLF as a line terminator.
>
> Currently Tomcat requires CRLF everywhere apart from the end of the
> request line for a HTTP 0.9 request where it also allows LF.
>
> This requirement to accept just LF as a line terminator first emerged in
> the W3C spec [1]. RFC 1945 [2] and RFC 2616 [3] retained this as a
> recommendation for all line terminators, RFC 7230 [4] no longer includes
> this recommendation.
>
> RFC 7230 also removes the expectation that a server that supports
> HTTP/1.1 will support HTTP 0.9.
>
> Arguably the current spec for HTTP/0.9 is [3].
>
> The Servlet spec references RFC 7230 and RFC 1945 so arguably HTTP/0.9
> support is expected.
>
>
> SP vs whitespace
> 
>
> Tomcat currently accepts any combination of SP and HTAB where RFC 7230
> calls for a single SP. This stems from a recommendation in RFC 2616
> which is no longer present in RFC 7230.
>
>
> I think we have three options.
>
> 1. No changes.
>CRLF is required everywhere apart from HTTP/0.9 where LF is also
>accepted.
>Any combination of SP/HTAB is accepted where SP is required.
>
> 2. Tighten up as per RFC 7230
>a) Require CRLF for all line endings
>b) Require SP where specified
>c) Drop HTTP/0.9 support
>
> 3. Relax the recent changes to allow CRLF or LF as a line terminator
>everywhere without allowing CR to appear in a request header.
>
> I think we should follow 1) for Tomcat 7, 8 & 9.
>
> I'm leaning towards 1 for 10.0.x as well with a view to discussing 2 in
> the Servlet project. i.e. explicitly dropping HTTP 0.9 support and the
> "Tolerant applications" requirements of RFC 1945 for Jakarta EE 10
> (Tomcat 10.1.x).
>
> In short this means largely do nothing apart from may be adding a few
> more tests to explicitly check behaviour for various edge cases.
>
> I'll note that the regressions reported with the recent change to
> requiring CRLF as a line terminator caused issues with valid HTTP/0.9
> requests but that this should now be resolved.
>
> We have had one user issue reported where a custom client was using LFLF
> as a line terminator and requests are now being rejected. Given that was
> never valid, I'm OK with that.
>
> Thoughts?
>
> Mark
>
>
>
> [1] https://www.w3.org/Protocols/HTTP/AsImplemented.html
> [2] https://tools.ietf.org/html/rfc1945
> [3] https://tools.ietf.org/html/rfc2616
> [4] https://tools.ietf.org/html/rfc7230
>
>
>
>
>
> With all of the above in mind I propose:
>
> - Doing nothing! I think Tomcat is striking the right balance here.
>
> This means:
> GET /CRLF   -> processed as HTTP/0.9
> GET /LF -> processed as HTTP/0.9
> GET / CRLF  -> processed as HTTP/1.1 and rejected as invalid
> GET / LF-> processed as HTTP/1.1 and rejected as invalid
>
> I want to write some tests to check this is behaving as expected but I'm
> not expecting any changes to the parsing at this point.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: dev-h...@tomcat.apache.org
>
>


Re: Request line parsing

2020-03-23 Thread Mark Thomas
On 23/03/2020 17:33, Christopher Schultz wrote:
> On 3/23/20 11:35, Mark Thomas wrote:



> Sounds good. I entirely missed your actual proposal, which was below
> your signature and after your references:

Sorry about that. I was editing and re-organising and got distracted.

Mark

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Request line parsing

2020-03-23 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Mark,

On 3/23/20 11:35, Mark Thomas wrote:
> On 23/03/2020 14:59, Christopher Schultz wrote:
>
> 
>
>> My only concern here is that request line + header-processing
>> really has to match whatever reverse proxy servers are doing as
>> well, and that's really not something we can know for sure. I
>> don't think there is a single safe implementation that will make
>> everyone happy.
>
> I think everything is, slowly, getting stricter. Generally as a
> result of request smuggling vulnerabilities and the like.
>
>> Is there a way to make this kind of thing pluggable with a few
>> obvious implementations so that users can select which one makes
>> the most sense for their environment? Similar to
>> cookie-proessing, we could have a super-spec-compliant one which
>> always requires CRLF line endings (which I guess means dropping
>> support for HTTP 0.9), and is super strict about everything
>> else.
>
> Yes, it is possible. I'm not sure if truly pluggable or
> configurable makes the most sense. I haven't looked at it too
> closely.
>
>> Another implementation could work the way Tomcat currently
>> behaves, being mostly spec-compliant and a little tolerant of
>> some sloppiness.
>>
>> This will allow an environment where e.g. httpd is in use to use
>> one implementation while Squid, IIS, nginx, etc. can make
>> different choices depending upon how the proxy will interpret
>> things.
>>
>> I realize that means more code. :( But if the proxy and origin
>> server disagree about how to interpret things, Bad Things can
>> happen. If we take a very strict interpretation of everything, I
>> think we can actually make the origin server safe, but we may
>> break environments which are relying on non-strict behavior.
>
> So far, the biggest breakage was caused by a regression in
> Tomcat's parsing of valid HTTP/0.9 requests. The approach you
> describe wouldn't address that sort of issue.
>
> We have had one additional report of new breakage with a client
> that uses LFLF as the line terminator but has never been valid and
> it was pure luck that Tomcat used to allow it - it was never
> intended to be allowed.
>
> My preference at this point is to concentrate on improving the
> unit tests to reduce the chances of regressions as and when we do
> make changes to the parsing code.
>
> I'm not completely against making request line (and header)
> parsing pluggable / configurable but relaxing parsing goes against
> the current direction we have been heading in and I'd need a lot of
> convincing to support such an approach.

Sounds good. I entirely missed your actual proposal, which was below
your signature and after your references:

On 3/23/20 09:01, Mark Thomas wrote:
> With all of the above in mind I propose:
>
> - Doing nothing! I think Tomcat is striking the right balance
> here.
>
> This means: GET /CRLF   -> processed as HTTP/0.9 GET /LF ->
> processed as HTTP/0.9 GET / CRLF  -> processed as HTTP/1.1 and
> rejected as invalid GET / LF-> processed as HTTP/1.1 and
> rejected as invalid
>
> I want to write some tests to check this is behaving as expected
> but I'm not expecting any changes to the parsing at this point.
I think I like the above very much.

- -chris
-BEGIN PGP SIGNATURE-
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAl548uMACgkQHPApP6U8
pFgZRQ/+KCX+EgM6+5FLDacDw2dy0mFSdCkjnWKcq2WbgwZe6GbNeGra0Egme9GF
CsJmVwrHnv34y+VKuvugoKVxDMyvsrkIrLyYPyW+VJg+iualK74BEI/28lvMGt2M
4RuPofhNQNmNxbWP1cvcWGb1CEinVo97AU3lHjRHRCtWWL5jQhiO9te9GWk2Q9z9
EmLJZo/tS+PSYLx73ewFEN0BxH7exQAu3MsYNxtq/be7pEDj4kJF5tXY6vnSXbmX
upZ14tBpw/loy6aC/Ay56Nx76q/t0+J4ZNsYgsNf3OeowCYnWFw8orbM5rCVFuY2
CshC40F+rhc1IZfcVd4F34SqRGs3W47eA4/CHnKBHBMA8R6Oj76gJkJvn3izRLYH
y5L8kHc3umXXDYQkDYgytGzGDaaXx0+OKaZHLywV6rXw30DvANh6xhsW+7GKqqI6
8FckjrQeCrBzBg+tSz+ObeHJxZvhpD5CYg5vjH8YxXKL6hi65k+IaaOgydmjq2SQ
lnoMIfdmydxRYnmhmzfCcJdgxkctvg3AGyidG2zEWE8EYbAfzixb9PVDZEhb4lMk
wBE5wj2IUmgsmFV/MdUfQjZNFn8dEoZbjebrrIyUlsLJng7AIObXss3myp6DZJXt
OWSjnMGVcsvXzbpfs/nw9kFnkdEMFbzDnVVGBDf93KygLQqEgCM=
=zQMS
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Request line parsing

2020-03-23 Thread Mark Thomas
On 23/03/2020 13:28, Rémy Maucherat wrote:
> On Mon, Mar 23, 2020 at 2:01 PM Mark Thomas  > wrote:



> With all of the above in mind I propose:
> 
> - Doing nothing! I think Tomcat is striking the right balance here.
> 
> This means:
> GET /CRLF   -> processed as HTTP/0.9
> GET /LF     -> processed as HTTP/0.9
> GET / CRLF  -> processed as HTTP/1.1 and rejected as invalid
> GET / LF    -> processed as HTTP/1.1 and rejected as invalid
> 
> I want to write some tests to check this is behaving as expected but I'm
> not expecting any changes to the parsing at this point.
> 
> 
> +1, that sounds really good !

I wrote too soon :)

GET / CRLF was being parsed as malformed HTTP/0.9 so there are going to
be changes to the parsing to make "GET / LF" and "GET / CRLF" consistent
(invalid HTTP/1.1). Treating both as HTTP/1.1 I think I can see a way to
make the parsing less hacky and have the code be a little clearer.

Mark

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Request line parsing

2020-03-23 Thread Mark Thomas
On 23/03/2020 14:59, Christopher Schultz wrote:



> My only concern here is that request line + header-processing really
> has to match whatever reverse proxy servers are doing as well, and
> that's really not something we can know for sure. I don't think there
> is a single safe implementation that will make everyone happy.

I think everything is, slowly, getting stricter. Generally as a result
of request smuggling vulnerabilities and the like.

> Is there a way to make this kind of thing pluggable with a few obvious
> implementations so that users can select which one makes the most
> sense for their environment? Similar to cookie-proessing, we could
> have a super-spec-compliant one which always requires CRLF line
> endings (which I guess means dropping support for HTTP 0.9), and is
> super strict about everything else.

Yes, it is possible. I'm not sure if truly pluggable or configurable
makes the most sense. I haven't looked at it too closely.

> Another implementation could work the way Tomcat currently behaves,
> being mostly spec-compliant and a little tolerant of some sloppiness.
> 
> This will allow an environment where e.g. httpd is in use to use one
> implementation while Squid, IIS, nginx, etc. can make different
> choices depending upon how the proxy will interpret things.
> 
> I realize that means more code. :( But if the proxy and origin server
> disagree about how to interpret things, Bad Things can happen. If we
> take a very strict interpretation of everything, I think we can
> actually make the origin server safe, but we may break environments
> which are relying on non-strict behavior.

So far, the biggest breakage was caused by a regression in Tomcat's
parsing of valid HTTP/0.9 requests. The approach you describe wouldn't
address that sort of issue.

We have had one additional report of new breakage with a client that
uses LFLF as the line terminator but has never been valid and it was
pure luck that Tomcat used to allow it - it was never intended to be
allowed.

My preference at this point is to concentrate on improving the unit
tests to reduce the chances of regressions as and when we do make
changes to the parsing code.

I'm not completely against making request line (and header) parsing
pluggable / configurable but relaxing parsing goes against the current
direction we have been heading in and I'd need a lot of convincing to
support such an approach.

Mark

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Request line parsing

2020-03-23 Thread Michael Osipov

Am 2020-03-23 um 14:01 schrieb Mark Thomas:

Hi,

I am currently looking at the request line parsing. I'll try and set out
each issue in turn.

End of line parsing
===

Prior to the recent changes, Tomcat allowed CRLF or LF to mark the end
of a line. The unwanted side effect was that CR could appear in the
header value. This caused problems and was tightened up to only allow
CRLF as a line terminator.

Currently Tomcat requires CRLF everywhere apart from the end of the
request line for a HTTP 0.9 request where it also allows LF.

This requirement to accept just LF as a line terminator first emerged in
the W3C spec [1]. RFC 1945 [2] and RFC 2616 [3] retained this as a
recommendation for all line terminators, RFC 7230 [4] no longer includes
this recommendation.

RFC 7230 also removes the expectation that a server that supports
HTTP/1.1 will support HTTP 0.9.

Arguably the current spec for HTTP/0.9 is [3].

The Servlet spec references RFC 7230 and RFC 1945 so arguably HTTP/0.9
support is expected.


SP vs whitespace


Tomcat currently accepts any combination of SP and HTAB where RFC 7230
calls for a single SP. This stems from a recommendation in RFC 2616
which is no longer present in RFC 7230.


I think we have three options.

1. No changes.
CRLF is required everywhere apart from HTTP/0.9 where LF is also
accepted.
Any combination of SP/HTAB is accepted where SP is required.

2. Tighten up as per RFC 7230
a) Require CRLF for all line endings
b) Require SP where specified
c) Drop HTTP/0.9 support

3. Relax the recent changes to allow CRLF or LF as a line terminator
everywhere without allowing CR to appear in a request header.

I think we should follow 1) for Tomcat 7, 8 & 9.

I'm leaning towards 1 for 10.0.x as well with a view to discussing 2 in
the Servlet project. i.e. explicitly dropping HTTP 0.9 support and the
"Tolerant applications" requirements of RFC 1945 for Jakarta EE 10
(Tomcat 10.1.x).


Makes sense for <= 9 and the evaluation for 10

M

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Request line parsing

2020-03-23 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Mark,

On 3/23/20 09:01, Mark Thomas wrote:
> Hi,
>
> I am currently looking at the request line parsing. I'll try and
> set out each issue in turn.
>
> End of line parsing ===
>
> Prior to the recent changes, Tomcat allowed CRLF or LF to mark the
> end of a line. The unwanted side effect was that CR could appear in
> the header value. This caused problems and was tightened up to only
> allow CRLF as a line terminator.
>
> Currently Tomcat requires CRLF everywhere apart from the end of
> the request line for a HTTP 0.9 request where it also allows LF.
>
> This requirement to accept just LF as a line terminator first
> emerged in the W3C spec [1]. RFC 1945 [2] and RFC 2616 [3] retained
> this as a recommendation for all line terminators, RFC 7230 [4] no
> longer includes this recommendation.
>
> RFC 7230 also removes the expectation that a server that supports
> HTTP/1.1 will support HTTP 0.9.
>
> Arguably the current spec for HTTP/0.9 is [3].
>
> The Servlet spec references RFC 7230 and RFC 1945 so arguably
> HTTP/0.9 support is expected.
>
>
> SP vs whitespace 
>
> Tomcat currently accepts any combination of SP and HTAB where RFC
> 7230 calls for a single SP. This stems from a recommendation in RFC
> 2616 which is no longer present in RFC 7230.
>
>
> I think we have three options.
>
> 1. No changes. CRLF is required everywhere apart from HTTP/0.9
> where LF is also accepted. Any combination of SP/HTAB is accepted
> where SP is required.
>
> 2. Tighten up as per RFC 7230 a) Require CRLF for all line endings
> b) Require SP where specified c) Drop HTTP/0.9 support
>
> 3. Relax the recent changes to allow CRLF or LF as a line
> terminator everywhere without allowing CR to appear in a request
> header.
>
> I think we should follow 1) for Tomcat 7, 8 & 9.

+1

> I'm leaning towards 1 for 10.0.x as well with a view to discussing
> 2 in the Servlet project. i.e. explicitly dropping HTTP 0.9 support
> and the "Tolerant applications" requirements of RFC 1945 for
> Jakarta EE 10 (Tomcat 10.1.x).
>
> In short this means largely do nothing apart from may be adding a
> few more tests to explicitly check behaviour for various edge
> cases.
>
> I'll note that the regressions reported with the recent change to
> requiring CRLF as a line terminator caused issues with valid
> HTTP/0.9 requests but that this should now be resolved.
>
> We have had one user issue reported where a custom client was using
> LFLF as a line terminator and requests are now being rejected.
> Given that was never valid, I'm OK with that.

+1

My only concern here is that request line + header-processing really
has to match whatever reverse proxy servers are doing as well, and
that's really not something we can know for sure. I don't think there
is a single safe implementation that will make everyone happy.

Is there a way to make this kind of thing pluggable with a few obvious
implementations so that users can select which one makes the most
sense for their environment? Similar to cookie-proessing, we could
have a super-spec-compliant one which always requires CRLF line
endings (which I guess means dropping support for HTTP 0.9), and is
super strict about everything else.

Another implementation could work the way Tomcat currently behaves,
being mostly spec-compliant and a little tolerant of some sloppiness.

This will allow an environment where e.g. httpd is in use to use one
implementation while Squid, IIS, nginx, etc. can make different
choices depending upon how the proxy will interpret things.

I realize that means more code. :( But if the proxy and origin server
disagree about how to interpret things, Bad Things can happen. If we
take a very strict interpretation of everything, I think we can
actually make the origin server safe, but we may break environments
which are relying on non-strict behavior.

- -chris
-BEGIN PGP SIGNATURE-
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAl54zucACgkQHPApP6U8
pFglJQ//WFBVYUlgHIXU86AgtpquhrMcl7eTR/AAnyJ5K5SGEUxS9sKgDS+K5Xsu
p1j5FpRBxpA5RY3xzn4X30dj1ps17sIFfntcIydHqZDePtFB78J8THMIeonaMaQE
SO1enP7HH2unowYewBZGjKvdEkLyCnFWz9JXJDjIvnNeqSapEdOttpVhMs+/sSjZ
8yPoOVhTAmYhnlJAH856KvXb/JVfhbBuvYfppuGGnV7aj8iy83BgYrXgOk8DJN1Z
7VowjCyabcSqZGMgCaXcpph3UTaYkuKjV30PYEssgTqT4xtibA5cGt75m4OVAnIH
ahVIKohGs+j+t+aNYJZsXLk2AtMF051ZygdbhwyXl2+/eS2X4mmISu3gpswIS1u3
Sz36boLhLBkTii67SXXXwIe+0e2WyYl4UA3sD/xzTKFDUMXvn5bjATxSHsCsUAO7
fYXlYxzE2ZyK0EyZ9Qox3K+d72bC/y8CjJvLZo1BHvUtt0QrvhsA0dgS6gXV/qjZ
hPgL3kodFF3dm5QQKEMYSzG1b64zPbw76DQ7nc7DcwNnlEJEPmk6NfxNlLE9SoCA
/9s1RJwiY+LVDqocElWZp1nwHWVAbFsPh9mr3nT5T5P1PMKGTIySTevfSHeBxJBw
7ua5N9SUg6KDLm6EwnfPL9fqSDMJerDn+b4oMu19TjQoPBjTXQ8=
=VY+W
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tom

Re: Request line parsing

2020-03-23 Thread Rémy Maucherat
On Mon, Mar 23, 2020 at 2:01 PM Mark Thomas  wrote:

> Hi,
>
> I am currently looking at the request line parsing. I'll try and set out
> each issue in turn.
>
> End of line parsing
> ===
>
> Prior to the recent changes, Tomcat allowed CRLF or LF to mark the end
> of a line. The unwanted side effect was that CR could appear in the
> header value. This caused problems and was tightened up to only allow
> CRLF as a line terminator.
>
> Currently Tomcat requires CRLF everywhere apart from the end of the
> request line for a HTTP 0.9 request where it also allows LF.
>
> This requirement to accept just LF as a line terminator first emerged in
> the W3C spec [1]. RFC 1945 [2] and RFC 2616 [3] retained this as a
> recommendation for all line terminators, RFC 7230 [4] no longer includes
> this recommendation.
>
> RFC 7230 also removes the expectation that a server that supports
> HTTP/1.1 will support HTTP 0.9.
>
> Arguably the current spec for HTTP/0.9 is [3].
>
> The Servlet spec references RFC 7230 and RFC 1945 so arguably HTTP/0.9
> support is expected.
>
>
> SP vs whitespace
> 
>
> Tomcat currently accepts any combination of SP and HTAB where RFC 7230
> calls for a single SP. This stems from a recommendation in RFC 2616
> which is no longer present in RFC 7230.
>
>
> I think we have three options.
>
> 1. No changes.
>CRLF is required everywhere apart from HTTP/0.9 where LF is also
>accepted.
>Any combination of SP/HTAB is accepted where SP is required.
>
> 2. Tighten up as per RFC 7230
>a) Require CRLF for all line endings
>b) Require SP where specified
>c) Drop HTTP/0.9 support
>
> 3. Relax the recent changes to allow CRLF or LF as a line terminator
>everywhere without allowing CR to appear in a request header.
>
> I think we should follow 1) for Tomcat 7, 8 & 9.
>
> I'm leaning towards 1 for 10.0.x as well with a view to discussing 2 in
> the Servlet project. i.e. explicitly dropping HTTP 0.9 support and the
> "Tolerant applications" requirements of RFC 1945 for Jakarta EE 10
> (Tomcat 10.1.x).
>
> In short this means largely do nothing apart from may be adding a few
> more tests to explicitly check behaviour for various edge cases.
>
> I'll note that the regressions reported with the recent change to
> requiring CRLF as a line terminator caused issues with valid HTTP/0.9
> requests but that this should now be resolved.
>
> We have had one user issue reported where a custom client was using LFLF
> as a line terminator and requests are now being rejected. Given that was
> never valid, I'm OK with that.
>
> Thoughts?
>
> Mark
>
>
>
> [1] https://www.w3.org/Protocols/HTTP/AsImplemented.html
> [2] https://tools.ietf.org/html/rfc1945
> [3] https://tools.ietf.org/html/rfc2616
> [4] https://tools.ietf.org/html/rfc7230
>
>
>
>
>
> With all of the above in mind I propose:
>
> - Doing nothing! I think Tomcat is striking the right balance here.
>
> This means:
> GET /CRLF   -> processed as HTTP/0.9
> GET /LF -> processed as HTTP/0.9
> GET / CRLF  -> processed as HTTP/1.1 and rejected as invalid
> GET / LF-> processed as HTTP/1.1 and rejected as invalid
>
> I want to write some tests to check this is behaving as expected but I'm
> not expecting any changes to the parsing at this point.
>

+1, that sounds really good !

Rémy


>
> -
> To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: dev-h...@tomcat.apache.org
>
>