On 03/02/2019 23:58, Garret Wilson wrote:
> On 2/3/2019 3:34 PM, Mark Thomas wrote:
>> ...
>> There is an open question what Tomcat should do with %2F sequences.
> 
> "What Tomcat should do" in what context? The servlet and JAX-RS specs
> may be clear about whether decoded or "raw" APIs should be returned from
> the various API methods. But I guess the issue here is /not/ whether
> JAX-RS should interpret a path segment as decoded or encoded. The issue
> is whether Tomcat has already fiddled with the URI itself to /change
> what constitutes the path segment/.

The Servlet spec is not always clear whether a URI or path that is
returned should be:
- %nn decoded or not
- normalized or not

This gets interesting because if servlet mappings, filter mappings,
security constraints (and all other URI pattern / path) based
configuration don't use a canonical form (i.e. always decoded, always
normalized) then you open up all sorts of issues such as security
constraint bypass.

e.g. if
/private

is protected by security constraints, a request to

/foo/../private
or
/priv%61te

should be subject to the same constraints. Hence you need to normalise
and decode before mapping the request. The question then becomes what to
return for getServletPath(), getPathInfo() and friends?

Tomcat takes the view that since only getRequestURI() states that the
return value is not decoded, all other return values are decoded. Tomcat
also normalises those values.

> Unless an EE specification says to muck around with the URI like this, I
> don't see how the server has any business changing the content of the
> URI. If the escaped path delimiters are decoded early on, then the
> downstream APIs will get different path segments altogether: some will
> have characters missing, and there will moreover be additional path
> segments than intended. It would seem to be that "trying to be helpful
> without being asked" in this case (as in most cases) would probably
> raise security issues, too.
> 
> Further downstream, whether each API method returns encoded or decoded
> information would depend on what the API contracts say, for better or
> for worse.
> 
> 
>> It
>> currently decodes them. Arguably, it should leave them alone.
> 
> That sounds right to me.

The problem is 15+ years of doing something else. Every time we make a
change to this sort of thing - even if is 100% backed by specs that have
been not changed during the lifetime of Tomcat - it ends up breaking
something for some users that rely on the incorrect behaviour.

I'm hoping to get clarification from the Servlet EG for the next release
of the Servlet spec.
https://github.com/eclipse-ee4j/servlet-api/issues/18

My current thinking (assuming no movement from the Servlet EG)  is that
we add an option to Tomcat to control the %nn decoding of reserved
characters with if defaulting to "decode" in 9.0.x (for backwards
compatibility) and changing to "not decode" for 10.0.x onwards.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to