On 03/02/2019 23:58, Garret Wilson wrote: > On 2/3/2019 3:34 PM, Mark Thomas wrote: >> ... >> There is an open question what Tomcat should do with %2F sequences. > > "What Tomcat should do" in what context? The servlet and JAX-RS specs > may be clear about whether decoded or "raw" APIs should be returned from > the various API methods. But I guess the issue here is /not/ whether > JAX-RS should interpret a path segment as decoded or encoded. The issue > is whether Tomcat has already fiddled with the URI itself to /change > what constitutes the path segment/.
The Servlet spec is not always clear whether a URI or path that is returned should be: - %nn decoded or not - normalized or not This gets interesting because if servlet mappings, filter mappings, security constraints (and all other URI pattern / path) based configuration don't use a canonical form (i.e. always decoded, always normalized) then you open up all sorts of issues such as security constraint bypass. e.g. if /private is protected by security constraints, a request to /foo/../private or /priv%61te should be subject to the same constraints. Hence you need to normalise and decode before mapping the request. The question then becomes what to return for getServletPath(), getPathInfo() and friends? Tomcat takes the view that since only getRequestURI() states that the return value is not decoded, all other return values are decoded. Tomcat also normalises those values. > Unless an EE specification says to muck around with the URI like this, I > don't see how the server has any business changing the content of the > URI. If the escaped path delimiters are decoded early on, then the > downstream APIs will get different path segments altogether: some will > have characters missing, and there will moreover be additional path > segments than intended. It would seem to be that "trying to be helpful > without being asked" in this case (as in most cases) would probably > raise security issues, too. > > Further downstream, whether each API method returns encoded or decoded > information would depend on what the API contracts say, for better or > for worse. > > >> It >> currently decodes them. Arguably, it should leave them alone. > > That sounds right to me. The problem is 15+ years of doing something else. Every time we make a change to this sort of thing - even if is 100% backed by specs that have been not changed during the lifetime of Tomcat - it ends up breaking something for some users that rely on the incorrect behaviour. I'm hoping to get clarification from the Servlet EG for the next release of the Servlet spec. https://github.com/eclipse-ee4j/servlet-api/issues/18 My current thinking (assuming no movement from the Servlet EG) is that we add an option to Tomcat to control the %nn decoding of reserved characters with if defaulting to "decode" in 9.0.x (for backwards compatibility) and changing to "not decode" for 10.0.x onwards. Mark --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org