It works!

I've been able to test with a customer name of: ÀËÌÑàëíñøü / \ Ѐӿ 中さ 😀
customer
This covers
 - Latin-1 Supplement characters are 1 byte: ÀËÌÑàëíñøü
 - The / and \ slash characters are ASCII, but are encoded due to special
meaning in a URL
 - Cyrillic characters are 2 bytes: Ѐӿ
 - Chinese and Japanese characters are 3 bytes: 中さ
 - emoji characters are 4 bytes: 😀

A client can request a URL path with the following:
"/customers/customer/%C3%80%C3%8B%C3%8C%C3%91%C3%A0%C3%AB%C3%AD%C3%B1%C3%B8%C3%BC%20%2F%20%5C%20%D0%80%D3%BF%20%E4%B8%AD%E3%81%95%20%F0%9F%98%80%20customer"

Then Tomcat processes the URL to the following and passes it into the
servlet.
"/customers/customer/ÀËÌÑàëíñøü %2F %5C Ѐӿ 中さ 😀 customer"

So we are now able to save this test customer to the database and read it
back out again. I tried this with a few other paths that are a little
structurally different (with/without query string, escaped string in a
middle path segment instead of just the end) and it all worked as desired.

Thank you so much for working on this!
-James

On Thu, Jan 23, 2025 at 8:20 AM Mark Thomas <ma...@apache.org> wrote:

> James,
>
> I've added attributes (encodedReverseSolidusHandling and
> encodedSolidusHandling) to the Context to provide control of how the
> RequestDispatcher paths are processed.
>
> Snapshots built after 12.00 UTC today should include the new attributes.
>
> As I type, Tomcat 12 is available, Tomcat 11 is building and Tomcat 10
> and Tomcat 9 are in the queue. All should be complete in a couple of hours.
>
> Mark
>
>
> On 22/01/2025 09:30, Mark Thomas wrote:
> > This is going to be fun.
> >
> > The RequestDispatcher processing currently does not take account of
> > encodedSolidusHandling or encodedReverseSolidusHandling.
> > RequestDsipatcher URLs are processed using the defaults.
> >
> > I have a test case that demonstrates various forms of this problem. I am
> > currently working on a fix.
> >
> > Mark
> >
> >
> > On 21/01/2025 23:10, James Matlik wrote:
> >> It got further, but ran into another security check within the
> >> org.apache.catalina.core.ApplicationContext. I'm using the 10.1.35-
> >> SNAPSHOT
> >> version as that is closest to my current version, and easiest to test
> >> as a
> >> result. This is the block of code having trouble now (linking into
> >> 10.1.34):
> >> https://github.com/apache/tomcat/blob/10.1.34/java/org/apache/
> >> catalina/core/ApplicationContext.java#L386-L396
> >>
> >> This code:
> >> 1. takes the URL (e.g. /customers/customer/A%2FB%5CC)
> >> 2. decodes it (e.g. /customers/customer/A/B\C)
> >> 3. normalizes the decoded url with the replaceBackSlash parameter
> >> hardcoded
> >> to true, thus forcing the decoded '\' character to become a '/'
> character
> >> (e.g. /customers/customer/A/B/C)
> >> 4. then it compares the decoded URL to the normalized URL, and if they
> >> are
> >> not equal, an IllegalArgumentException is created and a 500 response is
> >> returned.
> >>
> >> It seems that this area of the code would need to be aware of the
> >> encodedReverseSolidusHandling configuration as well.
> >>
> >> Thank you,
> >> James
> >>
> >> On Tue, Jan 21, 2025 at 1:20 PM Mark Thomas <ma...@apache.org> wrote:
> >>
> >>> On 21/01/2025 14:15, James Matlik wrote:
> >>>> Hello Mark,
> >>>>
> >>>> Yes, I would be available to test a snapshot if this gets implemented.
> >>>
> >>> Latest (as I type this) Tomcat 12 build with the new feature:
> >>>
> >>> https://repository.apache.org/content/groups/snapshots/org/apache/
> >>> tomcat/tomcat/12.0.0-M1-SNAPSHOT/tomcat-12.0.0-
> >>> M1-20250121.171115-315.tar.gz
> >>>
> >>> The snapshots for the 11.0.x, 10.1.x and 9.0.x are still building.
> >>> Generally, you'll find them under:
> >>>
> >>>
> >>> https://repository.apache.org/content/groups/snapshots/org/apache/
> >>> tomcat/tomcat/
> >>>
> >>> Anything built after 1700 UTC today should have the fix. Make sure you
> >>> look for the snapshot for the current dev version for each release
> >>> branch.
> >>>
> >>> The Connector attribute is called encodedReverseSolidusHandling
> >>>
> >>> Let the list know how you get on.
> >>>
> >>> Mark
> >>>
> >>>
> >>>>
> >>>> -James
> >>>>
> >>>> On Tue, Jan 21, 2025 at 8:17 AM Mark Thomas <ma...@apache.org> wrote:
> >>>>
> >>>>> On 18/01/2025 16:18, James Matlik wrote:
> >>>>>> I agree with everything you have said. As the config options stand
> >>> today,
> >>>>>> the allowBackslash seems to implement part of encodeSolidusHandling.
> >>>>>>
> >>>>>> While encodeSolidusHandling supports:
> >>>>>> * REJECT - Return 400 on encoded /
> >>>>>> * DECODE - Decodes the /
> >>>>>> * PASS_THROUGH - Passes the encoded / as is
> >>>>>>
> >>>>>> The allowBackslash supports two mostly analogous settings:
> >>>>>> * false - REJECT
> >>>>>> * true - DECODE, but it Decodes as / instead of \
> >>>>>
> >>>>> Not quite. Decoding is separate.
> >>>>>
> >>>>> Currently, %5C always gets decoded to \. allowBackslash then
> >>>>> determines
> >>>>> whether any \ (originally encoded or not) characters in the URL
> >>>>> will be
> >>>>> permitted and converted to / or the URL rejected.
> >>>>>
> >>>>> The decoding and conversion are partially independent. For example,
> >>>>> while it may be required that an encoded sequence %5C passed
> >>>>> through to
> >>>>> the application, whether an unencoded \ is rejected or converted
> >>>>> to / is
> >>>>> an independent choice.
> >>>>>
> >>>>>    From a security point of view, it is necessary to always convert
> >>>>> \ to
> >>> /
> >>>>> - at least on Windows anyway - else all sorts of nasty security
> things
> >>>>> could happen such as bypassing security constraints.
> >>>>>
> >>>>> We should be able to fix this for the next round of releases (early
> >>>>> Feb). I am currently looking at whether a new option to effectively
> >>>>> replace allowBackslash is sufficient or whether an new option is
> >>>>> needed
> >>>>> in addition to allowBackslash.
> >>>>>
> >>>>> My current thinking is that we should add
> encodeReverseSolidusHandling
> >>>>> with the same options as encodeSolidusHandling and leave the
> >>>>> convert/reject decision for unencoded \ to allowBackslash.
> >>>>>
> >>>>>> IMO, an ideal implementation would introduce
> >>> encodeReverseSolidusHandling
> >>>>>> to behave the same as encodeSolidusHandling, but maybe add a
> >>>>>> DECODE_AS_SOLIDUS. Then the allowBackslash setter would set the
> >>>>>> encodeReverseSolidusHandling variable to DECODE_AS_SOLIDUS or REJECT
> >>>>> (true
> >>>>>> vs false) for backward compatibility, but should probably be
> >>> deprecated.
> >>>>>> The code that normalizes the URL can remove any special handling of
> >>> the \
> >>>>>> character because all enforcement can be applied within UDecoder
> >>>>>> as is
> >>>>> done
> >>>>>> with the / character today.
> >>>>>
> >>>>> I don't think DECODE_AS_SOLIDUS is necessary although it could
> >>>>> always be
> >>>>> added as an additional option later if necessary.
> >>>>>
> >>>>>> This would make the handling of \ more consistent with /, and I
> >>>>>> imagine
> >>>>> the
> >>>>>> security concerns between the two remain about equal. Though, like
> >>>>>> you
> >>>>>> said, others on the team have already thought deeply about this, so
> I
> >>> may
> >>>>>> be missing something important.
> >>>>>>
> >>>>>> If I were to pursue getting a change like this made to Tomcat, where
> >>>>> should
> >>>>>> I start?
> >>>>>
> >>>>> This is the way to start the process. A discussion on the users list
> >>>>> that clearly explains the problem and what a solution might look
> like.
> >>>>>
> >>>>>> I know this would be a slow process.
> >>>>>
> >>>>> There is a good reason for the feature, the implementation looks to
> be
> >>>>> backwards compatible and not too invasive, there isn't simple
> >>>>> extension
> >>>>> point you can use to implement and one or more committers are
> >>>>> interested
> >>>>> in solving the problem then there is a good chance it will be in the
> >>>>> next round of releases (early Feb).
> >>>>>
> >>>>>> Or, if I'm better off working around the core functionality, would
> >>>>>> you
> >>>>> have
> >>>>>> any suggestions on how? I see the UDecoder recently changed to
> >>>>>> support
> >>>>>> encoded % characters. I considered using a double encoded \ hack to
> >>>>>> effectively pass through, but that won't work now. Plus, I wasn't
> >>>>>> able
> >>> to
> >>>>>> figure out how to add custom logic before the URL decode and
> >>>>>> normalize
> >>> to
> >>>>>> add my work around encoding on the server side, as doing so in the
> >>> client
> >>>>>> side isn't feasible.
> >>>>>>
> >>>>>> Ideally, I wouldn't need to maintain a custom build of Tomcat
> >>>>> indefinitely.
> >>>>>
> >>>>> There isn't an easy (or any) extension point to implement this. It
> >>>>> would
> >>>>> have to be a custom Tomcat build.
> >>>>>
> >>>>> Are you able to test some snapshot builds if this gets implemented?
> >>>>>
> >>>>> Mark
> >>>>>
> >>>>>>
> >>>>>> Thanks, James
> >>>>>>
> >>>>>> On Fri, Jan 17, 2025, 10:00 AM Christopher Schultz <
> >>>>>> ch...@christopherschultz.net> wrote:
> >>>>>>
> >>>>>>> James,
> >>>>>>>
> >>>>>>> On 1/17/25 8:04 AM, James Matlik wrote:
> >>>>>>>> When I'm talking about path parameters, it is in the context of
> how
> >>>>> Open
> >>>>>>>> API/Swagger defined them:
> >>>>>>>> https://swagger.io/docs/specification/v3_0/describing-parameters/
> >>>>>>>
> >>>>>>> Okay, that helps clear things up. In the URL specification
> >>>>>>> (inherited
> >>> by
> >>>>>>> HTTP) defines them differently:
> >>>>>>>
> >>>>>>> https://www.rfc-editor.org/rfc/rfc3986.html#section-3.3
> >>>>>>>
> >>>>>>> The term "path parameter" you are using is actually a "path
> segment"
> >>> in
> >>>>>>> URL/HTTP terms.
> >>>>>>>
> >>>>>>> The problem you are running into is that encoded slashes are often
> >>> used
> >>>>>>> to attack servers by performing path-traversals or avoiding
> security
> >>>>>>> constraints. Let's say you are on Windows and there is a security
> >>>>>>> constraint that says "only admin users can access files in
> >>>>>>> D:\docroot\(ON\QC)". You have mapped URL-/ to physical-D:\docroot.
> >>>>>>> (Admittedly this is very contrived).
> >>>>>>>
> >>>>>>> The attacker says "well, %29 is a backslash, so lemmie just check
> to
> >>> see
> >>>>>>> if I can slip one past the security system" and requests
> >>>>>>> http://localhost:8080/group/%28ON%5CQC%29%20LOCAL. If the server
> is
> >>>>>>> fooled, the attacker gets to request that resource in that
> >>>>>>> directory.
> >>>>>>>
> >>>>>>> My guess is that Tomcat's "URL cleaner" is seeing that \,
> attempting
> >>> to
> >>>>>>> "sanitize" it, and converting it to a / because that's the local
> >>>>>>> filesystem path separator. This makes security constraints much
> more
> >>>>>>> difficult to bypass.
> >>>>>>>
> >>>>>>> My naïve reading of your desire to have
> >>>>>>> encodeSolidusHandling=PASS_THROUGH behave the same for / and \
> seems
> >>>>>>> reasonable if a misnomer. I say "naïve" because others on this team
> >>> have
> >>>>>>> spent countless hours of their lives mulling these things over and
> >>>>>>> getting the logic "just right".
> >>>>>>>
> >>>>>>> I say it's a misnomer because "solidus" means literally / and not
> \.
> >>>>>>> encodeReverseSolidusHandling? encodeAntiSolidusHandling?
> >>>>>>>
> >>>>>>> The encodeSolidusHandling configuration setting is there so we can
> >>> have
> >>>>>>> a very secure default-configuration that prevents attacks on naive
> >>>>>>> applications. In your case, you have an application that
> >>>>>>> (presumably)
> >>>>>>> knows exactly what it is doing, and therefore you have the option
> to
> >>>>>>> pass-through those solidus...es. Solidii? Whatever.
> >>>>>>>
> >>>>>>> I think that actually having a separate configuration setting for
> >>>>>>> antisolidii handling makes more sense than having that existing
> >>> setting
> >>>>>>> perform double-duty, since each one of those you allow to pass-
> >>>>>>> through
> >>>>>>> opens the door slightly wider to an attacker.
> >>>>>>>
> >>>>>>> -chris
> >>>>>>>
> >>>>>>>
> >>>>>>>
> ---------------------------------------------------------------------
> >>>>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> >>>>>>> For additional commands, e-mail: users-h...@tomcat.apache.org
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> >>>>> For additional commands, e-mail: users-h...@tomcat.apache.org
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> >>> For additional commands, e-mail: users-h...@tomcat.apache.org
> >>>
> >>>
> >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > For additional commands, e-mail: users-h...@tomcat.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>
>

Reply via email to