Maxim and Chris,

LOL! Yes, I am definitely an ignorant westerner. Thanks for pointing out my
mistake. I'll be updating my unit test comments accordingly.

-James

On Fri, Jan 24, 2025, 9:57 PM Maxim Solodovnik <solomax...@gmail.com> wrote:

> Chris,
>
> from mobile (sorry for typos ;)
>
>
> On Sat, Jan 25, 2025, 05:35 Christopher Schultz <
> ch...@christopherschultz.net> wrote:
>
> > Maxim,
> >
> > On 1/23/25 9:31 PM, Maxim Solodovnik wrote:
> > > from mobile (sorry for typos ;)
> > >
> > >
> > > On Thu, Jan 23, 2025, 23:00 James Matlik <james.mat...@gmail.com>
> wrote:
> > >
> > >> It works!
> > >>
> > >> I've been able to test with a customer name of: ÀËÌÑàëíñøü / \ Ѐӿ 中さ
> 😀
> > >> customer
> > >> This covers
> > >>   - Latin-1 Supplement characters are 1 byte: ÀËÌÑàëíñøü
> > >>   - The / and \ slash characters are ASCII, but are encoded due to
> > special
> > >> meaning in a URL
> > >>   - Cyrillic characters are 2 bytes: Ѐӿ
> > >>
> > >
> > > These two are not Cyrillic :))
> >
> > Unicode says the lowercase version of Ѐ[1] is Cyrillic[2], and we are
> > ignorant westerners :)
> >
> > Also kha seems definitely to be Cyrillic[3].
> >
>
> My bad :)
> These two are not in russian, ukrainian or bulgarian
>
>
> >  > These are:  Вау :))
> >
> > LOL
> >
>
> More extraordinary are: ЩЖЦЪ
> I guess :)
>
>
> > -chris
> >
> > [1] https://www.compart.com/en/unicode/U+00C8
> > [2] https://www.compart.com/en/unicode/U+0450
> > [3] https://www.compart.com/en/unicode/U+04FF
> >
> > >
> > >   - Chinese and Japanese characters are 3 bytes: 中さ
> > >>   - emoji characters are 4 bytes: 😀
> > >>
> > >> A client can request a URL path with the following:
> > >>
> > >>
> >
> "/customers/customer/%C3%80%C3%8B%C3%8C%C3%91%C3%A0%C3%AB%C3%AD%C3%B1%C3%B8%C3%BC%20%2F%20%5C%20%D0%80%D3%BF%20%E4%B8%AD%E3%81%95%20%F0%9F%98%80%20customer"
> > >>
> > >> Then Tomcat processes the URL to the following and passes it into the
> > >> servlet.
> > >> "/customers/customer/ÀËÌÑàëíñøü %2F %5C Ѐӿ 中さ 😀 customer"
> > >>
> > >> So we are now able to save this test customer to the database and read
> > it
> > >> back out again. I tried this with a few other paths that are a little
> > >> structurally different (with/without query string, escaped string in a
> > >> middle path segment instead of just the end) and it all worked as
> > desired.
> > >>
> > >> Thank you so much for working on this!
> > >> -James
> > >>
> > >> On Thu, Jan 23, 2025 at 8:20 AM Mark Thomas <ma...@apache.org> wrote:
> > >>
> > >>> James,
> > >>>
> > >>> I've added attributes (encodedReverseSolidusHandling and
> > >>> encodedSolidusHandling) to the Context to provide control of how the
> > >>> RequestDispatcher paths are processed.
> > >>>
> > >>> Snapshots built after 12.00 UTC today should include the new
> > attributes.
> > >>>
> > >>> As I type, Tomcat 12 is available, Tomcat 11 is building and Tomcat
> 10
> > >>> and Tomcat 9 are in the queue. All should be complete in a couple of
> > >> hours.
> > >>>
> > >>> Mark
> > >>>
> > >>>
> > >>> On 22/01/2025 09:30, Mark Thomas wrote:
> > >>>> This is going to be fun.
> > >>>>
> > >>>> The RequestDispatcher processing currently does not take account of
> > >>>> encodedSolidusHandling or encodedReverseSolidusHandling.
> > >>>> RequestDsipatcher URLs are processed using the defaults.
> > >>>>
> > >>>> I have a test case that demonstrates various forms of this problem.
> I
> > >> am
> > >>>> currently working on a fix.
> > >>>>
> > >>>> Mark
> > >>>>
> > >>>>
> > >>>> On 21/01/2025 23:10, James Matlik wrote:
> > >>>>> It got further, but ran into another security check within the
> > >>>>> org.apache.catalina.core.ApplicationContext. I'm using the 10.1.35-
> > >>>>> SNAPSHOT
> > >>>>> version as that is closest to my current version, and easiest to
> test
> > >>>>> as a
> > >>>>> result. This is the block of code having trouble now (linking into
> > >>>>> 10.1.34):
> > >>>>> https://github.com/apache/tomcat/blob/10.1.34/java/org/apache/
> > >>>>> catalina/core/ApplicationContext.java#L386-L396
> > >>>>>
> > >>>>> This code:
> > >>>>> 1. takes the URL (e.g. /customers/customer/A%2FB%5CC)
> > >>>>> 2. decodes it (e.g. /customers/customer/A/B\C)
> > >>>>> 3. normalizes the decoded url with the replaceBackSlash parameter
> > >>>>> hardcoded
> > >>>>> to true, thus forcing the decoded '\' character to become a '/'
> > >>> character
> > >>>>> (e.g. /customers/customer/A/B/C)
> > >>>>> 4. then it compares the decoded URL to the normalized URL, and if
> > they
> > >>>>> are
> > >>>>> not equal, an IllegalArgumentException is created and a 500
> response
> > >> is
> > >>>>> returned.
> > >>>>>
> > >>>>> It seems that this area of the code would need to be aware of the
> > >>>>> encodedReverseSolidusHandling configuration as well.
> > >>>>>
> > >>>>> Thank you,
> > >>>>> James
> > >>>>>
> > >>>>> On Tue, Jan 21, 2025 at 1:20 PM Mark Thomas <ma...@apache.org>
> > wrote:
> > >>>>>
> > >>>>>> On 21/01/2025 14:15, James Matlik wrote:
> > >>>>>>> Hello Mark,
> > >>>>>>>
> > >>>>>>> Yes, I would be available to test a snapshot if this gets
> > >> implemented.
> > >>>>>>
> > >>>>>> Latest (as I type this) Tomcat 12 build with the new feature:
> > >>>>>>
> > >>>>>>
> https://repository.apache.org/content/groups/snapshots/org/apache/
> > >>>>>> tomcat/tomcat/12.0.0-M1-SNAPSHOT/tomcat-12.0.0-
> > >>>>>> M1-20250121.171115-315.tar.gz
> > >>>>>>
> > >>>>>> The snapshots for the 11.0.x, 10.1.x and 9.0.x are still building.
> > >>>>>> Generally, you'll find them under:
> > >>>>>>
> > >>>>>>
> > >>>>>>
> https://repository.apache.org/content/groups/snapshots/org/apache/
> > >>>>>> tomcat/tomcat/
> > >>>>>>
> > >>>>>> Anything built after 1700 UTC today should have the fix. Make sure
> > >> you
> > >>>>>> look for the snapshot for the current dev version for each release
> > >>>>>> branch.
> > >>>>>>
> > >>>>>> The Connector attribute is called encodedReverseSolidusHandling
> > >>>>>>
> > >>>>>> Let the list know how you get on.
> > >>>>>>
> > >>>>>> Mark
> > >>>>>>
> > >>>>>>
> > >>>>>>>
> > >>>>>>> -James
> > >>>>>>>
> > >>>>>>> On Tue, Jan 21, 2025 at 8:17 AM Mark Thomas <ma...@apache.org>
> > >> wrote:
> > >>>>>>>
> > >>>>>>>> On 18/01/2025 16:18, James Matlik wrote:
> > >>>>>>>>> I agree with everything you have said. As the config options
> > stand
> > >>>>>> today,
> > >>>>>>>>> the allowBackslash seems to implement part of
> > >> encodeSolidusHandling.
> > >>>>>>>>>
> > >>>>>>>>> While encodeSolidusHandling supports:
> > >>>>>>>>> * REJECT - Return 400 on encoded /
> > >>>>>>>>> * DECODE - Decodes the /
> > >>>>>>>>> * PASS_THROUGH - Passes the encoded / as is
> > >>>>>>>>>
> > >>>>>>>>> The allowBackslash supports two mostly analogous settings:
> > >>>>>>>>> * false - REJECT
> > >>>>>>>>> * true - DECODE, but it Decodes as / instead of \
> > >>>>>>>>
> > >>>>>>>> Not quite. Decoding is separate.
> > >>>>>>>>
> > >>>>>>>> Currently, %5C always gets decoded to \. allowBackslash then
> > >>>>>>>> determines
> > >>>>>>>> whether any \ (originally encoded or not) characters in the URL
> > >>>>>>>> will be
> > >>>>>>>> permitted and converted to / or the URL rejected.
> > >>>>>>>>
> > >>>>>>>> The decoding and conversion are partially independent. For
> > example,
> > >>>>>>>> while it may be required that an encoded sequence %5C passed
> > >>>>>>>> through to
> > >>>>>>>> the application, whether an unencoded \ is rejected or converted
> > >>>>>>>> to / is
> > >>>>>>>> an independent choice.
> > >>>>>>>>
> > >>>>>>>>     From a security point of view, it is necessary to always
> > convert
> > >>>>>>>> \ to
> > >>>>>> /
> > >>>>>>>> - at least on Windows anyway - else all sorts of nasty security
> > >>> things
> > >>>>>>>> could happen such as bypassing security constraints.
> > >>>>>>>>
> > >>>>>>>> We should be able to fix this for the next round of releases
> > (early
> > >>>>>>>> Feb). I am currently looking at whether a new option to
> > effectively
> > >>>>>>>> replace allowBackslash is sufficient or whether an new option is
> > >>>>>>>> needed
> > >>>>>>>> in addition to allowBackslash.
> > >>>>>>>>
> > >>>>>>>> My current thinking is that we should add
> > >>> encodeReverseSolidusHandling
> > >>>>>>>> with the same options as encodeSolidusHandling and leave the
> > >>>>>>>> convert/reject decision for unencoded \ to allowBackslash.
> > >>>>>>>>
> > >>>>>>>>> IMO, an ideal implementation would introduce
> > >>>>>> encodeReverseSolidusHandling
> > >>>>>>>>> to behave the same as encodeSolidusHandling, but maybe add a
> > >>>>>>>>> DECODE_AS_SOLIDUS. Then the allowBackslash setter would set the
> > >>>>>>>>> encodeReverseSolidusHandling variable to DECODE_AS_SOLIDUS or
> > >> REJECT
> > >>>>>>>> (true
> > >>>>>>>>> vs false) for backward compatibility, but should probably be
> > >>>>>> deprecated.
> > >>>>>>>>> The code that normalizes the URL can remove any special
> handling
> > >> of
> > >>>>>> the \
> > >>>>>>>>> character because all enforcement can be applied within
> UDecoder
> > >>>>>>>>> as is
> > >>>>>>>> done
> > >>>>>>>>> with the / character today.
> > >>>>>>>>
> > >>>>>>>> I don't think DECODE_AS_SOLIDUS is necessary although it could
> > >>>>>>>> always be
> > >>>>>>>> added as an additional option later if necessary.
> > >>>>>>>>
> > >>>>>>>>> This would make the handling of \ more consistent with /, and I
> > >>>>>>>>> imagine
> > >>>>>>>> the
> > >>>>>>>>> security concerns between the two remain about equal. Though,
> > like
> > >>>>>>>>> you
> > >>>>>>>>> said, others on the team have already thought deeply about
> this,
> > >> so
> > >>> I
> > >>>>>> may
> > >>>>>>>>> be missing something important.
> > >>>>>>>>>
> > >>>>>>>>> If I were to pursue getting a change like this made to Tomcat,
> > >> where
> > >>>>>>>> should
> > >>>>>>>>> I start?
> > >>>>>>>>
> > >>>>>>>> This is the way to start the process. A discussion on the users
> > >> list
> > >>>>>>>> that clearly explains the problem and what a solution might look
> > >>> like.
> > >>>>>>>>
> > >>>>>>>>> I know this would be a slow process.
> > >>>>>>>>
> > >>>>>>>> There is a good reason for the feature, the implementation looks
> > to
> > >>> be
> > >>>>>>>> backwards compatible and not too invasive, there isn't simple
> > >>>>>>>> extension
> > >>>>>>>> point you can use to implement and one or more committers are
> > >>>>>>>> interested
> > >>>>>>>> in solving the problem then there is a good chance it will be in
> > >> the
> > >>>>>>>> next round of releases (early Feb).
> > >>>>>>>>
> > >>>>>>>>> Or, if I'm better off working around the core functionality,
> > would
> > >>>>>>>>> you
> > >>>>>>>> have
> > >>>>>>>>> any suggestions on how? I see the UDecoder recently changed to
> > >>>>>>>>> support
> > >>>>>>>>> encoded % characters. I considered using a double encoded \
> hack
> > >> to
> > >>>>>>>>> effectively pass through, but that won't work now. Plus, I
> wasn't
> > >>>>>>>>> able
> > >>>>>> to
> > >>>>>>>>> figure out how to add custom logic before the URL decode and
> > >>>>>>>>> normalize
> > >>>>>> to
> > >>>>>>>>> add my work around encoding on the server side, as doing so in
> > the
> > >>>>>> client
> > >>>>>>>>> side isn't feasible.
> > >>>>>>>>>
> > >>>>>>>>> Ideally, I wouldn't need to maintain a custom build of Tomcat
> > >>>>>>>> indefinitely.
> > >>>>>>>>
> > >>>>>>>> There isn't an easy (or any) extension point to implement this.
> It
> > >>>>>>>> would
> > >>>>>>>> have to be a custom Tomcat build.
> > >>>>>>>>
> > >>>>>>>> Are you able to test some snapshot builds if this gets
> > implemented?
> > >>>>>>>>
> > >>>>>>>> Mark
> > >>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Thanks, James
> > >>>>>>>>>
> > >>>>>>>>> On Fri, Jan 17, 2025, 10:00 AM Christopher Schultz <
> > >>>>>>>>> ch...@christopherschultz.net> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> James,
> > >>>>>>>>>>
> > >>>>>>>>>> On 1/17/25 8:04 AM, James Matlik wrote:
> > >>>>>>>>>>> When I'm talking about path parameters, it is in the context
> of
> > >>> how
> > >>>>>>>> Open
> > >>>>>>>>>>> API/Swagger defined them:
> > >>>>>>>>>>>
> > >> https://swagger.io/docs/specification/v3_0/describing-parameters/
> > >>>>>>>>>>
> > >>>>>>>>>> Okay, that helps clear things up. In the URL specification
> > >>>>>>>>>> (inherited
> > >>>>>> by
> > >>>>>>>>>> HTTP) defines them differently:
> > >>>>>>>>>>
> > >>>>>>>>>> https://www.rfc-editor.org/rfc/rfc3986.html#section-3.3
> > >>>>>>>>>>
> > >>>>>>>>>> The term "path parameter" you are using is actually a "path
> > >>> segment"
> > >>>>>> in
> > >>>>>>>>>> URL/HTTP terms.
> > >>>>>>>>>>
> > >>>>>>>>>> The problem you are running into is that encoded slashes are
> > >> often
> > >>>>>> used
> > >>>>>>>>>> to attack servers by performing path-traversals or avoiding
> > >>> security
> > >>>>>>>>>> constraints. Let's say you are on Windows and there is a
> > security
> > >>>>>>>>>> constraint that says "only admin users can access files in
> > >>>>>>>>>> D:\docroot\(ON\QC)". You have mapped URL-/ to
> > >> physical-D:\docroot.
> > >>>>>>>>>> (Admittedly this is very contrived).
> > >>>>>>>>>>
> > >>>>>>>>>> The attacker says "well, %29 is a backslash, so lemmie just
> > check
> > >>> to
> > >>>>>> see
> > >>>>>>>>>> if I can slip one past the security system" and requests
> > >>>>>>>>>> http://localhost:8080/group/%28ON%5CQC%29%20LOCAL. If the
> > server
> > >>> is
> > >>>>>>>>>> fooled, the attacker gets to request that resource in that
> > >>>>>>>>>> directory.
> > >>>>>>>>>>
> > >>>>>>>>>> My guess is that Tomcat's "URL cleaner" is seeing that \,
> > >>> attempting
> > >>>>>> to
> > >>>>>>>>>> "sanitize" it, and converting it to a / because that's the
> local
> > >>>>>>>>>> filesystem path separator. This makes security constraints
> much
> > >>> more
> > >>>>>>>>>> difficult to bypass.
> > >>>>>>>>>>
> > >>>>>>>>>> My naïve reading of your desire to have
> > >>>>>>>>>> encodeSolidusHandling=PASS_THROUGH behave the same for / and \
> > >>> seems
> > >>>>>>>>>> reasonable if a misnomer. I say "naïve" because others on this
> > >> team
> > >>>>>> have
> > >>>>>>>>>> spent countless hours of their lives mulling these things over
> > >> and
> > >>>>>>>>>> getting the logic "just right".
> > >>>>>>>>>>
> > >>>>>>>>>> I say it's a misnomer because "solidus" means literally / and
> > not
> > >>> \.
> > >>>>>>>>>> encodeReverseSolidusHandling? encodeAntiSolidusHandling?
> > >>>>>>>>>>
> > >>>>>>>>>> The encodeSolidusHandling configuration setting is there so we
> > >> can
> > >>>>>> have
> > >>>>>>>>>> a very secure default-configuration that prevents attacks on
> > >> naive
> > >>>>>>>>>> applications. In your case, you have an application that
> > >>>>>>>>>> (presumably)
> > >>>>>>>>>> knows exactly what it is doing, and therefore you have the
> > option
> > >>> to
> > >>>>>>>>>> pass-through those solidus...es. Solidii? Whatever.
> > >>>>>>>>>>
> > >>>>>>>>>> I think that actually having a separate configuration setting
> > for
> > >>>>>>>>>> antisolidii handling makes more sense than having that
> existing
> > >>>>>> setting
> > >>>>>>>>>> perform double-duty, since each one of those you allow to
> pass-
> > >>>>>>>>>> through
> > >>>>>>>>>> opens the door slightly wider to an attacker.
> > >>>>>>>>>>
> > >>>>>>>>>> -chris
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>> ---------------------------------------------------------------------
> > >>>>>>>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > >>>>>>>>>> For additional commands, e-mail: users-h...@tomcat.apache.org
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >> ---------------------------------------------------------------------
> > >>>>>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > >>>>>>>> For additional commands, e-mail: users-h...@tomcat.apache.org
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > ---------------------------------------------------------------------
> > >>>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > >>>>>> For additional commands, e-mail: users-h...@tomcat.apache.org
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>>
> ---------------------------------------------------------------------
> > >>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > >>>> For additional commands, e-mail: users-h...@tomcat.apache.org
> > >>>>
> > >>>
> > >>>
> > >>> ---------------------------------------------------------------------
> > >>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > >>> For additional commands, e-mail: users-h...@tomcat.apache.org
> > >>>
> > >>>
> > >>
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > For additional commands, e-mail: users-h...@tomcat.apache.org
> >
> >
>

Reply via email to