It works! I've been able to test with a customer name of: ÀËÌÑàëíñøü / \ Ѐӿ 中さ 😀 customer This covers - Latin-1 Supplement characters are 1 byte: ÀËÌÑàëíñøü - The / and \ slash characters are ASCII, but are encoded due to special meaning in a URL - Cyrillic characters are 2 bytes: Ѐӿ - Chinese and Japanese characters are 3 bytes: 中さ - emoji characters are 4 bytes: 😀
A client can request a URL path with the following: "/customers/customer/%C3%80%C3%8B%C3%8C%C3%91%C3%A0%C3%AB%C3%AD%C3%B1%C3%B8%C3%BC%20%2F%20%5C%20%D0%80%D3%BF%20%E4%B8%AD%E3%81%95%20%F0%9F%98%80%20customer" Then Tomcat processes the URL to the following and passes it into the servlet. "/customers/customer/ÀËÌÑàëíñøü %2F %5C Ѐӿ 中さ 😀 customer" So we are now able to save this test customer to the database and read it back out again. I tried this with a few other paths that are a little structurally different (with/without query string, escaped string in a middle path segment instead of just the end) and it all worked as desired. Thank you so much for working on this! -James On Thu, Jan 23, 2025 at 8:20 AM Mark Thomas <ma...@apache.org> wrote: > James, > > I've added attributes (encodedReverseSolidusHandling and > encodedSolidusHandling) to the Context to provide control of how the > RequestDispatcher paths are processed. > > Snapshots built after 12.00 UTC today should include the new attributes. > > As I type, Tomcat 12 is available, Tomcat 11 is building and Tomcat 10 > and Tomcat 9 are in the queue. All should be complete in a couple of hours. > > Mark > > > On 22/01/2025 09:30, Mark Thomas wrote: > > This is going to be fun. > > > > The RequestDispatcher processing currently does not take account of > > encodedSolidusHandling or encodedReverseSolidusHandling. > > RequestDsipatcher URLs are processed using the defaults. > > > > I have a test case that demonstrates various forms of this problem. I am > > currently working on a fix. > > > > Mark > > > > > > On 21/01/2025 23:10, James Matlik wrote: > >> It got further, but ran into another security check within the > >> org.apache.catalina.core.ApplicationContext. I'm using the 10.1.35- > >> SNAPSHOT > >> version as that is closest to my current version, and easiest to test > >> as a > >> result. This is the block of code having trouble now (linking into > >> 10.1.34): > >> https://github.com/apache/tomcat/blob/10.1.34/java/org/apache/ > >> catalina/core/ApplicationContext.java#L386-L396 > >> > >> This code: > >> 1. takes the URL (e.g. /customers/customer/A%2FB%5CC) > >> 2. decodes it (e.g. /customers/customer/A/B\C) > >> 3. normalizes the decoded url with the replaceBackSlash parameter > >> hardcoded > >> to true, thus forcing the decoded '\' character to become a '/' > character > >> (e.g. /customers/customer/A/B/C) > >> 4. then it compares the decoded URL to the normalized URL, and if they > >> are > >> not equal, an IllegalArgumentException is created and a 500 response is > >> returned. > >> > >> It seems that this area of the code would need to be aware of the > >> encodedReverseSolidusHandling configuration as well. > >> > >> Thank you, > >> James > >> > >> On Tue, Jan 21, 2025 at 1:20 PM Mark Thomas <ma...@apache.org> wrote: > >> > >>> On 21/01/2025 14:15, James Matlik wrote: > >>>> Hello Mark, > >>>> > >>>> Yes, I would be available to test a snapshot if this gets implemented. > >>> > >>> Latest (as I type this) Tomcat 12 build with the new feature: > >>> > >>> https://repository.apache.org/content/groups/snapshots/org/apache/ > >>> tomcat/tomcat/12.0.0-M1-SNAPSHOT/tomcat-12.0.0- > >>> M1-20250121.171115-315.tar.gz > >>> > >>> The snapshots for the 11.0.x, 10.1.x and 9.0.x are still building. > >>> Generally, you'll find them under: > >>> > >>> > >>> https://repository.apache.org/content/groups/snapshots/org/apache/ > >>> tomcat/tomcat/ > >>> > >>> Anything built after 1700 UTC today should have the fix. Make sure you > >>> look for the snapshot for the current dev version for each release > >>> branch. > >>> > >>> The Connector attribute is called encodedReverseSolidusHandling > >>> > >>> Let the list know how you get on. > >>> > >>> Mark > >>> > >>> > >>>> > >>>> -James > >>>> > >>>> On Tue, Jan 21, 2025 at 8:17 AM Mark Thomas <ma...@apache.org> wrote: > >>>> > >>>>> On 18/01/2025 16:18, James Matlik wrote: > >>>>>> I agree with everything you have said. As the config options stand > >>> today, > >>>>>> the allowBackslash seems to implement part of encodeSolidusHandling. > >>>>>> > >>>>>> While encodeSolidusHandling supports: > >>>>>> * REJECT - Return 400 on encoded / > >>>>>> * DECODE - Decodes the / > >>>>>> * PASS_THROUGH - Passes the encoded / as is > >>>>>> > >>>>>> The allowBackslash supports two mostly analogous settings: > >>>>>> * false - REJECT > >>>>>> * true - DECODE, but it Decodes as / instead of \ > >>>>> > >>>>> Not quite. Decoding is separate. > >>>>> > >>>>> Currently, %5C always gets decoded to \. allowBackslash then > >>>>> determines > >>>>> whether any \ (originally encoded or not) characters in the URL > >>>>> will be > >>>>> permitted and converted to / or the URL rejected. > >>>>> > >>>>> The decoding and conversion are partially independent. For example, > >>>>> while it may be required that an encoded sequence %5C passed > >>>>> through to > >>>>> the application, whether an unencoded \ is rejected or converted > >>>>> to / is > >>>>> an independent choice. > >>>>> > >>>>> From a security point of view, it is necessary to always convert > >>>>> \ to > >>> / > >>>>> - at least on Windows anyway - else all sorts of nasty security > things > >>>>> could happen such as bypassing security constraints. > >>>>> > >>>>> We should be able to fix this for the next round of releases (early > >>>>> Feb). I am currently looking at whether a new option to effectively > >>>>> replace allowBackslash is sufficient or whether an new option is > >>>>> needed > >>>>> in addition to allowBackslash. > >>>>> > >>>>> My current thinking is that we should add > encodeReverseSolidusHandling > >>>>> with the same options as encodeSolidusHandling and leave the > >>>>> convert/reject decision for unencoded \ to allowBackslash. > >>>>> > >>>>>> IMO, an ideal implementation would introduce > >>> encodeReverseSolidusHandling > >>>>>> to behave the same as encodeSolidusHandling, but maybe add a > >>>>>> DECODE_AS_SOLIDUS. Then the allowBackslash setter would set the > >>>>>> encodeReverseSolidusHandling variable to DECODE_AS_SOLIDUS or REJECT > >>>>> (true > >>>>>> vs false) for backward compatibility, but should probably be > >>> deprecated. > >>>>>> The code that normalizes the URL can remove any special handling of > >>> the \ > >>>>>> character because all enforcement can be applied within UDecoder > >>>>>> as is > >>>>> done > >>>>>> with the / character today. > >>>>> > >>>>> I don't think DECODE_AS_SOLIDUS is necessary although it could > >>>>> always be > >>>>> added as an additional option later if necessary. > >>>>> > >>>>>> This would make the handling of \ more consistent with /, and I > >>>>>> imagine > >>>>> the > >>>>>> security concerns between the two remain about equal. Though, like > >>>>>> you > >>>>>> said, others on the team have already thought deeply about this, so > I > >>> may > >>>>>> be missing something important. > >>>>>> > >>>>>> If I were to pursue getting a change like this made to Tomcat, where > >>>>> should > >>>>>> I start? > >>>>> > >>>>> This is the way to start the process. A discussion on the users list > >>>>> that clearly explains the problem and what a solution might look > like. > >>>>> > >>>>>> I know this would be a slow process. > >>>>> > >>>>> There is a good reason for the feature, the implementation looks to > be > >>>>> backwards compatible and not too invasive, there isn't simple > >>>>> extension > >>>>> point you can use to implement and one or more committers are > >>>>> interested > >>>>> in solving the problem then there is a good chance it will be in the > >>>>> next round of releases (early Feb). > >>>>> > >>>>>> Or, if I'm better off working around the core functionality, would > >>>>>> you > >>>>> have > >>>>>> any suggestions on how? I see the UDecoder recently changed to > >>>>>> support > >>>>>> encoded % characters. I considered using a double encoded \ hack to > >>>>>> effectively pass through, but that won't work now. Plus, I wasn't > >>>>>> able > >>> to > >>>>>> figure out how to add custom logic before the URL decode and > >>>>>> normalize > >>> to > >>>>>> add my work around encoding on the server side, as doing so in the > >>> client > >>>>>> side isn't feasible. > >>>>>> > >>>>>> Ideally, I wouldn't need to maintain a custom build of Tomcat > >>>>> indefinitely. > >>>>> > >>>>> There isn't an easy (or any) extension point to implement this. It > >>>>> would > >>>>> have to be a custom Tomcat build. > >>>>> > >>>>> Are you able to test some snapshot builds if this gets implemented? > >>>>> > >>>>> Mark > >>>>> > >>>>>> > >>>>>> Thanks, James > >>>>>> > >>>>>> On Fri, Jan 17, 2025, 10:00 AM Christopher Schultz < > >>>>>> ch...@christopherschultz.net> wrote: > >>>>>> > >>>>>>> James, > >>>>>>> > >>>>>>> On 1/17/25 8:04 AM, James Matlik wrote: > >>>>>>>> When I'm talking about path parameters, it is in the context of > how > >>>>> Open > >>>>>>>> API/Swagger defined them: > >>>>>>>> https://swagger.io/docs/specification/v3_0/describing-parameters/ > >>>>>>> > >>>>>>> Okay, that helps clear things up. In the URL specification > >>>>>>> (inherited > >>> by > >>>>>>> HTTP) defines them differently: > >>>>>>> > >>>>>>> https://www.rfc-editor.org/rfc/rfc3986.html#section-3.3 > >>>>>>> > >>>>>>> The term "path parameter" you are using is actually a "path > segment" > >>> in > >>>>>>> URL/HTTP terms. > >>>>>>> > >>>>>>> The problem you are running into is that encoded slashes are often > >>> used > >>>>>>> to attack servers by performing path-traversals or avoiding > security > >>>>>>> constraints. Let's say you are on Windows and there is a security > >>>>>>> constraint that says "only admin users can access files in > >>>>>>> D:\docroot\(ON\QC)". You have mapped URL-/ to physical-D:\docroot. > >>>>>>> (Admittedly this is very contrived). > >>>>>>> > >>>>>>> The attacker says "well, %29 is a backslash, so lemmie just check > to > >>> see > >>>>>>> if I can slip one past the security system" and requests > >>>>>>> http://localhost:8080/group/%28ON%5CQC%29%20LOCAL. If the server > is > >>>>>>> fooled, the attacker gets to request that resource in that > >>>>>>> directory. > >>>>>>> > >>>>>>> My guess is that Tomcat's "URL cleaner" is seeing that \, > attempting > >>> to > >>>>>>> "sanitize" it, and converting it to a / because that's the local > >>>>>>> filesystem path separator. This makes security constraints much > more > >>>>>>> difficult to bypass. > >>>>>>> > >>>>>>> My naïve reading of your desire to have > >>>>>>> encodeSolidusHandling=PASS_THROUGH behave the same for / and \ > seems > >>>>>>> reasonable if a misnomer. I say "naïve" because others on this team > >>> have > >>>>>>> spent countless hours of their lives mulling these things over and > >>>>>>> getting the logic "just right". > >>>>>>> > >>>>>>> I say it's a misnomer because "solidus" means literally / and not > \. > >>>>>>> encodeReverseSolidusHandling? encodeAntiSolidusHandling? > >>>>>>> > >>>>>>> The encodeSolidusHandling configuration setting is there so we can > >>> have > >>>>>>> a very secure default-configuration that prevents attacks on naive > >>>>>>> applications. In your case, you have an application that > >>>>>>> (presumably) > >>>>>>> knows exactly what it is doing, and therefore you have the option > to > >>>>>>> pass-through those solidus...es. Solidii? Whatever. > >>>>>>> > >>>>>>> I think that actually having a separate configuration setting for > >>>>>>> antisolidii handling makes more sense than having that existing > >>> setting > >>>>>>> perform double-duty, since each one of those you allow to pass- > >>>>>>> through > >>>>>>> opens the door slightly wider to an attacker. > >>>>>>> > >>>>>>> -chris > >>>>>>> > >>>>>>> > >>>>>>> > --------------------------------------------------------------------- > >>>>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > >>>>>>> For additional commands, e-mail: users-h...@tomcat.apache.org > >>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>>> > >>>>> --------------------------------------------------------------------- > >>>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > >>>>> For additional commands, e-mail: users-h...@tomcat.apache.org > >>>>> > >>>>> > >>>> > >>> > >>> > >>> --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > >>> For additional commands, e-mail: users-h...@tomcat.apache.org > >>> > >>> > >> > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > > For additional commands, e-mail: users-h...@tomcat.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > >