On 5/9/23 8:01 PM, Eric Covener wrote:
> On Tue, May 9, 2023 at 11:51 AM Ruediger Pluem <rpl...@apache.org> wrote:
>>
>>
>>
>> On 5/9/23 4:33 PM, Yann Ylavic wrote:
>>> On Tue, May 9, 2023 at 2:10 PM Yann Ylavic <ylavic....@gmail.com> wrote:
>>>>
>>>> On Tue, May 9, 2023 at 12:55 PM Ruediger Pluem <rpl...@apache.org> wrote:
>>>>>
>>>>> On 5/9/23 12:16 PM, Eric Covener wrote:
>>>>>> Still getting feedback in the PR about breakage. Any thoughts on options 
>>>>>> here, like allowing spaces or encoding rather than failing?
>>>>>
>>>>> Allowing spaces is out of question for me as it creates an invalid 
>>>>> request and opens the door to response splitting. At best we
>>>>> could encode automatically. It is really a good question if we could not 
>>>>> make BCTLS the default.
>>>>
>>>> BCTLS by default looks fine to me, it changes the behaviour for users
>>>> that (used to) expect/handle decoded spaces in the query-string in
>>>> their scripts, but it's blocked now anyway..
>>>
>>> Hm, actually we don't really know where the backref is placed (either
>>> in the uri-path or in the query-string), so escaping unconditionally

Good point. Hence I think the only options left are that people fix their 
configurations or
that we scan r->args and encode all spaces. But this leaves the question open: 
Encode to what? '+' or %20?
Maybe we can put this job somehow in splitout_queryargs? It would have access 
to the RewriteRule flags and
we could decide based on if RULEFLAG_ESCAPENOPLUS is set or not.

Something along the following (partly pseudocode, not optimized):

Index: modules/mappers/mod_rewrite.c
===================================================================
--- modules/mappers/mod_rewrite.c       (revision 1909373)
+++ modules/mappers/mod_rewrite.c       (working copy)
@@ -854,6 +854,31 @@
            }
         }

+        if (global_option_encode_spaces) {
+            if (flags & RULEFLAG_ESCAPENOPLUS) {
+                char *p1, p2;
+
+                p1 = apr_palloc(r->pool, strlen(r->args) * 3 + 1);
+                for (p2 = r->args; *p2; p2++) {
+                    if (*p2 == ' ') {
+                        strcpy(p1, "%20");
+                        p1 += 3;
+                    }
+                    else {
+                        *p1++ = *p2;
+                    }
+                }
+                *p1 = '\0';
+            }
+            else {
+                for (char *pos = r->args; *pos; pos++) {
+                    if (*pos == ' ') {
+                        *pos = '+';
+                    }
+                }
+            }
+        }
+
         rewritelog(r, 3, NULL, "split uri=%s -> uri=%s, args=%s", olduri,
                    r->filename, r->args ? r->args : "<none>");
     }

>>> might lead to double-escaping in the uri-path. Maybe it's simpler to
>>> remove the check and leave it to mod_proxy only..
>>
>> Do we have an example config where this currently breaks that does not 
>> forward it to a proxy backend?
> 
>>From some of the GH activity, I think there is some mod_proxy_fcgi
> sending to FPM and PHP. In this case the query is in an environment
> variable.
> 
> Are spaces over proxy_http really that much of a smoking gun? Wouldn't
> any smuggling/splitting/desynch already be fatal if it's in the
> request line? It can't be terminated early if we only allow spaces.

You mean if there are just spaces and no control characters? I am not sure if 
you can do smuggling without \r and/or \n, but
the HTTP RFC is strict about what is allowed in the request URL and literal 
spaces are definitely not allowed.
If we allow them we send a non RFC compliant HTTP request to the backend. I 
think we must not do this.

Regards

Rüdiger

Reply via email to