On Tue, May 9, 2023 at 3:14 PM Ruediger Pluem <rpl...@apache.org> wrote:
>
>
>
> On 5/9/23 8:01 PM, Eric Covener wrote:
> > On Tue, May 9, 2023 at 11:51 AM Ruediger Pluem <rpl...@apache.org> wrote:
> >>
> >>
> >>
> >> On 5/9/23 4:33 PM, Yann Ylavic wrote:
> >>> On Tue, May 9, 2023 at 2:10 PM Yann Ylavic <ylavic....@gmail.com> wrote:
> >>>>
> >>>> On Tue, May 9, 2023 at 12:55 PM Ruediger Pluem <rpl...@apache.org> wrote:
> >>>>>
> >>>>> On 5/9/23 12:16 PM, Eric Covener wrote:
> >>>>>> Still getting feedback in the PR about breakage. Any thoughts on 
> >>>>>> options here, like allowing spaces or encoding rather than failing?
> >>>>>
> >>>>> Allowing spaces is out of question for me as it creates an invalid 
> >>>>> request and opens the door to response splitting. At best we
> >>>>> could encode automatically. It is really a good question if we could 
> >>>>> not make BCTLS the default.
> >>>>
> >>>> BCTLS by default looks fine to me, it changes the behaviour for users
> >>>> that (used to) expect/handle decoded spaces in the query-string in
> >>>> their scripts, but it's blocked now anyway..
> >>>
> >>> Hm, actually we don't really know where the backref is placed (either
> >>> in the uri-path or in the query-string), so escaping unconditionally
>
> Good point. Hence I think the only options left are that people fix their 
> configurations or
> that we scan r->args and encode all spaces. But this leaves the question 
> open: Encode to what? '+' or %20?
> Maybe we can put this job somehow in splitout_queryargs? It would have access 
> to the RewriteRule flags and
> we could decide based on if RULEFLAG_ESCAPENOPLUS is set or not.
>
> Something along the following (partly pseudocode, not optimized):
>
> Index: modules/mappers/mod_rewrite.c
> ===================================================================
> --- modules/mappers/mod_rewrite.c       (revision 1909373)
> +++ modules/mappers/mod_rewrite.c       (working copy)
> @@ -854,6 +854,31 @@
>             }
>          }
>
> +        if (global_option_encode_spaces) {
> +            if (flags & RULEFLAG_ESCAPENOPLUS) {
> +                char *p1, p2;
> +
> +                p1 = apr_palloc(r->pool, strlen(r->args) * 3 + 1);
> +                for (p2 = r->args; *p2; p2++) {
> +                    if (*p2 == ' ') {
> +                        strcpy(p1, "%20");
> +                        p1 += 3;
> +                    }
> +                    else {
> +                        *p1++ = *p2;
> +                    }
> +                }
> +                *p1 = '\0';
> +            }
> +            else {
> +                for (char *pos = r->args; *pos; pos++) {
> +                    if (*pos == ' ') {
> +                        *pos = '+';
> +                    }
> +                }
> +            }
> +        }
> +
>          rewritelog(r, 3, NULL, "split uri=%s -> uri=%s, args=%s", olduri,
>                     r->filename, r->args ? r->args : "<none>");
>      }
>
> >>> might lead to double-escaping in the uri-path. Maybe it's simpler to
> >>> remove the check and leave it to mod_proxy only..
> >>
> >> Do we have an example config where this currently breaks that does not 
> >> forward it to a proxy backend?
> >
> >>From some of the GH activity, I think there is some mod_proxy_fcgi
> > sending to FPM and PHP. In this case the query is in an environment
> > variable.
> >
> > Are spaces over proxy_http really that much of a smoking gun? Wouldn't
> > any smuggling/splitting/desynch already be fatal if it's in the
> > request line? It can't be terminated early if we only allow spaces.
>
> You mean if there are just spaces and no control characters? I am not sure if 
> you can do smuggling without \r and/or \n, but
> the HTTP RFC is strict about what is allowed in the request URL and literal 
> spaces are definitely not allowed.
> If we allow them we send a non RFC compliant HTTP request to the backend. I 
> think we must not do this.

I was assuming the 403s are coming from mod_rewrite, not the proxy
modules, and that we'd only change the former.   But it's not 100%
clear for the people following up in the PR.

If it's true, the proxy modules would then still return an error
before putting unencoded/decoded spaces into the query.

Reply via email to