On Thu, Aug 1, 2024 at 9:53 PM Eric Covener <cove...@gmail.com> wrote: > > On Thu, Aug 1, 2024 at 2:47 PM Yann Ylavic <ylavic....@gmail.com> wrote: > > > > On Thu, Aug 1, 2024 at 7:57 PM Eric Covener <cove...@gmail.com> wrote: > > > > > > On Thu, Aug 1, 2024 at 1:37 PM Yann Ylavic <ylavic....@gmail.com> wrote: > > > > > > > > On Thu, Aug 1, 2024 at 5:51 PM Eric Covener <cove...@gmail.com> wrote: > > > > > > > > > > But does it leave the splitting problem with decoded %3F? > > > > > > > > Yeah but I'm not sure that it's _our_ problem, a "proxy:" r->filename > > > > does never contain the query-string in the first place, so any '?' in > > > > there (hence in SCRIPT_FILENAME) is part of the actual file path > > > > (which we'd encode for proxying any other scheme than fcgi). And the > > > > '?' will be in SCRIPT_NAME/PATH_INFO/etc too. If the scripts want the > > > > decoded uri-path they have to be consistent and consider that > > > > SCRIPT_FILENAME is nothing else than a path (no query-string, which is > > > > in ... QUERY_STRING). > > > > > > Just to recap, FPM doesn't want to find the query it in > > > SCRIPT_FILENAME, it wants to toss it away because it used to > > > accidentally end up in there (via mod_rewrite?) But this is where the > > > mismatch between what we've walked/mapped/authorized and what will be > > > executed is. > > > > If FPM wants a decoded SCRIPT_FILENAME but no '?' character? > > Decoding a path with %3f will inevitably give '?', even though it's > > still the path, why would FPM decode it as a URL and find a query > > string in there? > > I think as a workaround for what we can (or used to?) send: > https://github.com/php/php-src/blob/master/sapi/fpm/fpm/fpm_main.c#L1043 > It also means an actual file with a literal question mark cannot be > run through php-fpm.
OK, so php-fpm will parse the given "proxy:" SCRIPT_FILENAME as an URL, trimming the query-string, and determine that "apache_was_there". So there's never been a way to pass a decoded path with '?' to php-fpm using SetHandler (which before the latest changes never re-encoded the filename). So we probably can just forbid '?' like controls. But reading that code (wow..), php-fpm is also able to differentiate ProxyPass vs Sethandler and do things like: https://github.com/php/php-src/blob/master/sapi/fpm/fpm/fpm_main.c#L1172 which %-decode the PATH_INFO, supposedly extracted from SCRIPT_FILENAME. So we probably should keep encoding r->filename with ProxyPass, and come back to my previous patch which skipped it only for SetHandler? Possibly FCGI_MAY_BE_FPM() only too because for "ProxyFCGIBackendType GENERIC" we don't send the "proxy:scheme://host" part and SCRIPT_NAME/FILENAME are supposed to be the real decoded paths? But it's going to be an endless issue if we can't fix or align ProxyPass and SetHandler because of workarounds there, we have to remain bug compatible.. At some point we'll have to coordinate with them to remove that "apache_was_there"..