On Mon, Jun 22, 2020 at 8:28 AM Yann Ylavic <[email protected]> wrote:
>
> > Allow for URI-path pre_translate_name before (and/or instead of) decoding.
> >
> > Only if no hook takes "ownership" of the URI (returning OK), apply
> > percent decoding for the rest of request handling. Otherwise r->uri remains
> > encoded meaning that further location/directory/file/if/.. sections (walks)
> > should that into account.
>
> This change allows a module to avoid URI decoding in
> pre_translate_name, such that r->uri remains encoded for the whole
> request handling.
>
> For instance (same series), mod_proxy will use that hook when
> "ProxyMappingDecoded off" is configured, and return OK if a ProxyPass
> matches. The decoding was already disabled (before this commit) for
> the forward-proxy case (ProxyRequests on) since r->proxyreq was set at
> an early stage, but now it "works" for the reverse-proxy case too.
>
> This means that when "ProxyMappingDecoded off" is configured (the
> default is still "on", though), for instance a <Location URI-path, or
> a fixup RewriteRule or a non-early RequestHeader rule will have to be
> expressed with the encoded form since reserved characters will not be
> decoded as before (note that unreserved characters will always be
> decoded anyway).
>
> I don't know how much it can break existing configurations, and wonder
> if we should have a core directive that still controls whether r->uri
> should be decoded, regardless of ProxyMappingDecoded or any further
> module pre_translate_name hook with its own directives.
>
> For mod_proxy this is not really an issue to have r->uri modified
> after the (pre_)translate phase, because the handler will later work
> on r->filename regardless (the "proxy:..." set by proxy_detect or
> proxy_trans), so maybe we should simply always decode r->uri after the
> pre_translate hooks have run. Though I find it easier in a proxy
> context to match a URI-path (LocationMatch, RewriteRule..) in its
> encoded form, and not worry about original vs decoded separators. That
> may be just my preference, but a new core directive looks sane to me..
>
> Thoughts?

I have not followed too closely so take with a grain of salt / for the
sake of argument:

>From an externals perspective (maybe being detached here is a benefit)
it might be better to have something like "ProxyUseOriginalURL" that
has the following caveats:
1. it's partially decoded
2. it is required for whatever the servlet mapping option is (iiuc?)
3. it will affect the string used in comparisons/substitutions for
<Location*>, Rewrite, <If>, (*CAUTION*)
   - Some details is needed here about the partial decoding, for
example %2f or multiple leading slashes so "simple" context root
configs can be written without too much paranoia
4. If there are risks of re-injecting URL's with [PT] or per-dir
rewrites try to explain that it's a complex combination to worry
about.

I would not worry too much about "existing configs" as this is all
opt-in and it's probably a lost cause anyway.

I would instead give people new built-in vars in the important places
(rewrite, expr) to be able to match the partially decoded URI that the
proxy will be working on so a savvy user can safely interoperate
without monstrous regexes.

Reply via email to