On Fri, Oct 7, 2022 at 9:14 PM Ruediger Pluem <rpl...@apache.org> wrote:
>
> On 10/7/22 7:11 PM, Stefan Eissing via dev wrote:
> >
> >
> >> Am 07.10.2022 um 18:45 schrieb Yann Ylavic <ylavic....@gmail.com>:
> >>
> >
> > Thanks, Yann, for the detailed explanation on how this works and that the 
> > default does the right thing.
>
> +1. I missed the default of not reusing the connection in these cases but 
> having the possibility to override it like the user did.
> But shouldn't we default to not reusing the connection in case of a $ inside 
> the port as well?

We default to disabling connection reuse for any $ substitution
already, the compat issue is for an explicit enablereuse=on which used
to "work" because it was ignored..

How about a patch like the attached one now, which disables connection
reuse definitively (regardless of enablereuse) if there is a $
substitution in the hostname or port part of the URL?
The patch uses the existing "is_address_reusable" flag (set to false
initially in this case), and since it's not configurable by the user
we are safe from a connection reuse point of vue in any case.

>
> >
> > My guess is that such configurations are pretty rare, as their security is 
> > not good in general. The only thing that comes to mind is logging a warning 
> > for such cases.
>
> +1

Done in the patch too, along with some docs update regarding this use case.


Regards;
Yann.
Index: modules/proxy/proxy_util.c
===================================================================
--- modules/proxy/proxy_util.c	(revision 1903576)
+++ modules/proxy/proxy_util.c	(working copy)
@@ -1851,6 +1851,7 @@ PROXY_DECLARE(char *) ap_proxy_define_worker_ex(ap
     proxy_worker_shared *wshared;
     const char *ptr = NULL, *sockpath = NULL, *pdollars = NULL;
     apr_port_t port_of_scheme;
+    int disable_reuse = 0;
     apr_uri_t uri;
 
     /*
@@ -1879,6 +1880,12 @@ PROXY_DECLARE(char *) ap_proxy_define_worker_ex(ap
          * to fail (e.g. "ProxyPassMatch ^/(a|b)(/.*)? http://host:port$2";).
          * So we trim all the $n from the :port and prepend them in uri.path
          * afterward for apr_uri_unparse() to restore the original URL below.
+         * If a dollar substitution is found in the hostname[:port] part of
+         * the URL, reusing address and connections in the same worker is not
+         * possible (the current implementation of active connections cache
+         * handles/assumes a single origin server:port per worker only), so
+         * we set disable_reuse here during parsing to take that into account
+         * in the worker settings below.
          */
 #define IS_REF(x) (x[0] == '$' && apr_isdigit(x[1]))
         const char *pos = ap_strstr_c(ptr, "://");
@@ -1885,6 +1892,9 @@ PROXY_DECLARE(char *) ap_proxy_define_worker_ex(ap
         if (pos) {
             pos += 3;
             while (*pos && *pos != ':' && *pos != '/') {
+                if (*pos == '$') {
+                    disable_reuse = 1;
+                }
                 pos++;
             }
             if (*pos == ':') {
@@ -1904,6 +1914,7 @@ PROXY_DECLARE(char *) ap_proxy_define_worker_ex(ap
                     vec[1].iov_base = (void *)path;
                     vec[1].iov_len = strlen(path);
                     ptr = apr_pstrcatv(p, vec, 2, NULL);
+                    disable_reuse = 1;
                 }
             }
         }
@@ -1999,7 +2010,7 @@ PROXY_DECLARE(char *) ap_proxy_define_worker_ex(ap
     wshared->port = (uri.port) ? uri.port : port_of_scheme;
     wshared->flush_packets = flush_off;
     wshared->flush_wait = PROXY_FLUSH_WAIT;
-    wshared->is_address_reusable = 1;
+    wshared->is_address_reusable = !disable_reuse;
     wshared->lbfactor = 100;
     wshared->passes = 1;
     wshared->fails = 1;
@@ -2008,7 +2019,31 @@ PROXY_DECLARE(char *) ap_proxy_define_worker_ex(ap
     wshared->hash.def = ap_proxy_hashfunc(wshared->name_ex, PROXY_HASHFUNC_DEFAULT);
     wshared->hash.fnv = ap_proxy_hashfunc(wshared->name_ex, PROXY_HASHFUNC_FNV);
     wshared->was_malloced = (mask & AP_PROXY_WORKER_IS_MALLOCED) != 0;
-    wshared->is_name_matchable = 0;
+    if (mask & AP_PROXY_WORKER_IS_MATCH) {
+        wshared->is_name_matchable = 1;
+
+        /* Before AP_PROXY_WORKER_IS_MATCH (< 2.4.47), a regex worker with
+         * dollar substitution was never matched against any actual URL, thus
+         * the requests fell through the generic worker. Now if a ProyPassMatch
+         * matches, a worker (and its parameters) is always used to determine
+         * the properties of the connection with the origin server. So for
+         * instance the same "timeout=" will be enforced for all the requests
+         * matched by the same ProyPassMatch worker, which is an improvement
+         * compared to the global/vhost [Proxy]Timeout applied by the generic
+         * worker. Likewise, address and connection reuse is the default for
+         * a ProyPassMatch worker with no dollar substitution, just like a
+         * "normal" worker. However to avoid DNS and connection reuse compat
+         * issues, connection reuse is disabled by default if there is any
+         * substitution in the uri-path (an explicit enablereuse=on can still
+         * opt-in), and reuse is even disabled definitively for substitutions
+         * happening in the hostname[:port] (disable_reuse was set above so
+         * address reuse is also disabled which will prevent enablereuse=on
+         * to apply anyway).
+         */
+        if (disable_reuse || ap_strchr_c(wshared->name_ex, '$')) {
+            wshared->disablereuse = 1;
+        }
+    }
     if (sockpath) {
         if (PROXY_STRNCPY(wshared->uds_path, sockpath) != APR_SUCCESS) {
             return apr_psprintf(p, "worker uds path (%s) too long", sockpath);
@@ -2028,20 +2063,6 @@ PROXY_DECLARE(char *) ap_proxy_define_worker_ex(ap
     (*worker)->balancer = balancer;
     (*worker)->s = wshared;
 
-    if (mask & AP_PROXY_WORKER_IS_MATCH) {
-        (*worker)->s->is_name_matchable = 1;
-        if (ap_strchr_c((*worker)->s->name_ex, '$')) {
-            /* Before AP_PROXY_WORKER_IS_MATCH (< 2.4.47), a regex worker
-             * with dollar substitution was never matched against the actual
-             * URL thus the request fell through the generic worker. To avoid
-             * dns and connection reuse compat issues, let's disable connection
-             * reuse by default, it can still be overwritten by an explicit
-             * enablereuse=on.
-             */
-            (*worker)->s->disablereuse = 1;
-        }
-    }
-
     return NULL;
 }
 
@@ -2127,12 +2148,21 @@ PROXY_DECLARE(apr_status_t) ap_proxy_initialize_wo
         if (!worker->s->retry_set) {
             worker->s->retry = apr_time_from_sec(PROXY_WORKER_DEFAULT_RETRY);
         }
-        /* By default address is reusable unless DisableReuse is set */
+        /* Consistently set address and connection reusabilty: when the
+         * address is known already to not be reusable for this worker (in
+         * any case, thus ignore DisableReuse), or when reuse is disabled by
+         * configuration.
+         */
         if (worker->s->disablereuse) {
             worker->s->is_address_reusable = 0;
         }
-        else {
-            worker->s->is_address_reusable = 1;
+        else if (!worker->s->is_address_reusable) {
+            if (worker->s->disablereuse_set && !worker->s->disablereuse) {
+                ap_log_error(APLOG_MARK, APLOG_WARN, 0, s, APLOGNO()
+                             "enablereuse=on ignored for worker %s",
+                             ap_proxy_worker_name(p, worker));
+            }
+            worker->s->disablereuse = 1;
         }
 
         /*
Index: docs/manual/mod/mod_proxy.xml
===================================================================
--- docs/manual/mod/mod_proxy.xml	(revision 1903576)
+++ docs/manual/mod/mod_proxy.xml	(working copy)
@@ -856,7 +856,7 @@ expressions</description>
 <directivesynopsis>
 <name>ProxyPass</name>
 <description>Maps remote servers into the local server URL-space</description>
-<syntax>ProxyPass [<var>path</var>] !|<var>url</var> [<var>key=value</var>
+<syntax>ProxyPass [<var>path</var>] !|<var>URL</var> [<var>key=value</var>
   <var>[key=value</var> ...]] [nocanon] [interpolate] [noquery]</syntax>
 <contextlist><context>server config</context><context>virtual host</context>
 <context>directory</context>
@@ -869,7 +869,7 @@ expressions</description>
     proxy in the conventional sense but appears to be a mirror of the
     remote server. The local server is often called a <dfn>reverse
     proxy</dfn> or <dfn>gateway</dfn>. The <var>path</var> is the name of
-    a local virtual path; <var>url</var> is a partial URL for the
+    a local virtual path; <var>URL</var> is a partial URL for the
     remote server and cannot include a query string.</p>
 
     <note>It is strongly suggested to review the concept of a
@@ -1126,10 +1126,10 @@ ProxyPass "/example" "http://backend.example.com";
         <td>Off</td>
         <td><p>This parameter should be used when you have a firewall between your
     Apache httpd and the backend server, which tends to drop inactive connections.
-    This flag will tell the Operating System to send <code>KEEP_ALIVE</code>
-    messages on inactive connections and thus prevent the firewall from dropping
+    This flag will tell the Operating System to send TCP <code>KEEP_ALIVE</code>
+    probes on inactive connections and thus prevent the firewall from dropping
     the connection.
-    To enable keepalive, set this property value to <code>On</code>. </p>
+    To enable TCP keepalive probes, set this property value to <code>On</code>.</p>
     <p>The frequency of initial and subsequent TCP keepalive probes
     depends on global OS settings, and may be as high as 2 hours. To be useful,
     the frequency configured in the OS must be smaller than the threshold used
@@ -1243,7 +1243,7 @@ ProxyPass "/example" "http://backend.example.com";
     </td></tr>
     <tr><td>mapping</td>
         <td>-</td>
-        <td><p>Type of mapping between the <var>path</var> and the <var>url</var>.
+        <td><p>Type of mapping between the <var>path</var> and the <var>URL</var>.
         This determines the normalization and/or (non-)decoding that <module>mod_proxy</module>
         will apply to the requested <var>uri-path</var> before matching the <var>path</var>. If
         a mapping matches, it's committed to the <var>uri-path</var> such that all the directory
@@ -1314,7 +1314,7 @@ ProxyPass "/example" "http://backend.example.com";
         like <code>JSESSIONID</code> or <code>PHPSESSIONID</code>,
         and it depends on the backend application server that support sessions.
         If the backend application server uses different name for cookies
-        and url encoded id (like servlet containers) use | to separate them.
+        and url-encoded id (like servlet containers) use | to separate them.
         The first part is for the cookie the second for the path.<br />
         Available in Apache HTTP Server 2.4.4 and later.
     </td></tr>
@@ -1467,18 +1467,21 @@ ProxyPassReverse  "/mirror/foo/" "https://backend.
 <directivesynopsis>
 <name>ProxyPassMatch</name>
 <description>Maps remote servers into the local server URL-space using regular expressions</description>
-<syntax>ProxyPassMatch [<var>regex</var>] !|<var>url</var> [<var>key=value</var>
+<syntax>ProxyPassMatch [<var>regex</var>] !|<var>URL</var> [<var>key=value</var>
 	<var>[key=value</var> ...]]</syntax>
 <contextlist><context>server config</context><context>virtual host</context>
 <context>directory</context>
 </contextlist>
+<compatibility>Since 2.4.47, the <var>key=value</var> Parameters are honored
+when the <var>URL</var> parameter contains backreference(s) (see note below).
+</compatibility>
 
 <usage>
     <p>This directive is equivalent to <directive module="mod_proxy">ProxyPass</directive>
        but makes use of regular expressions instead of simple prefix matching. The
-       supplied regular expression is matched against the <var>url</var>, and if it
+       supplied regular expression is matched against the <var>URL</var>, and if it
        matches, the server will substitute any parenthesized matches into the given
-       string and use it as a new <var>url</var>.</p>
+       string and use it as a new <var>URL</var>.</p>
 
     <note><strong>Note: </strong>This directive cannot be used within a
     <code>&lt;Directory&gt;</code> context.</note>
@@ -1493,20 +1496,7 @@ ProxyPassMatch "^/(.*\.gif)$" "http://backend.exam
     <p>will cause a local request for
     <code>http://example.com/foo/bar.gif</code> to be internally converted
     into a proxy request to <code>http://backend.example.com/foo/bar.gif</code>.</p>
-    <note><title>Note</title>
-      <p>The URL argument must be parsable as a URL <em>before</em> regexp
-      substitutions (as well as after).  This limits the matches you can use.
-      For instance, if we had used</p>
-      <highlight language="config">
-ProxyPassMatch "^(/.*\.gif)$" "http://backend.example.com:8000$1";
-      </highlight>
-      <p>in our previous example, it would fail with a syntax error
-      at server startup.  This is a bug (PR 46665 in the ASF bugzilla),
-      and the workaround is to reformulate the match:</p>
-      <highlight language="config">
-ProxyPassMatch "^/(.*\.gif)$" "http://backend.example.com:8000/$1";
-      </highlight>
-    </note>
+
     <p>The <code>!</code> directive is useful in situations where you don't want
     to reverse-proxy a subdirectory.</p>
 
@@ -1521,10 +1511,24 @@ ProxyPassMatch "^/(.*\.gif)$" "http://backend.exam
 
     <note>
       <title>Default Substitution</title>
-      <p>When the URL parameter doesn't use any backreferences into the regular
-      expression, the original URL will be appended to the URL parameter.
+      <p>When the <var>URL</var> parameter doesn't use any backreferences into the regular
+      expression, the original URL will be appended to the <var>URL</var> parameter.
       </p>
     </note>
+    <note>
+      <title><var><code>key=value</code> Parameters versus URL</var> with backreference(s)</title>
+      <p>Since Apache HTTP Server 2.4.47, the <code>key=value</code> Parameters
+      are no longer ignored in a <directive>ProxyPassMatch</directive> using
+      an <var>URL</var> with backreference(s). However to keep the existing
+      behavior regarding reuse/keepalive of backend connections (which were
+      never reused before for these URLs), the parameter <var>enablereuse</var>
+      (or <var>disablereuse</var>) default to <code>off</code> (resp. <code>on</code>)
+      in this case. Setting <code>enablereuse=on</code> explicitely allows to
+      reuse connections <strong>unless</strong> some backreference(s) belong in
+      the <code>authority</code> part (hostname and/or port) of the <var>URL</var>
+      (this condition is enforced since Apache HTTP Server 2.4.55, and produces
+      a warning at startup because these URLs are not reusable per se).</p>
+    </note>
 
     <note type="warning">
       <title>Security Warning</title>
@@ -1541,7 +1545,7 @@ ProxyPassMatch "^/(.*\.gif)$" "http://backend.exam
 <name>ProxyPassReverse</name>
 <description>Adjusts the URL in HTTP response headers sent from a reverse
 proxied server</description>
-<syntax>ProxyPassReverse [<var>path</var>] <var>url</var>
+<syntax>ProxyPassReverse [<var>path</var>] <var>URL</var>
 [interpolate]</syntax>
 <contextlist><context>server config</context><context>virtual host</context>
 <context>directory</context>
@@ -1563,7 +1567,7 @@ proxied server</description>
     match the proxy, you must load and enable <module>mod_proxy_html</module>.
     </p>
 
-    <p><var>path</var> is the name of a local virtual path; <var>url</var> is a
+    <p><var>path</var> is the name of a local virtual path; <var>URL</var> is a
     partial URL for the remote server. 
     These parameters are used the same way as for the
     <directive module="mod_proxy">ProxyPass</directive> directive.</p>

Reply via email to