On Fri, Oct 7, 2022 at 9:14 PM Ruediger Pluem <rpl...@apache.org> wrote: > > On 10/7/22 7:11 PM, Stefan Eissing via dev wrote: > > > > > >> Am 07.10.2022 um 18:45 schrieb Yann Ylavic <ylavic....@gmail.com>: > >> > > > > Thanks, Yann, for the detailed explanation on how this works and that the > > default does the right thing. > > +1. I missed the default of not reusing the connection in these cases but > having the possibility to override it like the user did. > But shouldn't we default to not reusing the connection in case of a $ inside > the port as well?
We default to disabling connection reuse for any $ substitution already, the compat issue is for an explicit enablereuse=on which used to "work" because it was ignored.. How about a patch like the attached one now, which disables connection reuse definitively (regardless of enablereuse) if there is a $ substitution in the hostname or port part of the URL? The patch uses the existing "is_address_reusable" flag (set to false initially in this case), and since it's not configurable by the user we are safe from a connection reuse point of vue in any case. > > > > > My guess is that such configurations are pretty rare, as their security is > > not good in general. The only thing that comes to mind is logging a warning > > for such cases. > > +1 Done in the patch too, along with some docs update regarding this use case. Regards; Yann.
Index: modules/proxy/proxy_util.c =================================================================== --- modules/proxy/proxy_util.c (revision 1903576) +++ modules/proxy/proxy_util.c (working copy) @@ -1851,6 +1851,7 @@ PROXY_DECLARE(char *) ap_proxy_define_worker_ex(ap proxy_worker_shared *wshared; const char *ptr = NULL, *sockpath = NULL, *pdollars = NULL; apr_port_t port_of_scheme; + int disable_reuse = 0; apr_uri_t uri; /* @@ -1879,6 +1880,12 @@ PROXY_DECLARE(char *) ap_proxy_define_worker_ex(ap * to fail (e.g. "ProxyPassMatch ^/(a|b)(/.*)? http://host:port$2"). * So we trim all the $n from the :port and prepend them in uri.path * afterward for apr_uri_unparse() to restore the original URL below. + * If a dollar substitution is found in the hostname[:port] part of + * the URL, reusing address and connections in the same worker is not + * possible (the current implementation of active connections cache + * handles/assumes a single origin server:port per worker only), so + * we set disable_reuse here during parsing to take that into account + * in the worker settings below. */ #define IS_REF(x) (x[0] == '$' && apr_isdigit(x[1])) const char *pos = ap_strstr_c(ptr, "://"); @@ -1885,6 +1892,9 @@ PROXY_DECLARE(char *) ap_proxy_define_worker_ex(ap if (pos) { pos += 3; while (*pos && *pos != ':' && *pos != '/') { + if (*pos == '$') { + disable_reuse = 1; + } pos++; } if (*pos == ':') { @@ -1904,6 +1914,7 @@ PROXY_DECLARE(char *) ap_proxy_define_worker_ex(ap vec[1].iov_base = (void *)path; vec[1].iov_len = strlen(path); ptr = apr_pstrcatv(p, vec, 2, NULL); + disable_reuse = 1; } } } @@ -1999,7 +2010,7 @@ PROXY_DECLARE(char *) ap_proxy_define_worker_ex(ap wshared->port = (uri.port) ? uri.port : port_of_scheme; wshared->flush_packets = flush_off; wshared->flush_wait = PROXY_FLUSH_WAIT; - wshared->is_address_reusable = 1; + wshared->is_address_reusable = !disable_reuse; wshared->lbfactor = 100; wshared->passes = 1; wshared->fails = 1; @@ -2008,7 +2019,31 @@ PROXY_DECLARE(char *) ap_proxy_define_worker_ex(ap wshared->hash.def = ap_proxy_hashfunc(wshared->name_ex, PROXY_HASHFUNC_DEFAULT); wshared->hash.fnv = ap_proxy_hashfunc(wshared->name_ex, PROXY_HASHFUNC_FNV); wshared->was_malloced = (mask & AP_PROXY_WORKER_IS_MALLOCED) != 0; - wshared->is_name_matchable = 0; + if (mask & AP_PROXY_WORKER_IS_MATCH) { + wshared->is_name_matchable = 1; + + /* Before AP_PROXY_WORKER_IS_MATCH (< 2.4.47), a regex worker with + * dollar substitution was never matched against any actual URL, thus + * the requests fell through the generic worker. Now if a ProyPassMatch + * matches, a worker (and its parameters) is always used to determine + * the properties of the connection with the origin server. So for + * instance the same "timeout=" will be enforced for all the requests + * matched by the same ProyPassMatch worker, which is an improvement + * compared to the global/vhost [Proxy]Timeout applied by the generic + * worker. Likewise, address and connection reuse is the default for + * a ProyPassMatch worker with no dollar substitution, just like a + * "normal" worker. However to avoid DNS and connection reuse compat + * issues, connection reuse is disabled by default if there is any + * substitution in the uri-path (an explicit enablereuse=on can still + * opt-in), and reuse is even disabled definitively for substitutions + * happening in the hostname[:port] (disable_reuse was set above so + * address reuse is also disabled which will prevent enablereuse=on + * to apply anyway). + */ + if (disable_reuse || ap_strchr_c(wshared->name_ex, '$')) { + wshared->disablereuse = 1; + } + } if (sockpath) { if (PROXY_STRNCPY(wshared->uds_path, sockpath) != APR_SUCCESS) { return apr_psprintf(p, "worker uds path (%s) too long", sockpath); @@ -2028,20 +2063,6 @@ PROXY_DECLARE(char *) ap_proxy_define_worker_ex(ap (*worker)->balancer = balancer; (*worker)->s = wshared; - if (mask & AP_PROXY_WORKER_IS_MATCH) { - (*worker)->s->is_name_matchable = 1; - if (ap_strchr_c((*worker)->s->name_ex, '$')) { - /* Before AP_PROXY_WORKER_IS_MATCH (< 2.4.47), a regex worker - * with dollar substitution was never matched against the actual - * URL thus the request fell through the generic worker. To avoid - * dns and connection reuse compat issues, let's disable connection - * reuse by default, it can still be overwritten by an explicit - * enablereuse=on. - */ - (*worker)->s->disablereuse = 1; - } - } - return NULL; } @@ -2127,12 +2148,21 @@ PROXY_DECLARE(apr_status_t) ap_proxy_initialize_wo if (!worker->s->retry_set) { worker->s->retry = apr_time_from_sec(PROXY_WORKER_DEFAULT_RETRY); } - /* By default address is reusable unless DisableReuse is set */ + /* Consistently set address and connection reusabilty: when the + * address is known already to not be reusable for this worker (in + * any case, thus ignore DisableReuse), or when reuse is disabled by + * configuration. + */ if (worker->s->disablereuse) { worker->s->is_address_reusable = 0; } - else { - worker->s->is_address_reusable = 1; + else if (!worker->s->is_address_reusable) { + if (worker->s->disablereuse_set && !worker->s->disablereuse) { + ap_log_error(APLOG_MARK, APLOG_WARN, 0, s, APLOGNO() + "enablereuse=on ignored for worker %s", + ap_proxy_worker_name(p, worker)); + } + worker->s->disablereuse = 1; } /* Index: docs/manual/mod/mod_proxy.xml =================================================================== --- docs/manual/mod/mod_proxy.xml (revision 1903576) +++ docs/manual/mod/mod_proxy.xml (working copy) @@ -856,7 +856,7 @@ expressions</description> <directivesynopsis> <name>ProxyPass</name> <description>Maps remote servers into the local server URL-space</description> -<syntax>ProxyPass [<var>path</var>] !|<var>url</var> [<var>key=value</var> +<syntax>ProxyPass [<var>path</var>] !|<var>URL</var> [<var>key=value</var> <var>[key=value</var> ...]] [nocanon] [interpolate] [noquery]</syntax> <contextlist><context>server config</context><context>virtual host</context> <context>directory</context> @@ -869,7 +869,7 @@ expressions</description> proxy in the conventional sense but appears to be a mirror of the remote server. The local server is often called a <dfn>reverse proxy</dfn> or <dfn>gateway</dfn>. The <var>path</var> is the name of - a local virtual path; <var>url</var> is a partial URL for the + a local virtual path; <var>URL</var> is a partial URL for the remote server and cannot include a query string.</p> <note>It is strongly suggested to review the concept of a @@ -1126,10 +1126,10 @@ ProxyPass "/example" "http://backend.example.com" <td>Off</td> <td><p>This parameter should be used when you have a firewall between your Apache httpd and the backend server, which tends to drop inactive connections. - This flag will tell the Operating System to send <code>KEEP_ALIVE</code> - messages on inactive connections and thus prevent the firewall from dropping + This flag will tell the Operating System to send TCP <code>KEEP_ALIVE</code> + probes on inactive connections and thus prevent the firewall from dropping the connection. - To enable keepalive, set this property value to <code>On</code>. </p> + To enable TCP keepalive probes, set this property value to <code>On</code>.</p> <p>The frequency of initial and subsequent TCP keepalive probes depends on global OS settings, and may be as high as 2 hours. To be useful, the frequency configured in the OS must be smaller than the threshold used @@ -1243,7 +1243,7 @@ ProxyPass "/example" "http://backend.example.com" </td></tr> <tr><td>mapping</td> <td>-</td> - <td><p>Type of mapping between the <var>path</var> and the <var>url</var>. + <td><p>Type of mapping between the <var>path</var> and the <var>URL</var>. This determines the normalization and/or (non-)decoding that <module>mod_proxy</module> will apply to the requested <var>uri-path</var> before matching the <var>path</var>. If a mapping matches, it's committed to the <var>uri-path</var> such that all the directory @@ -1314,7 +1314,7 @@ ProxyPass "/example" "http://backend.example.com" like <code>JSESSIONID</code> or <code>PHPSESSIONID</code>, and it depends on the backend application server that support sessions. If the backend application server uses different name for cookies - and url encoded id (like servlet containers) use | to separate them. + and url-encoded id (like servlet containers) use | to separate them. The first part is for the cookie the second for the path.<br /> Available in Apache HTTP Server 2.4.4 and later. </td></tr> @@ -1467,18 +1467,21 @@ ProxyPassReverse "/mirror/foo/" "https://backend. <directivesynopsis> <name>ProxyPassMatch</name> <description>Maps remote servers into the local server URL-space using regular expressions</description> -<syntax>ProxyPassMatch [<var>regex</var>] !|<var>url</var> [<var>key=value</var> +<syntax>ProxyPassMatch [<var>regex</var>] !|<var>URL</var> [<var>key=value</var> <var>[key=value</var> ...]]</syntax> <contextlist><context>server config</context><context>virtual host</context> <context>directory</context> </contextlist> +<compatibility>Since 2.4.47, the <var>key=value</var> Parameters are honored +when the <var>URL</var> parameter contains backreference(s) (see note below). +</compatibility> <usage> <p>This directive is equivalent to <directive module="mod_proxy">ProxyPass</directive> but makes use of regular expressions instead of simple prefix matching. The - supplied regular expression is matched against the <var>url</var>, and if it + supplied regular expression is matched against the <var>URL</var>, and if it matches, the server will substitute any parenthesized matches into the given - string and use it as a new <var>url</var>.</p> + string and use it as a new <var>URL</var>.</p> <note><strong>Note: </strong>This directive cannot be used within a <code><Directory></code> context.</note> @@ -1493,20 +1496,7 @@ ProxyPassMatch "^/(.*\.gif)$" "http://backend.exam <p>will cause a local request for <code>http://example.com/foo/bar.gif</code> to be internally converted into a proxy request to <code>http://backend.example.com/foo/bar.gif</code>.</p> - <note><title>Note</title> - <p>The URL argument must be parsable as a URL <em>before</em> regexp - substitutions (as well as after). This limits the matches you can use. - For instance, if we had used</p> - <highlight language="config"> -ProxyPassMatch "^(/.*\.gif)$" "http://backend.example.com:8000$1" - </highlight> - <p>in our previous example, it would fail with a syntax error - at server startup. This is a bug (PR 46665 in the ASF bugzilla), - and the workaround is to reformulate the match:</p> - <highlight language="config"> -ProxyPassMatch "^/(.*\.gif)$" "http://backend.example.com:8000/$1" - </highlight> - </note> + <p>The <code>!</code> directive is useful in situations where you don't want to reverse-proxy a subdirectory.</p> @@ -1521,10 +1511,24 @@ ProxyPassMatch "^/(.*\.gif)$" "http://backend.exam <note> <title>Default Substitution</title> - <p>When the URL parameter doesn't use any backreferences into the regular - expression, the original URL will be appended to the URL parameter. + <p>When the <var>URL</var> parameter doesn't use any backreferences into the regular + expression, the original URL will be appended to the <var>URL</var> parameter. </p> </note> + <note> + <title><var><code>key=value</code> Parameters versus URL</var> with backreference(s)</title> + <p>Since Apache HTTP Server 2.4.47, the <code>key=value</code> Parameters + are no longer ignored in a <directive>ProxyPassMatch</directive> using + an <var>URL</var> with backreference(s). However to keep the existing + behavior regarding reuse/keepalive of backend connections (which were + never reused before for these URLs), the parameter <var>enablereuse</var> + (or <var>disablereuse</var>) default to <code>off</code> (resp. <code>on</code>) + in this case. Setting <code>enablereuse=on</code> explicitely allows to + reuse connections <strong>unless</strong> some backreference(s) belong in + the <code>authority</code> part (hostname and/or port) of the <var>URL</var> + (this condition is enforced since Apache HTTP Server 2.4.55, and produces + a warning at startup because these URLs are not reusable per se).</p> + </note> <note type="warning"> <title>Security Warning</title> @@ -1541,7 +1545,7 @@ ProxyPassMatch "^/(.*\.gif)$" "http://backend.exam <name>ProxyPassReverse</name> <description>Adjusts the URL in HTTP response headers sent from a reverse proxied server</description> -<syntax>ProxyPassReverse [<var>path</var>] <var>url</var> +<syntax>ProxyPassReverse [<var>path</var>] <var>URL</var> [interpolate]</syntax> <contextlist><context>server config</context><context>virtual host</context> <context>directory</context> @@ -1563,7 +1567,7 @@ proxied server</description> match the proxy, you must load and enable <module>mod_proxy_html</module>. </p> - <p><var>path</var> is the name of a local virtual path; <var>url</var> is a + <p><var>path</var> is the name of a local virtual path; <var>URL</var> is a partial URL for the remote server. These parameters are used the same way as for the <directive module="mod_proxy">ProxyPass</directive> directive.</p>