On Fri, Jun 30, 2023 at 4:08 PM Yann Ylavic <ylavic....@gmail.com> wrote:
>
> On Fri, Jun 30, 2023 at 2:35 PM Ruediger Pluem <rpl...@apache.org> wrote:
> >
> > On 6/29/23 5:08 PM, Yann Ylavic wrote:
> > > On Tue, Apr 25, 2023 at 1:57 PM <rpl...@apache.org> wrote:
> > >>
> > >> --- httpd/httpd/trunk/modules/proxy/proxy_util.c (original)
> > >> +++ httpd/httpd/trunk/modules/proxy/proxy_util.c Tue Apr 25 11:57:22 2023
> > >> @@ -2790,7 +2790,11 @@ ap_proxy_determine_connection(apr_pool_t
> > >>                       * The single DNS lookup is used once per worker.
> > >>                       * If dynamic change is needed then set the addr to 
> > >> NULL
> > >>                       * inside dynamic config to force the lookup.
> > >> +                     *
> > >> +                     * Clear the dns_pool before to avoid a memory leak 
> > >> in case
> > >> +                     * we did the lookup again.
> > >>                       */
> > >> +                    apr_pool_clear(worker->cp->dns_pool);
> > >>                      err = apr_sockaddr_info_get(&addr,
> > >>                                                  conn->hostname, 
> > >> APR_UNSPEC,
> > >>                                                  conn->port, 0,
> > >
> > > I'm not sure we can clear the dns_pool, some conn->addr allocated on
> > > it may be still in use by other threads.
> > > While reviewing your backport proposal, I noticed that
> > > worker->cp->addr was used concurrently in several places in mod_proxy
> > > with no particular care, so I started to code a follow up, but it
> > > needs that apr_pool_clear() too somewhere to avoid the leak and
> > > figured it may not be safe (like the above).
> > > I attach the patch here to show what I mean by insecure
> > > worker->cp->addr usage if we start to modify it concurrently, though
> > > more work is needed it seems..
> > >
> > > If I am correct we need to refcount the worker->cp->addr (or rather a
> > > worker->address struct). I had a patch which does that to handle a
> > > proxy address_ttl parameter (above which address is renewed), I can
> > > try to revive and mix it with the attached one, in a PR. WDYT?
> >
> > Thanks for the good catch. With regards to mod_proxy_ajp I think we should
> > use conn->addr instead of worker->cp->addr. We cannot be sure that 
> > worker->cp->addr is really set.
> > I guess it did not cause any problems so far as in the AJP case we always 
> > reuse connections by default
> > and hence worker->cp->addr is set.
> >
> > I think using a refcount could be burdensome as we need to ensure correct 
> > increase and decrease of the refcount and
> > we might have 3rd party modules that do not play well. Hence I propose a 
> > copy approach (but thinking of it, it might
> > fail in the same or other ways with 3rd party modules using 
> > worker->cp->addr directly in a similar way like mod_proxy_ajp does today)
>
> Let me look at what I can come up with and compare with what you
> propose below (which does not look trivial either from a quick look).
> That is to say, I need a bit more time to have an opinion here :)

So I went with https://github.com/apache/httpd/pull/367 (sorry, not
much detailed commits yet), and specifically the first commit for the
reuse / ttl / force-expiring of the worker address(es).

The new function is (named after ap_proxy_determine_connection):

/*
 * Resolve an address, reusing the one of the worker if any.
 * @param proxy_function calling proxy scheme (http, ajp, ...)
 * @param conn proxy connection the address is used for
 * @param hostname host to resolve (should be the worker's if reusable)
 * @param hostport port to resolve (should be the worker's if reusable)
 * @param r current request (if any)
 * @param s current server (or NULL if r != NULL and ap_proxyerror()
 *                          should be called on error)
 * @return APR_SUCCESS or an error
 */
PROXY_DECLARE(apr_status_t) ap_proxy_determine_address(const char
*proxy_function,
                                                       proxy_conn_rec *conn,
                                                       const char *hostname,
                                                       apr_port_t hostport,
                                                       request_rec *r,
                                                       server_rec *s);

and it takes a proxy_conn_rec only, which simplifies things and which
we can provide everywhere after all (see the changes in mod_proxy_ftp
and mod_proxy_hcheck).

A new struct proxy_address (refcounted, TTL'ed) is defined to store an
address (when needed only, i.e. worker->is_address_reusable). It's
allocated as a subpool of worker->cp->dns_pool and looks like this:
struct proxy_address {
    apr_sockaddr_t *addr;       /* Remote address info */
    const char *hostname;       /* Remote host name */
    apr_port_t hostport;        /* Remote host port */
    apr_uint32_t refcount;      /* Number of conns and/or worker using it */
    apr_uint32_t expiry;        /* Expiry timestamp (seconds to
proxy_start_time) */
};
Each "reusable" proxy_worker and proxy_conn_rec owns a refcount to its
current address such that the last owner destroys the old one when
replacing it with the new one, which all happens in
ap_proxy_determine_address().

There are 2 ways an address can expire, either its TTL is over
(configured in the worker's addressTTL= parameter), or by forcing
(conn->)address->expiry = 0 (e.g. when ap_proxy_connect_backend()
tries to reconnect with a new DNS lookup, per your changes in
r1909451). There is a proxy_address_set_expired() helper for the
latter case to synchronize and avoid lookup storms.

The address expiry time is shared between all the children processes
in worker->s->address_expiry, even though there is a "copy" in each
address->expiry field to allow for forcing the DNS lookup and
synchronizing per child.
Because I didn't want to introduce the first use of the apr_atomic_*64
API in httpd (not sure which first APR version introduced it and only
the latest versions are reliable anyway), the expiry is 32bit
timestamp in seconds relative to some new proxy_start_time epoch
(initialized at httpd startup). The timestamps are taken from
apr_time_now() which is not a monotonic clock so it's not ideal (don't
change your system's time while httpd is running!), I think we need an
ap_monotonic_now() but that's another story (which I drafted in PR
#294 for mpm_event FWIW).

Since conn->addr and conn->hostname (i.e. conn->address->*) always
point to some valid memory (can be NULL before
ap_proxy_determine_address() still), they can be used safely during
connection processing regardless of worker->address changing
underneath, the eventual change will be taken into account in the
proxy_conn_rec the next time it's reused (closing the "old" socket if
any).

Regarding worker->s->addr, it's now legacy and set by
ap_proxy_determine_address() only if it's NULL (that is at startup or
if the user explicitly NULLed it). The first address' refcount is set
to 2 such that worker->s->addr is never deallocated, users can
continue to use it although it will not necessarily be the latest/real
address in use. That way we do not break users, and they can switch to
using conn->addr instead if needed.
Note that *worker->address is never safe to use outside a
PROXY_THREAD_[UN]LOCK() critical section, but changing *conn->address
is and does the same if it's the current address, which is useful for
the safety/correctness (hopefully) of ap_proxy_determine_address()
BTW.

There are other changes in the first commit for the lifetime of the
conn->uds_path and conn->forward, but the other commits are not really
related (I happened to make these changes while walking the code..).

WDYT?


Regards;
Yann.

Reply via email to