Hi Alexander,

On Fri, Nov 10, 2023 at 08:44:31PM +0000, Stephan, Alexander wrote:
> > I don't see how this is possible:
> >
> >        list_for_each_entry(srv_tlv, &src->pp_tlvs, list) {
> >                if (srv_tlv == NULL)
> >                       break;
> 
> > For srv_tlv to be NULL, it would mean that you visited (NULL)->list at some
> > point, and apparently pp_tlvs is always manipulated with
> > LIST_APPEND() only, so I don't see how you could end up with something like
> > last->next = (NULL)->list. Are you sure it was not a side effect of a
> > debugging session with some temporary code in it ?
> > I'd be interested in knowing if you can reproduce it so that we can find
> > the root cause (and hopefully address it).
> 
> I finally got back to this again. You should be able to reproduce it quite
> easily. After removing the 'break' above, you can run the regression tests.
(...)
> It should crash immediately with the following seg fault:
> 
> Thread 1 "haproxy" received signal SIGSEGV, Segmentation fault.
> srv_settings_cpy (srv=srv@entry=0xbed7e0, src=src@entry=0xbebce0, 
> srv_tmpl=srv_tmpl@entry=0) at src/server.c:2526
> 2526                    new_srv_tlv->fmt_string = strdup(srv_tlv->fmt_string);
> 
> Overall, I am not too confident with the order of invocations in the server.
> Maybe we have some initializations mixed up and unintuitive some
> dependencies.

Thanks! I could indeed reproduce it and the cause is indeed a missing
initialization, because the very first element is NULL ;-)  I found it,
there's still an ugly manual initialization of the default server made
in proxy_preset_defaults(). One day we'll have a function to reinitialize
a server... This patch fixes it, I'm going to merge it and get rid of
the test in the loop:

diff --git a/src/proxy.c b/src/proxy.c
index 7ff087190..544c22f82 100644
--- a/src/proxy.c
+++ b/src/proxy.c
@@ -1469,6 +1469,7 @@ void proxy_preset_defaults(struct proxy *defproxy)
        defproxy->defsrv.onerror = DEF_HANA_ONERR;
        defproxy->defsrv.consecutive_errors_limit = DEF_HANA_ERRLIMIT;
        defproxy->defsrv.uweight = defproxy->defsrv.iweight = 1;
+       LIST_INIT(&defproxy->defsrv.pp_tlvs);
 
        defproxy->email_alert.level = LOG_ALERT;
        defproxy->load_server_state_from_file = PR_SRV_STATE_FILE_UNSPEC;

> > Another point, in make_proxy_line_v2() I'm seeing that you've placed
> > everything under "if (strm)". I understand why, it's because you didn't
> > want to call build_logline() without a stream. That made me think "hmmm
> > we already have the ability to do that", and I found it, you can use
> > sess_build_logline() instead, that takes the session separately, and
> > supports an optional stream. That would be useful for health checks for
> > example, since they don't have streams. But there's a catch: in function
> > make_proxy_line_v2() we don't have the session at the moment, and
> > build_logline() extracts it from the stream. Normally during a session
> > establishment, the session is present in connection->owner, thus
> > remote->owner in make_proxy_line_v2(). I suggest that you give this a
> > try to confirm that it works for you, this way we'd be sure that health
> > checks can also send the ppv2 fields. But that's not critical given that
> > there's no bad behavior right now, it's just that fields will be ignored,
> > so this can be done in a subsequent patch.
> 
> I looked into it again. I did not see an obvious way to retrieve the current
> or pass the session as parameter.
> 
> I mean, the easy option, as you said, would be to use remote->owner, which
> would require replacing if (stream) with if (remote) to prevent the NULL
> dereference.
> But we still want to send out TLVs when there is no remote.

Hmm you're right, I misanalyzed the situation there. Actually the
conn_send_proxy() function does have the current connection we're 
working on, but the functions it calls (make_proxy_line()) doesn't
have it. However we pass either the remote connection when there is
a stream, or the local connection when there is no stream, hence
remote->owner should always be valid, and apparently it's done
precisely for checks:

    /* The target server expects a PROXY line to be sent first.
     * If the send_proxy_ofs is negative, it corresponds to the
     * offset to start sending from then end of the proxy string
     * (which is recomputed every time since it's constant). If
     * it is positive, it means we have to send from the start.
     * We can only send a "normal" PROXY line when the connection
     * is attached to a stream connector. Otherwise we can only
     * send a LOCAL line (eg: for use with health checks).
     */

    if (sc && sc_strm(sc)) {
            ret = make_proxy_line(trash.area, trash.size,
                                  objt_server(conn->target),
                                  sc_conn(sc_opposite(sc)),
                                  __sc_strm(sc));
    }
    else {
            /* The target server expects a LOCAL line to be sent first. 
Retrieving
             * local or remote addresses may fail until the connection is 
established.
             */
            if (!conn_get_src(conn) || !conn_get_dst(conn))
                    goto out_wait;

            ret = make_proxy_line(trash.area, trash.size,
                                  objt_server(conn->target), conn,
                                  NULL);
    }

Thus the only case where the remote conn does not exist is when being
established from an applet (e.g. Lua or httpclient).

> So maybe I could
> change it such that it uses remote->owner when available and otherwise, falls
> back to using the session from the stream?
> Maybe something along the lines of
>        struct session *sess = (remote != NULL) ? remote->owner : strm->sess;
>        replace->data = sess_build_logline(sess, strm, replace->area, 
> replace->size, &srv_tlv->fmt);
> ?
> It seems to achieve the behavior you described regarding the health checks,
> right? If that's alright, I will send the corresponding patch.

I'd do it slightly differently but the principle remains the same: if
we have a stream we always have a session, otherwise the remote's owner
should be used:

     struct session *sess = strm ? strm->sess : remote->owner;
     replace->data = sess_build_logline(sess, strm, replace->area, 
replace->size, &srv_tlv->fmt);

Please let me know if that works for you!

Thanks,
Willy

Reply via email to