Re: HAProxy 1.8.X crashing

Olivier Houchard Fri, 13 Apr 2018 06:57:16 -0700

Hi,

On Thu, Apr 12, 2018 at 11:33:43PM +0200, Willy Tarreau wrote:
> That may be the problem. But if a mux fails to initialize, then we can't
> destroy them either ? Is cs_destroy() the problem here maybe ? If so, I
> suspect that this approach would be more robust and would better match
> the validity tests on conn->mux that are performed at many places in the
> connection code :
> 
>       if (cs->conn->mux) {
>                   cs->conn->mux->detach(cs);
>       } else {
>             conn_stop_tracking(cs->conn);
>             conn_xprt_close(cs->conn);
>             conn_free(cs->conn);
>         }
>         cs_free(cs);
> 
> > >   - one above the return SF_ERR_INTERNAL above in connect_server() to
> > >     handle the case where the connection already exists but not the mux.
> > 
> > I'm not sure how this can happen, and what we should do if that happens.
> 
> At least in the current situation we pre-create a connection just to start
> to fill some information (source/destination address), without knowing the
> mux yet. It's an initialization process, it may look beautiful or ugly
> depending on the perspective, but in any case we must ensure that the
> code overall remains reliable in this situation. And to me it looks less
> ugly than picking a possibly wrong mux just to have a valid pointer
> avoiding a segfault.
> 
> But if it turns out that stabilizing the destroy path (or error path) is
> not sufficient and that it breaks somewhere else, we can well adopt this
> solution for the short term and see how to address it for the long term;
> we already know that the outgoing connection code is severely limited to
> the point of currently preventing us from using H2 on the backend, and
> that's exactly why we're currently working on it.


Ok, here is a patch that does exactly what you suggest. I'm not entirely
happy with it, but it'll do the job, as a stopgap. I want this crash
fixed :)

Olivier

>From 84f50217a88b2c8bb5152045e912a0e39f12d204 Mon Sep 17 00:00:00 2001
From: Olivier Houchard <[email protected]>
Date: Fri, 13 Apr 2018 15:50:27 +0200
Subject: [PATCH] BUG/MINOR: connection: Make sure we have a mux before calling
 detach().

In some cases, we call cs_destroy() very early, so early the connection
doesn't yet have a mux, so we can't call mux->detach(). In this case,
just destroy the associated connection.

This should be backported to 1.8.
---
 include/proto/connection.h | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/include/proto/connection.h b/include/proto/connection.h
index bc8d88484..8566736fd 100644
--- a/include/proto/connection.h
+++ b/include/proto/connection.h
@@ -699,7 +699,20 @@ static inline void conn_free(struct connection *conn)
 /* Release a conn_stream, and kill the connection if it was the last one */
 static inline void cs_destroy(struct conn_stream *cs)
 {
-       cs->conn->mux->detach(cs);
+       if (cs->conn->mux)
+               cs->conn->mux->detach(cs);
+       else {
+               /* It's too early to have a mux, let's just destroy
+                * the connection
+                */
+               struct connection *conn = cs->conn;
+
+               conn_stop_tracking(conn);
+               conn_full_close(conn);
+               if (conn->destroy_cb)
+                       conn->destroy_cb(conn);
+               conn_free(conn);
+       }
        cs_free(cs);
 }
 
-- 
2.14.3

Re: HAProxy 1.8.X crashing

Reply via email to