Hi,

On Sat, Mar 18, 2017 at 01:12:09PM +0100, Willy Tarreau wrote:
> OK here's a temporary patch. It includes a revert of the previous one and
> adds a condition for the wake-up. At least it passes all my tests, including
> those involving synchronous connection reports.
> 
> I'm not merging it yet as I'm wondering whether a reliable definitive
> solution should be done once for all (and backported) by addressing the
> root cause instead of constantly working around its consequences.

And here come two patches as a replacement for this temporary one. They
are safer and have been done after throrough code review. I spotted a
small tens of dirty corner cases having accumulated over the years due
to the unclear meaning of the CO_FL_CONNECTED flag. They'll have to be
addressed, but the current patches protect against these corner cases.
They survived all tests involving delayed connections and checks with
and without all handshake combinations, with tcp (immediate and delayed
requests and responses) and http (immediate, delayed requests and responses
and pipelining).

I'm resending the first one you already got Dmitry to make things easier
to follow for everyone. These three are to be applied on top of 1.7.3. I
still have a few other issues to deal with regarding 1.7 before doing a
new release (hopefully by the beginning of this week).

Thanks,
Willy
>From afbf56b951967e8fa4d509e423fdcb11c27d40e2 Mon Sep 17 00:00:00 2001
From: Willy Tarreau <[email protected]>
Date: Tue, 14 Mar 2017 20:19:29 +0100
Subject: BUG/MAJOR: connection: update CO_FL_CONNECTED before calling the data
 layer
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4

Matthias Fechner reported a regression in 1.7.3 brought by the backport
of commit 819efbf ("BUG/MEDIUM: tcp: don't poll for write when connect()
succeeds"), causing some connections to fail to establish once in a while.
While this commit itself was a fix for a bad sequencing of connection
events, it in fact unveiled a much deeper bug going back to the connection
rework era in v1.5-dev12 : 8f8c92f ("MAJOR: connection: add a new
CO_FL_CONNECTED flag").

It's worth noting that in a lab reproducing a similar environment as
Matthias' about only 1 every 19000 connections exhibit this behaviour,
making the issue not so easy to observe. A trick to make the problem
more observable consists in disabling non-blocking mode on the socket
before calling connect() and re-enabling it later, so that connect()
always succeeds. Then it becomes 100% reproducible.

The problem is that this CO_FL_CONNECTED flag is tested after deciding to
call the data layer (typically the stream interface but might be a health
check as well), and that the decision to call the data layer relies on a
change of one of the flags covered by the CO_FL_CONN_STATE set, which is
made of CO_FL_CONNECTED among others.

Before the fix above, this bug couldn't appear with TCP but it could
appear with Unix sockets. Indeed, connect() was always considered
blocking so the CO_FL_WAIT_L4_CONN connection flag was always set, and
polling for write events was always enabled. This used to guarantee that
the conn_fd_handler() could detect a change among the CO_FL_CONN_STATE
flags.

Now with the fix above, if a connect() immediately succeeds for non-ssl
connection with send-proxy enabled, and no data in the buffer (thus TCP
mode only), the CO_FL_WAIT_L4_CONN flag is not set, the lack of data in
the buffer doesn't enable polling flags for the data layer, the
CO_FL_CONNECTED flag is not set due to send-proxy still being pending,
and once send-proxy is done, its completion doesn't cause the data layer
to be woken up due to the fact that CO_FL_CONNECT is still not present
and that the CO_FL_SEND_PROXY flag is not watched in CO_FL_CONN_STATE.

Then no progress is made when data are received from the client (and
attempted to be forwarded), because a CF_WRITE_NULL (or CF_WRITE_PARTIAL)
flag is needed for the stream-interface state to turn from SI_ST_CON to
SI_ST_EST, allowing ->chk_snd() to be called when new data arrive. And
the only way to set this flag is to call the data layer of course.

After the connect timeout, the connection gets killed and if in the mean
time some data have accumulated in the buffer, the retry will succeed.

This patch fixes this situation by simply placing the update of
CO_FL_CONNECTED where it should have been, before the check for a flag
change needed to wake up the data layer and not after.

This fix must be backported to 1.7, 1.6 and 1.5. Versions not having
the patch above are still affected for unix sockets.

Special thanks to Matthias Fechner who provided a very detailed bug
report with a bisection designating the faulty patch, and to Olivier
Houchard for providing full access to a pretty similar environment where
the issue could first be reproduced.
(cherry picked from commit 7bf3fa3c23f6a1b7ed1212783507ac50f7e27544)
---
 src/connection.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/src/connection.c b/src/connection.c
index 26fc5f6..1e4c9aa 100644
--- a/src/connection.c
+++ b/src/connection.c
@@ -131,6 +131,13 @@ void conn_fd_handler(int fd)
        }
 
  leave:
+       /* Verify if the connection just established. The CO_FL_CONNECTED flag
+        * being included in CO_FL_CONN_STATE, its change will be noticed by
+        * the next block and be used to wake up the data layer.
+        */
+       if (unlikely(!(conn->flags & (CO_FL_WAIT_L4_CONN | CO_FL_WAIT_L6_CONN | 
CO_FL_CONNECTED))))
+               conn->flags |= CO_FL_CONNECTED;
+
        /* The wake callback may be used to process a critical error and abort 
the
         * connection. If so, we don't want to go further as the connection will
         * have been released and the FD destroyed.
@@ -140,10 +147,6 @@ void conn_fd_handler(int fd)
            conn->data->wake(conn) < 0)
                return;
 
-       /* Last check, verify if the connection just established */
-       if (unlikely(!(conn->flags & (CO_FL_WAIT_L4_CONN | CO_FL_WAIT_L6_CONN | 
CO_FL_CONNECTED))))
-               conn->flags |= CO_FL_CONNECTED;
-
        /* remove the events before leaving */
        fdtab[fd].ev &= FD_POLL_STICKY;
 
-- 
2.8.0.rc2.1.gbe9624a

>From 8ca791a60b4ebd8f5b5042c6e0bf09fe9d5e3308 Mon Sep 17 00:00:00 2001
From: Willy Tarreau <[email protected]>
Date: Sat, 18 Mar 2017 17:11:37 +0100
Subject: BUG/MAJOR: stream-int: do not depend on connection flags to detect
 connection
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4

Recent fix 7bf3fa3 ("BUG/MAJOR: connection: update CO_FL_CONNECTED before
calling the data layer") marked an end to a fragile situation where the
absence of CO_FL_{CONNECTED,L4,L6}* flags is used to mark the completion
of a connection setup. The problem is that by setting the CO_FL_CONNECTED
flag earlier, we can indeed call the ->wake() function from conn_fd_handler
but the stream-interface's wake function needs to see CO_FL_CONNECTED unset
to detect that a connection has just been established, so if there's no
pending data in the buffer, the connection times out. The other ->wake()
functions (health checks and idle connections) don't do this though.

So instead of trying to detect a subtle change in connection flags,
let's simply rely on the stream-interface's state and validate that the
connection is properly established and that handshakes are completed
before reporting the WRITE_NULL indicating that a pending connection was
just completed.

This patch passed all tests of handshake and non-handshake combinations,
with synchronous and asynchronous connect() and should be safe for backport
to 1.7, 1.6 and 1.5 when the fix above is already present.

(cherry picked from commit 52821e27376f89b41167565b01d975f47266284c)
---
 src/stream_interface.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/stream_interface.c b/src/stream_interface.c
index 758aec7..651340b 100644
--- a/src/stream_interface.c
+++ b/src/stream_interface.c
@@ -563,7 +563,8 @@ static int si_conn_wake_cb(struct connection *conn)
        if (conn->flags & CO_FL_ERROR)
                si->flags |= SI_FL_ERR;
 
-       if (unlikely(!(conn->flags & (CO_FL_WAIT_L4_CONN | CO_FL_WAIT_L6_CONN | 
CO_FL_CONNECTED)))) {
+       if ((si->state < SI_ST_EST) &&
+           (conn->flags & (CO_FL_CONNECTED | CO_FL_HANDSHAKE)) == 
CO_FL_CONNECTED) {
                si->exp = TICK_ETERNITY;
                oc->flags |= CF_WRITE_NULL;
        }
-- 
2.8.0.rc2.1.gbe9624a

>From 9a4e166382fb90a166ae91ad3071842cdef9cc27 Mon Sep 17 00:00:00 2001
From: Willy Tarreau <[email protected]>
Date: Sun, 19 Mar 2017 07:54:28 +0100
Subject: BUG/MEDIUM: connection: ensure to always report the end of handshakes
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4

Despite the previous commit working fine on all tests, it's still not
sufficient to completely address the problem. If the connection handler
is called with an event validating an L4 connection but some handshakes
remain (eg: accept-proxy), it will still wake the function up, which
will not report the activity, and will not detect a change once the
handshake it complete so it will not notify the ->wake() handler.

In fact the only reason why the ->wake() handler is still called here
is because after dropping the last handshake, we try to call ->recv()
and ->send() in turn and change the flags in order to detect a data
activity. But if for any reason the data layer is not interested in
reading nor writing, it will not get these events.

A cleaner way to address this is to call the ->wake() handler only
on definitive status changes (shut, error), on real data activity,
and on a complete connection setup, measured as CONNECTED with no
more handshake pending.

It could be argued that the handshake flags have to be made part of
the condition to set CO_FL_CONNECTED but that would currently break
a part of the health checks. Also a handshake could appear at any
moment even after a connection is established so we'd lose the
ability to detect a second end of handshake.

For now the situation around CO_FL_CONNECTED is not clean :
  - session_accept() only sets CO_FL_CONNECTED if there's no pending
    handshake ;

  - conn_fd_handler() will set it once L4 and L6 are complete, which
    will do what session_accept() above refrained from doing even if
    an accept_proxy handshake is still pending ;

  - ssl_sock_infocbk() and ssl_sock_handshake() consider that a
    handshake performed with CO_FL_CONNECTED set is a renegociation ;
    => they should instead filter on CO_FL_WAIT_L6_CONN

  - all ssl_fc_* sample fetch functions wait for CO_FL_CONNECTED before
    accepting to fetch information
    => they should also get rid of any pending handshake

  - smp_fetch_fc_rcvd_proxy() uses !CO_FL_CONNECTED instead of
    CO_FL_ACCEPT_PROXY

  - health checks (standard and tcp-checks) don't check for HANDSHAKE
    and may report a successful check based on CO_FL_CONNECTED while
    not yet done (eg: send buffer full on send_proxy).

This patch aims at solving some of these side effects in a backportable
way before this is reworked in depth :
  - we need to call ->wake() to report connection success, measure
    connection time, notify that the data layer is ready and update
    the data layer after activity ; this has to be done either if
    we switch from pending {L4,L6}_CONN to nothing with no handshakes
    left, or if we notice some handshakes were pending and are now
    done.

  - we document that CO_FL_CONNECTED exactly means "L4 connection
    setup confirmed at least once, L6 connection setup confirmed
    at least once or not necessary, all this regardless of any
    possibly remaining handshakes or future L6 negociations".

This patch also renames CO_FL_CONN_STATUS to the more explicit
CO_FL_NOTIFY_DATA, and works around the previous flags trick consiting
in setting an impossible combination of flags to notify the data layer,
by simply clearing the current flags.

This fix should be backported to 1.7, 1.6 and 1.5.

(cherry picked from commit 3c0cc49d30968cf839a1d3a747de6adda18d26db)
---
 include/types/connection.h | 10 +++++-----
 src/connection.c           | 41 ++++++++++++++++++++++++++---------------
 2 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/include/types/connection.h b/include/types/connection.h
index b65ad27..02eac93 100644
--- a/include/types/connection.h
+++ b/include/types/connection.h
@@ -95,15 +95,15 @@ enum {
        CO_FL_SOCK_RD_SH    = 0x00040000,  /* SOCK layer was notified about 
shutr/read0 */
        CO_FL_SOCK_WR_SH    = 0x00080000,  /* SOCK layer asked for shutw */
 
-       /* flags used to report connection status and errors */
+       /* flags used to report connection errors or other closing conditions */
        CO_FL_ERROR         = 0x00100000,  /* a fatal error was reported     */
-       CO_FL_CONNECTED     = 0x00200000,  /* the connection is now established 
*/
+       CO_FL_NOTIFY_DATA   = 0x001F0000,  /* any shut/error flags above needs 
to be reported */
+
+       /* flags used to report connection status updates */
+       CO_FL_CONNECTED     = 0x00200000,  /* L4+L6 now ready ; extra 
handshakes may or may not exist */
        CO_FL_WAIT_L4_CONN  = 0x00400000,  /* waiting for L4 to be connected */
        CO_FL_WAIT_L6_CONN  = 0x00800000,  /* waiting for L6 to be connected 
(eg: SSL) */
 
-       /* synthesis of the flags above */
-       CO_FL_CONN_STATE    = 0x00FF0000,  /* all shut/connected flags */
-
        /*** All the flags below are used for connection handshakes. Any new
         * handshake should be added after this point, and CO_FL_HANDSHAKE
         * should be updated.
diff --git a/src/connection.c b/src/connection.c
index 1e4c9aa..3629094 100644
--- a/src/connection.c
+++ b/src/connection.c
@@ -99,19 +99,21 @@ void conn_fd_handler(int fd)
         */
        if (conn->xprt && fd_recv_ready(fd) &&
            ((conn->flags & 
(CO_FL_DATA_RD_ENA|CO_FL_WAIT_ROOM|CO_FL_ERROR|CO_FL_HANDSHAKE)) == 
CO_FL_DATA_RD_ENA)) {
-               /* force detection of a flag change : it's impossible to have 
both
-                * CONNECTED and WAIT_CONN so we're certain to trigger a change.
+               /* force reporting of activity by clearing the previous flags :
+                * we'll have at least ERROR or CONNECTED at the end of an I/O,
+                * both of which will be detected below.
                 */
-               flags = CO_FL_WAIT_L4_CONN | CO_FL_CONNECTED;
+               flags = 0;
                conn->data->recv(conn);
        }
 
        if (conn->xprt && fd_send_ready(fd) &&
            ((conn->flags & 
(CO_FL_DATA_WR_ENA|CO_FL_WAIT_DATA|CO_FL_ERROR|CO_FL_HANDSHAKE)) == 
CO_FL_DATA_WR_ENA)) {
-               /* force detection of a flag change : it's impossible to have 
both
-                * CONNECTED and WAIT_CONN so we're certain to trigger a change.
+               /* force reporting of activity by clearing the previous flags :
+                * we'll have at least ERROR or CONNECTED at the end of an I/O,
+                * both of which will be detected below.
                 */
-               flags = CO_FL_WAIT_L4_CONN | CO_FL_CONNECTED;
+               flags = 0;
                conn->data->send(conn);
        }
 
@@ -129,21 +131,30 @@ void conn_fd_handler(int fd)
                if (!tcp_connect_probe(conn))
                        goto leave;
        }
-
  leave:
-       /* Verify if the connection just established. The CO_FL_CONNECTED flag
-        * being included in CO_FL_CONN_STATE, its change will be noticed by
-        * the next block and be used to wake up the data layer.
-        */
+       /* Verify if the connection just established. */
        if (unlikely(!(conn->flags & (CO_FL_WAIT_L4_CONN | CO_FL_WAIT_L6_CONN | 
CO_FL_CONNECTED))))
                conn->flags |= CO_FL_CONNECTED;
 
-       /* The wake callback may be used to process a critical error and abort 
the
-        * connection. If so, we don't want to go further as the connection will
-        * have been released and the FD destroyed.
+       /* The wake callback is normally used to notify the data layer about
+        * data layer activity (successful send/recv), connection establishment,
+        * shutdown and fatal errors. We need to consider the following
+        * situations to wake up the data layer :
+        *  - change among the CO_FL_NOTIFY_DATA flags :
+        *      {DATA,SOCK}_{RD,WR}_SH, ERROR,
+        *  - absence of any of {L4,L6}_CONN and CONNECTED, indicating the
+        *    end of handshake and transition to CONNECTED
+        *  - raise of CONNECTED with HANDSHAKE down
+        *  - end of HANDSHAKE with CONNECTED set
+        *  - regular data layer activity
+        *
+        * Note that the wake callback is allowed to release the connection and
+        * the fd (and return < 0 in this case).
         */
        if ((conn->flags & CO_FL_WAKE_DATA) &&
-           ((conn->flags ^ flags) & CO_FL_CONN_STATE) &&
+           (((conn->flags ^ flags) & CO_FL_NOTIFY_DATA) ||
+            ((flags & (CO_FL_CONNECTED|CO_FL_HANDSHAKE)) != CO_FL_CONNECTED &&
+             (conn->flags & (CO_FL_CONNECTED|CO_FL_HANDSHAKE)) == 
CO_FL_CONNECTED)) &&
            conn->data->wake(conn) < 0)
                return;
 
-- 
2.8.0.rc2.1.gbe9624a

Reply via email to