On Fri, Mar 17, 2017 at 12:10:43PM +0300, Dmitry Sivachenko wrote:
> 
> > On 17 Mar 2017, at 12:04, Willy Tarreau <[email protected]> wrote:
> > 
> > Hi Dmitry,
> > 
> > On Wed, Mar 15, 2017 at 12:45:54AM +0300, Dmitry Sivachenko wrote:
> >> I committed your patch to FreeBSD ports.
> > 
> > I was just reported an undesired side effect of this patch with smtp
> > in clear without proxy-proto :-(
> > 
> > [...]
> 
> 
> 
> Okay, thanks for information.  I have no other complains so far, so I wait a 
> bit for an update.

OK here's a temporary patch. It includes a revert of the previous one and
adds a condition for the wake-up. At least it passes all my tests, including
those involving synchronous connection reports.

I'm not merging it yet as I'm wondering whether a reliable definitive
solution should be done once for all (and backported) by addressing the
root cause instead of constantly working around its consequences.

Willy
>From 42121fa95cf77be0384ab6ce597a88c2fe6e2cd6 Mon Sep 17 00:00:00 2001
From: Willy Tarreau <[email protected]>
Date: Sat, 18 Mar 2017 12:04:38 +0100
Subject: WIP/BUG/MAJOR: connection: also call ->wake() after end of handshakes
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4

We have an issue with the relation between CO_FL_CONNECTED and the
various handshake flags which was uncovered by recent fix 7bf3fa3
("BUG/MAJOR: connection: update CO_FL_CONNECTED before calling the
data layer").

The flag was normally set *after* calling ->wake() to indicate that
both the control layer and the transport layer are ready, and that
the connection is ready to exchange data. It is important that the
->wake() call sees the transition with both CO_FL_{L4,L6}_CONN cleared
and CO_FL_CONNECTED not yet set, to detect the status change.

But we have an issue with handshakes like the proxy protocol. These
ones have to be completed before the data layer may be used, and
after completion polling for data may be enabled. The problem comes
from the relation between CO_FL_CONNECT and the handshakes. In fact,
the SSL handshake is used to complete the transport layer setup, so
there's a direct 1-to-1 relation between L6_CONN and the SSL handshake.
It will be broken if we later decide to implement upgradable SSL by
the way. Some handshakes like accepting the proxy protocol or sending
it area easily hidden behind the SSL handshake (which has to be
performed on top of it), so they're done with CO_FL_CONNECTED unset.
But if we only have the proxy protocol and no SSL, we end up in a
situation where CO_FL_CONNECTED might already be set and where the
removal of the handshake flag doesn't constitute a sufficient
condition to notify the data layer about the availability of the
connection, which fails each time the connection succeeds
synchronously without any data to send.

It could be argued that the handshake flags have to be made part of
the condition to et CO_FL_CONNECTED but that would currently break a
part of the health checks. Also a handshake could appear at any
moment even after a connection is established so we'd lose the
ability to detect a second end of handshake.

For now the situation is not clean :
  - session_accept() only sets CO_FL_CONNECTED if there's no pending
    handshake ;

  - conn_fd_handler() will set it once L4 and L6 are complete, which
    will do what session_accept() above refrained from doing even if
    an accept_proxy handshake is still pending ;

  - ssl_sock_info_cbk() and ssl_sock_handshake() consider that a
    handshake performed with CO_FL_CONNECTED set is a renegociation ;

  - all ssl_fc_* sample fetch functions wait for CO_FL_CONNECTED before
    accepting to fetch information

Only health checks validate the end of handshakes and the connected flag
separately.

This patch aims at solving the side effects in a backportable way
before this is reworked in depth :
  - we need to call ->wake() to report connection success, measure
    connection time, and notify that the data layer is ready ; this
    has to be done either if we switch from pending {L4,L6}_CONN to
    nothing, or if we notice some handshakes were pending and are
    now done.

  - we must only set CO_FL_CONNECTED after calling ->wake() because
    this one requires all flags down to detect the transition ; this
    rolls back previous change ;

  - we must ensure that we'll see CO_FL_{CONNECTED,L4,L6} cleared at
    least once in each connection's lifetime to ensure no wake up
    event may be lost.

  - we document that CO_FL_CONNECTED exactly means "L4 connection
    setup confirmed at least once, L6 connection setup confirmed
    at least once or not necessary, all this regardless of any
    possibly remaining handshakes or future L6 negociations".

For later, si_conn_wake_cb() has to be fixed not to consider the
connection's flags but the stream-interface's and channel status
instead, making it agnostic to the change on the CO_FL_CONNECTED flag,
and other data layers (checks) will likely have to do something
similar.
---
 include/types/connection.h |  2 +-
 src/connection.c           | 25 ++++++++++++++++++-------
 2 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/include/types/connection.h b/include/types/connection.h
index c644dd5..6ec0e4f 100644
--- a/include/types/connection.h
+++ b/include/types/connection.h
@@ -98,7 +98,7 @@ enum {
 
        /* flags used to report connection status and errors */
        CO_FL_ERROR         = 0x00100000,  /* a fatal error was reported     */
-       CO_FL_CONNECTED     = 0x00200000,  /* the connection is now established 
*/
+       CO_FL_CONNECTED     = 0x00200000,  /* L4+L6 now ready ; extra 
handshakes may or may not exist */
        CO_FL_WAIT_L4_CONN  = 0x00400000,  /* waiting for L4 to be connected */
        CO_FL_WAIT_L6_CONN  = 0x00800000,  /* waiting for L6 to be connected 
(eg: SSL) */
 
diff --git a/src/connection.c b/src/connection.c
index bc34ae7..ab5fcd4 100644
--- a/src/connection.c
+++ b/src/connection.c
@@ -132,21 +132,32 @@ void conn_fd_handler(int fd)
        }
 
  leave:
-       /* Verify if the connection just established. The CO_FL_CONNECTED flag
-        * being included in CO_FL_CONN_STATE, its change will be noticed by
-        * the next block and be used to wake up the data layer.
+       /* The wake callback is normally used to notify about connection 
establishment
+        * and critical errors. The called functions need to detect the change 
so the
+        * CO_FL_CONNECTED flag must not have been updated yet. However we need 
to
+        * consider the following situations to wake up the data layer :
+        *  - change among the CO_FL_CONN_STATE flags :
+        *      {DATA,SOCK}_{RD,WR}_SH, ERROR, CONNECTED, WAIT_L4_CONN, 
WAIT_L6_CONN
+        *  - absence of any of {L4,L6}_CONN and CONNECTED, indicating the
+        *    end of handshake and transition to CONNECTED
+        *  - end of handshakes
         */
-       if (unlikely(!(conn->flags & (CO_FL_WAIT_L4_CONN | CO_FL_WAIT_L6_CONN | 
CO_FL_CONNECTED))))
-               conn->flags |= CO_FL_CONNECTED;
 
        /* The wake callback may be used to process a critical error and abort 
the
         * connection. If so, we don't want to go further as the connection will
         * have been released and the FD destroyed.
         */
        if ((conn->flags & CO_FL_WAKE_DATA) &&
-           ((conn->flags ^ flags) & CO_FL_CONN_STATE) &&
-           conn->data->wake(conn) < 0)
+           (((conn->flags ^ flags) & CO_FL_CONN_STATE) ||
+            ((flags & CO_FL_HANDSHAKE) && !(conn->flags & CO_FL_HANDSHAKE))) &&
+           conn->data->wake(conn) < 0) {
                return;
+       }
+
+       /* Now set the CO_FL_CONNECTED flag if the connection was just 
established. */
+
+       if (unlikely(!(conn->flags & (CO_FL_WAIT_L4_CONN | CO_FL_WAIT_L6_CONN | 
CO_FL_CONNECTED))))
+               conn->flags |= CO_FL_CONNECTED;
 
        /* remove the events before leaving */
        fdtab[fd].ev &= FD_POLL_STICKY;
-- 
2.8.0.rc2.1.gbe9624a

Reply via email to