The branch, master has been updated
       via  4569c652881 ctdb-scripts: Add configuration variable 
CTDB_KILLTCP_USE_SS_KILL
       via  19e65f4012f ctdb-scripts: Factor out function kill_tcp_summarise()
       via  590a86dbe4a ctdb-scripts: Track connections for all ports for 
public IPs
       via  c3695722b63 ctdb-scripts: Get connections after tickle list
       via  9683bb3ac2b ctdb-scripts: Move connection tracking to 10.interface
       via  d39a1cc1d4f ctdb-server: Use ctdb_connection_same() to simplify
       via  1b1fd5c2280 ctdb: Don't leak a pointer on talloc_realloc failure
       via  e080add68ab ctdb: Save a few lines with talloc_zero()
       via  762f5f5ca63 ctdb-server: Remove duplicate logic
       via  5af8627feb8 ctdb-server: Handle pre-existing connection first
       via  9838b4d0d6c ctdb-server: Drop an unnecessary variable
       via  f4a8f84328c ctdb-server: Drop a log message to DEBUG level
       via  3c19c8df778 ctdb-server: Clean up connection tracking functions
       via  0505d06b12a ctdb-scripts: Use ss -H option to simplify
       via  32e4f786601 ctdb-scripts: Remove superseded compatibility code
       via  b3e2c69ad92 ctdb-scripts: update_tickles() should use the public 
IPs cache
       via  1a4a6c46f1c ctdb-scripts: Don't list connections when not hosting 
IPs
       via  3410eddd932 ctdb-scripts: Reformat with "shfmt -w -p -i 0 -fn"
       via  025bd34dfcf ctdb-doc: Improve 10.interface documentation and 
comments
       via  60067e2a74d ctdb-tests: Fix ss -a not supported
       via  4817e32c1da ctdb-tests: Drop unsupported long options from ss stub 
usage
       via  557b0342002 ctdb-tests: Ensure ss stub handles square brackets 
around addresses
      from  982042115b1 libndr: specialise ndr_token_find() for key pointer 
comparison

https://git.samba.org/?p=samba.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit 4569c65288177969ca1e4d9bd6badec60552beb9
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Tue Aug 22 12:13:44 2023 +1000

    ctdb-scripts: Add configuration variable CTDB_KILLTCP_USE_SS_KILL
    
    This allows CTDB to be configured to use "ss -K" to reset TCP
    connections on "releaseip".  This is only supported when the kernel is
    configured with CONFIG_INET_DIAG_DESTROY enabled.
    
    From the documentation:
    
       ss -K has been supported in ss since iproute 4.5 in March 2016 and
       in the Linux kernel since 4.4 in December 2015.  However, the
       required kernel configuration item CONFIG_INET_DIAG_DESTROY is
       disabled by default.  Although enabled in Debian kernels since
       ~2017 and in Ubuntu since at least 18.04,, this has only recently
       been enabled in distributions such as RHEL.  There seems to be no
       way, including running ss -K, to determine if this is supported, so
       use of this feature needs to be configurable.  When available, it
       should be the fastest, most reliable way of killing connections.
    
    For RHEL and derivatives, this was enabled as follows:
    
    * RHEL 8 via https://bugzilla.redhat.com/show_bug.cgi?id=2230213,
      arriving in version kernel-4.18.0-513.5.1.el8_9
    
    * RHEL 9 via https://issues.redhat.com/browse/RHEL-212, arriving in
      kernel-5.14.0-360.el9
    
    Enabling this option results in a small behaviour change because ss -K
    always does a 2-way kill (i.e. it also sends a RST to the client).
    Only a 1-way kill is done for SMB connections when ctdb_killtcp is
    used - the reasons for this are shrouded in history and the 2-way kill
    seems to work fine.
    
    For the summary that is logged, when CTDB_KILLTCP_USE_SS_KILL is "yes"
    or "try", always log the method used, even the fallback to
    ctdb_killtcp.  However, when set to "no", maintain the existing
    output.
    
    The decision to use -K rather than --kill is because short options are
    trivial to implement in test stubs.
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>
    
    Autobuild-User(master): Martin Schwenke <mart...@samba.org>
    Autobuild-Date(master): Thu Nov  7 00:12:34 UTC 2024 on atb-devel-224

commit 19e65f4012f286b279dbefeae74500d867592a27
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Fri Aug 25 10:00:57 2023 +1000

    ctdb-scripts: Factor out function kill_tcp_summarise()
    
    This will be used in a slightly different context in a subsequent
    commit.  In that case, the number of killed connections will be passed
    instead of the total number of connections, so support this here via
    different modes instead of churning later.
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>

commit 590a86dbe4adf45ac8d15497934e25ea98148034
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Mon Oct 23 14:17:36 2023 +1100

    ctdb-scripts: Track connections for all ports for public IPs
    
    Currently TCP ports like NFS lock manager are not tracked.  It is
    easier to track all connections than to add a configuration system to
    try to track specified ports, so do that.
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>

commit c3695722b6316b624aa6c44cad4f44279303d1b1
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Mon Sep 30 10:50:00 2024 +1000

    ctdb-scripts: Get connections after tickle list
    
    Running ss to get current connections before running ctdb gettickles
    means the ss output might be out of date when the 2 lists are
    compared.  Some tickles might have been added after ss was run by some
    other means (e.g. SMB tickles, added internally) and they would be
    deleted according to the stale ss output.
    
    This isn't currently a problem because update_tickles() is currently
    only called with port 2049, so all tickles are managed by this code.
    That will change in a subsequent commit.
    
    Changing the order means the reverse problem can occur, where
    update_tickles() attempts to delete an already deleted tickle.  That
    may happen occasionally but is harmless because it doesn't result in
    missing information.  It (currently) just causes a message to be
    logged at DEBUG level.
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>

commit 9683bb3ac2bbdf0e83c3be3681f9d1c8ee7cc327
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Mon Oct 23 14:05:21 2023 +1100

    ctdb-scripts: Move connection tracking to 10.interface
    
    This should really be done for all connections to public IP addresses.
    Leave the port number there for now - this is just the first step.
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>

commit d39a1cc1d4f874e398f87a6778a868ec1f9178eb
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Mon Sep 30 12:21:59 2024 +1000

    ctdb-server: Use ctdb_connection_same() to simplify
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>

commit 1b1fd5c2280ee7f5a3caba0779bf5208c11359db
Author: Volker Lendecke <v...@samba.org>
Date:   Wed Nov 6 11:51:04 2024 +0100

    ctdb: Don't leak a pointer on talloc_realloc failure
    
    We should not directly overwrite the pointer we are realloc'ing
    
    Signed-off-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Martin Schwenke <mar...@meltin.net>

commit e080add68ab748a290533ec1fcb97c6aef319418
Author: Volker Lendecke <v...@samba.org>
Date:   Wed Nov 6 11:49:36 2024 +0100

    ctdb: Save a few lines with talloc_zero()
    
    Signed-off-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Martin Schwenke <mar...@meltin.net>

commit 762f5f5ca6350cc0b93c71f06abc963e13793e0e
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Mon Sep 30 12:40:57 2024 +1000

    ctdb-server: Remove duplicate logic
    
    Initialise the pointer to NULL and fall through to let
    talloc_realloc() do the allocation.  talloc_realloc() does the right
    thing with a NULL pointer...
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>

commit 5af8627feb805b65b9bf28a295f2f7f81f5f8826
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Mon Sep 30 12:37:57 2024 +1000

    ctdb-server: Handle pre-existing connection first
    
    This is cheap when tcparray is NULL and let's the code that now
    follows be simplified.
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>

commit 9838b4d0d6cfdcf87c5aa6eac2252dd1579173cf
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Mon Sep 30 12:34:18 2024 +1000

    ctdb-server: Drop an unnecessary variable
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>

commit f4a8f84328c5e692ce63bec05bb71fcb469a3e9c
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Mon Sep 30 12:30:13 2024 +1000

    ctdb-server: Drop a log message to DEBUG level
    
    This is harmless, so it doesn't generally need to be logged.
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>

commit 3c19c8df778070705485b3c993e695ca1636bfa7
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Mon Sep 30 12:22:46 2024 +1000

    ctdb-server: Clean up connection tracking functions
    
    Apply README.Coding, modernise logging, pre-render connection as a
    string for logging, switch terminology from "tickle" to "connection",
    tidy up comments.
    
    No changes in functionality.
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>

commit 0505d06b12a04a5c5e813fb3f4799278f9e5b7eb
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Mon Sep 16 12:26:53 2024 +1000

    ctdb-scripts: Use ss -H option to simplify
    
    This option has been available since ~2018 and has been implemented in
    the stub since then.  I guess we didn't use it because CentOS 7?
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>

commit 32e4f786601712e57992ce4c8f46e5d38620a5dd
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Mon Oct 23 14:23:45 2023 +1100

    ctdb-scripts: Remove superseded compatibility code
    
    Since commit 224e99804efef960ef4ce2ff2f4f6dced1e74146, square brackets
    have been parsed by daemon and tool code, so drop the compatibility
    code from here.
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>

commit b3e2c69ad92c0d20bb10146d2dd6d0d475455298
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Thu Sep 19 14:32:46 2024 +1000

    ctdb-scripts: update_tickles() should use the public IPs cache
    
    This avoids duplicating logic.
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>

commit 1a4a6c46f1cdabfea67c264d6576a597a70c3007
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Thu Sep 19 13:52:48 2024 +1000

    ctdb-scripts: Don't list connections when not hosting IPs
    
    With an empty IP filter, all incoming connections to port 2049 will be
    listed, not just those to public IP addresses.  This causes error
    messages like the following to be logged:
    
      ctdb-eventd[...]: 60.nfs: Failed to add 1 tickles
    
    since the connection being added seems to be for a random NFS mount
    that doesn't use a public IP addresses.
    
    This has been a problem for a long time (probably since commit
    04fe9e20749985c71fef1bce7f6e4c439fe11c81 in 2015).  It isn't currently
    a huge deal because it only affects NFS connections.  However, this
    code will soon be used to track connections to public IP addresses on
    all ports.  This would result in a constant stream of log messages,
    since there will always be some active connections.
    
    The theory behind the fix is that if a node hosts no public IPs then
    it should have no relevant connections and has no business changing
    the list of registered tickles.
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>

commit 3410eddd932b430acc687c81a5dc6e62a0a420a6
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Fri Sep 13 16:21:24 2024 +1000

    ctdb-scripts: Reformat with "shfmt -w -p -i 0 -fn"
    
    Massage a couple of lines manually so they're formatted sanely given
    the new indentation.   Re-run shfmt to ensure no further changes.
    
    Best reviewed with "git show -w".
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>

commit 025bd34dfcf790d06080501f0263667506137736
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Tue Aug 22 12:12:50 2023 +1000

    ctdb-doc: Improve 10.interface documentation and comments
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>

commit 60067e2a74d58d9b31a5eef657ec33fbdc7ec514
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Mon Sep 16 12:32:02 2024 +1000

    ctdb-tests: Fix ss -a not supported
    
    This is currently just a series of typos.
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>

commit 4817e32c1da5e9d6f0e3594e67f1d2bed66463ac
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Mon Sep 16 12:19:00 2024 +1000

    ctdb-tests: Drop unsupported long options from ss stub usage
    
    These have not been supported since commit
    896c77df1ce2645c6dd7898b59ea802e204dc7d9 in 2018.
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>

commit 557b034200269aadb5c23d53207a988fc313c97f
Author: Martin Schwenke <mschwe...@ddn.com>
Date:   Fri Oct 27 11:06:23 2023 +1100

    ctdb-tests: Ensure ss stub handles square brackets around addresses
    
    It isn't unreasonable for unit test cases to use square brackets in
    their input.
    
    Signed-off-by: Martin Schwenke <mschwe...@ddn.com>
    Reviewed-by: Volker Lendecke <v...@samba.org>
    Reviewed-by: Jerry Heyman <jhey...@ddn.com>

-----------------------------------------------------------------------

Summary of changes:
 ctdb/config/events/legacy/10.interface.script    | 124 ++++++++++----------
 ctdb/config/events/legacy/60.nfs.script          |   1 -
 ctdb/config/functions                            | 118 +++++++++++--------
 ctdb/doc/ctdb-script.options.5.xml               |  94 ++++++++++++++-
 ctdb/server/ctdb_takeover.c                      | 141 ++++++++++++-----------
 ctdb/tests/UNIT/eventscripts/10.interface.020.sh |  27 +++++
 ctdb/tests/UNIT/eventscripts/10.interface.021.sh |  32 +++++
 ctdb/tests/UNIT/eventscripts/10.interface.022.sh |  35 ++++++
 ctdb/tests/UNIT/eventscripts/10.interface.023.sh |  40 +++++++
 ctdb/tests/UNIT/eventscripts/10.interface.030.sh |  27 +++++
 ctdb/tests/UNIT/eventscripts/10.interface.031.sh |  35 ++++++
 ctdb/tests/UNIT/eventscripts/10.interface.032.sh |  40 +++++++
 ctdb/tests/UNIT/eventscripts/10.interface.033.sh |  52 +++++++++
 ctdb/tests/UNIT/eventscripts/stubs/ss            |  37 ++++--
 14 files changed, 615 insertions(+), 188 deletions(-)
 create mode 100755 ctdb/tests/UNIT/eventscripts/10.interface.020.sh
 create mode 100755 ctdb/tests/UNIT/eventscripts/10.interface.021.sh
 create mode 100755 ctdb/tests/UNIT/eventscripts/10.interface.022.sh
 create mode 100755 ctdb/tests/UNIT/eventscripts/10.interface.023.sh
 create mode 100755 ctdb/tests/UNIT/eventscripts/10.interface.030.sh
 create mode 100755 ctdb/tests/UNIT/eventscripts/10.interface.031.sh
 create mode 100755 ctdb/tests/UNIT/eventscripts/10.interface.032.sh
 create mode 100755 ctdb/tests/UNIT/eventscripts/10.interface.033.sh


Changeset truncated at 500 lines:

diff --git a/ctdb/config/events/legacy/10.interface.script 
b/ctdb/config/events/legacy/10.interface.script
index 9aa067b4a61..8d2d6968a1d 100755
--- a/ctdb/config/events/legacy/10.interface.script
+++ b/ctdb/config/events/legacy/10.interface.script
@@ -1,11 +1,9 @@
 #!/bin/sh
 
-#################################
-# interface event script for ctdb
-# this adds/removes IPs from your 
-# public interface
+# Handle public IP address release and takeover, as well as monitoring
+# interfaces used by public IP addresses.
 
-[ -n "$CTDB_BASE" ] || \
+[ -n "$CTDB_BASE" ] ||
        CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD")
 
 . "${CTDB_BASE}/functions"
@@ -13,7 +11,7 @@
 load_script_options
 
 if ! have_public_addresses; then
-       if [ "$1" = "init" ] ; then
+       if [ "$1" = "init" ]; then
                echo "No public addresses file found"
        fi
        exit 0
@@ -32,8 +30,8 @@ monitor_interfaces()
        #
        # public_ifaces set by get_public_ifaces() above
        # shellcheck disable=SC2154
-       for _iface in $public_ifaces ; do
-               if interface_monitor "$_iface" ; then
+       for _iface in $public_ifaces; do
+               if interface_monitor "$_iface"; then
                        up_interfaces_found=true
                        $CTDB setifacelink "$_iface" up >/dev/null 2>&1
                else
@@ -42,11 +40,11 @@ monitor_interfaces()
                fi
        done
 
-       if ! $down_interfaces_found ; then
+       if ! $down_interfaces_found; then
                return 0
        fi
 
-       if ! $up_interfaces_found ; then
+       if ! $up_interfaces_found; then
                return 1
        fi
 
@@ -58,63 +56,61 @@ monitor_interfaces()
 }
 
 # Sets: iface, ip, maskbits
-get_iface_ip_maskbits ()
+get_iface_ip_maskbits()
 {
-    _iface_in="$1"
-    ip="$2"
-    _maskbits_in="$3"
-
-    # Intentional word splitting here
-    # shellcheck disable=SC2046
-    set -- $(ip_maskbits_iface "$ip")
-    if [ -n "$1" ] ; then
-       maskbits="$1"
-       iface="$2"
-
-       if [ "$iface" != "$_iface_in" ] ; then
-           printf \
-               'WARNING: Public IP %s hosted on interface %s but VNN says 
%s\n' \
-               "$ip" "$iface" "$_iface_in"
-       fi
-       if [ "$maskbits" != "$_maskbits_in" ] ; then
-           printf \
-               'WARNING: Public IP %s has %s bit netmask but VNN says %s\n' \
-                   "$ip" "$maskbits" "$_maskbits_in"
+       _iface_in="$1"
+       ip="$2"
+       _maskbits_in="$3"
+
+       # Intentional word splitting here
+       # shellcheck disable=SC2046
+       set -- $(ip_maskbits_iface "$ip")
+       if [ -n "$1" ]; then
+               maskbits="$1"
+               iface="$2"
+
+               if [ "$iface" != "$_iface_in" ]; then
+                       printf 'WARNING: Public IP %s hosted on interface %s 
but VNN says %s\n' \
+                               "$ip" "$iface" "$_iface_in"
+               fi
+               if [ "$maskbits" != "$_maskbits_in" ]; then
+                       printf 'WARNING: Public IP %s has %s bit netmask but 
VNN says %s\n' \
+                               "$ip" "$maskbits" "$_maskbits_in"
+               fi
+       else
+               die "ERROR: Unable to determine interface for IP ${ip}"
        fi
-    else
-       die "ERROR: Unable to determine interface for IP ${ip}"
-    fi
 }
 
-ip_block ()
+ip_block()
 {
        _ip="$1"
        _iface="$2"
 
        case "$_ip" in
        *:*) _family="inet6" ;;
-       *)   _family="inet"  ;;
+       *) _family="inet" ;;
        esac
 
        # Extra delete copes with previously killed script
        iptables_wrapper "$_family" \
-                        -D INPUT -i "$_iface" -d "$_ip" -j DROP 2>/dev/null
+               -D INPUT -i "$_iface" -d "$_ip" -j DROP 2>/dev/null
        iptables_wrapper "$_family" \
-                        -I INPUT -i "$_iface" -d "$_ip" -j DROP
+               -I INPUT -i "$_iface" -d "$_ip" -j DROP
 }
 
-ip_unblock ()
+ip_unblock()
 {
        _ip="$1"
        _iface="$2"
 
        case "$_ip" in
        *:*) _family="inet6" ;;
-       *)   _family="inet"  ;;
+       *) _family="inet" ;;
        esac
 
        iptables_wrapper "$_family" \
-                        -D INPUT -i "$_iface" -d "$_ip" -j DROP 2>/dev/null
+               -D INPUT -i "$_iface" -d "$_ip" -j DROP 2>/dev/null
 }
 
 ctdb_check_args "$@"
@@ -122,11 +118,11 @@ ctdb_check_args "$@"
 case "$1" in
 init)
        _promote="sys/net/ipv4/conf/all/promote_secondaries"
-       get_proc "$_promote" >/dev/null 2>&1 || \
-           die "Public IPs only supported if promote_secondaries is available"
+       get_proc "$_promote" >/dev/null 2>&1 ||
+               die "Public IPs only supported if promote_secondaries is 
available"
 
-       # make sure we drop any ips that might still be held if
-       # previous instance of ctdb got killed with -9 or similar
+       # Make sure we drop any IPs that might still be held if
+       # previous instance of ctdbd got killed with -9 or similar
        drop_all_public_ips
        ;;
 
@@ -146,7 +142,7 @@ takeip)
        update_my_public_ip_addresses "takeip" "$ip"
 
        add_ip_to_iface "$iface" "$ip" "$maskbits" || {
-               exit 1;
+               exit 1
        }
 
        # In case a previous "releaseip" for this IP was killed...
@@ -156,12 +152,15 @@ takeip)
        ;;
 
 releaseip)
-       # releasing an IP is a bit more complex than it seems. Once the IP
-       # is released, any open tcp connections to that IP on this host will end
-       # up being stuck. Some of them (such as NFS connections) will be 
unkillable
-       # so we need to use the killtcp ctdb function to kill them off. We also
-       # need to make sure that no new connections get established while we are
-       # doing this! So what we do is this:
+       # Releasing an IP is a bit more complex than it seems. Once
+       # the IP is released, any open TCP connections to that IP on
+       # this host will end up being stuck. Some of them (such as NFS
+       # connections) will be unkillable so we need to terminate
+       # them. We also need to make sure that no new connections get
+       # established while we are doing this.
+       #
+       # The steps are:
+       #
        # 1) firewall this IP, so no new external packets arrive for it
        # 2) find existing connections, and kill them
        # 3) remove the IP from the interface
@@ -186,17 +185,20 @@ releaseip)
        ;;
 
 updateip)
-       # moving an IP is a bit more complex than it seems.
-       # First we drop all traffic on the old interface.
-       # Then we try to add the ip to the new interface and before
-       # we finally remove it from the old interface.
+       # Moving an IP is a bit more complex than it seems.  First we
+       # drop all traffic on the old interface.  Then we try to
+       # remove the IP from the old interface and add it to the new
+       # interface.
+       #
+       # The steps are:
        #
        # 1) firewall this IP, so no new external packets arrive for it
        # 2) remove the IP from the old interface (and new interface, to be 
sure)
        # 3) add the IP to the new interface
        # 4) remove the firewall rule
        # 5) use ctdb gratarp to propagate the new mac address
-       # 6) use netstat -tn to find existing connections, and tickle them
+       # 6) send tickle ACKs for existing connections, so dropped
+       #    packets are resent
        _oiface=$2
        niface=$3
        _ip=$4
@@ -207,7 +209,7 @@ updateip)
 
        # Could check maskbits too.  However, that should never change
        # so we want to notice if it does.
-       if [ "$oiface" = "$niface" ] ; then
+       if [ "$oiface" = "$niface" ]; then
                echo "Redundant \"updateip\" - ${ip} already on ${niface}"
                exit 0
        fi
@@ -226,10 +228,10 @@ updateip)
 
        flush_route_cache
 
-       # propagate the new mac address
+       # Propagate the new MAC address
        $CTDB gratarp "$ip" "$niface"
 
-       # tickle all existing connections, so that dropped packets
+       # Tickle all existing connections, so that dropped packets
        # are retransmitted and the tcp streams work
        tickle_tcp_connections "$ip"
        ;;
@@ -241,6 +243,8 @@ ipreallocated)
 
 monitor)
        monitor_interfaces || exit 1
+
+       update_tickles
        ;;
 esac
 
diff --git a/ctdb/config/events/legacy/60.nfs.script 
b/ctdb/config/events/legacy/60.nfs.script
index bc5be241f67..b797ada9370 100755
--- a/ctdb/config/events/legacy/60.nfs.script
+++ b/ctdb/config/events/legacy/60.nfs.script
@@ -352,7 +352,6 @@ monitor)
                        exit $?
        fi
 
-       update_tickles 2049
        nfs_update_lock_info
 
        nfs_check_services
diff --git a/ctdb/config/functions b/ctdb/config/functions
index f8f539ad53f..1ca3cebbbca 100755
--- a/ctdb/config/functions
+++ b/ctdb/config/functions
@@ -499,7 +499,7 @@ ctdb_check_unix_socket()
                return 1
        fi
 
-       _out=$(ss -l -x "src ${_sockpath}" | tail -n +2)
+       _out=$(ss -l -xH "src ${_sockpath}")
        if [ -z "$_out" ]; then
                echo "ERROR: ${service_name} not listening on ${_sockpath}"
                return 1
@@ -509,6 +509,43 @@ ctdb_check_unix_socket()
 ################################################
 # kill off any TCP connections with the given IP
 ################################################
+
+kill_tcp_summarise()
+{
+       _mode="$1"
+       _count="$2"
+       _method="$3"
+
+       _connections=$(get_tcp_connections_for_ip "$_ip")
+       if [ -z "$_connections" ]; then
+               _remaining=0
+       else
+               _remaining=$(echo "$_connections" | wc -l)
+       fi
+
+       case "$_mode" in
+       total)
+               _total="$_count"
+               _killed=$((_total - _remaining))
+               ;;
+       killed)
+               _killed="$_count"
+               _total=$((_killed + _remaining))
+               ;;
+       esac
+
+       _t="${_killed}/${_total}"
+       _m=""
+       if [ -n "$_method" ]; then
+               _m=", using ${_method}"
+       fi
+       echo "Killed ${_t} TCP connections to released IP ${_ip}${_m}"
+       if [ -n "$_connections" ]; then
+               echo "Remaining connections:"
+               echo "$_connections" | sed -e 's|^|  |'
+       fi
+}
+
 kill_tcp_connections()
 {
        _iface="$1"
@@ -519,6 +556,16 @@ kill_tcp_connections()
                _oneway=true
        fi
 
+       case "$CTDB_KILLTCP_USE_SS_KILL" in
+       yes | try)
+               _killcount=$(ss -K -tnH state established src "$_ip" | wc -l)
+               kill_tcp_summarise "killed" "$_killcount" "ss -K"
+               if [ "$CTDB_KILLTCP_USE_SS_KILL" = "yes" ]; then
+                       return
+               fi
+               ;;
+       esac
+
        get_tcp_connections_for_ip "$_ip" | {
                _killcount=0
                _connections=""
@@ -556,22 +603,11 @@ kill_tcp_connections()
                        return
                }
 
-               _connections=$(get_tcp_connections_for_ip "$_ip")
-               if [ -z "$_connections" ]; then
-                       _remaining=0
-               else
-                       _remaining=$(echo "$_connections" | wc -l)
-               fi
-
-               _actually_killed=$((_killcount - _remaining))
-
-               _t="${_actually_killed}/${_killcount}"
-               echo "Killed ${_t} TCP connections to released IP $_ip"
-
-               if [ -n "$_connections" ]; then
-                       echo "Remaining connections:"
-                       echo "$_connections" | sed -e 's|^|  |'
+               _method=""
+               if [ "$CTDB_KILLTCP_USE_SS_KILL" = "try" ]; then
+                       _method="ctdb_killtcp"
                fi
+               kill_tcp_summarise "total" "$_killcount" "$_method"
        }
 }
 
@@ -602,7 +638,7 @@ get_tcp_connections_for_ip()
 {
        _ip="$1"
 
-       ss -tn state established "src [$_ip]" | awk 'NR > 1 {print $3, $4}'
+       ss -tnH state established "src [$_ip]" | awk '{print $3, $4}'
 }
 
 ########################################################
@@ -1181,49 +1217,39 @@ nfs_callout()
 
 update_tickles()
 {
-       _port="$1"
-
        tickledir="${CTDB_SCRIPT_VARDIR}/tickles"
        mkdir -p "$tickledir"
 
-       # What public IPs do I hold?
-       _pnn=$(ctdb_get_pnn)
-       _ips=$($CTDB -X ip | awk -F'|' -v pnn="$_pnn" '$3 == pnn {print $2}')
+       # If not hosting any public IPs then can't have any connections...
+       if [ ! -s "$CTDB_MY_PUBLIC_IPS_CACHE" ]; then
+               return
+       fi
 
-       # IPs and port as ss filters
+       # IPs ss filter
        _ip_filter=""
-       for _ip in $_ips; do
+       while read -r _ip; do
                _ip_filter="${_ip_filter}${_ip_filter:+ || }src [${_ip}]"
-       done
-       _port_filter="sport == :${_port}"
+       done <"$CTDB_MY_PUBLIC_IPS_CACHE"
+
+       # Record our current tickles in a temporary file
+       _my_tickles="${tickledir}/all.tickles.$$"
+       while read -r _i; do
+               $CTDB -X gettickles "$_i" |
+                       awk -F'|' 'NR > 1 { printf "%s:%s %s:%s\n", $2, $3, $4, 
$5 }'
+       done <"$CTDB_MY_PUBLIC_IPS_CACHE" |
+               sort >"$_my_tickles"
 
        # Record connections to our public IPs in a temporary file.
        # This temporary file is in CTDB's private state directory and
        # $$ is used to avoid a very rare race involving CTDB's script
        # debugging.  No security issue, nothing to see here...
-       _my_connections="${tickledir}/${_port}.connections.$$"
-       # Parentheses are needed around the filters for precedence but
+       _my_connections="${tickledir}/all.connections.$$"
+       # Parentheses are needed around the IP filter for precedence but
        # the parentheses can't be empty!
-       #
-       # Recent versions of ss print square brackets around IPv6
-       # addresses.  While it is desirable to update CTDB's address
-       # parsing and printing code, something needs to be done here
-       # for backward compatibility, so just delete the brackets.
-       ss -tn state established \
-               "${_ip_filter:+( ${_ip_filter} )}" \
-               "${_port_filter:+( ${_port_filter} )}" |
-               awk 'NR > 1 {print $4, $3}' |
-               tr -d '][' |
+       ss -tnH state established "${_ip_filter:+( ${_ip_filter} )}" |
+               awk '{print $4, $3}' |
                sort >"$_my_connections"
 
-       # Record our current tickles in a temporary file
-       _my_tickles="${tickledir}/${_port}.tickles.$$"
-       for _i in $_ips; do
-               $CTDB -X gettickles "$_i" "$_port" |
-                       awk -F'|' 'NR > 1 { printf "%s:%s %s:%s\n", $2, $3, $4, 
$5 }'
-       done |
-               sort >"$_my_tickles"
-
        # Add tickles for connections that we haven't already got tickles for
        comm -23 "$_my_connections" "$_my_tickles" |
                $CTDB addtickle
diff --git a/ctdb/doc/ctdb-script.options.5.xml 
b/ctdb/doc/ctdb-script.options.5.xml
index 11597097a04..9298f9f3498 100644
--- a/ctdb/doc/ctdb-script.options.5.xml
+++ b/ctdb/doc/ctdb-script.options.5.xml
@@ -105,12 +105,102 @@
       <title>10.interface</title>
 
       <para>
-       This event script handles monitoring of interfaces using by
-       public IP addresses.
+       This event script handles public IP address release and
+       takeover, as well as monitoring interfaces used by public IP
+       addresses.
       </para>
 
       <variablelist>
 
+       <varlistentry>
+         <term>
+           CTDB_KILLTCP_USE_SS_KILL=yes|try|no
+         </term>
+         <listitem>
+           <para>
+             Whether to use <command>ss -K/--kill</command> to reset
+             incoming TCP connections to public IP addresses during
+             <command>releaseip</command>.
+           </para>
+
+           <para>
+             CTDB's standard method of resetting incoming TCP
+             connections during <command>releaseip</command> is via
+             its custom <command>ctdb_killtcp</command> command.
+             This uses network trickery to reset each connection:
+             send a "tickle ACK", capture the reply to extract the
+             TCP sequence number, send a reset (containing the
+             correct sequence number).
+           </para>
+
+           <para>
+             <command>ss -K</command> has been supported in
+             <command>ss</command> since iproute 4.5 in March 2016
+             and in the Linux kernel since 4.4 in December 2015.
+             However, the required kernel configuration item
+             <code>CONFIG_INET_DIAG_DESTROY</code> is disabled by
+             default.  Although enabled in Debian kernels since ~2017
+             and in Ubuntu since at least 18.04, this has only
+             recently been enabled in distributions such as RHEL.
+             There seems to be no way, including running <command>ss
+             -K</command>, to determine if this is supported, so use
+             of this feature needs to be configurable.  When
+             available, it should be the fastest, most reliable way
+             of killing connections.
+           </para>
+
+           <para>
+             Supported values are:
+             <variablelist>
+               <varlistentry>
+                 <term>


-- 
Samba Shared Repository

Reply via email to