On Wed, Oct 4, 2023 at 12:23 AM Dumitru Ceara <dce...@redhat.com> wrote:
>
> Commit 9deb000536e0 ("northd: Remove potential duplicates in SB
> Load_Balancer table.") added code to northd to ensure that SB duplicates
> get removed.  Quoting from that commit message:
>
>     The Southbound Load_Balancer table (like the Northbound one) doesn't
>     define an index (e.g., by name).  This essentially means that there can
>     be duplicate records in the database.  Moreover, the OVSDB RAFT
>     implementation ensures "at-least-once" consistency [0] making it
>     possible for such duplicates to "spontaneously" appear.
>
> That is still the case, however, since then ovn-northd started to
> incrementally process (NB) load balancer updates and self-created
> SB.Load_Balancer records trigger a recompute of the "en_sync_to_sb_lb"
> node.  At scale (e.g., an OpenShift cluster with 750 nodes and 64K load
> balancers) this is significant:
>   inc_proc_eng|INFO|node: sync_to_sb_lb, recompute (missing handler for input 
> SB_load_balancer) took 524ms
>
> All we actually care about is whether the transaction to create the SB
> record created a spurious clone.  Instead of triggering a recompute, add
> a new handler for this type of SB update and check for duplicates.  With
> the database from the same scale test as above the duplicate check is
> almost instantaneous.
>
> Signed-off-by: Dumitru Ceara <dce...@redhat.com>

Hi Dumitru,

it looks good, there is just one minor thing below.

> ---
>  northd/en-sync-sb.c      | 17 +++++++++++++++++
>  northd/en-sync-sb.h      |  2 ++
>  northd/inc-proc-northd.c |  3 ++-
>  northd/northd.c          | 21 +++++++++++++++++++++
>  northd/northd.h          |  1 +
>  5 files changed, 43 insertions(+), 1 deletion(-)
>
> diff --git a/northd/en-sync-sb.c b/northd/en-sync-sb.c
> index aae396a43d..8c07a71ee8 100644
> --- a/northd/en-sync-sb.c
> +++ b/northd/en-sync-sb.c
> @@ -248,6 +248,23 @@ sync_to_sb_lb_northd_handler(struct engine_node *node, 
> void *data OVS_UNUSED)
>      return true;
>  }
>
> +bool
> +sync_to_sb_lb_sb_load_balancer(struct engine_node *node, void *data 
> OVS_UNUSED)
> +{
> +    const struct sbrec_load_balancer_table *sb_load_balancer_table =
> +        EN_OVSDB_GET(engine_get_input("SB_load_balancer", node));
> +
> +    /* The only reason to handle SB.Load_Balancer updates is to detect
> +     * spurious records being created in clustered databases due to
> +     * lack of indexing on the SB.Load_Balancer table.  All other changes
> +     * are valid and performed by northd, the only write-client for
> +     * this table. */
> +    if (check_sb_lb_duplicates(sb_load_balancer_table)) {
> +        return false;
> +    }
> +    return true;
> +}
> +
>  /* sync_to_sb_pb engine node functions.
>   * This engine node syncs the SB Port Bindings (partly).
>   * en_northd engine create the SB Port binding rows and
> diff --git a/northd/en-sync-sb.h b/northd/en-sync-sb.h
> index f08565eee1..3bcbb82594 100644
> --- a/northd/en-sync-sb.h
> +++ b/northd/en-sync-sb.h
> @@ -21,6 +21,8 @@ void *en_sync_to_sb_lb_init(struct engine_node *, struct 
> engine_arg *);
>  void en_sync_to_sb_lb_run(struct engine_node *, void *data);
>  void en_sync_to_sb_lb_cleanup(void *data);
>  bool sync_to_sb_lb_northd_handler(struct engine_node *, void *data 
> OVS_UNUSED);
> +bool sync_to_sb_lb_sb_load_balancer(struct engine_node *,
> +                                    void *data OVS_UNUSED);
>
>  void *en_sync_to_sb_pb_init(struct engine_node *, struct engine_arg *);
>  void en_sync_to_sb_pb_run(struct engine_node *, void *data);
> diff --git a/northd/inc-proc-northd.c b/northd/inc-proc-northd.c
> index 8b08171179..04df0b06c2 100644
> --- a/northd/inc-proc-northd.c
> +++ b/northd/inc-proc-northd.c
> @@ -230,7 +230,8 @@ void inc_proc_northd_init(struct ovsdb_idl_loop *nb,
>
>      engine_add_input(&en_sync_to_sb_lb, &en_northd,
>                       sync_to_sb_lb_northd_handler);
> -    engine_add_input(&en_sync_to_sb_lb, &en_sb_load_balancer, NULL);
> +    engine_add_input(&en_sync_to_sb_lb, &en_sb_load_balancer,
> +                     sync_to_sb_lb_sb_load_balancer);
>
>      engine_add_input(&en_sync_to_sb_pb, &en_northd,
>                       sync_to_sb_pb_northd_handler);
> diff --git a/northd/northd.c b/northd/northd.c
> index 528027d07b..ead337219d 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -4660,6 +4660,27 @@ sync_lbs(struct ovsdb_idl_txn *ovnsb_txn,
>      }
>  }
>
> +bool
> +check_sb_lb_duplicates(const struct sbrec_load_balancer_table *table)
> +{
> +    struct sset existing_nb_lb_uuids =
> +        SSET_INITIALIZER(&existing_nb_lb_uuids);
> +    const struct sbrec_load_balancer *sbrec_lb;
> +    bool duplicates = false;
> +
> +    SBREC_LOAD_BALANCER_TABLE_FOR_EACH (sbrec_lb, table) {
> +        const char *nb_lb_uuid = smap_get(&sbrec_lb->external_ids, "lb_id");
> +        if (nb_lb_uuid && !sset_add(&existing_nb_lb_uuids,
> +                                    CONST_CAST(void *, nb_lb_uuid))) {

We don't need CONST_CAST, "nb_lb_uuid" can be used directly.

> +            duplicates = true;
> +            break;
> +        }
> +    }
> +
> +    sset_destroy(&existing_nb_lb_uuids);
> +    return duplicates;
> +}
> +
>  /* Syncs the SB port binding for the ovn_port 'op'.  Caller should make sure
>   * that the OVN SB IDL txn is not NULL.  Presently it only syncs the nat
>   * column of port binding corresponding to the 'op->nbsp' */
> diff --git a/northd/northd.h b/northd/northd.h
> index 05f4140d77..b30e005cf1 100644
> --- a/northd/northd.h
> +++ b/northd/northd.h
> @@ -365,6 +365,7 @@ const struct ovn_datapath *northd_get_datapath_for_port(
>      const struct hmap *ls_ports, const char *port_name);
>  void sync_lbs(struct ovsdb_idl_txn *, const struct sbrec_load_balancer_table 
> *,
>                struct ovn_datapaths *ls_datapaths, struct hmap *lbs);
> +bool check_sb_lb_duplicates(const struct sbrec_load_balancer_table *);
>
>  void sync_pbs(struct ovsdb_idl_txn *, struct hmap *ls_ports);
>  bool sync_pbs_for_northd_ls_changes(struct tracked_ls_changes *);
> --
> 2.39.3
>
> _______________________________________________
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>


With that addressed:

Acked-by: Ales Musil <amu...@redhat.com>

-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA

amu...@redhat.com

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to