On Mon, Oct 6, 2025 at 11:52 AM Ilya Maximets <[email protected]> wrote:

> On 10/3/25 10:06 PM, Mark Michelson via dev wrote:
> > Thanks Lucas and Numan.
> >
> > I merged this to main.
> >
> > On Thu, Sep 25, 2025 at 10:25 AM Numan Siddique <[email protected]> wrote:
> >>
> >> On Thu, Sep 25, 2025 at 8:16 AM Lucas Vargas Dias via dev
> >> <[email protected]> wrote:
> >>>
> >>> Change the logic to save sbflow uuid and just update if
> >>> the lflow is reused. Otherwise, it's removed.
> >>> Also, reduce sbflow searching with uuidset instead of
> >>> searching through all lflow table.
> >>> Add lflow states:
> >>> LFLOW_STALE - Lflow is not relevant
> >>> LFLOW_TO_SYNC - Lflow needs to be synced with SB DB
> >>> LFLOW_SYNCED - Lflow is synced with SB SB
> >>>
> >>> It generates the following results in a scenario with:
> >>> - LSPs: 56548
> >>> - LRPs: 27304
> >>> - LRs: 7922
> >>> - LSs: 28602
> >>>
> >>> Without the commit:
> >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lr-add lr9-1
> >>> Time spent on processing nb_cfg 275438:
> >>>         ovn-northd delay before processing:     16069ms
> >>>         ovn-northd completion:                  32828ms
> >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb ls-add bar9-1
> >>> Time spent on processing nb_cfg 275439:
> >>>         ovn-northd delay before processing:     15019ms
> >>>         ovn-northd completion:                  33207ms
> >>>
> >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lrp-add lr9-1
> rp9-1-bar 00:01:ff:22:00:01 192.168.10.1/24 2801:80:3eaf:4401::1/64
> >>> Time spent on processing nb_cfg 275440:
> >>>         ovn-northd delay before processing:     14784ms
> >>>         ovn-northd completion:                  33577ms
> >>>
> >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lsp-add bar9-1
> bar-rp9-1 -- set Logical_Switch_Port bar-rp9-1 type=router
> options:router-port=rp9-1-bar -- lsp-set-addresses bar-rp9-1 router
> >>> Time spent on processing nb_cfg 275441:
> >>>         ovn-northd delay before processing:     14598ms
> >>>         ovn-northd completion:                  31942ms
> >>>
> >>> With the commit:
> >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lr-add lr9-1
> >>> Time spent on processing nb_cfg 275401:
> >>>         ovn-northd delay before processing:     12602ms
> >>>         ovn-northd completion:                  26103ms
> >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb ls-add bar9-1
> >>> Time spent on processing nb_cfg 275402:
> >>>         ovn-northd delay before processing:     12639ms
> >>>         ovn-northd completion:                  26759ms
> >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lrp-add lr9-1
> rp9-1-bar 00:01:ff:22:00:01 192.168.10.1/24 2801:80:3eaf:4401::1/64
> >>> Time spent on processing nb_cfg 275403:
> >>>         ovn-northd delay before processing:     11874ms
> >>>         ovn-northd completion:                  29733ms
> >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lsp-add bar9-1
> bar-rp9-1 -- set Logical_Switch_Port bar-rp9-1 type=router
> options:router-port=rp9-1-bar -- lsp-set-addresses bar-rp9-1 router
> >>> Time spent on processing nb_cfg 275404:
> >>>         ovn-northd delay before processing:     4058ms
> >>>         ovn-northd completion:                  17323ms
> >>>
> >>> Signed-off-by: Lucas Vargas Dias <[email protected]>
> >>
> >> Thanks for addressing the comments and for the great improvements.
> >>
> >> Acked-by: Numan Siddique <[email protected]>
> >>
> >> Numan
> >>
> >>> ---
> >>>  northd/en-lflow.c   |   2 +-
> >>>  northd/lflow-mgr.c  | 108 ++++++++++++++++++++++++--------------------
> >>>  northd/lflow-mgr.h  |  10 +++-
> >>>  northd/northd.c     |   8 ++--
> >>>  tests/ovn-northd.at |  32 +++++++++++++
> >>>  5 files changed, 106 insertions(+), 54 deletions(-)
> >>>
> >>> diff --git a/northd/en-lflow.c b/northd/en-lflow.c
> >>> index 50570b611..13c5e3119 100644
> >>> --- a/northd/en-lflow.c
> >>> +++ b/northd/en-lflow.c
> >>> @@ -122,7 +122,7 @@ en_lflow_run(struct engine_node *node, void *data)
> >>>      stopwatch_start(BUILD_LFLOWS_STOPWATCH_NAME, time_msec());
> >>>
> >>>      struct lflow_data *lflow_data = data;
> >>> -    lflow_table_clear(lflow_data->lflow_table);
> >>> +    lflow_table_clear(lflow_data->lflow_table, false);
> >>>      lflow_reset_northd_refs(&lflow_input);
> >>>      lflow_ref_clear(lflow_input.igmp_lflow_ref);
> >>>
> >>> diff --git a/northd/lflow-mgr.c b/northd/lflow-mgr.c
> >>> index 6a66a9718..5a989fe19 100644
> >>> --- a/northd/lflow-mgr.c
> >>> +++ b/northd/lflow-mgr.c
> >>> @@ -26,6 +26,7 @@
> >>>  #include "lflow-mgr.h"
> >>>  #include "lib/ovn-parallel-hmap.h"
> >>>  #include "lib/ovn-util.h"
> >>> +#include "lib/uuidset.h"
> >>>
> >>>  VLOG_DEFINE_THIS_MODULE(lflow_mgr);
> >>>
> >>> @@ -37,7 +38,8 @@ static void ovn_lflow_init(struct ovn_lflow *,
> struct ovn_datapath *od,
> >>>                             uint16_t priority, char *match,
> >>>                             char *actions, char *io_port,
> >>>                             char *ctrl_meter, char *stage_hint,
> >>> -                           const char *where, const char *flow_desc);
> >>> +                           const char *where, const char *flow_desc,
> >>> +                           struct uuid sbuuid);
> >>>  static struct ovn_lflow *ovn_lflow_find(const struct hmap *lflows,
> >>>                                          enum ovn_stage stage,
> >>>                                          uint16_t priority, const char
> *match,
> >>> @@ -147,6 +149,13 @@ static struct ovs_mutex
> lflow_hash_locks[LFLOW_HASH_LOCK_MASK + 1];
> >>>   */
> >>>  extern struct ovs_mutex fake_hash_mutex;
> >>>
> >>> +
> >>> +enum ovn_lflow_state {
> >>> +    LFLOW_STALE,
> >>> +    LFLOW_TO_SYNC,
> >>> +    LFLOW_SYNCED,
> >>> +};
> >>> +
> >>>  /* Represents a logical ovn flow (lflow).
> >>>   *
> >>>   * A logical flow with match 'M' and actions 'A' - L(M, A) is created
> >>> @@ -181,14 +190,7 @@ struct ovn_lflow {
> >>>      struct hmap dp_refcnts_map; /* Maintains the number of times this
> ovn_lflow
> >>>                                   * is referenced by a given datapath.
> >>>                                   * Contains 'struct dp_refcnt' in the
> map. */
> >>> -};
> >>> -
> >>> -/* Logical flow table. */
> >>> -struct lflow_table {
> >>> -    struct hmap entries; /* hmap of lflows. */
> >>> -    struct hmap ls_dp_groups; /* hmap of logical switch dp groups. */
> >>> -    struct hmap lr_dp_groups; /* hmap of logical router dp groups. */
> >>> -    ssize_t max_seen_lflow_size;
> >>> +    enum ovn_lflow_state sync_state;
> >>>  };
> >>>
> >>>  struct lflow_table *
> >>> @@ -210,10 +212,14 @@ lflow_table_init(struct lflow_table *lflow_table)
> >>>  }
> >>>
> >>>  void
> >>> -lflow_table_clear(struct lflow_table *lflow_table)
> >>> +lflow_table_clear(struct lflow_table *lflow_table, bool destroy_all)
> >>>  {
> >>>      struct ovn_lflow *lflow;
> >>>      HMAP_FOR_EACH_SAFE (lflow, hmap_node, &lflow_table->entries) {
> >>> +        if (!destroy_all) {
> >>> +            lflow->sync_state = LFLOW_STALE;
> >>> +            continue;
> >>> +        }
> >>>          ovn_lflow_destroy(lflow_table, lflow);
> >>>      }
> >>>
> >>> @@ -224,7 +230,7 @@ lflow_table_clear(struct lflow_table *lflow_table)
> >>>  void
> >>>  lflow_table_destroy(struct lflow_table *lflow_table)
> >>>  {
> >>> -    lflow_table_clear(lflow_table);
> >>> +    lflow_table_clear(lflow_table, true);
> >>>      hmap_destroy(&lflow_table->entries);
> >>>      ovn_dp_groups_destroy(&lflow_table->ls_dp_groups);
> >>>      ovn_dp_groups_destroy(&lflow_table->lr_dp_groups);
> >>> @@ -257,16 +263,42 @@ lflow_table_sync_to_sb(struct lflow_table
> *lflow_table,
> >>>                         const struct sbrec_logical_flow_table
> *sb_flow_table,
> >>>                         const struct sbrec_logical_dp_group_table
> *dpgrp_table)
> >>>  {
> >>> +    struct uuidset sb_uuid_set = UUIDSET_INITIALIZER(&sb_uuid_set);
> >>>      struct hmap lflows_temp = HMAP_INITIALIZER(&lflows_temp);
> >>>      struct hmap *lflows = &lflow_table->entries;
> >>>      struct ovn_lflow *lflow;
> >>> +    const struct sbrec_logical_flow *sbflow;
> >>>
> >>>      fast_hmap_size_for(&lflows_temp,
> >>>                         lflow_table->max_seen_lflow_size);
> >>>
> >>> +    HMAP_FOR_EACH_SAFE (lflow, hmap_node, lflows) {
> >>> +        if (lflow->sync_state == LFLOW_STALE) {
> >>> +            ovn_lflow_destroy(lflow_table, lflow);
> >>> +            continue;
> >>> +        }
> >>> +        sbflow = NULL;
> >>> +        if (!uuid_is_zero(&lflow->sb_uuid)) {
> >>> +            sbflow =
> sbrec_logical_flow_table_get_for_uuid(sb_flow_table,
> >>> +
>  &lflow->sb_uuid);
> >>> +        }
> >>> +        sync_lflow_to_sb(lflow, ovnsb_txn, lflow_table, ls_datapaths,
> >>> +                         lr_datapaths, ovn_internal_version_changed,
> >>> +                         sbflow, dpgrp_table);
> >>> +        uuidset_insert(&sb_uuid_set, &lflow->sb_uuid);
> >>> +        hmap_remove(lflows, &lflow->hmap_node);
> >>> +        hmap_insert(&lflows_temp, &lflow->hmap_node,
> >>> +                    hmap_node_hash(&lflow->hmap_node));
> >>> +    }
>
> Hi, Lucas, Numan, Mark, others.
>
> This change introduces a significant performance issue when northd
> is restarted or when one northd goes on standby and the other becomes
> active.
>
> The problem is that the code above is using lflow->sb_uuid that is
> always zero for logical flows not created by the currently active
> northd process.  So, if the flows existed in the database before the
> current process became active, it will remove all of these flows and
> re-create anew.  The issue can be easily reproduced in the sanbdbox
> by adding some resources, then stopping northd and restarting it.
> Checking the database file we can see all the logical flows re-created.
>
> In ovn-heater setup this behavior is causing a significant iteration
> time spike for up to 140 seconds in 500-node cluster-density test
> every time SB leader changes and a new northd becomes active.  It
> overwrites about 200 MB of logical flows, making Sb ovsdb-server send
> 200 x 150-ish (clients) MB of data momentarily consuming 30 GB of RAM.
> All ovn-controllers also spend 20-30 seconds processing the new update.
>
> We still need a way to search for logical flows in the southbound
> database directly, as not every lflow was created by the current process.
> We need to find a way to do that on top of this change or revert it.
> I assume, revert might be easier in the short term as serching Sb is
> exacly what this change was trying to avoid.
>
> Best regards, Ilya Maximets.
>
>
Hi,

based on the above I have posted a revert for now. I think it will be best
to unblock CI and focus on the development outside for now.

Thanks,
Ales
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to