Hi,

I'll work on this issue reported when northd is restarted or when one
northd goes on standby and the other becomes
active.

Best,
Lucas

Em seg., 6 de out. de 2025 às 07:19, Ales Musil via dev <
[email protected]> escreveu:

> On Mon, Oct 6, 2025 at 11:52 AM Ilya Maximets <[email protected]> wrote:
>
> > On 10/3/25 10:06 PM, Mark Michelson via dev wrote:
> > > Thanks Lucas and Numan.
> > >
> > > I merged this to main.
> > >
> > > On Thu, Sep 25, 2025 at 10:25 AM Numan Siddique <[email protected]>
> wrote:
> > >>
> > >> On Thu, Sep 25, 2025 at 8:16 AM Lucas Vargas Dias via dev
> > >> <[email protected]> wrote:
> > >>>
> > >>> Change the logic to save sbflow uuid and just update if
> > >>> the lflow is reused. Otherwise, it's removed.
> > >>> Also, reduce sbflow searching with uuidset instead of
> > >>> searching through all lflow table.
> > >>> Add lflow states:
> > >>> LFLOW_STALE - Lflow is not relevant
> > >>> LFLOW_TO_SYNC - Lflow needs to be synced with SB DB
> > >>> LFLOW_SYNCED - Lflow is synced with SB SB
> > >>>
> > >>> It generates the following results in a scenario with:
> > >>> - LSPs: 56548
> > >>> - LRPs: 27304
> > >>> - LRs: 7922
> > >>> - LSs: 28602
> > >>>
> > >>> Without the commit:
> > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lr-add lr9-1
> > >>> Time spent on processing nb_cfg 275438:
> > >>>         ovn-northd delay before processing:     16069ms
> > >>>         ovn-northd completion:                  32828ms
> > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb ls-add bar9-1
> > >>> Time spent on processing nb_cfg 275439:
> > >>>         ovn-northd delay before processing:     15019ms
> > >>>         ovn-northd completion:                  33207ms
> > >>>
> > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lrp-add lr9-1
> > rp9-1-bar 00:01:ff:22:00:01 192.168.10.1/24 2801:80:3eaf:4401::1/64
> > >>> Time spent on processing nb_cfg 275440:
> > >>>         ovn-northd delay before processing:     14784ms
> > >>>         ovn-northd completion:                  33577ms
> > >>>
> > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lsp-add bar9-1
> > bar-rp9-1 -- set Logical_Switch_Port bar-rp9-1 type=router
> > options:router-port=rp9-1-bar -- lsp-set-addresses bar-rp9-1 router
> > >>> Time spent on processing nb_cfg 275441:
> > >>>         ovn-northd delay before processing:     14598ms
> > >>>         ovn-northd completion:                  31942ms
> > >>>
> > >>> With the commit:
> > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lr-add lr9-1
> > >>> Time spent on processing nb_cfg 275401:
> > >>>         ovn-northd delay before processing:     12602ms
> > >>>         ovn-northd completion:                  26103ms
> > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb ls-add bar9-1
> > >>> Time spent on processing nb_cfg 275402:
> > >>>         ovn-northd delay before processing:     12639ms
> > >>>         ovn-northd completion:                  26759ms
> > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lrp-add lr9-1
> > rp9-1-bar 00:01:ff:22:00:01 192.168.10.1/24 2801:80:3eaf:4401::1/64
> > >>> Time spent on processing nb_cfg 275403:
> > >>>         ovn-northd delay before processing:     11874ms
> > >>>         ovn-northd completion:                  29733ms
> > >>> ovn-nbctl --no-leader-only --print-wait-time --wait=sb lsp-add bar9-1
> > bar-rp9-1 -- set Logical_Switch_Port bar-rp9-1 type=router
> > options:router-port=rp9-1-bar -- lsp-set-addresses bar-rp9-1 router
> > >>> Time spent on processing nb_cfg 275404:
> > >>>         ovn-northd delay before processing:     4058ms
> > >>>         ovn-northd completion:                  17323ms
> > >>>
> > >>> Signed-off-by: Lucas Vargas Dias <[email protected]>
> > >>
> > >> Thanks for addressing the comments and for the great improvements.
> > >>
> > >> Acked-by: Numan Siddique <[email protected]>
> > >>
> > >> Numan
> > >>
> > >>> ---
> > >>>  northd/en-lflow.c   |   2 +-
> > >>>  northd/lflow-mgr.c  | 108
> ++++++++++++++++++++++++--------------------
> > >>>  northd/lflow-mgr.h  |  10 +++-
> > >>>  northd/northd.c     |   8 ++--
> > >>>  tests/ovn-northd.at |  32 +++++++++++++
> > >>>  5 files changed, 106 insertions(+), 54 deletions(-)
> > >>>
> > >>> diff --git a/northd/en-lflow.c b/northd/en-lflow.c
> > >>> index 50570b611..13c5e3119 100644
> > >>> --- a/northd/en-lflow.c
> > >>> +++ b/northd/en-lflow.c
> > >>> @@ -122,7 +122,7 @@ en_lflow_run(struct engine_node *node, void
> *data)
> > >>>      stopwatch_start(BUILD_LFLOWS_STOPWATCH_NAME, time_msec());
> > >>>
> > >>>      struct lflow_data *lflow_data = data;
> > >>> -    lflow_table_clear(lflow_data->lflow_table);
> > >>> +    lflow_table_clear(lflow_data->lflow_table, false);
> > >>>      lflow_reset_northd_refs(&lflow_input);
> > >>>      lflow_ref_clear(lflow_input.igmp_lflow_ref);
> > >>>
> > >>> diff --git a/northd/lflow-mgr.c b/northd/lflow-mgr.c
> > >>> index 6a66a9718..5a989fe19 100644
> > >>> --- a/northd/lflow-mgr.c
> > >>> +++ b/northd/lflow-mgr.c
> > >>> @@ -26,6 +26,7 @@
> > >>>  #include "lflow-mgr.h"
> > >>>  #include "lib/ovn-parallel-hmap.h"
> > >>>  #include "lib/ovn-util.h"
> > >>> +#include "lib/uuidset.h"
> > >>>
> > >>>  VLOG_DEFINE_THIS_MODULE(lflow_mgr);
> > >>>
> > >>> @@ -37,7 +38,8 @@ static void ovn_lflow_init(struct ovn_lflow *,
> > struct ovn_datapath *od,
> > >>>                             uint16_t priority, char *match,
> > >>>                             char *actions, char *io_port,
> > >>>                             char *ctrl_meter, char *stage_hint,
> > >>> -                           const char *where, const char
> *flow_desc);
> > >>> +                           const char *where, const char *flow_desc,
> > >>> +                           struct uuid sbuuid);
> > >>>  static struct ovn_lflow *ovn_lflow_find(const struct hmap *lflows,
> > >>>                                          enum ovn_stage stage,
> > >>>                                          uint16_t priority, const
> char
> > *match,
> > >>> @@ -147,6 +149,13 @@ static struct ovs_mutex
> > lflow_hash_locks[LFLOW_HASH_LOCK_MASK + 1];
> > >>>   */
> > >>>  extern struct ovs_mutex fake_hash_mutex;
> > >>>
> > >>> +
> > >>> +enum ovn_lflow_state {
> > >>> +    LFLOW_STALE,
> > >>> +    LFLOW_TO_SYNC,
> > >>> +    LFLOW_SYNCED,
> > >>> +};
> > >>> +
> > >>>  /* Represents a logical ovn flow (lflow).
> > >>>   *
> > >>>   * A logical flow with match 'M' and actions 'A' - L(M, A) is
> created
> > >>> @@ -181,14 +190,7 @@ struct ovn_lflow {
> > >>>      struct hmap dp_refcnts_map; /* Maintains the number of times
> this
> > ovn_lflow
> > >>>                                   * is referenced by a given
> datapath.
> > >>>                                   * Contains 'struct dp_refcnt' in
> the
> > map. */
> > >>> -};
> > >>> -
> > >>> -/* Logical flow table. */
> > >>> -struct lflow_table {
> > >>> -    struct hmap entries; /* hmap of lflows. */
> > >>> -    struct hmap ls_dp_groups; /* hmap of logical switch dp groups.
> */
> > >>> -    struct hmap lr_dp_groups; /* hmap of logical router dp groups.
> */
> > >>> -    ssize_t max_seen_lflow_size;
> > >>> +    enum ovn_lflow_state sync_state;
> > >>>  };
> > >>>
> > >>>  struct lflow_table *
> > >>> @@ -210,10 +212,14 @@ lflow_table_init(struct lflow_table
> *lflow_table)
> > >>>  }
> > >>>
> > >>>  void
> > >>> -lflow_table_clear(struct lflow_table *lflow_table)
> > >>> +lflow_table_clear(struct lflow_table *lflow_table, bool destroy_all)
> > >>>  {
> > >>>      struct ovn_lflow *lflow;
> > >>>      HMAP_FOR_EACH_SAFE (lflow, hmap_node, &lflow_table->entries) {
> > >>> +        if (!destroy_all) {
> > >>> +            lflow->sync_state = LFLOW_STALE;
> > >>> +            continue;
> > >>> +        }
> > >>>          ovn_lflow_destroy(lflow_table, lflow);
> > >>>      }
> > >>>
> > >>> @@ -224,7 +230,7 @@ lflow_table_clear(struct lflow_table
> *lflow_table)
> > >>>  void
> > >>>  lflow_table_destroy(struct lflow_table *lflow_table)
> > >>>  {
> > >>> -    lflow_table_clear(lflow_table);
> > >>> +    lflow_table_clear(lflow_table, true);
> > >>>      hmap_destroy(&lflow_table->entries);
> > >>>      ovn_dp_groups_destroy(&lflow_table->ls_dp_groups);
> > >>>      ovn_dp_groups_destroy(&lflow_table->lr_dp_groups);
> > >>> @@ -257,16 +263,42 @@ lflow_table_sync_to_sb(struct lflow_table
> > *lflow_table,
> > >>>                         const struct sbrec_logical_flow_table
> > *sb_flow_table,
> > >>>                         const struct sbrec_logical_dp_group_table
> > *dpgrp_table)
> > >>>  {
> > >>> +    struct uuidset sb_uuid_set = UUIDSET_INITIALIZER(&sb_uuid_set);
> > >>>      struct hmap lflows_temp = HMAP_INITIALIZER(&lflows_temp);
> > >>>      struct hmap *lflows = &lflow_table->entries;
> > >>>      struct ovn_lflow *lflow;
> > >>> +    const struct sbrec_logical_flow *sbflow;
> > >>>
> > >>>      fast_hmap_size_for(&lflows_temp,
> > >>>                         lflow_table->max_seen_lflow_size);
> > >>>
> > >>> +    HMAP_FOR_EACH_SAFE (lflow, hmap_node, lflows) {
> > >>> +        if (lflow->sync_state == LFLOW_STALE) {
> > >>> +            ovn_lflow_destroy(lflow_table, lflow);
> > >>> +            continue;
> > >>> +        }
> > >>> +        sbflow = NULL;
> > >>> +        if (!uuid_is_zero(&lflow->sb_uuid)) {
> > >>> +            sbflow =
> > sbrec_logical_flow_table_get_for_uuid(sb_flow_table,
> > >>> +
> >  &lflow->sb_uuid);
> > >>> +        }
> > >>> +        sync_lflow_to_sb(lflow, ovnsb_txn, lflow_table,
> ls_datapaths,
> > >>> +                         lr_datapaths, ovn_internal_version_changed,
> > >>> +                         sbflow, dpgrp_table);
> > >>> +        uuidset_insert(&sb_uuid_set, &lflow->sb_uuid);
> > >>> +        hmap_remove(lflows, &lflow->hmap_node);
> > >>> +        hmap_insert(&lflows_temp, &lflow->hmap_node,
> > >>> +                    hmap_node_hash(&lflow->hmap_node));
> > >>> +    }
> >
> > Hi, Lucas, Numan, Mark, others.
> >
> > This change introduces a significant performance issue when northd
> > is restarted or when one northd goes on standby and the other becomes
> > active.
> >
> > The problem is that the code above is using lflow->sb_uuid that is
> > always zero for logical flows not created by the currently active
> > northd process.  So, if the flows existed in the database before the
> > current process became active, it will remove all of these flows and
> > re-create anew.  The issue can be easily reproduced in the sanbdbox
> > by adding some resources, then stopping northd and restarting it.
> > Checking the database file we can see all the logical flows re-created.
> >
> > In ovn-heater setup this behavior is causing a significant iteration
> > time spike for up to 140 seconds in 500-node cluster-density test
> > every time SB leader changes and a new northd becomes active.  It
> > overwrites about 200 MB of logical flows, making Sb ovsdb-server send
> > 200 x 150-ish (clients) MB of data momentarily consuming 30 GB of RAM.
> > All ovn-controllers also spend 20-30 seconds processing the new update.
> >
> > We still need a way to search for logical flows in the southbound
> > database directly, as not every lflow was created by the current process.
> > We need to find a way to do that on top of this change or revert it.
> > I assume, revert might be easier in the short term as serching Sb is
> > exacly what this change was trying to avoid.
> >
> > Best regards, Ilya Maximets.
> >
> >
> Hi,
>
> based on the above I have posted a revert for now. I think it will be best
> to unblock CI and focus on the development outside for now.
>
> Thanks,
> Ales
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>

-- 




_‘Esta mensagem é direcionada apenas para os endereços constantes no 
cabeçalho inicial. Se você não está listado nos endereços constantes no 
cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa 
mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão 
imediatamente anuladas e proibidas’._


* **‘Apesar do Magazine Luiza tomar 
todas as precauções razoáveis para assegurar que nenhum vírus esteja 
presente nesse e-mail, a empresa não poderá aceitar a responsabilidade por 
quaisquer perdas ou danos causados por esse e-mail ou por seus anexos’.*



_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to