On Tue, Jun 27, 2023 at 6:51 AM Dumitru Ceara <[email protected]> wrote:
>
> On 6/8/23 07:55, Han Zhou wrote:
> > On Wed, Jun 7, 2023 at 12:39 PM Numan Siddique <[email protected]> wrote:
> >>
> >> On Fri, Jun 2, 2023 at 12:13 AM Han Zhou <[email protected]> wrote:
> >>>
> >>> This patch introduces a change handler for NB logical_switch within
the
> >>> 'northd' node. It specifically handles cases where logical switch
ports
> >>> in the tracked logical switches are changed (added/updated/deleted).
> >>> Only regular logical switch ports - which are common for VMs/Pods -
are
> >>> addressed. For other scenarios, it reverts to recompute.
> >>>
> >>> This update avoids recompute in the northd node (especially the
> >>> resource-intensive build_ports()) for NB changes of VIF
> >>> add/update/delete. However, it does not eliminate the need for lflow
> >>> recompute.
> >>>
> >>> Below are the performance test results simulating an ovn-k8s topology
of
> >>> 500 nodes x 50 lsp per node:
> >>>
> >>> Before:
> >>> ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
> >>> ovn-northd completion: 955ms
> >>> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001 lsp_1_0001_01
> > -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" --
> > lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
> >>> ovn-northd completion: 919ms
> >>>
> >>> After:
> >>> ovn-nbctl --wait=hv --print-wait-time lsp-del lsp_1_0001_01
> >>> ovn-northd completion: 776ms
> >>> ovn-nbctl --wait=hv --print-wait-time lsp-add ls_1_0001 lsp_1_0001_01
> > -- lsp-set-addresses lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101" --
> > lsp-set-port-security lsp_1_0001_01 "ff:f1:bb:00:01:01 1.0.1.101"
> >>> ovn-northd completion: 773ms
> >>>
> >>> Both addition and deletion show ~20% reduction of ovn-northd
completion
> >>> time. Note: the test uses only 1 thread of ovn-northd for flow
> >>> recompute. Using multithread should show a larger percentage of
> >>> improvement.
> >>>
> >>> Signed-off-by: Han Zhou <[email protected]>
> >>> Reviewed-by: Ales Musil <[email protected]>
> >>
> >> Few nits below.
> >>
> >> Acked-by: Numan Siddique <[email protected]>
> >>
> >> Numan
> >>
>
> Hi Han, Numan,
>
> First of all, sorry for the late review, I had these patches on my list
> for a while and only got to them after they were merged.
>
Hi Dumitru, thanks for your diligent review and comments are always welcome
any time.
> >>
> >>> ---
> >>> lib/ovn-util.c | 15 ++
> >>> lib/ovn-util.h | 1 +
> >>> northd/en-northd.c | 130 ++++++---
> >>> northd/en-northd.h | 2 +
> >>> northd/inc-proc-northd.c | 12 +-
> >>> northd/northd.c | 567
+++++++++++++++++++++++++++++++++------
> >>> northd/northd.h | 27 +-
> >>> tests/ovn-northd.at | 58 ++++
> >>> 8 files changed, 685 insertions(+), 127 deletions(-)
> >>>
>
> [...]
>
> >>> +bool
> >>> +northd_handle_ls_changes(struct ovsdb_idl_txn *ovnsb_idl_txn,
> >>> + const struct northd_input *ni,
> >>> + struct northd_data *nd)
> >>> +{
> >>> + const struct nbrec_logical_switch *changed_ls;
> >>> + struct ls_change *ls_change = NULL;
> >>> +
> >>> + NBREC_LOGICAL_SWITCH_TABLE_FOR_EACH_TRACKED (changed_ls,
> >>> +
> > ni->nbrec_logical_switch_table) {
> >>> + ls_change = NULL;
> >>> + if (nbrec_logical_switch_is_new(changed_ls) ||
> >>> + nbrec_logical_switch_is_deleted(changed_ls)) {
> >>> + goto fail;
> >>> + }
> >>> + struct ovn_datapath *od = ovn_datapath_find(
> >>> + &nd->ls_datapaths.datapaths,
> >>> + &changed_ls->header_.uuid);
> >>> + if (!od) {
> >>> + static struct vlog_rate_limit rl =
VLOG_RATE_LIMIT_INIT(1,
> > 1);
> >>> + VLOG_WARN_RL(&rl, "Internal error: a tracked updated LS
> > doesn't "
> >>> + "exist in ls_datapaths: "UUID_FMT,
> >>> + UUID_ARGS(&changed_ls->header_.uuid));
> >>> + goto fail;
> >>> + }
> >>> +
> >>> + /* Now only able to handle lsp changes. */
> >>> + if (check_ls_changes_other_than_lsp(changed_ls)) {
> >>> + goto fail;
> >>> + }
> >>> +
> >>> + ls_change = xzalloc(sizeof *ls_change);
> >>> + ls_change->od = od;
> >>> + ovs_list_init(&ls_change->added_ports);
> >>> + ovs_list_init(&ls_change->deleted_ports);
> >>> + ovs_list_init(&ls_change->updated_ports);
> >>> +
> >>> + struct ovn_port *op;
> >>> + HMAP_FOR_EACH (op, dp_node, &od->ports) {
> >>> + op->visited = false;
> >>> + }
> >>> +
> >>> + /* Compare the individual ports in the old and new Logical
> > Switches */
> >>> + for (size_t j = 0; j < changed_ls->n_ports; ++j) {
> >>> + struct nbrec_logical_switch_port *new_nbsp =
> > changed_ls->ports[j];
> >>> + op = ovn_port_find_in_datapath(od, new_nbsp->name);
> >>> +
> >>> + if (!op) {
> >>> + if (!lsp_can_be_inc_processed(new_nbsp)) {
> >>> + goto fail;
> >>> + }
> >>> + op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
> >>> + new_nbsp->name, new_nbsp, od,
NULL,
> >>> + ni->sbrec_mirror_table,
> >>> + ni->sbrec_chassis_table,
> >>> + ni->sbrec_chassis_by_name,
> >>> + ni->sbrec_chassis_by_hostname);
> >>> + if (!op) {
> >>> + goto fail;
> >>> + }
> >>> + ovs_list_push_back(&ls_change->added_ports,
> >>> + &op->list);
> >>> + } else if (ls_port_has_changed(op->nbsp, new_nbsp)) {
> >>> + /* Existing port updated */
> >>> + bool temp = false;
> >>> + if (lsp_is_type_changed(op->sb, new_nbsp, &temp) ||
> >>> + !op->lsp_can_be_inc_processed ||
> >>> + !lsp_can_be_inc_processed(new_nbsp)) {
> >>> + goto fail;
> >>> + }
> >>> + const struct sbrec_port_binding *sb = op->sb;
> >>> + if (sset_contains(&nd->svc_monitor_lsps,
> > new_nbsp->name)) {
> >>> + /* This port is used for svc monitor, which may
be
> > impacted
> >>> + * by this change. Fallback to recompute. */
> >>> + goto fail;
> >>> + }
> >>> + ovn_port_destroy(&nd->ls_ports, op);
> >>> + op = ls_port_create(ovnsb_idl_txn, &nd->ls_ports,
> >>> + new_nbsp->name, new_nbsp, od, sb,
> >>> + ni->sbrec_mirror_table,
> >>> + ni->sbrec_chassis_table,
> >>> + ni->sbrec_chassis_by_name,
> >>> + ni->sbrec_chassis_by_hostname);
> >>> + if (!op) {
> >>> + goto fail;
> >>> + }
> >>> + ovs_list_push_back(&ls_change->updated_ports,
> > &op->list);
> >>> + }
> >>> + op->visited = true;
> >>> + }
> >>> +
> >>> + /* Check for deleted ports */
> >>> + HMAP_FOR_EACH_SAFE (op, dp_node, &od->ports) {
> >>> + if (!op->visited) {
> >>> + if (!op->lsp_can_be_inc_processed) {
> >>> + goto fail;
> >>> + }
> >>> + if (sset_contains(&nd->svc_monitor_lsps, op->key)) {
> >>> + /* This port was used for svc monitor, which may
be
> >>> + * impacted by this deletion. Fallback to
> > recompute. */
> >>> + goto fail;
> >>> + }
> >>> + ovs_list_push_back(&ls_change->deleted_ports,
> >>> + &op->list);
> >>> + hmap_remove(&nd->ls_ports, &op->key_node);
> >>> + hmap_remove(&od->ports, &op->dp_node);
> >>> + sbrec_port_binding_delete(op->sb);
>
> Quoting from ovsdb-idl.c:
>
> /* Deletes 'row_' from its table. May free 'row_', so it must not be
> * accessed afterward.
>
> So, in theory, we shouldn't call delete here because we might end up
> failing for other ports/datapath and falling back to recompute.
>
> As far as I can tell, only newly inserted rows may be deleted by the
> current IDL implementation but I don't think it's safe to assume this
> will never change in the future.
In general, when falling back to recompute in the middle of I-P, the
recompute function will do a full sync between desired states and existing
states. Now the existing states are partially synced during the I-P before
recompute, the recompute function will just see the synced part as "already
in sync" and only apply for the rest of the changes. I think this mechanism
applies for add/update/delete the same way. Here for delete, if the record
is deleted from IDL, it won't be found by the recompute function, so there
is no way to access it again. Did I miss something?
>
> We should probably accumulate all records that need to be deleted and
> delete them at the end of the whole I-P run. Do we need some I-P infra
> changes? Maybe users can provide a callback that can be called whenever
> an I-P run has finished?
>
I thought about this approach, but not for the reason above. I would do the
same for all DB changes to avoid directly updating DBs in the I-P engine.
However, the change is going to be quite big, considering how the current
code base depends on the IDL. Changes to IDL are all over the place. In
addition to the potential memory and CPU cost to maintain the delta, there
may be other tricky things to handle, such as:
- Sometimes it is problematic to see the old data (instead of the updated
ones) in the IDL in the same round of I-P. (this has happened in
ovn-controller due to out-of-date IDL index, which was fixed accordingly)
- Merge accumulated changes to the same DB record.
So, I think it is better to avoid such a change unless it is really
necessary.
Thanks,
Han
> What do you think?
>
> Thanks,
> Dumitru
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev