Hi Dumitru

Thanks for the review.

On Fri, Dec 5, 2025 at 10:46 AM Dumitru Ceara <[email protected]> wrote:

> On 12/4/25 6:16 PM, Xavier Simonart via dev wrote:
> > If a Logical Switch Port was added/updated and its Logical Switch deleted
> > within the same northd iteration, ovn-northd was asserting [0].
> >
> > [0] lib/ovsdb-idl.c:3668: assertion row->new_datum != NULL failed in
> ovsdb_idl_txn_write__()
> >
> > Stack trace:
> > 6  ovs_assert_failure                    lib/util.c:90
> > 7  ovsdb_idl_txn_write__                 lib/ovsdb-idl.c:3668
> > 8  ovsdb_idl_txn_write__                 lib/ovsdb-idl.c:3668
> > 9  sbrec_port_binding_set_nat_addresses  lib/ovn-sb-idl.c:40588
> > 10 sync_pbs_for_northd_changed_ovn_ports northd/northd.c:3959
> > 11 sync_to_sb_pb_northd_handler          northd/en-sync-sb.c:420
> > 12 engine_compute                        lib/inc-proc-eng.c:478
> > 13 engine_run_node                       lib/inc-proc-eng.c:550
> > 14 engine_run                            lib/inc-proc-eng.c:579
> > 15 inc_proc_northd_run                   northd/inc-proc-northd.c:608
> > 16 main                                  northd/ovn-northd.c:1134
> >
> > Fixes: 06e2c1bf0ce7 ("northd: I-P for logical switch creation/deletion
> in en_northd.")
> > Signed-off-by: Xavier Simonart <[email protected]>
> > ---
>
> Hi Xavier,
>
> Thanks for the patch!
>
> >  northd/northd.c     |  5 +++++
> >  tests/ovn-northd.at | 33 +++++++++++++++++++++++++++++++++
> >  2 files changed, 38 insertions(+)
> >
> > diff --git a/northd/northd.c b/northd/northd.c
> > index a8d435be8..1f5cb7830 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -3955,6 +3955,11 @@ sync_pbs_for_northd_changed_ovn_ports(
> >  {
> >      struct hmapx_node *hmapx_node;
> >
> > +    HMAPX_FOR_EACH (hmapx_node, &trk_ovn_ports->deleted) {
> > +        struct ovn_port *op = (struct ovn_port *) hmapx_node->data;
> > +        hmapx_find_and_delete(&trk_ovn_ports->created, op);
> > +        hmapx_find_and_delete(&trk_ovn_ports->updated, op);
> > +    }
>
> This breaks assumptions of the I-P engine.
>
> This is a handler for the en_northd -> en_sync_to_sb_pb data dependency,
> running for the en_sync_to_sb_pb node.  And trk_ovn_ports is actually
> data populated by the en_northd node, so "input data" for en_sync_to_sb_pb.
>
You're right.

>
> Change handlers should not change input data.  Maybe we should also
> change engine_get_input_data() to return const void * in the future.
>
> It does sound though that the actual problem is that we have another
> case of record (Logical_Switch_Port) being created/updated+deleted in
> multiple transactions that happen to be processed in the same iteration
> of ovn-northd.
>
Yes, that 's what is happening.
>
>
> Discussing with Ales, the actual root cause seems to be in
> ls_handle_lsp_changes().
>
> There, if "ls_deleted == true" we shouldn't be adding any of the switch
> ports to the tracked updated/created maps (the calls to
> add_op_to_northd_tracked_ports()).
>
> If I skip this whole for loop when ls_deleted == true:
>
> https://github.com/ovn-org/ovn/blob/53057d35a5d1f50f9a96ae14e70ea1d80cff5a72/northd/northd.c#L4602-L4671
>
> Then your test passes as well (with the change handler modification
> reverted).
>
> Isn't that the correct way of fixing the problem?
>
Agreed, this is much better. I'll send v2.

>
> Side rant: ls_handle_lsp_changes() is horrible, we should improve it in
> the near future.
>
> >      HMAPX_FOR_EACH (hmapx_node, &trk_ovn_ports->created) {
> >          sync_pb_for_lsp(hmapx_node->data, lr_stateful_table);
> >      }
> > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > index b9bd291b7..ffbf704c0 100644
> > --- a/tests/ovn-northd.at
> > +++ b/tests/ovn-northd.at
> > @@ -18972,3 +18972,36 @@ check ovn-nbctl --wait=sb sync
> >  OVN_CLEANUP_NORTHD
> >  AT_CLEANUP
> >  ])
> > +
> > +OVN_FOR_EACH_NORTHD_NO_HV([
> > +AT_SETUP([LS/LSP add and delete])
> > +
> > +ovn_start
> > +
> > +check ovn-nbctl --wait=sb ls-add ls1
> > +
> > +sleep_northd
> > +AS_BOX([$(date +%H:%M:%S.%03N) Add/delete LSP while northd is sleeping])
>
> Do we really need the date in the output?  Is this a left over from your
> debugging?
>
In tests such as this one which repeat similar operations multiple times
(e.g we delete
three times ls1) I found it much easier to have a date in the
testsuite.log, so we can easily
find in other logs or db what happens at that time. So yes, it's debugging,
but does not cost much
and should not influence the dynamics of the test (as would a vlog/set dbg
for instance).
So I find it easier if we keep the date, but I can remove it if you find we
should not have such debug info
in the tests.

>
> > +check ovn-nbctl lsp-add ls1 lsp1
> > +check ovn-nbctl lsp-del lsp1
> > +check ovn-nbctl lsp-add ls1 lsp1
> > +wake_up_northd
> > +check ovn-nbctl --wait=sb sync
> > +
> > +sleep_northd
> > +AS_BOX([$(date +%H:%M:%S.%03N) Update/delete LS while northd is
> sleeping])
> > +check ovn-nbctl lsp-set-addresses lsp1 "50:54:00:00:00:03 11.0.0.2"
> > +check ovn-nbctl ls-del ls1
> > +wake_up_northd
> > +
> > +check ovn-nbctl --wait=sb ls-add ls1
> > +sleep_northd
> > +AS_BOX([$(date +%H:%M:%S.%03N) Add/delete LS while northd is sleeping])
> > +check ovn-nbctl lsp-add ls1 lsp1
> > +check ovn-nbctl ls-del ls1
> > +wake_up_northd
> > +check ovn-nbctl --wait=sb sync
> > +
> > +OVN_CLEANUP_NORTHD
> > +AT_CLEANUP
> > +])
>
> Regards,
> Dumitru
>
> Thanks
Xavier
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to