On 12/3/25 3:50 PM, Xavier Simonart wrote:
> Hi Dumitru
> 

Hi Xavier,

> Thanks for reviewing this patch.
> Will send v2 (comments below).
> 
> Thanks
> Xavier
> 
> On Wed, Dec 3, 2025 at 3:03 PM Dumitru Ceara <[email protected]> wrote:
> 
>> On 12/3/25 2:22 PM, Xavier Simonart wrote:
>>> 0x00000000005101f4 in ovsdb_idl_row_is_synthetic (row=0x0) at
>> lib/ovsdb-idl.c:2724
>>> ovsdb_idl_txn_delete (row_=0x0) at lib/ovsdb-idl.c:3834
>>> 0x000000000043acdd in port_group_nb_port_group_handler (node=<optimized
>> out>, data_=<optimized out>) at northd/en-port-group.c:616
>>> 0x000000000045b7db in engine_compute (node=<optimized out>,
>> recompute_allowed=<optimized out>) at lib/inc-proc-eng.c:473
>>> engine_run_node (node=0x74cb20 <en_port_group>,
>> recompute_allowed=<optimized out>) at lib/inc-proc-eng.c:545
>>> engine_run (recompute_allowed=recompute_allowed@entry=true) at
>> lib/inc-proc-eng.c:571
>>> 0x000000000044d58b in inc_proc_northd_run 
>>> (ovnnb_txn=ovnnb_txn@entry=0xdd68060,
>> ovnsb_txn=ovnsb_txn@entry=0xdd66ed0, ctx=ctx@entry=0x7ffe3be7ab60) at
>> northd/inc-proc-northd.c:580
>>> 0x00000000004047d6 in main (argc=<optimized out>, argv=<optimized out>)
>> at northd/ovn-northd.c:1096
>>>
>>> Signed-off-by: Xavier Simonart <[email protected]>
>>> ---
>>
>> Hi Xavier,
>>
>> Thanks for the fix!
>>
>> I guess we're missing a "Fixes:" tag here.
>>
>> Also maybe it's a bit more readable if we skip the backtrace from the
>> commit message?  In the end what we need to say is that
>> sb_port_group_lookup_by_name() can return NULL if the (stale) record
>> it's searching for has been deleted in transactions that are processed
>> in the current iteration.  What do you think?
>>
> Looks much better.
> If you're ok, I would add the explanation you provided but would keep the
> last lines of the stack trace (maybe w/o the address).
> I found it sometimes easier when hitting a segfault and checking whether
> this is an already fixed one if the patch has some kind of stack trace..
> 

Sure, that's perfect.

>>
>>>  northd/en-port-group.c |  2 ++
>>>  tests/ovn-northd.at    | 26 ++++++++++++++++++++++++++
>>>  2 files changed, 28 insertions(+)
>>>
>>> diff --git a/northd/en-port-group.c b/northd/en-port-group.c
>>> index 476c0a18d..d0b7961fb 100644
>>> --- a/northd/en-port-group.c
>>> +++ b/northd/en-port-group.c
>>> @@ -613,7 +613,9 @@ port_group_nb_port_group_handler(struct engine_node
>> *node, void *data_)
>>>          const struct sbrec_port_group *sb_pg =
>>>              sb_port_group_lookup_by_name(sbrec_port_group_by_name,
>>>                                           stale_sb_port_group_name);
>>> +        if (sb_pg) {
>>>              sbrec_port_group_delete(sb_pg);
>>> +        }
>>
>> I think this is correct.
>>
>>>      }
>>>
>>>      sset_destroy(&stale_sb_port_groups);
>>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>>> index 931064fe6..7cfd30c6b 100644
>>> --- a/tests/ovn-northd.at
>>> +++ b/tests/ovn-northd.at
>>> @@ -18914,3 +18914,29 @@ wait_row_count Igmp_Group 0 address=mrouters
>>>  OVN_CLEANUP_NORTHD
>>>  AT_CLEANUP
>>>  ])
>>> +
>>> +OVN_FOR_EACH_NORTHD_NO_HV([
>>> +AT_SETUP([port group segfault])
>>> +
>>> +ovn_start
>>> +
>>> +check ovn-nbctl ls-add sw0
>>> +check ovn-nbctl lsp-add sw0 sw0-port1 -- lsp-set-addresses sw0-port1
>> "50:54:00:00:00:01 192.168.0.2"
>>
>> Nit: I'd split this long line.
>>
> Will do in v2
> 
>>
>>> +
>>> +check ovn-nbctl pg-add pg1
>>> +check ovn-nbctl pg-set-ports pg1 sw0-port1
>>
>> Wouldn't it be better to use --wait=sb here to make sure that the PG
>> actually has ports before northd goes to sleep?
>>
> Agreed. The only risk was ... potentially not hitting the issue in case of
> race condition, but it is better with --wait=sb.
> Will do in v2.
> 

I did try it locally (without the fix) and it crashed on each run.

>>
>>> +
>>> +sleep_sb
>>> +sleep_northd
>>> +check ovn-nbctl pg-del pg1
>>> +check ovn-nbctl pg-add pg1
>>> +check ovn-nbctl pg-set-ports pg1 sw0-port1
>>> +
>>> +wake_up_sb
>>> +wake_up_northd
>>> +check ovn-nbctl pg-del pg1
>>> +check ovn-nbctl --wait=hv sync
>>
>> Nit: not technically wrong but but I'd use --wait=sb instead of
>> --wait=hv here.
>>
> Agreed, will do in v2.
> 

Thanks!

Regards,
Dumitru

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to