On Thu, Feb 13, 2025 at 9:39 AM <[email protected]> wrote:

> [I'll squash this reply to make it more readable]
> On Thu, 2025-02-13 at 00:46 +0100, Dumitru Ceara wrote:
> > > > >
> > > > > diff --git a/northd/northd.c b/northd/northd.c
> > > > > index 9362624d63..9795c849f6 100644
> > > > > --- a/northd/northd.c
> > > > > +++ b/northd/northd.c
> > > > > @@ -11421,7 +11421,7 @@ build_lb_parsed_routes(const struct
> > > > > ovn_datapath *od,
> > > > >           *
> > > > >           * Advertise the LB IPs via all 'op' if this is a
> > > > > gateway
> > > > > router or
> > > > >           * throuh all DGPs of this distributed router
> > > > > otherwise.
> > > > > */
> > > > > -        struct ovn_port *op_ = CONST_CAST(struct ovn_port *,
> > > > > op);
> > > > > +        struct ovn_port *op_ = NULL;
> > > > >          size_t n_tracked_ports = !od->is_gw_router ? od-
> > > > > > n_l3dgw_ports : 1;
> > > > >          struct ovn_port **tracked_ports = !od->is_gw_router
> > > > >                                            ? od->l3dgw_ports
> > > >
> > > > I also included this incremental change in the last version of
> > > > the
> > > > branch I linked above.
> > > >
> > > > Again, if all this looks good to you, feel free to use it in v6.
> > >
> > > Thanks, I will add it. In the meantime I'm also fighting with
> > > segfaults
> > > that appeared when I was adding tests. Tests add "stuff" to the OVN
> > > in
> > > big batches and It crashes fairly consistently (though not always).
> > > This didn't occur in my manual testing.
> > > Sometimes the crashes are silent, but from time to time, there are
> > > transaction errors complaining about referencing non-existing DPs
> > > or
> > > PBs (both for logical ports and tracked ports). I don't presume
> > > that
> > > your patch will fix that, since it happens with pure LBs too.
> > > I wonder if it's enough to have dependency on lr_stateful. Is it
> > > possible that we need to depend on something else? Like
> > > "en_sync_to_sb_pb"?
> > >
> >
> > Would it be possible to share such a test?  I can look into it
> > tomorrow.
> >
> > Thanks,
> > Dumitru
> > > > >
> One thing that I noticed when debugging the crash is that it occurs in
> the `ar_add_entry` function, but it's not always on the same line. This
> makes me suspect that we are maybe trying to read an address from
> larger struct that is getting destroyed in different thread?
> Also few times when the transaction error popped up, it was saying that
> the transaction refers to non-existing row with ID like "a5a5a5a5-a5a5-
> a5a5-a5a5-a5a5a5a5a5a5" and that doesn't look like a random UUID to me
> :D
>
> Martin
>
>
a5 is the value inserted by clang when it analyzes memory usage, so there
is use-after-free or read out of bounds. Either way there should be
sanitizers.XXXX file.

Regards,
Ales
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to