On Thu, Feb 13, 2025 at 9:39 AM <[email protected]> wrote:
> [I'll squash this reply to make it more readable] > On Thu, 2025-02-13 at 00:46 +0100, Dumitru Ceara wrote: > > > > > > > > > > diff --git a/northd/northd.c b/northd/northd.c > > > > > index 9362624d63..9795c849f6 100644 > > > > > --- a/northd/northd.c > > > > > +++ b/northd/northd.c > > > > > @@ -11421,7 +11421,7 @@ build_lb_parsed_routes(const struct > > > > > ovn_datapath *od, > > > > > * > > > > > * Advertise the LB IPs via all 'op' if this is a > > > > > gateway > > > > > router or > > > > > * throuh all DGPs of this distributed router > > > > > otherwise. > > > > > */ > > > > > - struct ovn_port *op_ = CONST_CAST(struct ovn_port *, > > > > > op); > > > > > + struct ovn_port *op_ = NULL; > > > > > size_t n_tracked_ports = !od->is_gw_router ? od- > > > > > > n_l3dgw_ports : 1; > > > > > struct ovn_port **tracked_ports = !od->is_gw_router > > > > > ? od->l3dgw_ports > > > > > > > > I also included this incremental change in the last version of > > > > the > > > > branch I linked above. > > > > > > > > Again, if all this looks good to you, feel free to use it in v6. > > > > > > Thanks, I will add it. In the meantime I'm also fighting with > > > segfaults > > > that appeared when I was adding tests. Tests add "stuff" to the OVN > > > in > > > big batches and It crashes fairly consistently (though not always). > > > This didn't occur in my manual testing. > > > Sometimes the crashes are silent, but from time to time, there are > > > transaction errors complaining about referencing non-existing DPs > > > or > > > PBs (both for logical ports and tracked ports). I don't presume > > > that > > > your patch will fix that, since it happens with pure LBs too. > > > I wonder if it's enough to have dependency on lr_stateful. Is it > > > possible that we need to depend on something else? Like > > > "en_sync_to_sb_pb"? > > > > > > > Would it be possible to share such a test? I can look into it > > tomorrow. > > > > Thanks, > > Dumitru > > > > > > One thing that I noticed when debugging the crash is that it occurs in > the `ar_add_entry` function, but it's not always on the same line. This > makes me suspect that we are maybe trying to read an address from > larger struct that is getting destroyed in different thread? > Also few times when the transaction error popped up, it was saying that > the transaction refers to non-existing row with ID like "a5a5a5a5-a5a5- > a5a5-a5a5-a5a5a5a5a5a5" and that doesn't look like a random UUID to me > :D > > Martin > > a5 is the value inserted by clang when it analyzes memory usage, so there is use-after-free or read out of bounds. Either way there should be sanitizers.XXXX file. Regards, Ales _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
