On Tue, Oct 25, 2022 at 5:51 AM Han Zhou <[email protected]> wrote: > > > On Sun, Oct 23, 2022 at 10:21 PM Ales Musil <[email protected]> wrote: > > > > > > > > On Sun, Oct 23, 2022 at 10:16 PM Han Zhou <[email protected]> wrote: > >> > >> > >> > >> On Fri, Oct 21, 2022 at 1:37 AM Ales Musil <[email protected]> wrote: > >> > > >> > Run ovs_list_init on list_node from desired flow. > >> > Not doing that can potentially cause crash on > >> > ovs_assert(ovs_list_is_empty(&f->list_node)) > >> > this might happen when we have conditional monitoring > >> > enabled and have some flow like below being processed > >> > by ofctrl in a single run. > >> > > >> > flowA is removed, flowA is added, flowA is removed. > >> > > >> > > > > Hi Han, > > > >> > >> Thanks Ales for the patch. First of all I agree this patch is not > harmful and is the right thing to do, so I am going to apply it. > >> At the same time, I still don't understand the root cause analysis > above and like you said, I am not sure if it would fix the bug. Two > questions here: > > > > > > The description is based on some assumptions and might not be the > correct way to get into that state. > > > >> > >> 1. How would it relate to conditional monitoring? > > > > > > It is more likely to receive multiple updates at once with conditional > monitoring or am I wrong? > > > I don't see a reason for this. I think it only depends on what changes > happened in SB and notified and pending to ovn-controller during an > iteration of ovn-controller control loop, which doesn't seem to have > relationship with conditional monitoring. Do you have more details on the > reasoning? >
It was based on the configuration, however it was confirmed that it still happens even with conditional monitoring disabled. > > >> > >> 2. For the example "flowA is removed, flowA is added, flowA is > removed", I assume flowA is "a desired OVS flow", not a logical flow, > correct? So when flowA is removed, it should be removed from the desired > flow table and added to the "tracked_flows". Then when flowA is added, it > would in fact be a newly allocated desired_flow, and no chance to find the > earlier tracked deleted flowA from the desired flow table. The 2nd flowA > deletion would find the 2nd flowA instance and destroy it, but the first > deleted desired_flow instance of flowA would not be referenced any more and > have no chance to use the list_node before it is finally destroyed, right? > > > > > > That makes sense, as I said, I was not able to reproduce it, the issue > seemed to make sense under those conditions. Anyway I can update the commit > message leaving the flows part out. > > Understand, please update the commit message. > I tend to believe that the core dump was caused by some other bug that may > result in corruption of the list/memory (in such case this change won't > help), and also not sure if it is a problem only with this old version > v20.12 or still a problem of the latest code. > Ack, I will update the commit message in v2. We have also provided a build so we will get feedback if it still happens with this patch or not. I will postpone v2 until then. > > > > > Also one thing to keep in mind, it was originally reported against > v20.12. We should also backport it if possible. > > I am not sure if it is a candidate for backporting. Usually we backport > critical bug fixes only, especially for such an old version. But in this > case the change itself is so simple and we are almost sure it won't cause > any harm, although no evidence of any benefit, either. > > What's your opinion of backporting this? Dumitru, Mark, Numan? > > Thanks, > Han > > > > Thanks, > > Ales > > > >> > >> > >> Thanks, > >> Han > >> > >> > This is most likely related to BZ[0]. > >> > > >> > [0] https://bugzilla.redhat.com/2125723 > >> > Suggested-By: Dumitru Ceara <[email protected]> > >> > Signed-off-by: Ales Musil <[email protected]> > >> > --- > >> > I'm not adding the BZ as Reported-At because this might > >> > not fix the issue they are facing. It seems to be very hard > >> > to reproduce, however it still is a good idea to apply this > >> > fix as in the context it's the right to do and shouldn't cause > >> > any harm. > >> > --- > >> > controller/ofctrl.c | 1 + > >> > 1 file changed, 1 insertion(+) > >> > > >> > diff --git a/controller/ofctrl.c b/controller/ofctrl.c > >> > index c77991258..fb0e0869c 100644 > >> > --- a/controller/ofctrl.c > >> > +++ b/controller/ofctrl.c > >> > @@ -1535,6 +1535,7 @@ remove_flows_from_sb_to_flow(struct > ovn_desired_flow_table *flow_table, > >> > free(sfr); > >> > } > >> > ovs_list_remove(&f->list_node); > >> > + ovs_list_init(&f->list_node); > >> > if (log_msg) { > >> > ovn_flow_log(&f->flow, log_msg); > >> > } > >> > -- > >> > 2.37.3 > >> > > >> > _______________________________________________ > >> > dev mailing list > >> > [email protected] > >> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > > > > > -- > > > > Ales Musil > > > > Senior Software Engineer - OVN Core > > > > Red Hat EMEA > > > > [email protected] IM: amusil > Thanks, Ales -- Ales Musil Senior Software Engineer - OVN Core Red Hat EMEA <https://www.redhat.com> [email protected] IM: amusil <https://red.ht/sig> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
