On Tue, Oct 25, 2022 at 5:51 AM Han Zhou <[email protected]> wrote:

>
>
> On Sun, Oct 23, 2022 at 10:21 PM Ales Musil <[email protected]> wrote:
> >
> >
> >
> > On Sun, Oct 23, 2022 at 10:16 PM Han Zhou <[email protected]> wrote:
> >>
> >>
> >>
> >> On Fri, Oct 21, 2022 at 1:37 AM Ales Musil <[email protected]> wrote:
> >> >
> >> > Run ovs_list_init on list_node from desired flow.
> >> > Not doing that can potentially cause crash on
> >> > ovs_assert(ovs_list_is_empty(&f->list_node))
> >> > this might happen when we have conditional monitoring
> >> > enabled and have some flow like below being processed
> >> > by ofctrl in a single run.
> >> >
> >> > flowA is removed, flowA is added, flowA is removed.
> >> >
> >>
> >
> > Hi Han,
> >
> >>
> >> Thanks Ales for the patch. First of all I agree this patch is not
> harmful and is the right thing to do, so I am going to apply it.
> >> At the same time, I still don't understand the root cause analysis
> above and like you said, I am not sure if it would fix the bug. Two
> questions here:
> >
> >
> > The description is based on some assumptions and might not be the
> correct way to get into that state.
> >
> >>
> >> 1. How would it relate to conditional monitoring?
> >
> >
> > It is more likely to receive multiple updates at once with conditional
> monitoring or am I wrong?
> >
> I don't see a reason for this. I think it only depends on what changes
> happened in SB and notified and pending to ovn-controller during an
> iteration of ovn-controller control loop, which doesn't seem to have
> relationship with conditional monitoring. Do you have more details on the
> reasoning?
>

It was based on the configuration, however it was confirmed that it still
happens even with conditional monitoring disabled.


>
> >>
> >> 2. For the example "flowA is removed, flowA is added, flowA is
> removed", I assume flowA is "a desired OVS flow", not a logical flow,
> correct? So when flowA is removed, it should be removed from the desired
> flow table and added to the "tracked_flows". Then when flowA is added, it
> would in fact be a newly allocated desired_flow, and no chance to find the
> earlier tracked deleted flowA from the desired flow table. The 2nd flowA
> deletion would find the 2nd flowA instance and destroy it, but the first
> deleted desired_flow instance of flowA would not be referenced any more and
> have no chance to use the list_node before it is finally destroyed, right?
> >
> >
> > That makes sense, as I said, I was not able to reproduce it, the issue
> seemed to make sense under those conditions. Anyway I can update the commit
> message leaving the flows part out.
>
> Understand, please update the commit message.
> I tend to believe that the core dump was caused by some other bug that may
> result in corruption of the list/memory (in such case this change won't
> help), and also not sure if it is a problem only with this old version
> v20.12 or still a problem of the latest code.
>

Ack, I will update the commit message in  v2. We have also provided a build
so we will get feedback if it still happens with this patch or not. I will
postpone v2 until then.


>
> >
> > Also one thing to keep in mind, it was originally reported against
> v20.12. We should also backport it if possible.
>
> I am not sure if it is a candidate for backporting. Usually we backport
> critical bug fixes only, especially for such an old version. But in this
> case the change itself is so simple and we are almost sure it won't cause
> any harm, although no evidence of any benefit, either.
>
> What's your opinion of backporting this? Dumitru, Mark, Numan?
>
> Thanks,
> Han
> >
> > Thanks,
> > Ales
> >
> >>
> >>
> >> Thanks,
> >> Han
> >>
> >> > This is most likely related to BZ[0].
> >> >
> >> > [0] https://bugzilla.redhat.com/2125723
> >> > Suggested-By: Dumitru Ceara <[email protected]>
> >> > Signed-off-by: Ales Musil <[email protected]>
> >> > ---
> >> > I'm not adding the BZ as Reported-At because this might
> >> > not fix the issue they are facing. It seems to be very hard
> >> > to reproduce, however it still is a good idea to apply this
> >> > fix as in the context it's the right to do and shouldn't cause
> >> > any harm.
> >> > ---
> >> >  controller/ofctrl.c | 1 +
> >> >  1 file changed, 1 insertion(+)
> >> >
> >> > diff --git a/controller/ofctrl.c b/controller/ofctrl.c
> >> > index c77991258..fb0e0869c 100644
> >> > --- a/controller/ofctrl.c
> >> > +++ b/controller/ofctrl.c
> >> > @@ -1535,6 +1535,7 @@ remove_flows_from_sb_to_flow(struct
> ovn_desired_flow_table *flow_table,
> >> >              free(sfr);
> >> >          }
> >> >          ovs_list_remove(&f->list_node);
> >> > +        ovs_list_init(&f->list_node);
> >> >          if (log_msg) {
> >> >              ovn_flow_log(&f->flow, log_msg);
> >> >          }
> >> > --
> >> > 2.37.3
> >> >
> >> > _______________________________________________
> >> > dev mailing list
> >> > [email protected]
> >> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
> >
> >
> > --
> >
> > Ales Musil
> >
> > Senior Software Engineer - OVN Core
> >
> > Red Hat EMEA
> >
> > [email protected]    IM: amusil
>


Thanks,
Ales

-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA <https://www.redhat.com>

[email protected]    IM: amusil
<https://red.ht/sig>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to