On Tue, Feb 23, 2021 at 4:56 PM Dumitru Ceara <[email protected]> wrote:
>
> On 2/22/21 7:48 PM, [email protected] wrote:
> > From: Numan Siddique <[email protected]>
> >
> > In one of the scaled deployments, ovn-controller is asserting with the
> > below stack trace
> >
> >      ***
> >       (gdb) bt
> >         0  raise () from /lib64/libc.so.6
> >         1  abort () from /lib64/libc.so.6
> >         2  ovs_abort_valist ("%s: assertion %s failed in %s()") at 
> > lib/util.c:419
> >         3  vlog_abort_valist ("%s: assertion %s failed in %s()") at 
> > lib/vlog.c:1249
> >         4  vlog_abort ("%s: assertion %s failed in %s()") at lib/vlog.c:1263
> >         5  ovs_assert_failure (where="controller/ofctrl.c:1216",
> >                                function="flood_remove_flows_for_sb_uuid",
> >                                
> > condition="ovs_list_is_empty(&f->list_node)") at lib/util.c:86
> >         6  flood_remove_flows_for_sb_uuid (sb_uuid=...134,
> >              flood_remove_nodes=...ef0) at controller/ofctrl.c:1216
> >         9  ofctrl_flood_remove_flows (flood_remove_nodes=...ef0) at 
> > controller/ofctrl.c:1280
> >         10 lflow_handle_changed_ref (ref_type=REF_TYPE_PORTGROUP,
> >              ref_name= "5564_pg_14...aac") at controller/lflow.c:522
> >         11 _flow_output_resource_ref_handler (ref_type=REF_TYPE_PORTGROUP)
> >              at controller/ovn-controller.c:2220
> >         12 engine_compute () at lib/inc-proc-eng.c:306
> >         13 engine_run_node (recompute_allowed=true) at 
> > lib/inc-proc-eng.c:352
> >         14 engine_run (recompute_allowed=true) at lib/inc-proc-eng.c:377
> >         15 main () at controller/ovn-controller.c:2887
> >      ***
> >
> > The reason for this assert is different from the assertion bug fixed in
> > the commit 858d1dd716("ofctrl: Fix the assert seen when flood removing
> > flows.")
> >
> > The above assertion is seen if lflow.c calls ofctrl_add_or_append_flow()
> > twice for the same flow uuid 'S'.  When this function is called for
> > the first time, a new desired flow 'f' is created and 'f->references'
> > will have [(S, f)].  When it is called for the 2nd time, the list
> > 'f->references' is updated as [(S, f), (S,f)].  Later when
> > flood_remove_flows_for_sb_uuid() is called for 'S', the above assertion
> > is seen.
> >
> > This scenarion can happen when ovn-controller receives an update message
> > from SB ovsdb-server in which a new logical flow 'F' belonging to a
> > datapath 'D' is added which would result in conjunction flows.  If this
> > datapath 'D' is added to 'local_datapaths' in the same loop, then
> > logical flow 'F' is first processed in lflow_add_flows_for_datapath()
> > and the second time in lflow_handle_changed_flows().
> >
> > This issue can be fixed in 2 ways
> > 1. Flood remove all the tracked flows in lflow_handle_changed_flows()
> >     first and then process the modified or newly added logical flows.
> >
> > 2. In the function ofctrl_add_or_append_flow(), check if there is
> >     already a reference for the flow uuid 'S' in the existing desired
> >     flows and only append the actions if it doesn't exist.
> >
> > This patch has taken the approach (1) to ensure correctness of flows.
> >
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1929978
> > Signed-off-by: Numan Siddique <[email protected]>
> > ---
>
> Looks good to me, thanks!
>
> Acked-by: Dumitru Ceara <[email protected]>

Thanks Dumitru. I applied this patch to the main branch and backported
upto branch-20.09.

Thanks
Numan

>
>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to