Hello All, Is this issue fixed in any new ovn release ?
[Mon Jan 17 10:05:30 2022] openvswitch: ovs-system: deferred action limit reached, drop recirc action [Mon Jan 17 10:05:31 2022] openvswitch: ovs-system: deferred action limit reached, drop recirc action [Mon Jan 17 10:05:31 2022] openvswitch: ovs-system: deferred action limit reached, drop recirc action [Mon Jan 17 10:05:31 2022] openvswitch: ovs-system: deferred action limit reached, drop recirc action Ammad On Tue, Nov 2, 2021 at 11:59 AM Ammad Syed <syedamma...@gmail.com> wrote: > Hi, > > I just upgraded by ovn and ovs to the latest releases i.e ovn 21.09 and > ovs 2.16.0. Still getting the same messages in my dmesg logs. > > The issue can be reproduced by below steps. > > - Add neutron router > - Set its external gateway. > - Add a local network subnet with a router. In my case geneve is a tenant > network and vlan is provider external network. > - Now try to access SNAT public / external IP that is assigned to the > router via any means (you can just put that IP in your web browser and > enter) you will see below logs in dmesg. > - The logs can only be seen on external gateway chassis. > > [Tue Nov 2 11:48:12 2021] openvswitch: ovs-system: deferred action limit > reached, drop recirc action > [Tue Nov 2 11:48:19 2021] openvswitch: ovs-system: deferred action limit > reached, drop recirc action > [Tue Nov 2 11:48:39 2021] openvswitch: ovs-system: deferred action limit > reached, drop recirc action > > - Ammad > > > On Thu, Sep 9, 2021 at 7:25 PM Odintsov Vladislav <vlodint...@croc.ru> > wrote: > >> Hi Han, >> >> I’ll try answer first question to move this discussion forward. >> >> Next is the output of the ovs-appctl ofproto/trace <flow> | ovn-detrace >> for my topology. >> There is a part of last stages of lr egress pipeline and jump to lr >> ingress. >> The full output is in attachment. >> Hope this can help. >> >> >> 25. metadata=0x4, priority 0, cookie 0xb4d0917 >> resubmit(,26) >> * Logical datapaths: >> * "lr0-edge" (c55eb989-eda9-47b9-8b34-e898dc1c6be2) [ingress] >> * "lr0" (f0acad28-1531-4c32-98f1-6e95c528c2a5) [ingress] >> * Logical flow: table=17 (lr_in_larger_pkts), priority=0, match=(1), >> actions=(next;) >> 26. reg15=0x1,metadata=0x4, priority 50, cookie 0x1d634149 >> set_field:0x2->reg15 >> resubmit(,27) >> * Logical datapaths: >> * "lr0-edge" (c55eb989-eda9-47b9-8b34-e898dc1c6be2) [ingress] >> * Logical flow: table=18 (lr_in_gw_redirect), priority=50, >> match=(outport == "lr0-wan), actions=(outport = "cr-lr0-wan"; next;) >> * Logical Router Port: lr0-wan mac 0e:01:aa:29:41:03 networks [' >> 172.16.0.1/32'] ipv6_ra_configs {} >> 27. metadata=0x4, priority 0, cookie 0x433abe7d >> resubmit(,37) >> * Logical datapaths: >> * "lr0-edge" (c55eb989-eda9-47b9-8b34-e898dc1c6be2) [ingress] >> * "lr0" (f0acad28-1531-4c32-98f1-6e95c528c2a5) [ingress] >> * Logical flow: table=19 (lr_in_arp_request), priority=0, match=(1), >> actions=(output;) >> 37. priority 0 >> resubmit(,38) >> 38. reg15=0x2,metadata=0x4, priority 100, cookie 0xf7faafb5 >> set_field:0x1->reg15 >> set_field:0x9->reg11 >> set_field:0xb->reg12 >> resubmit(,39) >> * Logical datapath: "lr0-edge" (c55eb989-eda9-47b9-8b34-e898dc1c6be2) >> * Port Binding: logical_port "cr-lr0-wan", tunnel_key 2, chassis-name >> "ai10", chassis-str "ai10.ai315t.int.c2.croc.ru" >> 39. priority 0 >> set_field:0->reg0 >> set_field:0->reg1 >> set_field:0->reg2 >> set_field:0->reg3 >> set_field:0->reg4 >> set_field:0->reg5 >> set_field:0->reg6 >> set_field:0->reg7 >> set_field:0->reg8 >> set_field:0->reg9 >> resubmit(,40) >> 40. ip,metadata=0x4, priority 50, cookie 0x851809e6 >> set_field:0x1/0x1->reg10 >> ct(table=41,zone=NXM_NX_REG11[0..15],nat) >> nat >> -> A clone of the packet is forked to recirculate. The forked pipeline >> will be resumed at table 41. >> -> Sets the packet to an untracked state, and clears all the conntrack >> fields. >> * Logical datapaths: >> * "lr0-edge" (c55eb989-eda9-47b9-8b34-e898dc1c6be2) [egress] >> * Logical flow: table=0 (lr_out_undnat), priority=50, match=(ip), >> actions=(flags.loopback = 1; ct_dnat;) >> >> Final flow: >> recirc_id=0x3223,eth,tcp,reg10=0x1,reg11=0x9,reg12=0xb,reg14=0x1,reg15=0x1,metadata=0x4,in_port=132,vlan_tci=0x0000,dl_src=0e:01:aa:29:41:03,dl_dst=0e:01:aa:29:41:03,nw_src=10.0.0.21,nw_dst=172.16.0.1,nw_tos=0,nw_ecn=0,nw_ttl=56,tp_src=0,tp_dst=22,tcp_flags=0 >> Megaflow: >> recirc_id=0x3223,ct_state=+new-est-rel-rpl-inv+trk,ct_label=0/0x1,eth,ip,in_port=132,dl_src=84:3d:c6:da:5f:ff,dl_dst=0e:01:aa:29:41:03,nw_src= >> 0.0.0.0/1,nw_dst=172.16.0.1,nw_ttl=57,nw_frag=no >> Datapath actions: >> set(eth(src=0e:01:aa:29:41:03)),set(ipv4(ttl=56)),ct(zone=9,nat),recirc(0x329c) >> >> >> =============================================================================== >> recirc(0x329c) - resume conntrack with default ct_state=trk|new (use >> --ct-next to customize) >> Replacing src/dst IP/ports to simulate NAT: >> Initial flow: >> Modified flow: >> >> =============================================================================== >> >> Flow: >> recirc_id=0x329c,ct_state=new|trk,ct_zone=9,eth,tcp,reg10=0x1,reg11=0x9,reg12=0xb,reg14=0x1,reg15=0x1,metadata=0x4,in_port=132,vlan_tci=0x0000,dl_src=0e:01:aa:29:41:03,dl_dst=0e:01:aa:29:41:03,nw_src=10.0.0.21,nw_dst=172.16.0.1,nw_tos=0,nw_ecn=0,nw_ttl=56,tp_src=0,tp_dst=22,tcp_flags=0 >> >> bridge("internet") >> ------------------ >> thaw >> Resuming from table 41 >> 41. ct_state=+new+trk,ip,metadata=0x4, priority 50, cookie 0xc26ec65c >> ct(commit,zone=NXM_NX_REG11[0..15],nat(src)) >> nat(src) >> -> Sets the packet to an untracked state, and clears all the conntrack >> fields. >> resubmit(,42) >> * Logical datapaths: >> * "lr0-edge" (c55eb989-eda9-47b9-8b34-e898dc1c6be2) [egress] >> * Logical flow: table=1 (lr_out_post_undnat), priority=50, match=(ip >> && ct.new), actions=(ct_commit { } ; next; ) >> 42. metadata=0x4, priority 0, cookie 0xa404a92a >> resubmit(,43) >> * Logical datapaths: >> * "lr0-edge" (c55eb989-eda9-47b9-8b34-e898dc1c6be2) [egress] >> * "lr0" (f0acad28-1531-4c32-98f1-6e95c528c2a5) [egress] >> * Logical flow: table=2 (lr_out_snat), priority=0, match=(1), >> actions=(next;) >> 43. ip,reg15=0x1,metadata=0x4,nw_dst=172.16.0.1, priority 100, cookie >> 0x48dfa8d3 >> >> clone(ct_clear,move:NXM_NX_REG15[]->NXM_NX_REG14[],set_field:0->reg15,push:NXM_OF_ETH_SRC[],push:NXM_OF_ETH_DST[],pop:NXM_OF_ETH_SRC[],pop:NXM_OF_ETH_DST[],set_field:0->reg10,set_field:0x1/0x1->reg10,set_field:0/0xffffffff000000000000000000000000->xxreg0,set_field:0/0xffffffff0000000000000000->xxreg0,set_field:0/0xffffffff00000000->xxreg0,set_field:0/0xffffffff->xxreg0,set_field:0/0xffffffff000000000000000000000000->xxreg1,set_field:0/0xffffffff0000000000000000->xxreg1,set_field:0/0xffffffff00000000->xxreg1,set_field:0/0xffffffff->xxreg1,set_field:0/0xffffffff00000000->xreg4,set_field:0/0xffffffff->xreg4,set_field:0x1/0x1->xreg4,resubmit(,8)) >> ct_clear >> move:NXM_NX_REG15[]->NXM_NX_REG14[] >> -> NXM_NX_REG14[] is now 0x1 >> set_field:0->reg15 >> push:NXM_OF_ETH_SRC[] >> push:NXM_OF_ETH_DST[] >> pop:NXM_OF_ETH_SRC[] >> -> NXM_OF_ETH_SRC[] is now 0e:01:aa:29:41:03 >> pop:NXM_OF_ETH_DST[] >> -> NXM_OF_ETH_DST[] is now 0e:01:aa:29:41:03 >> set_field:0->reg10 >> set_field:0x1/0x1->reg10 >> set_field:0/0xffffffff000000000000000000000000->xxreg0 >> set_field:0/0xffffffff0000000000000000->xxreg0 >> set_field:0/0xffffffff00000000->xxreg0 >> set_field:0/0xffffffff->xxreg0 >> set_field:0/0xffffffff000000000000000000000000->xxreg1 >> set_field:0/0xffffffff0000000000000000->xxreg1 >> set_field:0/0xffffffff00000000->xxreg1 >> set_field:0/0xffffffff->xxreg1 >> set_field:0/0xffffffff00000000->xreg4 >> set_field:0/0xffffffff->xreg4 >> set_field:0x1/0x1->xreg4 >> resubmit(,8) >> * Logical datapaths: >> * "lr0-edge" (c55eb989-eda9-47b9-8b34-e898dc1c6be2) [egress] >> * Logical flow: table=3 (lr_out_egr_loop), priority=100, >> match=(ip4.dst == 172.16.0.1 && outport == "lr0-wan" && >> is_chassis_resident("cr-lr0-wan")), actions=(clone { ct_clear; inport = >> outport; outport = ""; eth.dst <-> eth.src; flags = 0; flags.loopback = 1; >> reg0 = 0; reg1 = 0; reg2 = 0; reg3 = 0; reg4 = 0; reg5 = 0; reg6 = 0; reg7 >> = 0; reg8 = 0; reg9 = 0; reg9[0] = 1; next(pipeline=ingress, table=0); };) >> * NAT: external IP 172.16.0.1 external_mac [] logical_ip >> 192.168.0.0/16 logical_port [] type snat >> 8. reg14=0x1,metadata=0x4,dl_dst=0e:01:aa:29:41:03, priority 50, cookie >> 0xce77dac6 >> >> set_field:0xe01aa2941030000000000000000/0xffffffffffff0000000000000000->xxreg0 >> resubmit(,9) >> * Logical datapaths: >> * "lr0-edge" (c55eb989-eda9-47b9-8b34-e898dc1c6be2) [ingress] >> * Logical flow: table=0 (lr_in_admission), priority=50, match=(eth.dst >> == 0e:01:aa:29:41:03 && inport == "lr0-wan" && >> is_chassis_resident("cr-lr0-wan")), actions=(xreg0[0..47] = >> 0e:01:aa:29:41:03; next;) >> * Logical Router Port: lr0-wan mac 0e:01:aa:29:41:03 networks [' >> 172.16.0.1/32'] ipv6_ra_configs {} >> 9. metadata=0x4, priority 0, cookie 0x27b6069b >> set_field:0x4/0x4->xreg4 >> resubmit(,10) >> * Logical datapaths: >> * "lr0-edge" (c55eb989-eda9-47b9-8b34-e898dc1c6be2) [ingress] >> * "lr0" (f0acad28-1531-4c32-98f1-6e95c528c2a5) [ingress] >> * Logical flow: table=1 (lr_in_lookup_neighbor), priority=0, >> match=(1), actions=(reg9[2] = 1; next;) >> 10. reg9=0x4/0x4,metadata=0x4, priority 100, cookie 0xebcd30a8 >> resubmit(,11) >> * Logical datapaths: >> * "lr0-edge" (c55eb989-eda9-47b9-8b34-e898dc1c6be2) [ingress] >> * "lr0" (f0acad28-1531-4c32-98f1-6e95c528c2a5) [ingress] >> * Logical flow: table=2 (lr_in_learn_neighbor), priority=100, >> match=(reg9[2] == 1), actions=(next;) >> >> >> Regards, >> Vladislav Odintsov >> >> On 4 Aug 2021, at 21:02, Han Zhou <hz...@ovn.org> wrote: >> >> >> >> On Wed, Aug 4, 2021 at 6:41 AM Numan Siddique <num...@ovn.org> wrote: >> > >> > On Wed, Aug 4, 2021 at 4:17 AM Krzysztof Klimonda >> > <kklimo...@syntaxhighlighted.com> wrote: >> > > >> > > Hi Ammad, >> > > >> > > (Re-adding ovs-discuss@openvswitch.org to CC to keep track of the >> discussion) >> > > >> > > Thanks for testing it with SNAT enabled/disabled and verifying that >> it seems to be related. >> > > >> > > As for the impact of this bug I have to say I'm unsure. I have >> theorized that this could the cause for (or at least connected to) BFD >> sessions being dropped between gateway chassises, but I couldn't really >> validate it. >> > > >> > > My linked patch is pretty old and no longer applies cleanly on >> master, but I'd be interested in getting some feedback from developers on >> whether I'm even fixing the right thing. >> > >> > Hi Krzysztof, >> > >> > Your patch is in the "change requested" stage. I see from the comment >> > that the ddlog part of the code is missing. >> > >> > Seems like a valid case to me. The issue is seen when the packet is >> > destined to the router port IP right ? >> > >> > In the case of ovn-kubernetes, the router port IP is also used as a >> > load balancer backend IP. >> > >> > Will your patch have any impact if the logical router has this load >> > balancer configured ? (for the system test case you've added ) >> > >> > ovn-nbctl lb-add lb1 172.16.1.254:90 192.168.1.100:90 >> > ovn-nbctl lr-lb-add R1 lb1 >> > >> > Can you please repost the patch for further review. It would be great >> > if you can add ddlog code. Or you can repost the patch >> > and the ddlog part can be added if the reviewers are fine with the >> patch. >> > >> > Thanks >> > Numan >> > >> >> Thanks Krzysztof, this is interesting. Could you share more on the root >> cause since you debugged it - how did the loop happen? When a packet >> destined to the SNAT IP hits the router ingress pipeline, what's the next >> hop? How the L2 dst is populated for the dst IP and how is the packet >> forwarded back to the router pipeline? How /32 IP (instead of a subnet) on >> the SNAT config made a difference? >> >> > > >> > > Regards, >> > > Krzysztof >> > > >> > > On Wed, Aug 4, 2021, at 09:02, Ammad Syed wrote: >> > > > I am able to reproduce this issue with snat enabled network and >> > > > accessing the snat IP from external network can reproduce this >> issue . >> > > > If I keep snat disable, then I didn't see these logs in syslog. >> > > > >> > > > Ammad >> > > > >> > > > On Tue, Aug 3, 2021 at 6:39 PM Ammad Syed <syedamma...@gmail.com> >> wrote: >> > > > > Thanks. Let me try to reproduce it with this way. >> > > > > >> > > > > Can you please advise if this will cause any trouble if we have >> this bug in production? Any workaround to avoid this issue? >> > > > > >> > > > > Ammad >> > > > > >> > > > > On Tue, Aug 3, 2021 at 5:56 PM Krzysztof Klimonda < >> kklimo...@syntaxhighlighted.com> wrote: >> > > > >> Hi, >> > > > >> >> > > > >> To reproduce it (on openstack. although the issue does not seem >> to be openstack-specific) I've created a network with SNAT enabled (which >> is default) and set its external gateway to my external network. Next, I've >> tried establishing TCP session from the outside to IP address assigned to >> the router and checked dmesg on the chassis that the port is assigned to >> for "ovs-system: deferred action limit reached, drop recirc action" >> messages. >> > > > >> >> > > > >> Best Regards, >> > > > >> Krzysztof >> > > > >> >> > > > >> On Tue, Aug 3, 2021, at 09:05, Ammad Syed wrote: >> > > > >> > Hi Krzysztof, >> > > > >> > >> > > > >> > Yes I might be stuck in this issue. How can I check if there >> is any >> > > > >> > loop in lflow-list ? >> > > > >> > >> > > > >> > Ammad >> > > > >> > >> > > > >> > On Tue, Aug 3, 2021 at 2:14 AM Krzysztof Klimonda >> > > > >> > <kklimo...@syntaxhighlighted.com> wrote: >> > > > >> > > Hi, >> > > > >> > > >> > > > >> > > Not sure if it's related, but I've seen this bug in ovn >> 20.12 release, where routing loop was related to flows created to handle >> SNAT, I've sent an RFC patch few months back but didn't really have time to >> follow up on it since then to get some feedback: >> https://www.mail-archive.com/ovs-dev@openvswitch.org/msg53195.html >> > > > >> > > I was planning on re-testing it with 21.06 release and >> follow up on the patch. >> > > > >> > > >> > > > >> > > On Mon, Aug 2, 2021, at 21:31, Han Zhou wrote: >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > On Mon, Aug 2, 2021 at 5:07 AM Ammad Syed < >> syedamma...@gmail.com> wrote: >> > > > >> > > > > >> > > > >> > > > > Hello, >> > > > >> > > > > >> > > > >> > > > > I am using openstack with OVN 20.12 and OVS 2.15.0 on >> ubuntu 20.04. I am using geneve tenant network and vlan provider network. >> > > > >> > > > > >> > > > >> > > > > I am continuously getting below messages in my dmesg >> logs continuously on compute node 1 only the other two compute nodes have >> no such messages. >> > > > >> > > > > >> > > > >> > > > > [275612.826698] openvswitch: ovs-system: deferred action >> limit reached, drop recirc action >> > > > >> > > > > [275683.750343] openvswitch: ovs-system: deferred action >> limit reached, drop recirc action >> > > > >> > > > > [276102.200772] openvswitch: ovs-system: deferred action >> limit reached, drop recirc action >> > > > >> > > > > [276161.575494] openvswitch: ovs-system: deferred action >> limit reached, drop recirc action >> > > > >> > > > > [276210.262524] openvswitch: ovs-system: deferred action >> limit reached, drop recirc action >> > > > >> > > > > >> > > > >> > > > > I have tried by reinstalling (OS everything) compute >> node 1 but still having same errors. >> > > > >> > > > > >> > > > >> > > > > Need your advise. >> > > > >> > > > > >> > > > >> > > > > -- >> > > > >> > > > > Regards, >> > > > >> > > > > >> > > > >> > > > > >> > > > >> > > > > Syed Ammad Ali >> > > > >> > > > > _______________________________________________ >> > > > >> > > > > discuss mailing list >> > > > >> > > > > disc...@openvswitch.org >> > > > >> > > > > >> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >> > > > >> > > > >> > > > >> > > > Hi Syed, >> > > > >> > > > >> > > > >> > > > Could you check if you have routing loops (i.e. a packet >> being routed >> > > > >> > > > back and forth between logical routers infinitely) in your >> logical >> > > > >> > > > topology? >> > > > >> > > > >> > > > >> > > > Thanks, >> > > > >> > > > Han >> > > > >> > > > _______________________________________________ >> > > > >> > > > discuss mailing list >> > > > >> > > > disc...@openvswitch.org >> > > > >> > > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >> > > > >> > > > >> > > > >> > > >> > > > >> > > >> > > > >> > > -- >> > > > >> > > Krzysztof Klimonda >> > > > >> > > kklimo...@syntaxhighlighted.com >> > > > >> > > _______________________________________________ >> > > > >> > > discuss mailing list >> > > > >> > > disc...@openvswitch.org >> > > > >> > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >> > > > >> > >> > > > >> > >> > > > >> > -- >> > > > >> > Regards, >> > > > >> > >> > > > >> > >> > > > >> > Syed Ammad Ali >> > > > >> >> > > > >> >> > > > >> -- >> > > > >> Krzysztof Klimonda >> > > > >> kklimo...@syntaxhighlighted.com >> > > > > -- >> > > > > Regards, >> > > > > >> > > > > >> > > > > Syed Ammad Ali >> > > > >> > > > >> > > > -- >> > > > Regards, >> > > > >> > > > >> > > > Syed Ammad Ali >> > > >> > > >> > > -- >> > > Krzysztof Klimonda >> > > kklimo...@syntaxhighlighted.com >> > > _______________________________________________ >> > > discuss mailing list >> > > disc...@openvswitch.org >> > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >> > > >> > _______________________________________________ >> > discuss mailing list >> > disc...@openvswitch.org >> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >> _______________________________________________ >> discuss mailing list >> disc...@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >> >> >> _______________________________________________ >> discuss mailing list >> disc...@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >> > > > -- > Regards, > > > Syed Ammad Ali > -- Regards, Syed Ammad Ali
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss