Re: [ovs-discuss] [ovn] Broken ovs localport flow for ovnmeta namespaces created by neutron
On Thu, Feb 4, 2021 at 7:39 PM Michał Nasiadka wrote: > > Hello Numan, > > But the latest package in CentOS NFV SIG repo is built/uploaded 16th Dec 2020. > Are you sure it contains the fix? I think you need to check with the Centos NFV/RDO team for that. Thanks Numan > > czw., 4 lut 2021 o 13:53 Numan Siddique napisał(a): >> >> On Thu, Feb 4, 2021 at 5:04 PM Michał Nasiadka wrote: >> > >> > Hello, >> > >> > I've been hitting the same issue with OVN 20.09 from CentOS NFV SIG repo - >> > is there a chance to backport this change to 20.09? >> >> Its already backported a while ago - >> https://github.com/ovn-org/ovn/commit/ab1e46d7ec7a2ba80d68fbf6a45d09eef3269541 >> >> Numan >> >> > >> > czw., 17 gru 2020 o 11:05 Daniel Alvarez Sanchez >> > napisał(a): >> >> >> >> >> >> >> >> On Tue, Dec 15, 2020 at 11:39 AM Krzysztof Klimonda >> >> wrote: >> >>> >> >>> Hi, >> >>> >> >>> Just as a quick update - I've updated our ovn version to 20.12.0 >> >>> snapshot (d8bc0377c) and so far the problem hasn't yet reoccurred after >> >>> over 24 hours of tempest testing. >> >> >> >> >> >> We could reproduce the issue with 20.12 and master. Also this is not >> >> related exclusively to localports but to any port potentially. >> >> Dumitru posted a fix for this: >> >> >> >> http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/ >> >> >> >> Thanks! >> >> daniel >> >>> >> >>> >> >>> Best Regards, >> >>> -Chris >> >>> >> >>> >> >>> On Tue, Dec 15, 2020, at 11:13, Daniel Alvarez Sanchez wrote: >> >>> >> >>> Hey Krzysztof, >> >>> >> >>> On Fri, Nov 20, 2020 at 1:17 PM Krzysztof Klimonda >> >>> wrote: >> >>> >> >>> Hi, >> >>> >> >>> Doing some tempest runs on our pre-prod environment (stable/ussuri with >> >>> ovn 20.06.2 release) I've noticed that some network connectivity tests >> >>> were failing randomly. I've reproduced that by conitnously rescuing and >> >>> unrescuing instance - network connectivity from and to VM works in >> >>> general (dhcp is fine, access from outside is fine), however VM has no >> >>> access to its metadata server (via 169.254.169.254 ip address). Tracing >> >>> packet from VM to metadata via: >> >>> >> >>> 8<8<8< >> >>> ovs-appctl ofproto/trace br-int >> >>> in_port=tapa489d406-91,dl_src=fa:16:3e:2c:b0:fd,dl_dst=fa:16:3e:8b:b5:39 >> >>> 8<8<8< >> >>> >> >>> ends with >> >>> >> >>> 8<8<8< >> >>> 65. reg15=0x1,metadata=0x97e, priority 100, cookie 0x15ec4875 >> >>> output:1187 >> >>> >> Nonexistent output port >> >>> 8<8<8< >> >>> >> >>> And I can verify that there is no flow for the actual ovnmeta tap >> >>> interface (tap67731b0a-c0): >> >>> >> >>> 8<8<8< >> >>> # docker exec -it openvswitch_vswitchd ovs-ofctl dump-flows br-int |grep >> >>> -E output:'("tap67731b0a-c0"|1187)' >> >>> cookie=0x15ec4875, duration=1868.378s, table=65, n_packets=524, >> >>> n_bytes=40856, priority=100,reg15=0x1,metadata=0x97e actions=output:1187 >> >>> # >> >>> 8<8<8< >> >>> >> >>> From ovs-vswitchd.log it seems the interface tap67731b0a-c0 was added >> >>> with index 1187, then deleted, and re-added with index 1189 - that's >> >>> probably due to the fact that that is the only VM in that network and >> >>> I'm constantly hard rebooting it via rescue/unrescue: >> >>> >> >>> 8<8<8< >> >>> 2020-11-20T11:41:18.347Z|08043|bridge|INFO|bridge br-int: added >> >>> interface tap67731b0a-c0 on port 1187 >> >>> 2020-11-20T11:41:30.813Z|08044|bridge|INFO|bridge br-int: deleted >> >>> interface tapa489d406-91 on port 1186 >> >>> 2020-11-20T11:41:30.816Z|08045|bridge|WARN|could not open network device >> >>> tapa489d406-91 (No such device) >> >>> 2020-11-20T11:41:31.040Z|08046|bridge|INFO|bridge br-int: deleted >> >>> interface tap67731b0a-c0 on port 1187 >> >>> 2020-11-20T11:41:31.044Z|08047|bridge|WARN|could not open network device >> >>> tapa489d406-91 (No such device) >> >>> 2020-11-20T11:41:31.050Z|08048|bridge|WARN|could not open network device >> >>> tapa489d406-91 (No such device) >> >>> 2020-11-20T11:41:31.235Z|08049|connmgr|INFO|br-int<->unix#31: 2069 >> >>> flow_mods in the last 43 s (858 adds, 814 deletes, 397 modifications) >> >>> 2020-11-20T11:41:33.057Z|08050|bridge|INFO|bridge br-int: added >> >>> interface tapa489d406-91 on port 1188 >> >>> 2020-11-20T11:41:33.582Z|08051|bridge|INFO|bridge br-int: added >> >>> interface tap67731b0a-c0 on port 1189 >> >>> 2020-11-20T11:42:31.235Z|08052|connmgr|INFO|br-int<->unix#31: 168 >> >>> flow_mods in the 2 s starting 59 s ago (114 adds, 10 deletes, 44 >> >>> modifications) >> >>> 8<8<8< >> >>> >> >>> Once I restart ovn-controller it recalculates local ovs flows and the >> >>> problem is fixed so I'm assuming it's a local problem and not related to >> >>> NB and SB databases. >> >>> >> >>> >> >>> I have seen exactly the same which with 20.09, for the same port input >>
Re: [ovs-discuss] [ovn] Broken ovs localport flow for ovnmeta namespaces created by neutron
Hello Numan, But the latest package in CentOS NFV SIG repo is built/uploaded 16th Dec 2020. Are you sure it contains the fix? czw., 4 lut 2021 o 13:53 Numan Siddique napisał(a): > On Thu, Feb 4, 2021 at 5:04 PM Michał Nasiadka > wrote: > > > > Hello, > > > > I've been hitting the same issue with OVN 20.09 from CentOS NFV SIG repo > - is there a chance to backport this change to 20.09? > > Its already backported a while ago - > > https://github.com/ovn-org/ovn/commit/ab1e46d7ec7a2ba80d68fbf6a45d09eef3269541 > > Numan > > > > > czw., 17 gru 2020 o 11:05 Daniel Alvarez Sanchez > napisał(a): > >> > >> > >> > >> On Tue, Dec 15, 2020 at 11:39 AM Krzysztof Klimonda < > kklimo...@syntaxhighlighted.com> wrote: > >>> > >>> Hi, > >>> > >>> Just as a quick update - I've updated our ovn version to 20.12.0 > snapshot (d8bc0377c) and so far the problem hasn't yet reoccurred after > over 24 hours of tempest testing. > >> > >> > >> We could reproduce the issue with 20.12 and master. Also this is not > related exclusively to localports but to any port potentially. > >> Dumitru posted a fix for this: > >> > >> > http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/ > >> > >> Thanks! > >> daniel > >>> > >>> > >>> Best Regards, > >>> -Chris > >>> > >>> > >>> On Tue, Dec 15, 2020, at 11:13, Daniel Alvarez Sanchez wrote: > >>> > >>> Hey Krzysztof, > >>> > >>> On Fri, Nov 20, 2020 at 1:17 PM Krzysztof Klimonda < > kklimo...@syntaxhighlighted.com> wrote: > >>> > >>> Hi, > >>> > >>> Doing some tempest runs on our pre-prod environment (stable/ussuri > with ovn 20.06.2 release) I've noticed that some network connectivity tests > were failing randomly. I've reproduced that by conitnously rescuing and > unrescuing instance - network connectivity from and to VM works in general > (dhcp is fine, access from outside is fine), however VM has no access to > its metadata server (via 169.254.169.254 ip address). Tracing packet from > VM to metadata via: > >>> > >>> 8<8<8< > >>> ovs-appctl ofproto/trace br-int > in_port=tapa489d406-91,dl_src=fa:16:3e:2c:b0:fd,dl_dst=fa:16:3e:8b:b5:39 > >>> 8<8<8< > >>> > >>> ends with > >>> > >>> 8<8<8< > >>> 65. reg15=0x1,metadata=0x97e, priority 100, cookie 0x15ec4875 > >>> output:1187 > >>> >> Nonexistent output port > >>> 8<8<8< > >>> > >>> And I can verify that there is no flow for the actual ovnmeta tap > interface (tap67731b0a-c0): > >>> > >>> 8<8<8< > >>> # docker exec -it openvswitch_vswitchd ovs-ofctl dump-flows br-int > |grep -E output:'("tap67731b0a-c0"|1187)' > >>> cookie=0x15ec4875, duration=1868.378s, table=65, n_packets=524, > n_bytes=40856, priority=100,reg15=0x1,metadata=0x97e actions=output:1187 > >>> # > >>> 8<8<8< > >>> > >>> From ovs-vswitchd.log it seems the interface tap67731b0a-c0 was added > with index 1187, then deleted, and re-added with index 1189 - that's > probably due to the fact that that is the only VM in that network and I'm > constantly hard rebooting it via rescue/unrescue: > >>> > >>> 8<8<8< > >>> 2020-11-20T11:41:18.347Z|08043|bridge|INFO|bridge br-int: added > interface tap67731b0a-c0 on port 1187 > >>> 2020-11-20T11:41:30.813Z|08044|bridge|INFO|bridge br-int: deleted > interface tapa489d406-91 on port 1186 > >>> 2020-11-20T11:41:30.816Z|08045|bridge|WARN|could not open network > device tapa489d406-91 (No such device) > >>> 2020-11-20T11:41:31.040Z|08046|bridge|INFO|bridge br-int: deleted > interface tap67731b0a-c0 on port 1187 > >>> 2020-11-20T11:41:31.044Z|08047|bridge|WARN|could not open network > device tapa489d406-91 (No such device) > >>> 2020-11-20T11:41:31.050Z|08048|bridge|WARN|could not open network > device tapa489d406-91 (No such device) > >>> 2020-11-20T11:41:31.235Z|08049|connmgr|INFO|br-int<->unix#31: 2069 > flow_mods in the last 43 s (858 adds, 814 deletes, 397 modifications) > >>> 2020-11-20T11:41:33.057Z|08050|bridge|INFO|bridge br-int: added > interface tapa489d406-91 on port 1188 > >>> 2020-11-20T11:41:33.582Z|08051|bridge|INFO|bridge br-int: added > interface tap67731b0a-c0 on port 1189 > >>> 2020-11-20T11:42:31.235Z|08052|connmgr|INFO|br-int<->unix#31: 168 > flow_mods in the 2 s starting 59 s ago (114 adds, 10 deletes, 44 > modifications) > >>> 8<8<8< > >>> > >>> Once I restart ovn-controller it recalculates local ovs flows and the > problem is fixed so I'm assuming it's a local problem and not related to NB > and SB databases. > >>> > >>> > >>> I have seen exactly the same which with 20.09, for the same port input > and output ofports do not match: > >>> > >>> bash-4.4# ovs-ofctl dump-flows br-int table=0 | grep 745 > >>> cookie=0x38937d8e, duration=40387.372s, table=0, n_packets=1863, > n_bytes=111678, idle_age=1, priority=100,in_port=745 >
Re: [ovs-discuss] [ovn] Broken ovs localport flow for ovnmeta namespaces created by neutron
On Thu, Feb 4, 2021 at 5:04 PM Michał Nasiadka wrote: > > Hello, > > I've been hitting the same issue with OVN 20.09 from CentOS NFV SIG repo - is > there a chance to backport this change to 20.09? Its already backported a while ago - https://github.com/ovn-org/ovn/commit/ab1e46d7ec7a2ba80d68fbf6a45d09eef3269541 Numan > > czw., 17 gru 2020 o 11:05 Daniel Alvarez Sanchez > napisał(a): >> >> >> >> On Tue, Dec 15, 2020 at 11:39 AM Krzysztof Klimonda >> wrote: >>> >>> Hi, >>> >>> Just as a quick update - I've updated our ovn version to 20.12.0 snapshot >>> (d8bc0377c) and so far the problem hasn't yet reoccurred after over 24 >>> hours of tempest testing. >> >> >> We could reproduce the issue with 20.12 and master. Also this is not related >> exclusively to localports but to any port potentially. >> Dumitru posted a fix for this: >> >> http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/ >> >> Thanks! >> daniel >>> >>> >>> Best Regards, >>> -Chris >>> >>> >>> On Tue, Dec 15, 2020, at 11:13, Daniel Alvarez Sanchez wrote: >>> >>> Hey Krzysztof, >>> >>> On Fri, Nov 20, 2020 at 1:17 PM Krzysztof Klimonda >>> wrote: >>> >>> Hi, >>> >>> Doing some tempest runs on our pre-prod environment (stable/ussuri with ovn >>> 20.06.2 release) I've noticed that some network connectivity tests were >>> failing randomly. I've reproduced that by conitnously rescuing and >>> unrescuing instance - network connectivity from and to VM works in general >>> (dhcp is fine, access from outside is fine), however VM has no access to >>> its metadata server (via 169.254.169.254 ip address). Tracing packet from >>> VM to metadata via: >>> >>> 8<8<8< >>> ovs-appctl ofproto/trace br-int >>> in_port=tapa489d406-91,dl_src=fa:16:3e:2c:b0:fd,dl_dst=fa:16:3e:8b:b5:39 >>> 8<8<8< >>> >>> ends with >>> >>> 8<8<8< >>> 65. reg15=0x1,metadata=0x97e, priority 100, cookie 0x15ec4875 >>> output:1187 >>> >> Nonexistent output port >>> 8<8<8< >>> >>> And I can verify that there is no flow for the actual ovnmeta tap interface >>> (tap67731b0a-c0): >>> >>> 8<8<8< >>> # docker exec -it openvswitch_vswitchd ovs-ofctl dump-flows br-int |grep -E >>> output:'("tap67731b0a-c0"|1187)' >>> cookie=0x15ec4875, duration=1868.378s, table=65, n_packets=524, >>> n_bytes=40856, priority=100,reg15=0x1,metadata=0x97e actions=output:1187 >>> # >>> 8<8<8< >>> >>> From ovs-vswitchd.log it seems the interface tap67731b0a-c0 was added with >>> index 1187, then deleted, and re-added with index 1189 - that's probably >>> due to the fact that that is the only VM in that network and I'm constantly >>> hard rebooting it via rescue/unrescue: >>> >>> 8<8<8< >>> 2020-11-20T11:41:18.347Z|08043|bridge|INFO|bridge br-int: added interface >>> tap67731b0a-c0 on port 1187 >>> 2020-11-20T11:41:30.813Z|08044|bridge|INFO|bridge br-int: deleted interface >>> tapa489d406-91 on port 1186 >>> 2020-11-20T11:41:30.816Z|08045|bridge|WARN|could not open network device >>> tapa489d406-91 (No such device) >>> 2020-11-20T11:41:31.040Z|08046|bridge|INFO|bridge br-int: deleted interface >>> tap67731b0a-c0 on port 1187 >>> 2020-11-20T11:41:31.044Z|08047|bridge|WARN|could not open network device >>> tapa489d406-91 (No such device) >>> 2020-11-20T11:41:31.050Z|08048|bridge|WARN|could not open network device >>> tapa489d406-91 (No such device) >>> 2020-11-20T11:41:31.235Z|08049|connmgr|INFO|br-int<->unix#31: 2069 >>> flow_mods in the last 43 s (858 adds, 814 deletes, 397 modifications) >>> 2020-11-20T11:41:33.057Z|08050|bridge|INFO|bridge br-int: added interface >>> tapa489d406-91 on port 1188 >>> 2020-11-20T11:41:33.582Z|08051|bridge|INFO|bridge br-int: added interface >>> tap67731b0a-c0 on port 1189 >>> 2020-11-20T11:42:31.235Z|08052|connmgr|INFO|br-int<->unix#31: 168 flow_mods >>> in the 2 s starting 59 s ago (114 adds, 10 deletes, 44 modifications) >>> 8<8<8< >>> >>> Once I restart ovn-controller it recalculates local ovs flows and the >>> problem is fixed so I'm assuming it's a local problem and not related to NB >>> and SB databases. >>> >>> >>> I have seen exactly the same which with 20.09, for the same port input and >>> output ofports do not match: >>> >>> bash-4.4# ovs-ofctl dump-flows br-int table=0 | grep 745 >>> cookie=0x38937d8e, duration=40387.372s, table=0, n_packets=1863, >>> n_bytes=111678, idle_age=1, priority=100,in_port=745 >>> actions=load:0x4b->NXM_NX_REG13[],load:0x6a->NXM_NX_REG11[],load:0x69->NXM_NX_REG12[],load:0x18d->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8) >>> >>> >>> bash-4.4# ovs-ofctl dump-flows br-int table=65 | grep 8937d8e >>> cookie=0x38937d8e, duration=40593.699s, table=65, n_packets=1848, >>> n_bytes=98960, idle_age=2599, priority=100,reg15=0x1,metadata=0x18d >>> actions=output:737 >>> >>> >>> In table=0, the ofport is fine (745) but in the
Re: [ovs-discuss] [ovn] Broken ovs localport flow for ovnmeta namespaces created by neutron
Hello, I've been hitting the same issue with OVN 20.09 from CentOS NFV SIG repo - is there a chance to backport this change to 20.09? czw., 17 gru 2020 o 11:05 Daniel Alvarez Sanchez napisał(a): > > > On Tue, Dec 15, 2020 at 11:39 AM Krzysztof Klimonda < > kklimo...@syntaxhighlighted.com> wrote: > >> Hi, >> >> Just as a quick update - I've updated our ovn version to 20.12.0 snapshot >> (d8bc0377c) and so far the problem hasn't yet reoccurred after over 24 >> hours of tempest testing. >> > > We could reproduce the issue with 20.12 and master. Also this is not > related exclusively to localports but to any port potentially. > Dumitru posted a fix for this: > > > http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/ > > Thanks! > daniel > >> >> Best Regards, >> -Chris >> >> >> On Tue, Dec 15, 2020, at 11:13, Daniel Alvarez Sanchez wrote: >> >> Hey Krzysztof, >> >> On Fri, Nov 20, 2020 at 1:17 PM Krzysztof Klimonda < >> kklimo...@syntaxhighlighted.com> wrote: >> >> Hi, >> >> Doing some tempest runs on our pre-prod environment (stable/ussuri with >> ovn 20.06.2 release) I've noticed that some network connectivity tests were >> failing randomly. I've reproduced that by conitnously rescuing and >> unrescuing instance - network connectivity from and to VM works in general >> (dhcp is fine, access from outside is fine), however VM has no access to >> its metadata server (via 169.254.169.254 ip address). Tracing packet from >> VM to metadata via: >> >> 8<8<8< >> ovs-appctl ofproto/trace br-int >> in_port=tapa489d406-91,dl_src=fa:16:3e:2c:b0:fd,dl_dst=fa:16:3e:8b:b5:39 >> 8<8<8< >> >> ends with >> >> 8<8<8< >> 65. reg15=0x1,metadata=0x97e, priority 100, cookie 0x15ec4875 >> output:1187 >> >> Nonexistent output port >> 8<8<8< >> >> And I can verify that there is no flow for the actual ovnmeta tap >> interface (tap67731b0a-c0): >> >> 8<8<8< >> # docker exec -it openvswitch_vswitchd ovs-ofctl dump-flows br-int |grep >> -E output:'("tap67731b0a-c0"|1187)' >> cookie=0x15ec4875, duration=1868.378s, table=65, n_packets=524, >> n_bytes=40856, priority=100,reg15=0x1,metadata=0x97e actions=output:1187 >> # >> 8<8<8< >> >> From ovs-vswitchd.log it seems the interface tap67731b0a-c0 was added >> with index 1187, then deleted, and re-added with index 1189 - that's >> probably due to the fact that that is the only VM in that network and I'm >> constantly hard rebooting it via rescue/unrescue: >> >> 8<8<8< >> 2020-11-20T11:41:18.347Z|08043|bridge|INFO|bridge br-int: added interface >> tap67731b0a-c0 on port 1187 >> 2020-11-20T11:41:30.813Z|08044|bridge|INFO|bridge br-int: deleted >> interface tapa489d406-91 on port 1186 >> 2020-11-20T11:41:30.816Z|08045|bridge|WARN|could not open network device >> tapa489d406-91 (No such device) >> 2020-11-20T11:41:31.040Z|08046|bridge|INFO|bridge br-int: deleted >> interface tap67731b0a-c0 on port 1187 >> 2020-11-20T11:41:31.044Z|08047|bridge|WARN|could not open network device >> tapa489d406-91 (No such device) >> 2020-11-20T11:41:31.050Z|08048|bridge|WARN|could not open network device >> tapa489d406-91 (No such device) >> 2020-11-20T11:41:31.235Z|08049|connmgr|INFO|br-int<->unix#31: 2069 >> flow_mods in the last 43 s (858 adds, 814 deletes, 397 modifications) >> 2020-11-20T11:41:33.057Z|08050|bridge|INFO|bridge br-int: added interface >> tapa489d406-91 on port 1188 >> 2020-11-20T11:41:33.582Z|08051|bridge|INFO|bridge br-int: added interface >> tap67731b0a-c0 on port 1189 >> 2020-11-20T11:42:31.235Z|08052|connmgr|INFO|br-int<->unix#31: 168 >> flow_mods in the 2 s starting 59 s ago (114 adds, 10 deletes, 44 >> modifications) >> 8<8<8< >> >> Once I restart ovn-controller it recalculates local ovs flows and the >> problem is fixed so I'm assuming it's a local problem and not related to NB >> and SB databases. >> >> >> I have seen exactly the same which with 20.09, for the same port input >> and output ofports do not match: >> >> bash-4.4# ovs-ofctl dump-flows br-int table=0 | grep 745 >> cookie=0x38937d8e, duration=40387.372s, table=0, n_packets=1863, >> n_bytes=111678, idle_age=1, priority=100,in_port=745 >> actions=load:0x4b->NXM_NX_REG13[],load:0x6a->NXM_NX_REG11[],load:0x69->NXM_NX_REG12[],load:0x18d->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8) >> >> >> bash-4.4# ovs-ofctl dump-flows br-int table=65 | grep 8937d8e >> cookie=0x38937d8e, duration=40593.699s, table=65, n_packets=1848, >> n_bytes=98960, idle_age=2599, priority=100,reg15=0x1,metadata=0x18d >> actions=output:737 >> >> >> In table=0, the ofport is fine (745) but in the output stage it is using >> a different one (737). >> >> By checking the OVS database transaction history, that port, at some >> point, had the id 737: >> >> record 6516: 2020-12-14 22:22:54.184 >> >> table Interface row "tap71a5dfc1-10" (073801e2): >> ofport=737 >>
Re: [ovs-discuss] [ovn] Broken ovs localport flow for ovnmeta namespaces created by neutron
On Tue, Dec 15, 2020 at 11:39 AM Krzysztof Klimonda < kklimo...@syntaxhighlighted.com> wrote: > Hi, > > Just as a quick update - I've updated our ovn version to 20.12.0 snapshot > (d8bc0377c) and so far the problem hasn't yet reoccurred after over 24 > hours of tempest testing. > We could reproduce the issue with 20.12 and master. Also this is not related exclusively to localports but to any port potentially. Dumitru posted a fix for this: http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/ Thanks! daniel > > Best Regards, > -Chris > > > On Tue, Dec 15, 2020, at 11:13, Daniel Alvarez Sanchez wrote: > > Hey Krzysztof, > > On Fri, Nov 20, 2020 at 1:17 PM Krzysztof Klimonda < > kklimo...@syntaxhighlighted.com> wrote: > > Hi, > > Doing some tempest runs on our pre-prod environment (stable/ussuri with > ovn 20.06.2 release) I've noticed that some network connectivity tests were > failing randomly. I've reproduced that by conitnously rescuing and > unrescuing instance - network connectivity from and to VM works in general > (dhcp is fine, access from outside is fine), however VM has no access to > its metadata server (via 169.254.169.254 ip address). Tracing packet from > VM to metadata via: > > 8<8<8< > ovs-appctl ofproto/trace br-int > in_port=tapa489d406-91,dl_src=fa:16:3e:2c:b0:fd,dl_dst=fa:16:3e:8b:b5:39 > 8<8<8< > > ends with > > 8<8<8< > 65. reg15=0x1,metadata=0x97e, priority 100, cookie 0x15ec4875 > output:1187 > >> Nonexistent output port > 8<8<8< > > And I can verify that there is no flow for the actual ovnmeta tap > interface (tap67731b0a-c0): > > 8<8<8< > # docker exec -it openvswitch_vswitchd ovs-ofctl dump-flows br-int |grep > -E output:'("tap67731b0a-c0"|1187)' > cookie=0x15ec4875, duration=1868.378s, table=65, n_packets=524, > n_bytes=40856, priority=100,reg15=0x1,metadata=0x97e actions=output:1187 > # > 8<8<8< > > From ovs-vswitchd.log it seems the interface tap67731b0a-c0 was added with > index 1187, then deleted, and re-added with index 1189 - that's probably > due to the fact that that is the only VM in that network and I'm constantly > hard rebooting it via rescue/unrescue: > > 8<8<8< > 2020-11-20T11:41:18.347Z|08043|bridge|INFO|bridge br-int: added interface > tap67731b0a-c0 on port 1187 > 2020-11-20T11:41:30.813Z|08044|bridge|INFO|bridge br-int: deleted > interface tapa489d406-91 on port 1186 > 2020-11-20T11:41:30.816Z|08045|bridge|WARN|could not open network device > tapa489d406-91 (No such device) > 2020-11-20T11:41:31.040Z|08046|bridge|INFO|bridge br-int: deleted > interface tap67731b0a-c0 on port 1187 > 2020-11-20T11:41:31.044Z|08047|bridge|WARN|could not open network device > tapa489d406-91 (No such device) > 2020-11-20T11:41:31.050Z|08048|bridge|WARN|could not open network device > tapa489d406-91 (No such device) > 2020-11-20T11:41:31.235Z|08049|connmgr|INFO|br-int<->unix#31: 2069 > flow_mods in the last 43 s (858 adds, 814 deletes, 397 modifications) > 2020-11-20T11:41:33.057Z|08050|bridge|INFO|bridge br-int: added interface > tapa489d406-91 on port 1188 > 2020-11-20T11:41:33.582Z|08051|bridge|INFO|bridge br-int: added interface > tap67731b0a-c0 on port 1189 > 2020-11-20T11:42:31.235Z|08052|connmgr|INFO|br-int<->unix#31: 168 > flow_mods in the 2 s starting 59 s ago (114 adds, 10 deletes, 44 > modifications) > 8<8<8< > > Once I restart ovn-controller it recalculates local ovs flows and the > problem is fixed so I'm assuming it's a local problem and not related to NB > and SB databases. > > > I have seen exactly the same which with 20.09, for the same port input and > output ofports do not match: > > bash-4.4# ovs-ofctl dump-flows br-int table=0 | grep 745 > cookie=0x38937d8e, duration=40387.372s, table=0, n_packets=1863, > n_bytes=111678, idle_age=1, priority=100,in_port=745 > actions=load:0x4b->NXM_NX_REG13[],load:0x6a->NXM_NX_REG11[],load:0x69->NXM_NX_REG12[],load:0x18d->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8) > > > bash-4.4# ovs-ofctl dump-flows br-int table=65 | grep 8937d8e > cookie=0x38937d8e, duration=40593.699s, table=65, n_packets=1848, > n_bytes=98960, idle_age=2599, priority=100,reg15=0x1,metadata=0x18d > actions=output:737 > > > In table=0, the ofport is fine (745) but in the output stage it is using a > different one (737). > > By checking the OVS database transaction history, that port, at some > point, had the id 737: > > record 6516: 2020-12-14 22:22:54.184 > > table Interface row "tap71a5dfc1-10" (073801e2): > ofport=737 > table Open_vSwitch row 1d9566c8 (1d9566c8): > cur_cfg=2023 > > So it looks like ovn-controller is not updating the ofport in the physical > flows for the output stage. > > We'll try to figure out if this happens also in master. > > Thanks, > daniel > > > -- > Krzysztof Klimonda > kklimo...@syntaxhighlighted.com >
Re: [ovs-discuss] [ovn] Broken ovs localport flow for ovnmeta namespaces created by neutron
Hi, Just as a quick update - I've updated our ovn version to 20.12.0 snapshot (d8bc0377c) and so far the problem hasn't yet reoccurred after over 24 hours of tempest testing. Best Regards, -Chris On Tue, Dec 15, 2020, at 11:13, Daniel Alvarez Sanchez wrote: > Hey Krzysztof, > > On Fri, Nov 20, 2020 at 1:17 PM Krzysztof Klimonda > wrote: >> Hi, >> >> Doing some tempest runs on our pre-prod environment (stable/ussuri with ovn >> 20.06.2 release) I've noticed that some network connectivity tests were >> failing randomly. I've reproduced that by conitnously rescuing and >> unrescuing instance - network connectivity from and to VM works in general >> (dhcp is fine, access from outside is fine), however VM has no access to its >> metadata server (via 169.254.169.254 ip address). Tracing packet from VM to >> metadata via: >> >> 8<8<8< >> ovs-appctl ofproto/trace br-int >> in_port=tapa489d406-91,dl_src=fa:16:3e:2c:b0:fd,dl_dst=fa:16:3e:8b:b5:39 >> 8<8<8< >> >> ends with >> >> 8<8<8< >> 65. reg15=0x1,metadata=0x97e, priority 100, cookie 0x15ec4875 >> output:1187 >> >> Nonexistent output port >> 8<8<8< >> >> And I can verify that there is no flow for the actual ovnmeta tap interface >> (tap67731b0a-c0): >> >> 8<8<8< >> # docker exec -it openvswitch_vswitchd ovs-ofctl dump-flows br-int |grep -E >> output:'("tap67731b0a-c0"|1187)' >> cookie=0x15ec4875, duration=1868.378s, table=65, n_packets=524, >> n_bytes=40856, priority=100,reg15=0x1,metadata=0x97e actions=output:1187 >> # >> 8<8<8< >> >> From ovs-vswitchd.log it seems the interface tap67731b0a-c0 was added with >> index 1187, then deleted, and re-added with index 1189 - that's probably due >> to the fact that that is the only VM in that network and I'm constantly hard >> rebooting it via rescue/unrescue: >> >> 8<8<8< >> 2020-11-20T11:41:18.347Z|08043|bridge|INFO|bridge br-int: added interface >> tap67731b0a-c0 on port 1187 >> 2020-11-20T11:41:30.813Z|08044|bridge|INFO|bridge br-int: deleted interface >> tapa489d406-91 on port 1186 >> 2020-11-20T11:41:30.816Z|08045|bridge|WARN|could not open network device >> tapa489d406-91 (No such device) >> 2020-11-20T11:41:31.040Z|08046|bridge|INFO|bridge br-int: deleted interface >> tap67731b0a-c0 on port 1187 >> 2020-11-20T11:41:31.044Z|08047|bridge|WARN|could not open network device >> tapa489d406-91 (No such device) >> 2020-11-20T11:41:31.050Z|08048|bridge|WARN|could not open network device >> tapa489d406-91 (No such device) >> 2020-11-20T11:41:31.235Z|08049|connmgr|INFO|br-int<->unix#31: 2069 flow_mods >> in the last 43 s (858 adds, 814 deletes, 397 modifications) >> 2020-11-20T11:41:33.057Z|08050|bridge|INFO|bridge br-int: added interface >> tapa489d406-91 on port 1188 >> 2020-11-20T11:41:33.582Z|08051|bridge|INFO|bridge br-int: added interface >> tap67731b0a-c0 on port 1189 >> 2020-11-20T11:42:31.235Z|08052|connmgr|INFO|br-int<->unix#31: 168 flow_mods >> in the 2 s starting 59 s ago (114 adds, 10 deletes, 44 modifications) >> 8<8<8< >> >> Once I restart ovn-controller it recalculates local ovs flows and the >> problem is fixed so I'm assuming it's a local problem and not related to NB >> and SB databases. >> > > I have seen exactly the same which with 20.09, for the same port input and > output ofports do not match: > > bash-4.4# ovs-ofctl dump-flows br-int table=0 | grep 745 > cookie=0x38937d8e, duration=40387.372s, table=0, n_packets=1863, > n_bytes=111678, idle_age=1, priority=100,in_port=745 > actions=load:0x4b->NXM_NX_REG13[],load:0x6a->NXM_NX_REG11[],load:0x69->NXM_NX_REG12[],load:0x18d->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8) > > > bash-4.4# ovs-ofctl dump-flows br-int table=65 | grep 8937d8e > cookie=0x38937d8e, duration=40593.699s, table=65, n_packets=1848, > n_bytes=98960, idle_age=2599, priority=100,reg15=0x1,metadata=0x18d > actions=output:737 > > > In table=0, the ofport is fine (745) but in the output stage it is using a > different one (737). > > By checking the OVS database transaction history, that port, at some point, > had the id 737: > > record 6516: 2020-12-14 22:22:54.184 > > table Interface row "tap71a5dfc1-10" (073801e2): > ofport=737 > table Open_vSwitch row 1d9566c8 (1d9566c8): > cur_cfg=2023 > > So it looks like ovn-controller is not updating the ofport in the physical > flows for the output stage. > > We'll try to figure out if this happens also in master. > > Thanks, > daniel > >> -- >> Krzysztof Klimonda >> kklimo...@syntaxhighlighted.com >> ___ >> discuss mailing list >> disc...@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >> ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] [ovn] Broken ovs localport flow for ovnmeta namespaces created by neutron
Hey Krzysztof, On Fri, Nov 20, 2020 at 1:17 PM Krzysztof Klimonda < kklimo...@syntaxhighlighted.com> wrote: > Hi, > > Doing some tempest runs on our pre-prod environment (stable/ussuri with > ovn 20.06.2 release) I've noticed that some network connectivity tests were > failing randomly. I've reproduced that by conitnously rescuing and > unrescuing instance - network connectivity from and to VM works in general > (dhcp is fine, access from outside is fine), however VM has no access to > its metadata server (via 169.254.169.254 ip address). Tracing packet from > VM to metadata via: > > 8<8<8< > ovs-appctl ofproto/trace br-int > in_port=tapa489d406-91,dl_src=fa:16:3e:2c:b0:fd,dl_dst=fa:16:3e:8b:b5:39 > 8<8<8< > > ends with > > 8<8<8< > 65. reg15=0x1,metadata=0x97e, priority 100, cookie 0x15ec4875 > output:1187 > >> Nonexistent output port > 8<8<8< > > And I can verify that there is no flow for the actual ovnmeta tap > interface (tap67731b0a-c0): > > 8<8<8< > # docker exec -it openvswitch_vswitchd ovs-ofctl dump-flows br-int |grep > -E output:'("tap67731b0a-c0"|1187)' > cookie=0x15ec4875, duration=1868.378s, table=65, n_packets=524, > n_bytes=40856, priority=100,reg15=0x1,metadata=0x97e actions=output:1187 > # > 8<8<8< > > From ovs-vswitchd.log it seems the interface tap67731b0a-c0 was added with > index 1187, then deleted, and re-added with index 1189 - that's probably > due to the fact that that is the only VM in that network and I'm constantly > hard rebooting it via rescue/unrescue: > > 8<8<8< > 2020-11-20T11:41:18.347Z|08043|bridge|INFO|bridge br-int: added interface > tap67731b0a-c0 on port 1187 > 2020-11-20T11:41:30.813Z|08044|bridge|INFO|bridge br-int: deleted > interface tapa489d406-91 on port 1186 > 2020-11-20T11:41:30.816Z|08045|bridge|WARN|could not open network device > tapa489d406-91 (No such device) > 2020-11-20T11:41:31.040Z|08046|bridge|INFO|bridge br-int: deleted > interface tap67731b0a-c0 on port 1187 > 2020-11-20T11:41:31.044Z|08047|bridge|WARN|could not open network device > tapa489d406-91 (No such device) > 2020-11-20T11:41:31.050Z|08048|bridge|WARN|could not open network device > tapa489d406-91 (No such device) > 2020-11-20T11:41:31.235Z|08049|connmgr|INFO|br-int<->unix#31: 2069 > flow_mods in the last 43 s (858 adds, 814 deletes, 397 modifications) > 2020-11-20T11:41:33.057Z|08050|bridge|INFO|bridge br-int: added interface > tapa489d406-91 on port 1188 > 2020-11-20T11:41:33.582Z|08051|bridge|INFO|bridge br-int: added interface > tap67731b0a-c0 on port 1189 > 2020-11-20T11:42:31.235Z|08052|connmgr|INFO|br-int<->unix#31: 168 > flow_mods in the 2 s starting 59 s ago (114 adds, 10 deletes, 44 > modifications) > 8<8<8< > > Once I restart ovn-controller it recalculates local ovs flows and the > problem is fixed so I'm assuming it's a local problem and not related to NB > and SB databases. > > I have seen exactly the same which with 20.09, for the same port input and output ofports do not match: bash-4.4# ovs-ofctl dump-flows br-int table=0 | grep 745 cookie=0x38937d8e, duration=40387.372s, table=0, n_packets=1863, n_bytes=111678, idle_age=1, priority=100,in_port=745 actions=load:0x4b->NXM_NX_REG13[],load:0x6a->NXM_NX_REG11[],load:0x69->NXM_NX_REG12[],load:0x18d->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8) bash-4.4# ovs-ofctl dump-flows br-int table=65 | grep 8937d8e cookie=0x38937d8e, duration=40593.699s, table=65, n_packets=1848, n_bytes=98960, idle_age=2599, priority=100,reg15=0x1,metadata=0x18d actions=output:737 In table=0, the ofport is fine (745) but in the output stage it is using a different one (737). By checking the OVS database transaction history, that port, at some point, had the id 737: record 6516: 2020-12-14 22:22:54.184 table Interface row "tap71a5dfc1-10" (073801e2): ofport=737 table Open_vSwitch row 1d9566c8 (1d9566c8): cur_cfg=2023 So it looks like ovn-controller is not updating the ofport in the physical flows for the output stage. We'll try to figure out if this happens also in master. Thanks, daniel > -- > Krzysztof Klimonda > kklimo...@syntaxhighlighted.com > ___ > discuss mailing list > disc...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss