Re: [ovs-discuss] [ovn] Broken ovs localport flow for ovnmeta namespaces created by neutron

2021-02-04 Thread Numan Siddique
On Thu, Feb 4, 2021 at 7:39 PM Michał Nasiadka  wrote:
>
> Hello Numan,
>
> But the latest package in CentOS NFV SIG repo is built/uploaded 16th Dec 2020.
> Are you sure it contains the fix?

I think you need to check with the Centos NFV/RDO team for that.

Thanks
Numan

>
> czw., 4 lut 2021 o 13:53 Numan Siddique  napisał(a):
>>
>> On Thu, Feb 4, 2021 at 5:04 PM Michał Nasiadka  wrote:
>> >
>> > Hello,
>> >
>> > I've been hitting the same issue with OVN 20.09 from CentOS NFV SIG repo - 
>> > is there a chance to backport this change to 20.09?
>>
>> Its already backported a while ago -
>> https://github.com/ovn-org/ovn/commit/ab1e46d7ec7a2ba80d68fbf6a45d09eef3269541
>>
>> Numan
>>
>> >
>> > czw., 17 gru 2020 o 11:05 Daniel Alvarez Sanchez  
>> > napisał(a):
>> >>
>> >>
>> >>
>> >> On Tue, Dec 15, 2020 at 11:39 AM Krzysztof Klimonda 
>> >>  wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> Just as a quick update - I've updated our ovn version to 20.12.0 
>> >>> snapshot (d8bc0377c) and so far the problem hasn't yet reoccurred after 
>> >>> over 24 hours of tempest testing.
>> >>
>> >>
>> >> We could reproduce the issue with 20.12 and master. Also this is not 
>> >> related exclusively to localports but to any port potentially.
>> >> Dumitru posted a fix for this:
>> >>
>> >> http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/
>> >>
>> >> Thanks!
>> >> daniel
>> >>>
>> >>>
>> >>> Best Regards,
>> >>> -Chris
>> >>>
>> >>>
>> >>> On Tue, Dec 15, 2020, at 11:13, Daniel Alvarez Sanchez wrote:
>> >>>
>> >>> Hey Krzysztof,
>> >>>
>> >>> On Fri, Nov 20, 2020 at 1:17 PM Krzysztof Klimonda 
>> >>>  wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> Doing some tempest runs on our pre-prod environment (stable/ussuri with 
>> >>> ovn 20.06.2 release) I've noticed that some network connectivity tests 
>> >>> were failing randomly. I've reproduced that by conitnously rescuing and 
>> >>> unrescuing instance - network connectivity from and to VM works in 
>> >>> general (dhcp is fine, access from outside is fine), however VM has no 
>> >>> access to its metadata server (via 169.254.169.254 ip address). Tracing 
>> >>> packet from VM to metadata via:
>> >>>
>> >>> 8<8<8<
>> >>> ovs-appctl ofproto/trace br-int 
>> >>> in_port=tapa489d406-91,dl_src=fa:16:3e:2c:b0:fd,dl_dst=fa:16:3e:8b:b5:39
>> >>> 8<8<8<
>> >>>
>> >>> ends with
>> >>>
>> >>> 8<8<8<
>> >>> 65. reg15=0x1,metadata=0x97e, priority 100, cookie 0x15ec4875
>> >>> output:1187
>> >>>  >> Nonexistent output port
>> >>> 8<8<8<
>> >>>
>> >>> And I can verify that there is no flow for the actual ovnmeta tap 
>> >>> interface (tap67731b0a-c0):
>> >>>
>> >>> 8<8<8<
>> >>> # docker exec -it openvswitch_vswitchd ovs-ofctl dump-flows br-int |grep 
>> >>> -E output:'("tap67731b0a-c0"|1187)'
>> >>>  cookie=0x15ec4875, duration=1868.378s, table=65, n_packets=524, 
>> >>> n_bytes=40856, priority=100,reg15=0x1,metadata=0x97e actions=output:1187
>> >>> #
>> >>> 8<8<8<
>> >>>
>> >>> From ovs-vswitchd.log it seems the interface tap67731b0a-c0 was added 
>> >>> with index 1187, then deleted, and re-added with index 1189 - that's 
>> >>> probably due to the fact that that is the only VM in that network and 
>> >>> I'm constantly hard rebooting it via rescue/unrescue:
>> >>>
>> >>> 8<8<8<
>> >>> 2020-11-20T11:41:18.347Z|08043|bridge|INFO|bridge br-int: added 
>> >>> interface tap67731b0a-c0 on port 1187
>> >>> 2020-11-20T11:41:30.813Z|08044|bridge|INFO|bridge br-int: deleted 
>> >>> interface tapa489d406-91 on port 1186
>> >>> 2020-11-20T11:41:30.816Z|08045|bridge|WARN|could not open network device 
>> >>> tapa489d406-91 (No such device)
>> >>> 2020-11-20T11:41:31.040Z|08046|bridge|INFO|bridge br-int: deleted 
>> >>> interface tap67731b0a-c0 on port 1187
>> >>> 2020-11-20T11:41:31.044Z|08047|bridge|WARN|could not open network device 
>> >>> tapa489d406-91 (No such device)
>> >>> 2020-11-20T11:41:31.050Z|08048|bridge|WARN|could not open network device 
>> >>> tapa489d406-91 (No such device)
>> >>> 2020-11-20T11:41:31.235Z|08049|connmgr|INFO|br-int<->unix#31: 2069 
>> >>> flow_mods in the last 43 s (858 adds, 814 deletes, 397 modifications)
>> >>> 2020-11-20T11:41:33.057Z|08050|bridge|INFO|bridge br-int: added 
>> >>> interface tapa489d406-91 on port 1188
>> >>> 2020-11-20T11:41:33.582Z|08051|bridge|INFO|bridge br-int: added 
>> >>> interface tap67731b0a-c0 on port 1189
>> >>> 2020-11-20T11:42:31.235Z|08052|connmgr|INFO|br-int<->unix#31: 168 
>> >>> flow_mods in the 2 s starting 59 s ago (114 adds, 10 deletes, 44 
>> >>> modifications)
>> >>> 8<8<8<
>> >>>
>> >>> Once I restart ovn-controller it recalculates local ovs flows and the 
>> >>> problem is fixed so I'm assuming it's a local problem and not related to 
>> >>> NB and SB databases.
>> >>>
>> >>>
>> >>> I have seen exactly the same which with 20.09, for the same port input 
>> 

Re: [ovs-discuss] [ovn] Broken ovs localport flow for ovnmeta namespaces created by neutron

2021-02-04 Thread Michał Nasiadka
Hello Numan,

But the latest package in CentOS NFV SIG repo is built/uploaded 16th Dec
2020.
Are you sure it contains the fix?

czw., 4 lut 2021 o 13:53 Numan Siddique  napisał(a):

> On Thu, Feb 4, 2021 at 5:04 PM Michał Nasiadka 
> wrote:
> >
> > Hello,
> >
> > I've been hitting the same issue with OVN 20.09 from CentOS NFV SIG repo
> - is there a chance to backport this change to 20.09?
>
> Its already backported a while ago -
>
> https://github.com/ovn-org/ovn/commit/ab1e46d7ec7a2ba80d68fbf6a45d09eef3269541
>
> Numan
>
> >
> > czw., 17 gru 2020 o 11:05 Daniel Alvarez Sanchez 
> napisał(a):
> >>
> >>
> >>
> >> On Tue, Dec 15, 2020 at 11:39 AM Krzysztof Klimonda <
> kklimo...@syntaxhighlighted.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> Just as a quick update - I've updated our ovn version to 20.12.0
> snapshot (d8bc0377c) and so far the problem hasn't yet reoccurred after
> over 24 hours of tempest testing.
> >>
> >>
> >> We could reproduce the issue with 20.12 and master. Also this is not
> related exclusively to localports but to any port potentially.
> >> Dumitru posted a fix for this:
> >>
> >>
> http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/
> >>
> >> Thanks!
> >> daniel
> >>>
> >>>
> >>> Best Regards,
> >>> -Chris
> >>>
> >>>
> >>> On Tue, Dec 15, 2020, at 11:13, Daniel Alvarez Sanchez wrote:
> >>>
> >>> Hey Krzysztof,
> >>>
> >>> On Fri, Nov 20, 2020 at 1:17 PM Krzysztof Klimonda <
> kklimo...@syntaxhighlighted.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> Doing some tempest runs on our pre-prod environment (stable/ussuri
> with ovn 20.06.2 release) I've noticed that some network connectivity tests
> were failing randomly. I've reproduced that by conitnously rescuing and
> unrescuing instance - network connectivity from and to VM works in general
> (dhcp is fine, access from outside is fine), however VM has no access to
> its metadata server (via 169.254.169.254 ip address). Tracing packet from
> VM to metadata via:
> >>>
> >>> 8<8<8<
> >>> ovs-appctl ofproto/trace br-int
> in_port=tapa489d406-91,dl_src=fa:16:3e:2c:b0:fd,dl_dst=fa:16:3e:8b:b5:39
> >>> 8<8<8<
> >>>
> >>> ends with
> >>>
> >>> 8<8<8<
> >>> 65. reg15=0x1,metadata=0x97e, priority 100, cookie 0x15ec4875
> >>> output:1187
> >>>  >> Nonexistent output port
> >>> 8<8<8<
> >>>
> >>> And I can verify that there is no flow for the actual ovnmeta tap
> interface (tap67731b0a-c0):
> >>>
> >>> 8<8<8<
> >>> # docker exec -it openvswitch_vswitchd ovs-ofctl dump-flows br-int
> |grep -E output:'("tap67731b0a-c0"|1187)'
> >>>  cookie=0x15ec4875, duration=1868.378s, table=65, n_packets=524,
> n_bytes=40856, priority=100,reg15=0x1,metadata=0x97e actions=output:1187
> >>> #
> >>> 8<8<8<
> >>>
> >>> From ovs-vswitchd.log it seems the interface tap67731b0a-c0 was added
> with index 1187, then deleted, and re-added with index 1189 - that's
> probably due to the fact that that is the only VM in that network and I'm
> constantly hard rebooting it via rescue/unrescue:
> >>>
> >>> 8<8<8<
> >>> 2020-11-20T11:41:18.347Z|08043|bridge|INFO|bridge br-int: added
> interface tap67731b0a-c0 on port 1187
> >>> 2020-11-20T11:41:30.813Z|08044|bridge|INFO|bridge br-int: deleted
> interface tapa489d406-91 on port 1186
> >>> 2020-11-20T11:41:30.816Z|08045|bridge|WARN|could not open network
> device tapa489d406-91 (No such device)
> >>> 2020-11-20T11:41:31.040Z|08046|bridge|INFO|bridge br-int: deleted
> interface tap67731b0a-c0 on port 1187
> >>> 2020-11-20T11:41:31.044Z|08047|bridge|WARN|could not open network
> device tapa489d406-91 (No such device)
> >>> 2020-11-20T11:41:31.050Z|08048|bridge|WARN|could not open network
> device tapa489d406-91 (No such device)
> >>> 2020-11-20T11:41:31.235Z|08049|connmgr|INFO|br-int<->unix#31: 2069
> flow_mods in the last 43 s (858 adds, 814 deletes, 397 modifications)
> >>> 2020-11-20T11:41:33.057Z|08050|bridge|INFO|bridge br-int: added
> interface tapa489d406-91 on port 1188
> >>> 2020-11-20T11:41:33.582Z|08051|bridge|INFO|bridge br-int: added
> interface tap67731b0a-c0 on port 1189
> >>> 2020-11-20T11:42:31.235Z|08052|connmgr|INFO|br-int<->unix#31: 168
> flow_mods in the 2 s starting 59 s ago (114 adds, 10 deletes, 44
> modifications)
> >>> 8<8<8<
> >>>
> >>> Once I restart ovn-controller it recalculates local ovs flows and the
> problem is fixed so I'm assuming it's a local problem and not related to NB
> and SB databases.
> >>>
> >>>
> >>> I have seen exactly the same which with 20.09, for the same port input
> and output ofports do not match:
> >>>
> >>> bash-4.4# ovs-ofctl dump-flows br-int table=0 | grep 745
> >>>  cookie=0x38937d8e, duration=40387.372s, table=0, n_packets=1863,
> n_bytes=111678, idle_age=1, priority=100,in_port=745
> 

Re: [ovs-discuss] [ovn] Broken ovs localport flow for ovnmeta namespaces created by neutron

2021-02-04 Thread Numan Siddique
On Thu, Feb 4, 2021 at 5:04 PM Michał Nasiadka  wrote:
>
> Hello,
>
> I've been hitting the same issue with OVN 20.09 from CentOS NFV SIG repo - is 
> there a chance to backport this change to 20.09?

Its already backported a while ago -
https://github.com/ovn-org/ovn/commit/ab1e46d7ec7a2ba80d68fbf6a45d09eef3269541

Numan

>
> czw., 17 gru 2020 o 11:05 Daniel Alvarez Sanchez  
> napisał(a):
>>
>>
>>
>> On Tue, Dec 15, 2020 at 11:39 AM Krzysztof Klimonda 
>>  wrote:
>>>
>>> Hi,
>>>
>>> Just as a quick update - I've updated our ovn version to 20.12.0 snapshot 
>>> (d8bc0377c) and so far the problem hasn't yet reoccurred after over 24 
>>> hours of tempest testing.
>>
>>
>> We could reproduce the issue with 20.12 and master. Also this is not related 
>> exclusively to localports but to any port potentially.
>> Dumitru posted a fix for this:
>>
>> http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/
>>
>> Thanks!
>> daniel
>>>
>>>
>>> Best Regards,
>>> -Chris
>>>
>>>
>>> On Tue, Dec 15, 2020, at 11:13, Daniel Alvarez Sanchez wrote:
>>>
>>> Hey Krzysztof,
>>>
>>> On Fri, Nov 20, 2020 at 1:17 PM Krzysztof Klimonda 
>>>  wrote:
>>>
>>> Hi,
>>>
>>> Doing some tempest runs on our pre-prod environment (stable/ussuri with ovn 
>>> 20.06.2 release) I've noticed that some network connectivity tests were 
>>> failing randomly. I've reproduced that by conitnously rescuing and 
>>> unrescuing instance - network connectivity from and to VM works in general 
>>> (dhcp is fine, access from outside is fine), however VM has no access to 
>>> its metadata server (via 169.254.169.254 ip address). Tracing packet from 
>>> VM to metadata via:
>>>
>>> 8<8<8<
>>> ovs-appctl ofproto/trace br-int 
>>> in_port=tapa489d406-91,dl_src=fa:16:3e:2c:b0:fd,dl_dst=fa:16:3e:8b:b5:39
>>> 8<8<8<
>>>
>>> ends with
>>>
>>> 8<8<8<
>>> 65. reg15=0x1,metadata=0x97e, priority 100, cookie 0x15ec4875
>>> output:1187
>>>  >> Nonexistent output port
>>> 8<8<8<
>>>
>>> And I can verify that there is no flow for the actual ovnmeta tap interface 
>>> (tap67731b0a-c0):
>>>
>>> 8<8<8<
>>> # docker exec -it openvswitch_vswitchd ovs-ofctl dump-flows br-int |grep -E 
>>> output:'("tap67731b0a-c0"|1187)'
>>>  cookie=0x15ec4875, duration=1868.378s, table=65, n_packets=524, 
>>> n_bytes=40856, priority=100,reg15=0x1,metadata=0x97e actions=output:1187
>>> #
>>> 8<8<8<
>>>
>>> From ovs-vswitchd.log it seems the interface tap67731b0a-c0 was added with 
>>> index 1187, then deleted, and re-added with index 1189 - that's probably 
>>> due to the fact that that is the only VM in that network and I'm constantly 
>>> hard rebooting it via rescue/unrescue:
>>>
>>> 8<8<8<
>>> 2020-11-20T11:41:18.347Z|08043|bridge|INFO|bridge br-int: added interface 
>>> tap67731b0a-c0 on port 1187
>>> 2020-11-20T11:41:30.813Z|08044|bridge|INFO|bridge br-int: deleted interface 
>>> tapa489d406-91 on port 1186
>>> 2020-11-20T11:41:30.816Z|08045|bridge|WARN|could not open network device 
>>> tapa489d406-91 (No such device)
>>> 2020-11-20T11:41:31.040Z|08046|bridge|INFO|bridge br-int: deleted interface 
>>> tap67731b0a-c0 on port 1187
>>> 2020-11-20T11:41:31.044Z|08047|bridge|WARN|could not open network device 
>>> tapa489d406-91 (No such device)
>>> 2020-11-20T11:41:31.050Z|08048|bridge|WARN|could not open network device 
>>> tapa489d406-91 (No such device)
>>> 2020-11-20T11:41:31.235Z|08049|connmgr|INFO|br-int<->unix#31: 2069 
>>> flow_mods in the last 43 s (858 adds, 814 deletes, 397 modifications)
>>> 2020-11-20T11:41:33.057Z|08050|bridge|INFO|bridge br-int: added interface 
>>> tapa489d406-91 on port 1188
>>> 2020-11-20T11:41:33.582Z|08051|bridge|INFO|bridge br-int: added interface 
>>> tap67731b0a-c0 on port 1189
>>> 2020-11-20T11:42:31.235Z|08052|connmgr|INFO|br-int<->unix#31: 168 flow_mods 
>>> in the 2 s starting 59 s ago (114 adds, 10 deletes, 44 modifications)
>>> 8<8<8<
>>>
>>> Once I restart ovn-controller it recalculates local ovs flows and the 
>>> problem is fixed so I'm assuming it's a local problem and not related to NB 
>>> and SB databases.
>>>
>>>
>>> I have seen exactly the same which with 20.09, for the same port input and 
>>> output ofports do not match:
>>>
>>> bash-4.4# ovs-ofctl dump-flows br-int table=0 | grep 745
>>>  cookie=0x38937d8e, duration=40387.372s, table=0, n_packets=1863, 
>>> n_bytes=111678, idle_age=1, priority=100,in_port=745 
>>> actions=load:0x4b->NXM_NX_REG13[],load:0x6a->NXM_NX_REG11[],load:0x69->NXM_NX_REG12[],load:0x18d->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8)
>>>
>>>
>>> bash-4.4# ovs-ofctl dump-flows br-int table=65 | grep 8937d8e
>>>  cookie=0x38937d8e, duration=40593.699s, table=65, n_packets=1848, 
>>> n_bytes=98960, idle_age=2599, priority=100,reg15=0x1,metadata=0x18d 
>>> actions=output:737
>>>
>>>
>>> In table=0, the ofport is fine (745) but in the 

Re: [ovs-discuss] [ovn] Broken ovs localport flow for ovnmeta namespaces created by neutron

2021-02-04 Thread Michał Nasiadka
Hello,

I've been hitting the same issue with OVN 20.09 from CentOS NFV SIG repo -
is there a chance to backport this change to 20.09?

czw., 17 gru 2020 o 11:05 Daniel Alvarez Sanchez 
napisał(a):

>
>
> On Tue, Dec 15, 2020 at 11:39 AM Krzysztof Klimonda <
> kklimo...@syntaxhighlighted.com> wrote:
>
>> Hi,
>>
>> Just as a quick update - I've updated our ovn version to 20.12.0 snapshot
>> (d8bc0377c) and so far the problem hasn't yet reoccurred after over 24
>> hours of tempest testing.
>>
>
> We could reproduce the issue with 20.12 and master. Also this is not
> related exclusively to localports but to any port potentially.
> Dumitru posted a fix for this:
>
>
> http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/
>
> Thanks!
> daniel
>
>>
>> Best Regards,
>> -Chris
>>
>>
>> On Tue, Dec 15, 2020, at 11:13, Daniel Alvarez Sanchez wrote:
>>
>> Hey Krzysztof,
>>
>> On Fri, Nov 20, 2020 at 1:17 PM Krzysztof Klimonda <
>> kklimo...@syntaxhighlighted.com> wrote:
>>
>> Hi,
>>
>> Doing some tempest runs on our pre-prod environment (stable/ussuri with
>> ovn 20.06.2 release) I've noticed that some network connectivity tests were
>> failing randomly. I've reproduced that by conitnously rescuing and
>> unrescuing instance - network connectivity from and to VM works in general
>> (dhcp is fine, access from outside is fine), however VM has no access to
>> its metadata server (via 169.254.169.254 ip address). Tracing packet from
>> VM to metadata via:
>>
>> 8<8<8<
>> ovs-appctl ofproto/trace br-int
>> in_port=tapa489d406-91,dl_src=fa:16:3e:2c:b0:fd,dl_dst=fa:16:3e:8b:b5:39
>> 8<8<8<
>>
>> ends with
>>
>> 8<8<8<
>> 65. reg15=0x1,metadata=0x97e, priority 100, cookie 0x15ec4875
>> output:1187
>>  >> Nonexistent output port
>> 8<8<8<
>>
>> And I can verify that there is no flow for the actual ovnmeta tap
>> interface (tap67731b0a-c0):
>>
>> 8<8<8<
>> # docker exec -it openvswitch_vswitchd ovs-ofctl dump-flows br-int |grep
>> -E output:'("tap67731b0a-c0"|1187)'
>>  cookie=0x15ec4875, duration=1868.378s, table=65, n_packets=524,
>> n_bytes=40856, priority=100,reg15=0x1,metadata=0x97e actions=output:1187
>> #
>> 8<8<8<
>>
>> From ovs-vswitchd.log it seems the interface tap67731b0a-c0 was added
>> with index 1187, then deleted, and re-added with index 1189 - that's
>> probably due to the fact that that is the only VM in that network and I'm
>> constantly hard rebooting it via rescue/unrescue:
>>
>> 8<8<8<
>> 2020-11-20T11:41:18.347Z|08043|bridge|INFO|bridge br-int: added interface
>> tap67731b0a-c0 on port 1187
>> 2020-11-20T11:41:30.813Z|08044|bridge|INFO|bridge br-int: deleted
>> interface tapa489d406-91 on port 1186
>> 2020-11-20T11:41:30.816Z|08045|bridge|WARN|could not open network device
>> tapa489d406-91 (No such device)
>> 2020-11-20T11:41:31.040Z|08046|bridge|INFO|bridge br-int: deleted
>> interface tap67731b0a-c0 on port 1187
>> 2020-11-20T11:41:31.044Z|08047|bridge|WARN|could not open network device
>> tapa489d406-91 (No such device)
>> 2020-11-20T11:41:31.050Z|08048|bridge|WARN|could not open network device
>> tapa489d406-91 (No such device)
>> 2020-11-20T11:41:31.235Z|08049|connmgr|INFO|br-int<->unix#31: 2069
>> flow_mods in the last 43 s (858 adds, 814 deletes, 397 modifications)
>> 2020-11-20T11:41:33.057Z|08050|bridge|INFO|bridge br-int: added interface
>> tapa489d406-91 on port 1188
>> 2020-11-20T11:41:33.582Z|08051|bridge|INFO|bridge br-int: added interface
>> tap67731b0a-c0 on port 1189
>> 2020-11-20T11:42:31.235Z|08052|connmgr|INFO|br-int<->unix#31: 168
>> flow_mods in the 2 s starting 59 s ago (114 adds, 10 deletes, 44
>> modifications)
>> 8<8<8<
>>
>> Once I restart ovn-controller it recalculates local ovs flows and the
>> problem is fixed so I'm assuming it's a local problem and not related to NB
>> and SB databases.
>>
>>
>> I have seen exactly the same which with 20.09, for the same port input
>> and output ofports do not match:
>>
>> bash-4.4# ovs-ofctl dump-flows br-int table=0 | grep 745
>>  cookie=0x38937d8e, duration=40387.372s, table=0, n_packets=1863,
>> n_bytes=111678, idle_age=1, priority=100,in_port=745
>> actions=load:0x4b->NXM_NX_REG13[],load:0x6a->NXM_NX_REG11[],load:0x69->NXM_NX_REG12[],load:0x18d->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8)
>>
>>
>> bash-4.4# ovs-ofctl dump-flows br-int table=65 | grep 8937d8e
>>  cookie=0x38937d8e, duration=40593.699s, table=65, n_packets=1848,
>> n_bytes=98960, idle_age=2599, priority=100,reg15=0x1,metadata=0x18d
>> actions=output:737
>>
>>
>> In table=0, the ofport is fine (745) but in the output stage it is using
>> a different one (737).
>>
>> By checking the OVS database transaction history, that port, at some
>> point, had the id 737:
>>
>> record 6516: 2020-12-14 22:22:54.184
>>
>>   table Interface row "tap71a5dfc1-10" (073801e2):
>> ofport=737
>>   

Re: [ovs-discuss] [ovn] Broken ovs localport flow for ovnmeta namespaces created by neutron

2020-12-17 Thread Daniel Alvarez Sanchez
On Tue, Dec 15, 2020 at 11:39 AM Krzysztof Klimonda <
kklimo...@syntaxhighlighted.com> wrote:

> Hi,
>
> Just as a quick update - I've updated our ovn version to 20.12.0 snapshot
> (d8bc0377c) and so far the problem hasn't yet reoccurred after over 24
> hours of tempest testing.
>

We could reproduce the issue with 20.12 and master. Also this is not
related exclusively to localports but to any port potentially.
Dumitru posted a fix for this:

http://patchwork.ozlabs.org/project/ovn/patch/1608197000-637-1-git-send-email-dce...@redhat.com/

Thanks!
daniel

>
> Best Regards,
> -Chris
>
>
> On Tue, Dec 15, 2020, at 11:13, Daniel Alvarez Sanchez wrote:
>
> Hey Krzysztof,
>
> On Fri, Nov 20, 2020 at 1:17 PM Krzysztof Klimonda <
> kklimo...@syntaxhighlighted.com> wrote:
>
> Hi,
>
> Doing some tempest runs on our pre-prod environment (stable/ussuri with
> ovn 20.06.2 release) I've noticed that some network connectivity tests were
> failing randomly. I've reproduced that by conitnously rescuing and
> unrescuing instance - network connectivity from and to VM works in general
> (dhcp is fine, access from outside is fine), however VM has no access to
> its metadata server (via 169.254.169.254 ip address). Tracing packet from
> VM to metadata via:
>
> 8<8<8<
> ovs-appctl ofproto/trace br-int
> in_port=tapa489d406-91,dl_src=fa:16:3e:2c:b0:fd,dl_dst=fa:16:3e:8b:b5:39
> 8<8<8<
>
> ends with
>
> 8<8<8<
> 65. reg15=0x1,metadata=0x97e, priority 100, cookie 0x15ec4875
> output:1187
>  >> Nonexistent output port
> 8<8<8<
>
> And I can verify that there is no flow for the actual ovnmeta tap
> interface (tap67731b0a-c0):
>
> 8<8<8<
> # docker exec -it openvswitch_vswitchd ovs-ofctl dump-flows br-int |grep
> -E output:'("tap67731b0a-c0"|1187)'
>  cookie=0x15ec4875, duration=1868.378s, table=65, n_packets=524,
> n_bytes=40856, priority=100,reg15=0x1,metadata=0x97e actions=output:1187
> #
> 8<8<8<
>
> From ovs-vswitchd.log it seems the interface tap67731b0a-c0 was added with
> index 1187, then deleted, and re-added with index 1189 - that's probably
> due to the fact that that is the only VM in that network and I'm constantly
> hard rebooting it via rescue/unrescue:
>
> 8<8<8<
> 2020-11-20T11:41:18.347Z|08043|bridge|INFO|bridge br-int: added interface
> tap67731b0a-c0 on port 1187
> 2020-11-20T11:41:30.813Z|08044|bridge|INFO|bridge br-int: deleted
> interface tapa489d406-91 on port 1186
> 2020-11-20T11:41:30.816Z|08045|bridge|WARN|could not open network device
> tapa489d406-91 (No such device)
> 2020-11-20T11:41:31.040Z|08046|bridge|INFO|bridge br-int: deleted
> interface tap67731b0a-c0 on port 1187
> 2020-11-20T11:41:31.044Z|08047|bridge|WARN|could not open network device
> tapa489d406-91 (No such device)
> 2020-11-20T11:41:31.050Z|08048|bridge|WARN|could not open network device
> tapa489d406-91 (No such device)
> 2020-11-20T11:41:31.235Z|08049|connmgr|INFO|br-int<->unix#31: 2069
> flow_mods in the last 43 s (858 adds, 814 deletes, 397 modifications)
> 2020-11-20T11:41:33.057Z|08050|bridge|INFO|bridge br-int: added interface
> tapa489d406-91 on port 1188
> 2020-11-20T11:41:33.582Z|08051|bridge|INFO|bridge br-int: added interface
> tap67731b0a-c0 on port 1189
> 2020-11-20T11:42:31.235Z|08052|connmgr|INFO|br-int<->unix#31: 168
> flow_mods in the 2 s starting 59 s ago (114 adds, 10 deletes, 44
> modifications)
> 8<8<8<
>
> Once I restart ovn-controller it recalculates local ovs flows and the
> problem is fixed so I'm assuming it's a local problem and not related to NB
> and SB databases.
>
>
> I have seen exactly the same which with 20.09, for the same port input and
> output ofports do not match:
>
> bash-4.4# ovs-ofctl dump-flows br-int table=0 | grep 745
>  cookie=0x38937d8e, duration=40387.372s, table=0, n_packets=1863,
> n_bytes=111678, idle_age=1, priority=100,in_port=745
> actions=load:0x4b->NXM_NX_REG13[],load:0x6a->NXM_NX_REG11[],load:0x69->NXM_NX_REG12[],load:0x18d->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8)
>
>
> bash-4.4# ovs-ofctl dump-flows br-int table=65 | grep 8937d8e
>  cookie=0x38937d8e, duration=40593.699s, table=65, n_packets=1848,
> n_bytes=98960, idle_age=2599, priority=100,reg15=0x1,metadata=0x18d
> actions=output:737
>
>
> In table=0, the ofport is fine (745) but in the output stage it is using a
> different one (737).
>
> By checking the OVS database transaction history, that port, at some
> point, had the id 737:
>
> record 6516: 2020-12-14 22:22:54.184
>
>   table Interface row "tap71a5dfc1-10" (073801e2):
> ofport=737
>   table Open_vSwitch row 1d9566c8 (1d9566c8):
> cur_cfg=2023
>
> So it looks like ovn-controller is not updating the ofport in the physical
> flows for the output stage.
>
> We'll try to figure out if this happens also in master.
>
> Thanks,
> daniel
>
>
> --
>   Krzysztof Klimonda
>   kklimo...@syntaxhighlighted.com
> 

Re: [ovs-discuss] [ovn] Broken ovs localport flow for ovnmeta namespaces created by neutron

2020-12-15 Thread Krzysztof Klimonda
Hi,

Just as a quick update - I've updated our ovn version to 20.12.0 snapshot 
(d8bc0377c) and so far the problem hasn't yet reoccurred after over 24 hours of 
tempest testing.

Best Regards,
-Chris


On Tue, Dec 15, 2020, at 11:13, Daniel Alvarez Sanchez wrote:
> Hey Krzysztof,
> 
> On Fri, Nov 20, 2020 at 1:17 PM Krzysztof Klimonda 
>  wrote:
>> Hi,
>> 
>> Doing some tempest runs on our pre-prod environment (stable/ussuri with ovn 
>> 20.06.2 release) I've noticed that some network connectivity tests were 
>> failing randomly. I've reproduced that by conitnously rescuing and 
>> unrescuing instance - network connectivity from and to VM works in general 
>> (dhcp is fine, access from outside is fine), however VM has no access to its 
>> metadata server (via 169.254.169.254 ip address). Tracing packet from VM to 
>> metadata via:
>> 
>> 8<8<8<
>> ovs-appctl ofproto/trace br-int 
>> in_port=tapa489d406-91,dl_src=fa:16:3e:2c:b0:fd,dl_dst=fa:16:3e:8b:b5:39
>> 8<8<8<
>> 
>> ends with
>> 
>> 8<8<8<
>> 65. reg15=0x1,metadata=0x97e, priority 100, cookie 0x15ec4875
>> output:1187
>>  >> Nonexistent output port
>> 8<8<8<
>> 
>> And I can verify that there is no flow for the actual ovnmeta tap interface 
>> (tap67731b0a-c0):
>> 
>> 8<8<8<
>> # docker exec -it openvswitch_vswitchd ovs-ofctl dump-flows br-int |grep -E 
>> output:'("tap67731b0a-c0"|1187)'
>>  cookie=0x15ec4875, duration=1868.378s, table=65, n_packets=524, 
>> n_bytes=40856, priority=100,reg15=0x1,metadata=0x97e actions=output:1187
>> #
>> 8<8<8<
>> 
>> From ovs-vswitchd.log it seems the interface tap67731b0a-c0 was added with 
>> index 1187, then deleted, and re-added with index 1189 - that's probably due 
>> to the fact that that is the only VM in that network and I'm constantly hard 
>> rebooting it via rescue/unrescue:
>> 
>> 8<8<8<
>> 2020-11-20T11:41:18.347Z|08043|bridge|INFO|bridge br-int: added interface 
>> tap67731b0a-c0 on port 1187
>> 2020-11-20T11:41:30.813Z|08044|bridge|INFO|bridge br-int: deleted interface 
>> tapa489d406-91 on port 1186
>> 2020-11-20T11:41:30.816Z|08045|bridge|WARN|could not open network device 
>> tapa489d406-91 (No such device)
>> 2020-11-20T11:41:31.040Z|08046|bridge|INFO|bridge br-int: deleted interface 
>> tap67731b0a-c0 on port 1187
>> 2020-11-20T11:41:31.044Z|08047|bridge|WARN|could not open network device 
>> tapa489d406-91 (No such device)
>> 2020-11-20T11:41:31.050Z|08048|bridge|WARN|could not open network device 
>> tapa489d406-91 (No such device)
>> 2020-11-20T11:41:31.235Z|08049|connmgr|INFO|br-int<->unix#31: 2069 flow_mods 
>> in the last 43 s (858 adds, 814 deletes, 397 modifications)
>> 2020-11-20T11:41:33.057Z|08050|bridge|INFO|bridge br-int: added interface 
>> tapa489d406-91 on port 1188
>> 2020-11-20T11:41:33.582Z|08051|bridge|INFO|bridge br-int: added interface 
>> tap67731b0a-c0 on port 1189
>> 2020-11-20T11:42:31.235Z|08052|connmgr|INFO|br-int<->unix#31: 168 flow_mods 
>> in the 2 s starting 59 s ago (114 adds, 10 deletes, 44 modifications) 
>> 8<8<8<
>> 
>> Once I restart ovn-controller it recalculates local ovs flows and the 
>> problem is fixed so I'm assuming it's a local problem and not related to NB 
>> and SB databases.
>> 
> 
> I have seen exactly the same which with 20.09, for the same port input and 
> output ofports do not match:
> 
> bash-4.4# ovs-ofctl dump-flows br-int table=0 | grep 745
>  cookie=0x38937d8e, duration=40387.372s, table=0, n_packets=1863, 
> n_bytes=111678, idle_age=1, priority=100,in_port=745 
> actions=load:0x4b->NXM_NX_REG13[],load:0x6a->NXM_NX_REG11[],load:0x69->NXM_NX_REG12[],load:0x18d->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8)
> 
> 
> bash-4.4# ovs-ofctl dump-flows br-int table=65 | grep 8937d8e
>  cookie=0x38937d8e, duration=40593.699s, table=65, n_packets=1848, 
> n_bytes=98960, idle_age=2599, priority=100,reg15=0x1,metadata=0x18d 
> actions=output:737
> 
> 
> In table=0, the ofport is fine (745) but in the output stage it is using a 
> different one (737).
> 
> By checking the OVS database transaction history, that port, at some point, 
> had the id 737:
> 
> record 6516: 2020-12-14 22:22:54.184
> 
>   table Interface row "tap71a5dfc1-10" (073801e2):
> ofport=737
>   table Open_vSwitch row 1d9566c8 (1d9566c8):
> cur_cfg=2023
> 
> So it looks like ovn-controller is not updating the ofport in the physical 
> flows for the output stage.
> 
> We'll try to figure out if this happens also in master.
> 
> Thanks,
> daniel
>  
>> -- 
>>   Krzysztof Klimonda
>>   kklimo...@syntaxhighlighted.com
>> ___
>> discuss mailing list
>> disc...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>> 
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovn] Broken ovs localport flow for ovnmeta namespaces created by neutron

2020-12-15 Thread Daniel Alvarez Sanchez
Hey Krzysztof,

On Fri, Nov 20, 2020 at 1:17 PM Krzysztof Klimonda <
kklimo...@syntaxhighlighted.com> wrote:

> Hi,
>
> Doing some tempest runs on our pre-prod environment (stable/ussuri with
> ovn 20.06.2 release) I've noticed that some network connectivity tests were
> failing randomly. I've reproduced that by conitnously rescuing and
> unrescuing instance - network connectivity from and to VM works in general
> (dhcp is fine, access from outside is fine), however VM has no access to
> its metadata server (via 169.254.169.254 ip address). Tracing packet from
> VM to metadata via:
>
> 8<8<8<
> ovs-appctl ofproto/trace br-int
> in_port=tapa489d406-91,dl_src=fa:16:3e:2c:b0:fd,dl_dst=fa:16:3e:8b:b5:39
> 8<8<8<
>
> ends with
>
> 8<8<8<
> 65. reg15=0x1,metadata=0x97e, priority 100, cookie 0x15ec4875
> output:1187
>  >> Nonexistent output port
> 8<8<8<
>
> And I can verify that there is no flow for the actual ovnmeta tap
> interface (tap67731b0a-c0):
>
> 8<8<8<
> # docker exec -it openvswitch_vswitchd ovs-ofctl dump-flows br-int |grep
> -E output:'("tap67731b0a-c0"|1187)'
>  cookie=0x15ec4875, duration=1868.378s, table=65, n_packets=524,
> n_bytes=40856, priority=100,reg15=0x1,metadata=0x97e actions=output:1187
> #
> 8<8<8<
>
> From ovs-vswitchd.log it seems the interface tap67731b0a-c0 was added with
> index 1187, then deleted, and re-added with index 1189 - that's probably
> due to the fact that that is the only VM in that network and I'm constantly
> hard rebooting it via rescue/unrescue:
>
> 8<8<8<
> 2020-11-20T11:41:18.347Z|08043|bridge|INFO|bridge br-int: added interface
> tap67731b0a-c0 on port 1187
> 2020-11-20T11:41:30.813Z|08044|bridge|INFO|bridge br-int: deleted
> interface tapa489d406-91 on port 1186
> 2020-11-20T11:41:30.816Z|08045|bridge|WARN|could not open network device
> tapa489d406-91 (No such device)
> 2020-11-20T11:41:31.040Z|08046|bridge|INFO|bridge br-int: deleted
> interface tap67731b0a-c0 on port 1187
> 2020-11-20T11:41:31.044Z|08047|bridge|WARN|could not open network device
> tapa489d406-91 (No such device)
> 2020-11-20T11:41:31.050Z|08048|bridge|WARN|could not open network device
> tapa489d406-91 (No such device)
> 2020-11-20T11:41:31.235Z|08049|connmgr|INFO|br-int<->unix#31: 2069
> flow_mods in the last 43 s (858 adds, 814 deletes, 397 modifications)
> 2020-11-20T11:41:33.057Z|08050|bridge|INFO|bridge br-int: added interface
> tapa489d406-91 on port 1188
> 2020-11-20T11:41:33.582Z|08051|bridge|INFO|bridge br-int: added interface
> tap67731b0a-c0 on port 1189
> 2020-11-20T11:42:31.235Z|08052|connmgr|INFO|br-int<->unix#31: 168
> flow_mods in the 2 s starting 59 s ago (114 adds, 10 deletes, 44
> modifications)
> 8<8<8<
>
> Once I restart ovn-controller it recalculates local ovs flows and the
> problem is fixed so I'm assuming it's a local problem and not related to NB
> and SB databases.
>
>
I have seen exactly the same which with 20.09, for the same port input and
output ofports do not match:

bash-4.4# ovs-ofctl dump-flows br-int table=0 | grep 745
 cookie=0x38937d8e, duration=40387.372s, table=0, n_packets=1863,
n_bytes=111678, idle_age=1, priority=100,in_port=745
actions=load:0x4b->NXM_NX_REG13[],load:0x6a->NXM_NX_REG11[],load:0x69->NXM_NX_REG12[],load:0x18d->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8)


bash-4.4# ovs-ofctl dump-flows br-int table=65 | grep 8937d8e
 cookie=0x38937d8e, duration=40593.699s, table=65, n_packets=1848,
n_bytes=98960, idle_age=2599, priority=100,reg15=0x1,metadata=0x18d
actions=output:737


In table=0, the ofport is fine (745) but in the output stage it is using a
different one (737).

By checking the OVS database transaction history, that port, at some point,
had the id 737:

record 6516: 2020-12-14 22:22:54.184

  table Interface row "tap71a5dfc1-10" (073801e2):
ofport=737
  table Open_vSwitch row 1d9566c8 (1d9566c8):
cur_cfg=2023

So it looks like ovn-controller is not updating the ofport in the physical
flows for the output stage.

We'll try to figure out if this happens also in master.

Thanks,
daniel


> --
>   Krzysztof Klimonda
>   kklimo...@syntaxhighlighted.com
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss