Re: [ovs-dev] [PATCH] datapath: use after free flow mask on destroy flow table

2022-03-24 Thread Wentao Jia

Hi,TongHao and Ilya


this patch was reported at
https://mail.openvswitch.org/pipermail/ovs-discuss/2020-August/050489.html
and it was fixed by patch 1f3a090("net: openvswitch: introduce common code for 
flushing flows")
Best regards, Wentao Jia



Hi,TongHao and Ilya


I guess this bug was fixed by patch 1f3a090("net: openvswitch: 
introduce common code for flushing flows"),but this patch is not 
for fix bug use-after-free  flow mask

>> On 3/24/22 05:17, Wentao Jia wrote: >> > >> > >> > on destroy flow table 
>> instance, referenced flow mask may be released >> > too. fuction 
>> ovs_flow_tbl_destroy(), release flow mask first and then >> > destroy flow 
>> table instance. this will trigger kernel panic on detroy >> > datapath >> > 
>> >> > >> > [ 377.647756] kernel BUG at .../datapath/linux/flow_table.c:272! 
>> >> > [ 377.653794] invalid opcode:  [#1] SMP PTI >> > [ 377.666827] RIP: 
>> 0010:table_instance_flow_free.isra.7+0x148/0x150 >> > [ 377.711465] Call 
>> Trace: >> > [ 377.715238]  >> > [ 377.718964] 
>> table_instance_destroy+0xbe/0x160 [openvswitch] >> > [ 377.722793] 
>> destroy_dp_rcu+0x12/0x40 [openvswitch] >> > [ 377.726651] 
>> rcu_process_callbacks+0x297/0x460 >> > [ 377.736795] __do_softirq+0xe3/0x30a 
>> >> > [ 377.740654] ? ktime_get+0x36/0xa0 >> > [ 377.744490] 
>> irq_exit+0x100/0x110 >> > [ 377.748514] smp_apic_timer_interrupt+0x74/0x140 
>> >> > [ 377.752817] apic_timer_interrupt+0xf/0x20 >> > [ 377.758802]  
>> >> > >> > >> > Fixes: 6d1cf7f3e ("datapath: fix possible memleak on destroy 
>> >> > flow-table") >for linux upstream, fix tag: >Fixes: 50b0e61b32ee ("net: 
>> openvswitch: fix possible memleak on >destroy flow-table") >> > >> > 
>> Signed-off-by: Wentao Jia  >> > Signed-off-by: 
>> Chuanjie Zeng  >> > --- >> >> Hi, Wentao Jia. 
>> Thanks for the patch! >> >> Please, send it to the mainline linux kernel 
>> ('netdev' mailing list, >> keeping the ovs-dev in CC) using the linux kernel 
>> process for >> submitting patches. >> >> When it is accepted to the upstream 
>> kernel, it can be backported to >> the OOT kernel module in OVS repository. 
>> >> >> Best regards, Ilya Maximets. >> >> > datapath/flow_table.c | 2 +- >> > 
>> 1 file changed, 1 insertion(+), 1 deletion(-) >> > >> > >> > diff --git 
>> a/datapath/flow_table.c b/datapath/flow_table.c >> > index 
>> 650338fb0..b2f4b1108 100644 >> > --- a/datapath/flow_table.c >> > +++ 
>> b/datapath/flow_table.c >> > @@ -415,8 +415,8 @@ void 
>> ovs_flow_tbl_destroy(struct flow_table *table) >> > struct table_instance 
>> *ufid_ti = rcu_dereference_raw(table->ufid_ti); >> > >> > >> > 
>> free_percpu(table->mask_cache); >> > - 
>> kfree(rcu_dereference_raw(table->mask_array)); >> > 
>> table_instance_destroy(table, ti, ufid_ti, false); >> > + 
>> kfree(rcu_dereference_raw(table->mask_array)); >> > } >> > > >-- >Best 
>> regards, Tonghao





___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] datapath: use after free flow mask on destroy flow table

2022-03-24 Thread Wentao Jia

Hi,TongHao and Ilya


I guess this bug was fixed by patch 1f3a090("net: openvswitch: 
introduce common code for flushing flows"),but this patch is not 
for fix bug use-after-free  flow mask
Best regards, Wentao Jia >> On 3/24/22 05:17, Wentao Jia wrote: >> > >> > >> > 
on destroy flow table instance, referenced flow mask may be released >> > too. 
fuction ovs_flow_tbl_destroy(), release flow mask first and then >> > destroy 
flow table instance. this will trigger kernel panic on detroy >> > datapath >> 
> >> > >> > [ 377.647756] kernel BUG at .../datapath/linux/flow_table.c:272! >> 
> [ 377.653794] invalid opcode:  [#1] SMP PTI >> > [ 377.666827] RIP: 
0010:table_instance_flow_free.isra.7+0x148/0x150 >> > [ 377.711465] Call Trace: 
>> > [ 377.715238]  >> > [ 377.718964] table_instance_destroy+0xbe/0x160 
[openvswitch] >> > [ 377.722793] destroy_dp_rcu+0x12/0x40 [openvswitch] >> > [ 
377.726651] rcu_process_callbacks+0x297/0x460 >> > [ 377.736795] 
__do_softirq+0xe3/0x30a >> > [ 377.740654] ? ktime_get+0x36/0xa0 >> > [ 
377.744490] irq_exit+0x100/0x110 >> > [ 377.748514] 
smp_apic_timer_interrupt+0x74/0x140 >> > [ 377.752817] 
apic_timer_interrupt+0xf/0x20 >> > [ 377.758802]  >> > >> > >> > Fixes: 
6d1cf7f3e ("datapath: fix possible memleak on destroy >> > flow-table") >for 
linux upstream, fix tag: >Fixes: 50b0e61b32ee ("net: openvswitch: fix possible 
memleak on >destroy flow-table") >> > >> > Signed-off-by: Wentao Jia 
 >> > Signed-off-by: Chuanjie Zeng 
 >> > --- >> >> Hi, Wentao Jia. Thanks for the 
patch! >> >> Please, send it to the mainline linux kernel ('netdev' mailing 
list, >> keeping the ovs-dev in CC) using the linux kernel process for >> 
submitting patches. >> >> When it is accepted to the upstream kernel, it can be 
backported to >> the OOT kernel module in OVS repository. >> >> Best regards, 
Ilya Maximets. >> >> > datapath/flow_table.c | 2 +- >> > 1 file changed, 1 
insertion(+), 1 deletion(-) >> > >> > >> > diff --git a/datapath/flow_table.c 
b/datapath/flow_table.c >> > index 650338fb0..b2f4b1108 100644 >> > --- 
a/datapath/flow_table.c >> > +++ b/datapath/flow_table.c >> > @@ -415,8 +415,8 
@@ void ovs_flow_tbl_destroy(struct flow_table *table) >> > struct 
table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti); >> > >> > >> > 
free_percpu(table->mask_cache); >> > - 
kfree(rcu_dereference_raw(table->mask_array)); >> > 
table_instance_destroy(table, ti, ufid_ti, false); >> > + 
kfree(rcu_dereference_raw(table->mask_array)); >> > } >> > > >-- >Best regards, 
Tonghao



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v6] Add a northbound interface to program MAC_Binding table

2022-03-24 Thread 0-day Robot
References:  <20220324214158.137613-1-svc.eng.git-m...@nutanix.com>
 

Bleep bloop.  Greetings svc.eng.git-mail, I am a robot and I have tried out 
your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line lacks whitespace around operator
#1033 FILE: utilities/ovn-nbctl.c:456:
  static-mac-binding-add LOGICAL_PORT IP MAC\n\

WARNING: Line lacks whitespace around operator
#1035 FILE: utilities/ovn-nbctl.c:458:
  static-mac-binding-del LOGICAL_PORT IP\n\

WARNING: Line lacks whitespace around operator
#1037 FILE: utilities/ovn-nbctl.c:460:
  static-mac-binding-list   List all Static_MAC_Binding entries\n\

Lines checked: 1209, Warnings: 3, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v5] Add a northbound interface to program MAC_Binding table

2022-03-24 Thread svc . eng . git-mail
Hi,

I’ve updated the patch and sent out v6. There was a memory leak in ovn-nbctl.c 
which is fixed now. Hopefully the build should pass now.

Thank you,
Karthik C

On 3/24/22, 9:27 AM, "Lorenzo Bianconi"  wrote:
> From: Karthik Chandrashekar 
> mailto:karthi...@nutanix.com>>
>
> Add a new NB and SB table for managing Static MAC_Binding entries.
> This table is currently supported for logical routers. OVN northd
> is responsible for propagating the values from NB to SB. OVN controller
> is responsible for installation MAC lookup flows. The priority of
> the installed flows are based on override_dynamic_mac flag. This helps
> control the precedence of statically programmed vs dynamically learnt
> MAC Bindings.

Hi Karthik,

thx for working on v5. I think this is mostly fine.
Just two comments inline.

Regards,
Lorenzo

>
> Signed-off-by: Karthik Chandrashekar 
> mailto:karthi...@nutanix.com>>
> ---
>  controller/lflow.c | 103 -
>  controller/lflow.h |  10 +-
>  controller/ovn-controller.c|  38 +++-
>  lib/automake.mk|   2 +
>  lib/static-mac-binding-index.c |  43 +
>  lib/static-mac-binding-index.h |  27 ++
>  northd/en-northd.c |   8 ++
>  northd/inc-proc-northd.c   |  14 ++-
>  northd/northd.c|  76 
>  northd/northd.h|   5 +
>  ovn-nb.ovsschema   |  15 ++-
>  ovn-nb.xml |  29 ++
>  ovn-sb.ovsschema   |  15 ++-
>  ovn-sb.xml |  27 ++
>  tests/ovn-nbctl.at |  69 ++
>  tests/ovn-northd.at|  25 -
>  tests/ovn.at   |  90 ++
>  utilities/ovn-nbctl.c  | 162 -
>  18 files changed, 718 insertions(+), 40 deletions(-)
>  create mode 100644 lib/static-mac-binding-index.c
>  create mode 100644 lib/static-mac-binding-index.h
>
> diff --git a/controller/lflow.c b/controller/lflow.c
> index e169edef1..075373e7b 100644
> --- a/controller/lflow.c
> +++ b/controller/lflow.c
> @@ -1622,40 +1622,50 @@ static void
>  consider_neighbor_flow(struct ovsdb_idl_index *sbrec_port_binding_by_name,
> const struct hmap *local_datapaths,
> const struct sbrec_mac_binding *b,
> -   struct ovn_desired_flow_table *flow_table)
> +   const struct sbrec_static_mac_binding *smb,
> +   struct ovn_desired_flow_table *flow_table,
> +   uint16_t priority)
>  {
> +if (!b && !smb) {
> +return;
> +}
> +
> +char *logical_port = !b ? smb->logical_port : b->logical_port;
> +char *ip = !b ? smb->ip : b->ip;
> +char *mac = !b ? smb->mac : b->mac;
> +
>  const struct sbrec_port_binding *pb
> -= lport_lookup_by_name(sbrec_port_binding_by_name, b->logical_port);
> += lport_lookup_by_name(sbrec_port_binding_by_name, logical_port);
>  if (!pb || !get_local_datapath(local_datapaths,
> pb->datapath->tunnel_key)) {
>  return;
>  }
>
> -struct eth_addr mac;
> -if (!eth_addr_from_string(b->mac, )) {
> +struct eth_addr mac_addr;
> +if (!eth_addr_from_string(mac, _addr)) {
>  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
> -VLOG_WARN_RL(, "bad 'mac' %s", b->mac);
> +VLOG_WARN_RL(, "bad 'mac' %s", mac);
>  return;
>  }
>
>  struct match get_arp_match = MATCH_CATCHALL_INITIALIZER;
>  struct match lookup_arp_match = MATCH_CATCHALL_INITIALIZER;
>
> -if (strchr(b->ip, '.')) {
> -ovs_be32 ip;
> -if (!ip_parse(b->ip, )) {
> +if (strchr(ip, '.')) {
> +ovs_be32 ip_addr;
> +if (!ip_parse(ip, _addr)) {
>  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
> -VLOG_WARN_RL(, "bad 'ip' %s", b->ip);
> +VLOG_WARN_RL(, "bad 'ip' %s", ip);
>  return;
>  }
> -match_set_reg(_arp_match, 0, ntohl(ip));
> -match_set_reg(_arp_match, 0, ntohl(ip));
> +match_set_reg(_arp_match, 0, ntohl(ip_addr));
> +match_set_reg(_arp_match, 0, ntohl(ip_addr));
>  match_set_dl_type(_arp_match, htons(ETH_TYPE_ARP));
>  } else {
>  struct in6_addr ip6;
> -if (!ipv6_parse(b->ip, )) {
> +if (!ipv6_parse(ip, )) {
>  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
> -VLOG_WARN_RL(, "bad 'ip' %s", b->ip);
> +VLOG_WARN_RL(, "bad 'ip' %s", ip);
>  return;
>  }
>  ovs_be128 value;
> @@ -1678,20 +1688,22 @@ consider_neighbor_flow(struct ovsdb_idl_index 
> *sbrec_port_binding_by_name,
>  uint64_t stub[1024 / 8];
>  struct ofpbuf ofpacts = OFPBUF_STUB_INITIALIZER(stub);
>  uint8_t value = 1;
> -put_load(mac.ea, sizeof mac.ea, 

[ovs-dev] [PATCH ovn v6] Add a northbound interface to program MAC_Binding table

2022-03-24 Thread svc . eng . git-mail
From: Karthik Chandrashekar 

Add a new NB and SB table for managing Static MAC_Binding entries.
This table is currently supported for logical routers. OVN northd
is responsible for propagating the values from NB to SB. OVN controller
is responsible for installation MAC lookup flows. The priority of
the installed flows are based on override_dynamic_mac flag. This helps
control the precedence of statically programmed vs dynamically learnt
MAC Bindings.

Signed-off-by: Karthik Chandrashekar 
---
 controller/lflow.c | 103 -
 controller/lflow.h |  10 +-
 controller/ovn-controller.c|  38 +++-
 lib/automake.mk|   2 +
 lib/static-mac-binding-index.c |  43 +
 lib/static-mac-binding-index.h |  27 ++
 northd/en-northd.c |   8 ++
 northd/inc-proc-northd.c   |  14 ++-
 northd/northd.c|  76 +++
 northd/northd.h|   5 +
 ovn-nb.ovsschema   |  15 ++-
 ovn-nb.xml |  29 ++
 ovn-sb.ovsschema   |  15 ++-
 ovn-sb.xml |  27 ++
 tests/ovn-nbctl.at |  69 ++
 tests/ovn-northd.at|  25 -
 tests/ovn.at   |  90 ++
 utilities/ovn-nbctl.c  | 164 -
 18 files changed, 720 insertions(+), 40 deletions(-)
 create mode 100644 lib/static-mac-binding-index.c
 create mode 100644 lib/static-mac-binding-index.h

diff --git a/controller/lflow.c b/controller/lflow.c
index e169edef1..075373e7b 100644
--- a/controller/lflow.c
+++ b/controller/lflow.c
@@ -1622,40 +1622,50 @@ static void
 consider_neighbor_flow(struct ovsdb_idl_index *sbrec_port_binding_by_name,
const struct hmap *local_datapaths,
const struct sbrec_mac_binding *b,
-   struct ovn_desired_flow_table *flow_table)
+   const struct sbrec_static_mac_binding *smb,
+   struct ovn_desired_flow_table *flow_table,
+   uint16_t priority)
 {
+if (!b && !smb) {
+return;
+}
+
+char *logical_port = !b ? smb->logical_port : b->logical_port;
+char *ip = !b ? smb->ip : b->ip;
+char *mac = !b ? smb->mac : b->mac;
+
 const struct sbrec_port_binding *pb
-= lport_lookup_by_name(sbrec_port_binding_by_name, b->logical_port);
+= lport_lookup_by_name(sbrec_port_binding_by_name, logical_port);
 if (!pb || !get_local_datapath(local_datapaths,
pb->datapath->tunnel_key)) {
 return;
 }
 
-struct eth_addr mac;
-if (!eth_addr_from_string(b->mac, )) {
+struct eth_addr mac_addr;
+if (!eth_addr_from_string(mac, _addr)) {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
-VLOG_WARN_RL(, "bad 'mac' %s", b->mac);
+VLOG_WARN_RL(, "bad 'mac' %s", mac);
 return;
 }
 
 struct match get_arp_match = MATCH_CATCHALL_INITIALIZER;
 struct match lookup_arp_match = MATCH_CATCHALL_INITIALIZER;
 
-if (strchr(b->ip, '.')) {
-ovs_be32 ip;
-if (!ip_parse(b->ip, )) {
+if (strchr(ip, '.')) {
+ovs_be32 ip_addr;
+if (!ip_parse(ip, _addr)) {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
-VLOG_WARN_RL(, "bad 'ip' %s", b->ip);
+VLOG_WARN_RL(, "bad 'ip' %s", ip);
 return;
 }
-match_set_reg(_arp_match, 0, ntohl(ip));
-match_set_reg(_arp_match, 0, ntohl(ip));
+match_set_reg(_arp_match, 0, ntohl(ip_addr));
+match_set_reg(_arp_match, 0, ntohl(ip_addr));
 match_set_dl_type(_arp_match, htons(ETH_TYPE_ARP));
 } else {
 struct in6_addr ip6;
-if (!ipv6_parse(b->ip, )) {
+if (!ipv6_parse(ip, )) {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
-VLOG_WARN_RL(, "bad 'ip' %s", b->ip);
+VLOG_WARN_RL(, "bad 'ip' %s", ip);
 return;
 }
 ovs_be128 value;
@@ -1678,20 +1688,22 @@ consider_neighbor_flow(struct ovsdb_idl_index 
*sbrec_port_binding_by_name,
 uint64_t stub[1024 / 8];
 struct ofpbuf ofpacts = OFPBUF_STUB_INITIALIZER(stub);
 uint8_t value = 1;
-put_load(mac.ea, sizeof mac.ea, MFF_ETH_DST, 0, 48, );
+put_load(mac_addr.ea, sizeof mac_addr.ea, MFF_ETH_DST, 0, 48, );
 put_load(, sizeof value, MFF_LOG_FLAGS, MLF_LOOKUP_MAC_BIT, 1,
  );
-ofctrl_add_flow(flow_table, OFTABLE_MAC_BINDING, 100,
-b->header_.uuid.parts[0], _arp_match,
-, >header_.uuid);
+ofctrl_add_flow(flow_table, OFTABLE_MAC_BINDING, priority,
+!b ? smb->header_.uuid.parts[0] : b->header_.uuid.parts[0],
+_arp_match, ,
+!b ? >header_.uuid : >header_.uuid);
 
 ofpbuf_clear();
 

Re: [ovs-dev] [PATCH ovn] rhel: fix logrotate user config option

2022-03-24 Thread Vladislav Odintsov
Thanks Numan.

Regards,
Vladislav Odintsov

> On 24 Mar 2022, at 17:54, Numan Siddique  wrote:
> 
> On Tue, Mar 22, 2022 at 2:33 PM Mark Michelson  wrote:
>> 
>> Acked-by: Mark Michelson 
>> 
> 
> Thanks.
> 
> I applied this patch to the main branch and backported upto branch-21.03.
> 
> Numan
>> On 3/22/22 13:37, Vladislav Odintsov wrote:
>>> If rpm is built with libcapng (default), /etc/logrotate.d/ovn
>>> config file user is SEDed in postinstall section with 'ovn ovn'.
>>> 
>>> If one run logrotate, next error occurs:
>>> ```
>>> [root@host ~]# logrotate /etc/logrotate.d/ovn
>>> error: /etc/logrotate.d/ovn:9 unknown user 'ovn'
>>> error: found error in /var/log/ovn/*.log , skipping
>>> ```
>>> 
>>> Replace 'ovn' user with 'openvswitch' to fix the issue.
>>> 
>>> Signed-off-by: Vladislav Odintsov 
>>> ---
>>>  rhel/ovn-fedora.spec.in | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>> 
>>> diff --git a/rhel/ovn-fedora.spec.in b/rhel/ovn-fedora.spec.in
>>> index 3fb854a37..821eb03cc 100644
>>> --- a/rhel/ovn-fedora.spec.in
>>> +++ b/rhel/ovn-fedora.spec.in
>>> @@ -323,7 +323,7 @@ ln -sf ovn_detrace.py %{_bindir}/ovn-detrace
>>>  %if %{with libcapng}
>>>  if [ $1 -eq 1 ]; then
>>>  sed -i 's:^#OVN_USER_ID=:OVN_USER_ID=:' %{_sysconfdir}/sysconfig/ovn
>>> -sed -i 's:\(.*su\).*:\1 ovn ovn:' %{_sysconfdir}/logrotate.d/ovn
>>> +sed -i 's:\(.*su\).*:\1 openvswitch openvswitch:' 
>>> %{_sysconfdir}/logrotate.d/ovn
>>>  fi
>>>  %endif
> 
> 
>>> 
>>> 
>> 
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] northd: Use separate SNAT for already-DNATted traffic.

2022-03-24 Thread Numan Siddique
On Thu, Mar 24, 2022 at 4:47 PM Mark Michelson  wrote:
>
> Hi Numan,
>
> This message got buried in my inbox, so sorry about only replying now. I
> have some replies in-line below
>
> On 3/18/22 16:34, Numan Siddique wrote:
> > On Wed, Mar 9, 2022 at 2:38 PM Mark Michelson  wrote:
> >>
> >> Commit 4deac4509abbedd6ffaecf27eed01ddefccea40a introduced functionality
> >> in ovn-northd to use a single zone for SNAT and DNAT when possible. It
> >> accounts for certain situations, such as hairpinned traffic, where we
> >> still need a separate SNAT and DNAT zone.
> >>
> >> However, one situation it did not account for was when traffic traverses
> >> a logical router and is DNATted as a result of a load balancer, then
> >> when the traffic egresses the router, it needs to be SNATted. In this
> >> situation, we try to use the same CT zone for the SNAT as for the load
> >> balancer DNAT, which does not work.
> >>
> >> This commit fixes the issue by installing a flow in the OUT_SNAT stage
> >> of the logical router egress pipeline for each SNAT. This flow will fall
> >> back to using a separate CT zone for SNAT if the ct_label.natted flag is
> >> set.
> >>
> >> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2044127
> >>
> >> Signed-off-by: Mark Michelson 
> >
> >
> > Hi Mark,
> >
> > Few comments,
> >
> >   - There are a few test cases failing with this patch.
>
> Interesting. I'll have to double-check about that but I hadn't noticed
> it happening locally.
>
> >
> >   - I'm pretty sure the issue will not be seen if the load balancer is
> > also associated with the logical switch.  In that case, dnat would
> > happen in the logical switch pipeline.
> > But still we should address this issue as it's a regression
> > introduced by the commit 4deac4509abb.
> >
> >   - Only small concern I've with this patch is the usage of
> > ct_label.natted field in the egress pipeline.  In the ingress pipeline
> > of the router we send the pkt to conntrack and then do NAT.
> > I'm just worried what if the ct states/labels are not valid or they
> > get cleared when the pkt enters ingress to egress pipeline.  Looks
> > like we don't clear the conntrack state now.
> > But future patches could.  I think we should document this somewhere.
>
> The tricky thing here is that I don't know if there is any other
> reliable way to know in the egress pipeline if the packet was natted in
> the ingress pipeline. AFAIK, there is no scenario where the router
> egress pipeline can run on a different node than the router ingress
> pipeline, so at least the way it currently works should be reliable.
>
> Where is the best place to document this?

I guess a code comment in northd.c would be helpful.  I'd suggest to see
if we can document in ovn-northd.8.xml while updating this file with the newly
added logical flow(s).  I don't have any strong preference.  So please see
if it makes sense to document in ovn-northd.8.xml too.

Numan

>
> >
> > -  This patch needs to add the relevant documentation in the
> > ovn-northd.8.xml for the new flows added.
> >
> > -  Instead of adding the flows in the "lr_out_snat" stage,  I'd
> > suggest to add the below flow in the "lr_out_chk_dnat_local" pipeline
> >
> >  table=0 (lr_out_chk_dnat_local), priority=50   , match=(ip &&
> > ct_label.natted == 1),  action=(reg9[4] = 1; next;)
> >
> > We already have flows in the "lr_out_snat" stage to make use of
> > "ct_snat" action instead of "ct_snat_in_czone" if reg9[4] is set.
> > Can you please see if this suggestion is possible ?
>
> That sounds perfectly reasonable. I'll test it out and see if it works.
>
> >
> >
> > Thanks
> > Numan
>
>
> >
> >
> >> ---
> >> Note: The "SNAT in separate zone from DNAT" test in system-ovn.at can
> >> fail on some systems, and at this point it's not clear what causes it.
> >> On my development system running Fedora 33 and kernel 5.14.18-100, the
> >> test fails because the pings fail. In a container running Fedora 33 and
> >> kernel 5.10.19-200, the test succeeds. It's unknown if it's the kernel
> >> version or some other setting that has a bearing on the success or
> >> failure of the test. The OpenFlow that ovn-controller generates is
> >> identical on both successful and failure scenarios.
> >>
> >> If this inconsistency causes issues with CI, then this patch may need to
> >> be re-examined to determine if there are other parameters that may be
> >> needed to correct the referenced issue.
> >> ---
> >>   northd/northd.c |  53 ++--
> >>   tests/ovn-northd.at |  36 ++
> >>   tests/system-ovn.at | 117 
> >>   3 files changed, 190 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/northd/northd.c b/northd/northd.c
> >> index b264fb850..866a81310 100644
> >> --- a/northd/northd.c
> >> +++ b/northd/northd.c
> >> @@ -12961,23 +12961,44 @@ build_lrouter_out_snat_flow(struct hmap *lflows, 
> >> struct ovn_datapath *od,
> >>

Re: [ovs-dev] [PATCH ovn] northd: Use separate SNAT for already-DNATted traffic.

2022-03-24 Thread Mark Michelson

Hi Numan,

This message got buried in my inbox, so sorry about only replying now. I 
have some replies in-line below


On 3/18/22 16:34, Numan Siddique wrote:

On Wed, Mar 9, 2022 at 2:38 PM Mark Michelson  wrote:


Commit 4deac4509abbedd6ffaecf27eed01ddefccea40a introduced functionality
in ovn-northd to use a single zone for SNAT and DNAT when possible. It
accounts for certain situations, such as hairpinned traffic, where we
still need a separate SNAT and DNAT zone.

However, one situation it did not account for was when traffic traverses
a logical router and is DNATted as a result of a load balancer, then
when the traffic egresses the router, it needs to be SNATted. In this
situation, we try to use the same CT zone for the SNAT as for the load
balancer DNAT, which does not work.

This commit fixes the issue by installing a flow in the OUT_SNAT stage
of the logical router egress pipeline for each SNAT. This flow will fall
back to using a separate CT zone for SNAT if the ct_label.natted flag is
set.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2044127

Signed-off-by: Mark Michelson 



Hi Mark,

Few comments,

  - There are a few test cases failing with this patch.


Interesting. I'll have to double-check about that but I hadn't noticed 
it happening locally.




  - I'm pretty sure the issue will not be seen if the load balancer is
also associated with the logical switch.  In that case, dnat would
happen in the logical switch pipeline.
But still we should address this issue as it's a regression
introduced by the commit 4deac4509abb.

  - Only small concern I've with this patch is the usage of
ct_label.natted field in the egress pipeline.  In the ingress pipeline
of the router we send the pkt to conntrack and then do NAT.
I'm just worried what if the ct states/labels are not valid or they
get cleared when the pkt enters ingress to egress pipeline.  Looks
like we don't clear the conntrack state now.
But future patches could.  I think we should document this somewhere.


The tricky thing here is that I don't know if there is any other 
reliable way to know in the egress pipeline if the packet was natted in 
the ingress pipeline. AFAIK, there is no scenario where the router 
egress pipeline can run on a different node than the router ingress 
pipeline, so at least the way it currently works should be reliable.


Where is the best place to document this?



-  This patch needs to add the relevant documentation in the
ovn-northd.8.xml for the new flows added.

-  Instead of adding the flows in the "lr_out_snat" stage,  I'd
suggest to add the below flow in the "lr_out_chk_dnat_local" pipeline

 table=0 (lr_out_chk_dnat_local), priority=50   , match=(ip &&
ct_label.natted == 1),  action=(reg9[4] = 1; next;)

We already have flows in the "lr_out_snat" stage to make use of
"ct_snat" action instead of "ct_snat_in_czone" if reg9[4] is set.
Can you please see if this suggestion is possible ?


That sounds perfectly reasonable. I'll test it out and see if it works.




Thanks
Numan







---
Note: The "SNAT in separate zone from DNAT" test in system-ovn.at can
fail on some systems, and at this point it's not clear what causes it.
On my development system running Fedora 33 and kernel 5.14.18-100, the
test fails because the pings fail. In a container running Fedora 33 and
kernel 5.10.19-200, the test succeeds. It's unknown if it's the kernel
version or some other setting that has a bearing on the success or
failure of the test. The OpenFlow that ovn-controller generates is
identical on both successful and failure scenarios.

If this inconsistency causes issues with CI, then this patch may need to
be re-examined to determine if there are other parameters that may be
needed to correct the referenced issue.
---
  northd/northd.c |  53 ++--
  tests/ovn-northd.at |  36 ++
  tests/system-ovn.at | 117 
  3 files changed, 190 insertions(+), 16 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index b264fb850..866a81310 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -12961,23 +12961,44 @@ build_lrouter_out_snat_flow(struct hmap *lflows, 
struct ovn_datapath *od,
  priority, ds_cstr(match),
  ds_cstr(actions), >header_);

-if (!stateless) {
-ds_put_cstr(match, " && "REGBIT_DST_NAT_IP_LOCAL" == 1");
-ds_clear(actions);
-if (distributed) {
-ds_put_format(actions, "eth.src = "ETH_ADDR_FMT"; ",
-  ETH_ADDR_ARGS(mac));
-}
-ds_put_format(actions,  REGBIT_DST_NAT_IP_LOCAL" = 0; ct_snat(%s",
-  nat->external_ip);
-if (nat->external_port_range[0]) {
-ds_put_format(actions, ",%s", nat->external_port_range);
-}
-ds_put_format(actions, ");");
-

Re: [ovs-dev] [PATCH ovn branch-22.03 1/2] Set release date for 22.03.0.

2022-03-24 Thread Mark Michelson

On 3/24/22 12:32, Ilya Maximets wrote:

On 3/10/22 21:49, Mark Michelson wrote:

Signed-off-by: Mark Michelson 
---
  NEWS | 2 +-
  debian/changelog | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index 9f3ce3cf3..0f8e068b1 100644
--- a/NEWS
+++ b/NEWS
@@ -1,4 +1,4 @@
-OVN v22.03.0 - XX XXX 
+OVN v22.03.0 - 11 Mar 2022
  --
- Refactor CoPP commands introducing a unique name index in CoPP NB table.
  Add following new CoPP commands to manage CoPP table:


Hey, Mark.  It looks like the version of the patch that got applied
to the branch-22.03 doesn't have the NEWS update above.

We also need this patch ported to the main branch to update all the dates.
(And it looks like I myself forgot to do that for OVS 2.17...)

Bets regards, Ilya Maximets.


Thanks Ilya, I pushed corrections as follows:

I cherry-picked this change to main and updated NEWS to have the correct 
release date.


I pushed a commit to branch-22.03 that updates NEWS to have the correct 
release date.





diff --git a/debian/changelog b/debian/changelog
index fe0e7f488..a26420b7a 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -2,7 +2,7 @@ ovn (22.03.0-1) unstable; urgency=low
  
 * New upstream version
  
- -- OVN team   Thu, 24 Feb 2022 16:55:45 -0500

+ -- OVN team   Thu, 10 Mar 2022 15:47:41 -0500
  
  ovn (21.12.0-1) unstable; urgency=low
  




___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH branch-2.17 0/2] Release patches for v2.17.0.

2022-03-24 Thread Ilya Maximets
On 2/17/22 23:13, Ilya Maximets wrote:
> On 2/17/22 19:45, Ilya Maximets wrote:
>>
>> Ilya Maximets (2):
>>   Set release date for 2.17.0.
>>   Prepare for 2.17.1.
>>
>>  NEWS | 7 ++-
>>  configure.ac | 2 +-
>>  debian/changelog | 8 +++-
>>  3 files changed, 14 insertions(+), 3 deletions(-)
>>
> 
> Applied.  Thanks, Flavio!

I did apply the patches to branch-2.17, but I forgot to
apply the fist one to the master branch.  Fixed now.

> 
> Will send release announce soon.
> 
> Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] system-traffic.at: Fix flaky DNAT load balancing test.

2022-03-24 Thread Paolo Valerio
Hello Ilya,

Ilya Maximets  writes:

> 'conntrack - DNAT load balancing' test fails from time to time
> because not all the group buckets are getting hit.
>
> In short, the test creates a group with 3 buckets with the same
> weight.  It creates 12 TCP sessions and expects that each bucket
> will be used at least once.  However, there is a solid chance that
> this will not happen.  The probability of having at least one
> empty bucket is:
>
>   C(3, 1) x (2/3)^N - C(3, 2) x (1/3)^N
>
> Where N is the number of distinct TCP sessions.  For N=12, the
> probability is about 0.023, i.e. there is a 2.3% chance for a
> test to fail, which is not great for CI.
>
> Increasing the number of sessions to 50 to reduce the probability
> of failure down to 4.7e-9.  In my testing, the configuration with
> 50 TCP sessions didn't fail after 6000 runs.  Should be good
> enough for CI systems.
>
> Fixes: 2c66ebe47a88 ("ofp-actions: Allow conntrack action in group buckets.")
> Signed-off-by: Ilya Maximets 
> ---

the patch LGTM,

Acked-by: Paolo Valerio 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] system-traffic.at: Fix flaky DNAT load balancing test.

2022-03-24 Thread Phelan, Michael


> -Original Message-
> From: Ilya Maximets 
> Sent: Thursday 24 March 2022 11:20
> To: ovs-dev@openvswitch.org
> Cc: Aaron Conole ; Phelan, Michael
> ; Ilya Maximets 
> Subject: [PATCH] system-traffic.at: Fix flaky DNAT load balancing test.
> 
> 'conntrack - DNAT load balancing' test fails from time to time because not all
> the group buckets are getting hit.
> 
> In short, the test creates a group with 3 buckets with the same weight.  It
> creates 12 TCP sessions and expects that each bucket will be used at least 
> once.
> However, there is a solid chance that this will not happen.  The probability 
> of
> having at least one empty bucket is:
> 
>   C(3, 1) x (2/3)^N - C(3, 2) x (1/3)^N
> 
> Where N is the number of distinct TCP sessions.  For N=12, the probability is
> about 0.023, i.e. there is a 2.3% chance for a test to fail, which is not 
> great for
> CI.
> 
> Increasing the number of sessions to 50 to reduce the probability of failure
> down to 4.7e-9.  In my testing, the configuration with
> 50 TCP sessions didn't fail after 6000 runs.  Should be good enough for CI
> systems.
> 
> Fixes: 2c66ebe47a88 ("ofp-actions: Allow conntrack action in group buckets.")
> Signed-off-by: Ilya Maximets 
> ---
>  tests/system-traffic.at | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tests/system-traffic.at b/tests/system-traffic.at index
> 95383275a..4a7fa49fc 100644
> --- a/tests/system-traffic.at
> +++ b/tests/system-traffic.at
> @@ -6471,7 +6471,7 @@ on_exit 'ovs-appctl revalidator/purge'
>  on_exit 'ovs-appctl dpif/dump-flows br0'
> 
>  dnl Should work with the virtual IP address through NAT -for i in 1 2 3 4 5 
> 6 7 8 9
> 10 11 12; do
> +for i in $(seq 1 50); do
>  echo Request $i
>  NS_CHECK_EXEC([at_ns1], [wget 10.1.1.64 -t 5 -T 1 --retry-connrefused -v 
> -o
> wget$i.log])  done
> --
> 2.34.1

Acked-by: Michael Phelan 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] dpif-netdev: Allow cross-NUMA polling on selected ports

2022-03-24 Thread Kevin Traynor

On 24/03/2022 16:35, Jan Scheurich wrote:

Hi Kevin,

This was a bit of a misunderstanding. We didn't check your RFC patch carefully 
enough to realize that you had meant to encompass our cross-numa-polling 
function in that RFC patch. Sorry for the confusion.

I wouldn't say we are particularly keen on upstreaming exactly our 
implementation of cross-numa-polling for ALB, as long as we get the 
functionality with a per-interface configuration option (preferably as in out 
patch, so that we can maintain backward compatibility with our downstream 
solution).

I suggest we have a closer look at your RFC and come back with comments on that.



No problems. It doesn't have selection logic/tests/docs etc, it was just 
a way of seeing how the additions below could be added. It is mainly 
just reorganising the schedule() fn. to not be always: select numa 
first, then select pmd from numa. And some splitting things into 
functions to help with checking load on all pmds.




'roundrobin' and 'cycles', are dependent on RR of cores per numa, so I agree it
makes sense in those cases to still RR the numas. Otherwise a mix of cross-
numa enabled and cross-numa disabled interfaces would conflict with each
other when selecting a core. So even though the user has selected to ignore
numa for this interface, we don't have a choice but to still RR the numa.

For 'group' it is about finding the lowest loaded pmd core. In that case we 
don't
need to RR numa. We can just find the lowest loaded pmd core from any numa.

This is better because the user is choosing to remove numa based selection for
the interface and as the rxq scheduling algorthim is not dependent on it, we
can fully remove it too by checking pmds from all numas. I have done this in my
RFC.

It is also better to do this where possible because there may not be same
amount of pmd cores on each numa, or one numa could already be more
heavily loaded than the other.

Another difference is with above for 'group' I added tiebreaker for a local-to-
interface numa pmd to be selected if multiple pmds from different numas were
available with same load. This is most likely to be helpful for initial 
selection
when there is no load on any pmds.


At first glance I agree with your reasoning. Choosing the least-loaded PMD from 
all NUMAs using NUMA-locality as a tie-breaker makes sense in 'group' 
algorithm. How do you select between equally loaded PMDs on the same NUMA node? 
Just pick any?


Yes, that is how the group algorithm currently operates. A further way 
to tiebreak on local numa could be if one of the remaining pmds was 
already polling that rxq, or on number of rxqs assigned to the pmd.


thanks,
Kevin.



BR, Jan



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] dpif-netdev: Allow cross-NUMA polling on selected ports

2022-03-24 Thread Jan Scheurich via dev
Hi Kevin,

This was a bit of a misunderstanding. We didn't check your RFC patch carefully 
enough to realize that you had meant to encompass our cross-numa-polling 
function in that RFC patch. Sorry for the confusion.

I wouldn't say we are particularly keen on upstreaming exactly our 
implementation of cross-numa-polling for ALB, as long as we get the 
functionality with a per-interface configuration option (preferably as in out 
patch, so that we can maintain backward compatibility with our downstream 
solution).

I suggest we have a closer look at your RFC and come back with comments on that.

> 
> 'roundrobin' and 'cycles', are dependent on RR of cores per numa, so I agree 
> it
> makes sense in those cases to still RR the numas. Otherwise a mix of cross-
> numa enabled and cross-numa disabled interfaces would conflict with each
> other when selecting a core. So even though the user has selected to ignore
> numa for this interface, we don't have a choice but to still RR the numa.
> 
> For 'group' it is about finding the lowest loaded pmd core. In that case we 
> don't
> need to RR numa. We can just find the lowest loaded pmd core from any numa.
> 
> This is better because the user is choosing to remove numa based selection for
> the interface and as the rxq scheduling algorthim is not dependent on it, we
> can fully remove it too by checking pmds from all numas. I have done this in 
> my
> RFC.
> 
> It is also better to do this where possible because there may not be same
> amount of pmd cores on each numa, or one numa could already be more
> heavily loaded than the other.
> 
> Another difference is with above for 'group' I added tiebreaker for a 
> local-to-
> interface numa pmd to be selected if multiple pmds from different numas were
> available with same load. This is most likely to be helpful for initial 
> selection
> when there is no load on any pmds.

At first glance I agree with your reasoning. Choosing the least-loaded PMD from 
all NUMAs using NUMA-locality as a tie-breaker makes sense in 'group' 
algorithm. How do you select between equally loaded PMDs on the same NUMA node? 
Just pick any?

BR, Jan

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn branch-22.03 1/2] Set release date for 22.03.0.

2022-03-24 Thread Ilya Maximets
On 3/10/22 21:49, Mark Michelson wrote:
> Signed-off-by: Mark Michelson 
> ---
>  NEWS | 2 +-
>  debian/changelog | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/NEWS b/NEWS
> index 9f3ce3cf3..0f8e068b1 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -1,4 +1,4 @@
> -OVN v22.03.0 - XX XXX 
> +OVN v22.03.0 - 11 Mar 2022
>  --
>- Refactor CoPP commands introducing a unique name index in CoPP NB table.
>  Add following new CoPP commands to manage CoPP table:

Hey, Mark.  It looks like the version of the patch that got applied
to the branch-22.03 doesn't have the NEWS update above.

We also need this patch ported to the main branch to update all the dates.
(And it looks like I myself forgot to do that for OVS 2.17...)

Bets regards, Ilya Maximets.

> diff --git a/debian/changelog b/debian/changelog
> index fe0e7f488..a26420b7a 100644
> --- a/debian/changelog
> +++ b/debian/changelog
> @@ -2,7 +2,7 @@ ovn (22.03.0-1) unstable; urgency=low
>  
> * New upstream version
>  
> - -- OVN team   Thu, 24 Feb 2022 16:55:45 -0500
> + -- OVN team   Thu, 10 Mar 2022 15:47:41 -0500
>  
>  ovn (21.12.0-1) unstable; urgency=low
>  

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4] northd: Add support for NAT with multiple DGP

2022-03-24 Thread 0-day Robot
References:  <20220324161239.61857-1-sangana.abhi...@nutanix.com>
 

Bleep bloop.  Greetings Abhiram Sangana, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 275 characters long (recommended limit is 79)
#1210 FILE: utilities/ovn-nbctl.8.xml:1145:
  [--may-exist] [--stateless] 
[--gateway_port=GATEWAY_PORT] lr-nat-add 
router type external_ip logical_ip 
[logical_port external_mac]

WARNING: Line is 143 characters long (recommended limit is 79)
#1255 FILE: utilities/ovn-nbctl.8.xml:1222:
  [--if-exists] lr-nat-del router 
[type [ip] [gateway_port]]

WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
#1301 FILE: utilities/ovn-nbctl.c:381:
  [--gateway-port=GATEWAY_PORT]\n\

WARNING: Line lacks whitespace around operator
#1306 FILE: utilities/ovn-nbctl.c:385:
  lr-nat-del ROUTER [TYPE [IP] [GATEWAY_PORT]]\n\

Lines checked: 1592, Warnings: 6, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v4] northd: Add support for NAT with multiple DGP

2022-03-24 Thread Abhiram Sangana
Currently, if multiple distributed gateway ports (DGP) are configured
on a logical router, NAT is disabled as part of commit 15348b7
(northd: Multiple distributed gateway port support.)

This patch adds a new column called "gateway_port" in NAT table that
references a distributed gateway port in the Logical_Router_Port
table. A NAT rule is only applied on matching packets entering or
leaving the DGP configured for the rule, when a router has
multiple DGPs.

If a router has a single DGP, NAT rules are applied at the DGP even if
the "gateway_port" column is not set. It is an error to not set this
column for a NAT rule when the router has multiple DGPs.

This patch also updates the NAT commands in ovn-nbctl to support the
new column.

Signed-off-by: Abhiram Sangana 
---
 NEWS  |   1 +
 northd/northd.c   | 184 +--
 northd/ovn-northd.8.xml   |  27 +++---
 ovn-architecture.7.xml|   6 +-
 ovn-nb.ovsschema  |  10 +-
 ovn-nb.xml|  37 ++-
 tests/ovn-nbctl.at| 197 +-
 tests/ovn-northd.at   | 172 +++--
 tests/ovn.at  |   2 +-
 utilities/ovn-nbctl.8.xml |  48 ++
 utilities/ovn-nbctl.c | 170 +++-
 11 files changed, 647 insertions(+), 207 deletions(-)

diff --git a/NEWS b/NEWS
index 5ba9c3cf4..458db45a0 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,7 @@ Post v22.03.0
 -
   - Support IGMP and MLD snooping on transit logical switches that connect
 different OVN Interconnection availability zones.
+  - Support NAT for logical routers with multiple distributed gateway ports.
 
 OVN v22.03.0 - XX XXX 
 --
diff --git a/northd/northd.c b/northd/northd.c
index a2cf8d6fc..df890d97d 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -606,11 +606,11 @@ struct ovn_datapath {
 
 /* Applies to only logical router datapath.
  * True if logical router is a gateway router. i.e options:chassis is set.
- * If this is true, then 'l3dgw_port' will be ignored. */
+ * If this is true, then 'l3dgw_ports' will be ignored. */
 bool is_gw_router;
 
-/* OVN northd only needs to know about the logical router gateway port for
- * NAT on a distributed router.  The "distributed gateway ports" are
+/* OVN northd only needs to know about logical router gateway ports for
+ * NAT/LB on a distributed router.  The "distributed gateway ports" are
  * populated only when there is a gateway chassis or ha chassis group
  * specified for some of the ports on the logical router. Otherwise this
  * will be NULL. */
@@ -2702,8 +2702,9 @@ join_logical_ports(struct northd_input *input_data,
  * by one or more IP addresses, and if the port is a distributed gateway
  * port, followed by 'is_chassis_resident("LPORT_NAME")', where the
  * LPORT_NAME is the name of the L3 redirect port or the name of the
- * logical_port specified in a NAT rule.  These strings include the
- * external IP addresses of all NAT rules defined on that router, and all
+ * logical_port specified in a NAT rule. These strings include the
+ * external IP addresses of NAT rules defined on that router which have
+ * gateway_port not set or have gateway_port as the router port 'op', and all
  * of the IP addresses used in load balancer VIPs defined on that router.
  *
  * The caller must free each of the n returned strings with free(),
@@ -2716,8 +2717,7 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, 
bool routable_only,
 struct eth_addr mac;
 if (!op || !op->nbrp || !op->od || !op->od->nbr
 || (!op->od->nbr->n_nat && !op->od->has_lb_vip)
-|| !eth_addr_from_string(op->nbrp->mac, )
-|| op->od->n_l3dgw_ports > 1) {
+|| !eth_addr_from_string(op->nbrp->mac, )) {
 *n = n_nats;
 return NULL;
 }
@@ -2746,6 +2746,12 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, 
bool routable_only,
 continue;
 }
 
+/* Not including external IP of NAT rules whose gateway_port is
+ * not 'op'. */
+if (nat->gateway_port && nat->gateway_port != op->nbrp) {
+continue;
+}
+
 /* Determine whether this NAT rule satisfies the conditions for
  * distributed NAT processing. */
 if (op->od->n_l3dgw_ports && !strcmp(nat->type, "dnat_and_snat")
@@ -2816,9 +2822,9 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, 
bool routable_only,
 if (central_ip_address) {
 /* Gratuitous ARP for centralized NAT rules on distributed gateway
  * ports should be restricted to the gateway chassis. */
-if (op->od->n_l3dgw_ports) {
+if (is_l3dgw_port(op)) {
 ds_put_format(_addresses, " is_chassis_resident(%s)",
-  op->od->l3dgw_ports[0]->cr_port->json_key);
+  op->cr_port->json_key);

[ovs-dev] OVS DPDK DMA-Dev library/Design Discussion

2022-03-24 Thread Stokes, Ian
Hi All,

This meeting is a follow up to the call earlier this week.

This week Sunil presented 3 different approaches to integrating DMA-Dev with 
OVS along with the performance impacts.

https://github.com/Sunil-Pai-G/OVS-DPDK-presentation-share/blob/main/OVS%20vhost%20async%20datapath%20design%202022.pdf

The approaches were as follows:

*   Defer work.
*   Tx completions from Rx context.
*   Tx completions from Rx context + lockless ring.

The pros and cons of each approach were discussed but there was no clear 
solution reached.

As such a follow up call was suggested to continue discussion and to reach a 
clear decision on the approach to take.

Please see agenda as it stands below:

Agenda
*   Opens
*   Continue discussion of 3x approaches from last week (Defer work, "V3", 
V4, links to patches in Sunil's slides above)
*   Design Feedback (please review solutions of above & slide-deck from 
last week before call to be informed)
*   Dynamic Allocation of DMA engine per queue
*   Code Availability (DPDK GitHub, OVS GitHub branches)

Please feel free to respond with any other items to be added to the agenda.

Google Meet: https://meet.google.com/hme-pygf-bfb

Regards
Ian

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] system-traffic.at: Fix flaky DNAT load balancing test.

2022-03-24 Thread Eelco Chaudron



On 24 Mar 2022, at 12:20, Ilya Maximets wrote:

> 'conntrack - DNAT load balancing' test fails from time to time
> because not all the group buckets are getting hit.
>
> In short, the test creates a group with 3 buckets with the same
> weight.  It creates 12 TCP sessions and expects that each bucket
> will be used at least once.  However, there is a solid chance that
> this will not happen.  The probability of having at least one
> empty bucket is:
>
>   C(3, 1) x (2/3)^N - C(3, 2) x (1/3)^N
>
> Where N is the number of distinct TCP sessions.  For N=12, the
> probability is about 0.023, i.e. there is a 2.3% chance for a
> test to fail, which is not great for CI.
>
> Increasing the number of sessions to 50 to reduce the probability
> of failure down to 4.7e-9.  In my testing, the configuration with
> 50 TCP sessions didn't fail after 6000 runs.  Should be good
> enough for CI systems.
>
> Fixes: 2c66ebe47a88 ("ofp-actions: Allow conntrack action in group buckets.")
> Signed-off-by: Ilya Maximets 
> ---

Change looks good to me, for test past 6 times ;)

Acked-by: Eelco Chaudron 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [External] Re: [PATCH] dpif-netdev: add an option to assign pmd rxq to all numas

2022-03-24 Thread Kevin Traynor

Hi Anurag,

On 16/03/2022 12:29, Anurag Agarwal wrote:

Hello Kevin,
Thanks for your inputs.

In this scenario we have one VM each on NUMA0 and NUMA1 (VM1 is on NUMA0, VM2 
is on NUMA1), dpdk port is on NUMA1.

Without cross-numa-polling, VM/VHU queue traffic is evenly distributed based on 
load on their respective NUMA sockets.

However, DPDK traffic is only load balanced on NUMA1 PMDs, thereby exhibiting 
aggregate load imbalance in the system (i.e.NUMA1 PMDs having more load v/s 
NUMA0 PMDs)

Please refer example below (cross-numa-polling is not enabled)

pmd thread numa_id 0 core_id 2:
   isolated : false
   port: vhu-vm1p1 queue-id:  2 (enabled)   pmd usage: 11 %
   port: vhu-vm1p1 queue-id:  4 (enabled)   pmd usage:  0 %
   overhead:  0 %
pmd thread numa_id 1 core_id 3:
   isolated : false
   port: dpdk0 queue-id:  0 (enabled)   pmd usage: 13 %
   port: dpdk0 queue-id:  2 (enabled)   pmd usage: 15 %
   port: vhu-vm2p1 queue-id:  3 (enabled)   pmd usage:  9 %
   port: vhu-vm2p1 queue-id:  4 (enabled)   pmd usage:  0 %
   overhead:  0 %

With cross-numa-polling enabled,  the rxqs from DPDK port are distributed to 
both NUMAs, and then the 'group' scheduling algorithm assigns the rxqs to PMDs 
based on load.

Please refer example below, after cross-numa-polling is enabled on dpdk0 port.

pmd thread numa_id 0 core_id 2:
   isolated : false
   port: dpdk0 queue-id:  5 (enabled)   pmd usage: 11 %
   port: vhu-vm1p1 queue-id:  3 (enabled)   pmd usage:  4 %
   port: vhu-vm1p1 queue-id:  5 (enabled)   pmd usage:  4 %
   overhead:  2 %
pmd thread numa_id 1 core_id 3:
   isolated : false
   port: dpdk0 queue-id:  2 (enabled)   pmd usage: 10 %
   port: vhu-vm2p1 queue-id:  0 (enabled)   pmd usage:  4 %
   port: vhu-vm2p1 queue-id:  6 (enabled)   pmd usage:  4 %
   overhead:  3 %



Yes, this illustrates the operation well. We can imply that if the 
traffic rate was increased that the cross-numa disabled case would hit a 
bottleneck first by virtue of having 2 dpdk rxq + 2 vm rxq on one core.


Kevin.


Regards,
Anurag

-Original Message-
From: Kevin Traynor 
Sent: Thursday, March 10, 2022 11:02 PM
To: Anurag Agarwal ; Jan Scheurich 
; Wan Junjie 
Cc: d...@openvswitch.org
Subject: Re: [External] Re: [PATCH] dpif-netdev: add an option to assign pmd 
rxq to all numas

On 04/03/2022 17:57, Anurag Agarwal wrote:

Hello Kevin,
 I have prepared a patch for "per port cross-numa-polling" and 
attached herewith.

The results are captured in 'cross-numa-results.txt'. We see PMD to RxQ 
assignment evenly balanced across all PMDs with this patch.

Please take a look and let us know your inputs.


Hi Anurag,

I think what this is showing is more related to txqs used for sending to the 
VM. As you are allowing the rxqs from the phy port to be handled by more pmds, 
and all those rxqs have traffic then in turn more txqs are used for sending to 
the VM. The result of using more txqs when sending to the VM in this case is 
that the traffic is returned on the more rxqs.

Allowing cross-numa does not guarantee that the different pmd cores will poll 
rxqs from an interface. At least with group algorithm, the pmds will be 
selected purely on load. The right way to ensure that all VM
txqs(/rxqs) are used is to enable the Tx-steering feature [0].

So you might be seeing some benefit in this case, but to me it's not the core 
use case of cross-numa polling. That is more about allowing the pmds on every 
numa to be used when the traffic load is primarily coming from one numa.

Kevin.

[0] 
https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-45444731-623c75dfcb975446=1=fda391fe-6bfc-4657-ba86-b13008b338fd=https%3A%2F%2Fdocs.openvswitch.org%2Fen%2Flatest%2Ftopics%2Fuserspace-tx-steering%2F


Please find some of my inputs inline, in response to your comments.

Regards,
Anurag

-Original Message-
From: Kevin Traynor 
Sent: Thursday, February 24, 2022 7:54 PM
To: Jan Scheurich ; Wan Junjie

Cc: d...@openvswitch.org; Anurag Agarwal 
Subject: Re: [External] Re: [PATCH] dpif-netdev: add an option to
assign pmd rxq to all numas

Hi Jan,

On 17/02/2022 14:21, Jan Scheurich wrote:

Hi Kevin,


We have done extensive benchmarking and found that we get better
overall

PMD load balance and resulting OVS performance when we do not
statically pin any rx queues and instead let the auto-load-balancing
find the optimal distribution of phy rx queues over both NUMA nodes
to balance an asymmetric load of vhu rx queues (polled only on the local NUMA 
node).


Cross-NUMA polling of vhu rx queues comes with a very high latency
cost due

to cross-NUMA access to volatile virtio ring pointers in every
iteration (not only when actually copying packets). Cross-NUMA
polling of phy rx queues doesn't have a similar issue.




I agree that for vhost rxq polling, it always causes a performance
penalty when 

Re: [ovs-dev] [PATCH v2] dpif-netdev: Allow cross-NUMA polling on selected ports

2022-03-24 Thread Kevin Traynor

Hi Anurag/Jan,

On 21/03/2022 09:42, Anurag Agarwal wrote:

From: Jan Scheurich 

  Today dpif-netdev considers PMD threads on a non-local NUMA node for
  automatic assignment of the rxqs of a port only if there are no local,
  non-isolated PMDs.

  On typical servers with both physical ports on one NUMA node, this
  often
  leaves the PMDs on the other NUMA node under-utilized, wasting CPU
  resources. The alternative, to manually pin the rxqs to PMDs on remote
  NUMA nodes, also has drawbacks as it limits OVS' ability to auto
  load-balance the rxqs.

  This patch introduces a new interface configuration option to allow
  ports to be automatically polled by PMDs on any NUMA node:

  ovs-vsctl set interface  other_config:cross-numa-polling=true

  If this option is not present or set to false, legacy behaviour
  applies.



I mentioned some additional features to improve this and you seemed to 
agree they were useful so I sent an RFC with them for you to check out.


https://mail.openvswitch.org/pipermail/ovs-dev/2022-March/392310.html

Then you sent this patch, so I am a bit confused. Maybe you missed it?

I will highlight some differences between the patchsets on rxq 
scheduling below.


I think general comments on the trade-offs of this feature in the other 
thread are still welcome and maybe more important than implementation 
discussions.



Signed-off-by: Jan Scheurich 
Signed-off-by: Anurag Agarwal 
---
  Documentation/topics/dpdk/pmd.rst | 24 ++--
  lib/dpif-netdev.c | 31 +++
  tests/pmd.at  | 30 ++
  vswitchd/vswitch.xml  | 20 
  4 files changed, 95 insertions(+), 10 deletions(-)

diff --git a/Documentation/topics/dpdk/pmd.rst 
b/Documentation/topics/dpdk/pmd.rst
index b259cc8b3..f6c9671d7 100644
--- a/Documentation/topics/dpdk/pmd.rst
+++ b/Documentation/topics/dpdk/pmd.rst
@@ -132,8 +132,28 @@ or can be triggered by using::
  
  Port/Rx Queue assignment to PMD threads by manual pinning

  ~
-Rx queues may be manually pinned to cores. This will change the default Rx
-queue assignment to PMD threads::
+
+Normally, Rx queues are assigned to PMD threads automatically.  By default
+OVS only assigns Rx queues to PMD threads executing on the same NUMA
+node in order to avoid unnecessary latency for accessing packet buffers
+across the NUMA boundary.  Typically this overhead is higher for vhostuser
+ports than for physical ports due to the packet copy that is done for all
+rx packets.
+
+On NUMA servers with physical ports only on one NUMA node, the NUMA-local
+polling policy can lead to an under-utilization of the PMD threads on the
+remote NUMA node.  For the overall OVS performance it may in such cases be
+beneficial to utilize the spare capacity and allow polling of a physical
+port's rxqs across NUMA nodes despite the overhead involved.
+The policy can be set per port with the following configuration option::
+
+$ ovs-vsctl set Interface  \
+other_config:cross-numa-polling=true|false
+
+The default value is false.
+
+Rx queues may also be manually pinned to cores. This will change the default
+Rx queue assignment to PMD threads::
  
  $ ovs-vsctl set Interface  \

  other_config:pmd-rxq-affinity=
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 720818e30..4fda8d7a0 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -466,6 +466,7 @@ struct dp_netdev_port {
  char *type; /* Port type as requested by user. */
  char *rxq_affinity_list;/* Requested affinity of rx queues. */
  enum txq_req_mode txq_requested_mode;
+bool cross_numa_polling;/* If true cross polling will be enabled */
  };
  
  static bool dp_netdev_flow_ref(struct dp_netdev_flow *);

@@ -5018,6 +5019,7 @@ dpif_netdev_port_set_config(struct dpif *dpif, odp_port_t 
port_no,
  bool emc_enabled = smap_get_bool(cfg, "emc-enable", true);
  const char *tx_steering_mode = smap_get(cfg, "tx-steering");
  enum txq_req_mode txq_mode;
+bool cross_numa_polling = smap_get_bool(cfg, "cross-numa-polling", false);
  
  ovs_rwlock_wrlock(>port_rwlock);

  error = get_port_by_number(dp, port_no, );
@@ -5086,6 +5088,14 @@ dpif_netdev_port_set_config(struct dpif *dpif, 
odp_port_t port_no,
  dp_netdev_request_reconfigure(dp);
  }
  
+if (cross_numa_polling != port->cross_numa_polling) {

+port->cross_numa_polling = cross_numa_polling;
+VLOG_INFO("%s:cross-numa-polling has been %s.",
+  netdev_get_name(port->netdev),
+  (cross_numa_polling)? "enabled" : "disabled");
+dp_netdev_request_reconfigure(dp);
+}
+
  unlock:
  ovs_rwlock_unlock(>port_rwlock);
  return error;
@@ -5885,7 +5895,7 @@ sched_numa_list_schedule(struct sched_numa_list 
*numa_list,
  {
  struct 

Re: [ovs-dev] [PATCH ovn] rhel: fix logrotate user config option

2022-03-24 Thread Numan Siddique
On Tue, Mar 22, 2022 at 2:33 PM Mark Michelson  wrote:
>
> Acked-by: Mark Michelson 
>

Thanks.

I applied this patch to the main branch and backported upto branch-21.03.

Numan
> On 3/22/22 13:37, Vladislav Odintsov wrote:
> > If rpm is built with libcapng (default), /etc/logrotate.d/ovn
> > config file user is SEDed in postinstall section with 'ovn ovn'.
> >
> > If one run logrotate, next error occurs:
> > ```
> > [root@host ~]# logrotate /etc/logrotate.d/ovn
> > error: /etc/logrotate.d/ovn:9 unknown user 'ovn'
> > error: found error in /var/log/ovn/*.log , skipping
> > ```
> >
> > Replace 'ovn' user with 'openvswitch' to fix the issue.
> >
> > Signed-off-by: Vladislav Odintsov 
> > ---
> >   rhel/ovn-fedora.spec.in | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/rhel/ovn-fedora.spec.in b/rhel/ovn-fedora.spec.in
> > index 3fb854a37..821eb03cc 100644
> > --- a/rhel/ovn-fedora.spec.in
> > +++ b/rhel/ovn-fedora.spec.in
> > @@ -323,7 +323,7 @@ ln -sf ovn_detrace.py %{_bindir}/ovn-detrace
> >   %if %{with libcapng}
> >   if [ $1 -eq 1 ]; then
> >   sed -i 's:^#OVN_USER_ID=:OVN_USER_ID=:' %{_sysconfdir}/sysconfig/ovn
> > -sed -i 's:\(.*su\).*:\1 ovn ovn:' %{_sysconfdir}/logrotate.d/ovn
> > +sed -i 's:\(.*su\).*:\1 openvswitch openvswitch:' 
> > %{_sysconfdir}/logrotate.d/ovn
> >   fi
> >   %endif


> >
> >
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Possible backport candidate

2022-03-24 Thread Numan Siddique
On Tue, Mar 22, 2022 at 12:31 PM Mark Michelson  wrote:
>
> Hi everyone,
>
> Last month, commit d7514abe1 (acl-log: Log the direction (logical
> pipeline) of the matching ACL.) was added to the main branch of OVN.
> This commit was made because of the issue report at
> https://bugzilla.redhat.com/show_bug.cgi?id=1992641 .
>
> At the time the change was added, it was not considered for a backport
> to OVN 21.12 because it modifies the contents of ACL logs. The fear was
> that if people have written parsers for ACL logs, the addition of a new
> field could cause issues. This is not a very friendly thing to do during
> a minor update.
>
> However, the change was made in an attempt to correct a deficiency in
> the ACL logs. Without the change, it appears that there are duplicate
> ACL logs since there is nothing to distinguish the pipeline on which the
> ACL matched. Also, there is no documented contract about ACL log format;
> they are intended to be human readable, and that's all.
>
> My question is if the community would be OK with backporting this fix to
> OVN 21.12 in light of the change to the ACL log contents. Please let me
> know what you think.

Hi Mark,

No objections from me.

Numan

>
> Thanks,
> Mark Michelson
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5 3/5] pmd.at: Add tests for multi non-local numa pmds.

2022-03-24 Thread David Marchand
On Wed, Mar 23, 2022 at 3:09 PM Kevin Traynor  wrote:
>
> Ensure that if there are no local numa PMD thread
> cores available that pmd cores from all other non-local
> numas will be used.
>
> Signed-off-by: Kevin Traynor 
Acked-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] datapath: use after free flow mask on destroy flow table

2022-03-24 Thread Tonghao Zhang
On Thu, Mar 24, 2022 at 6:32 PM Ilya Maximets  wrote:
>
> On 3/24/22 05:17, Wentao Jia wrote:
> >
> >
> > on destroy flow table instance, referenced flow mask may be released
> > too. fuction ovs_flow_tbl_destroy(), release flow mask first and then
> > destroy flow table instance. this will trigger kernel panic on detroy
> > datapath
> >
> >
> > [  377.647756] kernel BUG at .../datapath/linux/flow_table.c:272!
> > [  377.653794] invalid opcode:  [#1] SMP PTI
> > [  377.666827] RIP: 0010:table_instance_flow_free.isra.7+0x148/0x150
> > [  377.711465] Call Trace:
> > [  377.715238]  
> > [  377.718964]  table_instance_destroy+0xbe/0x160 [openvswitch]
> > [  377.722793]  destroy_dp_rcu+0x12/0x40 [openvswitch]
> > [  377.726651]  rcu_process_callbacks+0x297/0x460
> > [  377.736795]  __do_softirq+0xe3/0x30a
> > [  377.740654]  ? ktime_get+0x36/0xa0
> > [  377.744490]  irq_exit+0x100/0x110
> > [  377.748514]  smp_apic_timer_interrupt+0x74/0x140
> > [  377.752817]  apic_timer_interrupt+0xf/0x20
> > [  377.758802]  
> >
> >
> > Fixes: 6d1cf7f3e ("datapath: fix possible memleak on destroy
> > flow-table")
for linux upstream, fix tag:
Fixes: 50b0e61b32ee ("net: openvswitch: fix possible memleak on
destroy flow-table")
> >
> > Signed-off-by: Wentao Jia 
> > Signed-off-by: Chuanjie Zeng 
> > ---
>
> Hi, Wentao Jia.  Thanks for the patch!
>
> Please, send it to the mainline linux kernel ('netdev' mailing list,
> keeping the ovs-dev in CC) using the linux kernel process for
> submitting patches.
>
> When it is accepted to the upstream kernel, it can be backported to
> the OOT kernel module in OVS repository.
>
> Best regards, Ilya Maximets.
>
> >  datapath/flow_table.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> >
> > diff --git a/datapath/flow_table.c b/datapath/flow_table.c
> > index 650338fb0..b2f4b1108 100644
> > --- a/datapath/flow_table.c
> > +++ b/datapath/flow_table.c
> > @@ -415,8 +415,8 @@ void ovs_flow_tbl_destroy(struct flow_table *table)
> > struct table_instance *ufid_ti = 
> > rcu_dereference_raw(table->ufid_ti);
> >
> >
> > free_percpu(table->mask_cache);
> > -   kfree(rcu_dereference_raw(table->mask_array));
> > table_instance_destroy(table, ti, ufid_ti, false);
> > +   kfree(rcu_dereference_raw(table->mask_array));
> >  }
>


-- 
Best regards, Tonghao
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] system-traffic.at: Fix flaky DNAT load balancing test.

2022-03-24 Thread Ilya Maximets
'conntrack - DNAT load balancing' test fails from time to time
because not all the group buckets are getting hit.

In short, the test creates a group with 3 buckets with the same
weight.  It creates 12 TCP sessions and expects that each bucket
will be used at least once.  However, there is a solid chance that
this will not happen.  The probability of having at least one
empty bucket is:

  C(3, 1) x (2/3)^N - C(3, 2) x (1/3)^N

Where N is the number of distinct TCP sessions.  For N=12, the
probability is about 0.023, i.e. there is a 2.3% chance for a
test to fail, which is not great for CI.

Increasing the number of sessions to 50 to reduce the probability
of failure down to 4.7e-9.  In my testing, the configuration with
50 TCP sessions didn't fail after 6000 runs.  Should be good
enough for CI systems.

Fixes: 2c66ebe47a88 ("ofp-actions: Allow conntrack action in group buckets.")
Signed-off-by: Ilya Maximets 
---
 tests/system-traffic.at | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index 95383275a..4a7fa49fc 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -6471,7 +6471,7 @@ on_exit 'ovs-appctl revalidator/purge'
 on_exit 'ovs-appctl dpif/dump-flows br0'
 
 dnl Should work with the virtual IP address through NAT
-for i in 1 2 3 4 5 6 7 8 9 10 11 12; do
+for i in $(seq 1 50); do
 echo Request $i
 NS_CHECK_EXEC([at_ns1], [wget 10.1.1.64 -t 5 -T 1 --retry-connrefused -v 
-o wget$i.log])
 done
-- 
2.34.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] datapath: use after free flow mask on destroy flow table

2022-03-24 Thread Ilya Maximets
On 3/24/22 05:17, Wentao Jia wrote:
> 
> 
> on destroy flow table instance, referenced flow mask may be released
> too. fuction ovs_flow_tbl_destroy(), release flow mask first and then
> destroy flow table instance. this will trigger kernel panic on detroy
> datapath
> 
> 
> [  377.647756] kernel BUG at .../datapath/linux/flow_table.c:272!
> [  377.653794] invalid opcode:  [#1] SMP PTI
> [  377.666827] RIP: 0010:table_instance_flow_free.isra.7+0x148/0x150
> [  377.711465] Call Trace:
> [  377.715238]  
> [  377.718964]  table_instance_destroy+0xbe/0x160 [openvswitch]
> [  377.722793]  destroy_dp_rcu+0x12/0x40 [openvswitch]
> [  377.726651]  rcu_process_callbacks+0x297/0x460
> [  377.736795]  __do_softirq+0xe3/0x30a
> [  377.740654]  ? ktime_get+0x36/0xa0
> [  377.744490]  irq_exit+0x100/0x110
> [  377.748514]  smp_apic_timer_interrupt+0x74/0x140
> [  377.752817]  apic_timer_interrupt+0xf/0x20
> [  377.758802]  
> 
> 
> Fixes: 6d1cf7f3e ("datapath: fix possible memleak on destroy
> flow-table")
> 
> 
> Signed-off-by: Wentao Jia 
> Signed-off-by: Chuanjie Zeng 
> ---

Hi, Wentao Jia.  Thanks for the patch!

Please, send it to the mainline linux kernel ('netdev' mailing list,
keeping the ovs-dev in CC) using the linux kernel process for
submitting patches.

When it is accepted to the upstream kernel, it can be backported to
the OOT kernel module in OVS repository.

Best regards, Ilya Maximets.

>  datapath/flow_table.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> 
> diff --git a/datapath/flow_table.c b/datapath/flow_table.c
> index 650338fb0..b2f4b1108 100644
> --- a/datapath/flow_table.c
> +++ b/datapath/flow_table.c
> @@ -415,8 +415,8 @@ void ovs_flow_tbl_destroy(struct flow_table *table)
> struct table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti);
> 
> 
> free_percpu(table->mask_cache);
> -   kfree(rcu_dereference_raw(table->mask_array));
> table_instance_destroy(table, ti, ufid_ti, false);
> +   kfree(rcu_dereference_raw(table->mask_array));
>  }

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5 00/15] Fix undefined behavior in loop macros

2022-03-24 Thread Dumitru Ceara
On 3/23/22 17:24, Ilya Maximets wrote:
> On 3/23/22 17:11, Adrian Moreno wrote:
>>
>>
>> On 3/23/22 14:25, Eelco Chaudron wrote:
>>> One small nit in patch 4, the rest of the changes from v4 to v5 look good.
>>>
>>
>> Sorry about that. Dumitru said he might take a look at the latest version. 
>> I'll give him some time and resend.
> 
> If there will be no further comments from Dumitru,

I had another look at v5 and I acked all remaining patches.

> I can fix that one comment on commit.

Sounds good to me.

Thanks,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5 12/15] rculist: use multi-variable helpers for loop macros

2022-03-24 Thread Dumitru Ceara
On 3/23/22 12:56, Adrian Moreno wrote:
> Use multi-variable iteration helpers to rewrite rculist loops macros.
> 
> There is an important behavior change compared with the previous
> implementation: When the loop ends normally (i.e: not via "break;"),
> the object pointer provided by the user is NULL. This is safer
> because it's not guaranteed that it would end up pointing a valid
> address.
> 
> Acked-by: Eelco Chaudron 
> Signed-off-by: Adrian Moreno 
> ---

Acked-by: Dumitru Ceara 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5 10/15] hindex: use multi-variable iterators

2022-03-24 Thread Dumitru Ceara
On 3/23/22 12:56, Adrian Moreno wrote:
> Re-write hindex's loops using multi-variable helpers.
> 
> For safe loops, use the LONG version to maintain backwards
> compatibility.
> 
> Acked-by: Eelco Chaudron 
> Signed-off-by: Adrian Moreno 
> ---

Acked-by: Dumitru Ceara 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5 09/15] cmap: use multi-variable iterators

2022-03-24 Thread Dumitru Ceara
On 3/23/22 12:56, Adrian Moreno wrote:
> Re-write cmap's loops using multi-variable helpers.
> 
> For iterators based on cmap_cursor, we just need to make sure the NODE
> variable is not used after the loop, so we set it to NULL.
> 
> Acked-by: Eelco Chaudron 
> Signed-off-by: Adrian Moreno 
> ---

Acked-by: Dumitru Ceara 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5 06/15] hmap: use multi-variable helpers for hmap loops

2022-03-24 Thread Dumitru Ceara
On 3/23/22 12:56, Adrian Moreno wrote:
> Rewrite hmap's loops using multi-variable helpers.
> 
> For SAFE loops, use the LONG version of the multi-variable macros to
> keep backwards compatibility.
> 
> Acked-by: Eelco Chaudron 
> Signed-off-by: Adrian Moreno 
> ---

Acked-by: Dumitru Ceara 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5 04/15] list: use multi-variable helpers for list loops

2022-03-24 Thread Dumitru Ceara
On 3/23/22 12:56, Adrian Moreno wrote:
> Use multi-variable iteration helpers to rewrite non-safe loops.
> 
> There is an important behavior change compared with the previous
> implementation: When the loop ends normally (i.e: not via "break;"), the
> object pointer provided by the user is NULL. This is safer because it's
> not guaranteed that it would end up pointing a valid address.
> 
> For pop iterator, set the variable to NULL when the loop ends.
> 
> Clang-analyzer has successfully picked the potential null-pointer
> dereference on the code that triggered this change (bond.c) and nothing
> else has been detected.
> 
> For _SAFE loops, use the LONG version for backwards compatibility.
> 
> Acked-by: Eelco Chaudron 
> Signed-off-by: Adrian Moreno 
> ---

Acked-by: Dumitru Ceara 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5 02/15] util: add safe multi-variable iterators

2022-03-24 Thread Dumitru Ceara
On 3/23/22 12:56, Adrian Moreno wrote:
> Safe version of multi-variable iterator helpers declare an internal
> variable to store the next value of the iterator temporarily.
> 
> Two versions of the macro are provided, one that still uses the NEXT
> variable for backwards compatibility and a shorter version that does not
> require the use of an additional variable provided by the user.
> 
> Signed-off-by: Adrian Moreno 
> Acked-by: Eelco Chaudron 
> ---

Acked-by: Dumitru Ceara 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5 01/15] util: add multi-variable loop iterator macros

2022-03-24 Thread Dumitru Ceara
On 3/23/22 12:56, Adrian Moreno wrote:
> Multi-variable loop iterators avoid potential undefined behavior by
> using an internal iterator variable to perform the iteration and only
> referencing the containing object (via OBJECT_CONTAINING) if the
> iterator has been validated via the second expression of the for
> statement.
> 
> That way, the user can easily implement a loop that never tries to
> obtain the object containing NULL or stack-allocated non-contained
> nodes.
> 
> When the loop ends normally (not via "break;") the user-provided
> variable is set to NULL.
> 
> Signed-off-by: Adrian Moreno 
> Acked-by: Eelco Chaudron 
> ---

Acked-by: Dumitru Ceara 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ofproto/bond: Add knob "all_slaves_active"

2022-03-24 Thread Christophe Fontaine
On Wed, Mar 23, 2022 at 10:12 PM Mike Pattrick  wrote:
>
> On Thu, Mar 17, 2022 at 11:54 AM Christophe Fontaine
>  wrote:
> >
> > This config param allows the delivery of broadcast and multicast packets
> > to the secondary interface of non-lacp bonds, identical to the same option
> > for kernel bonds.
> >
> > Tested with an ovs-dpdk balance-slb bond with 2 virtual functions,
> > and a kernel bond with LACP on 2 other virtual functions.
> >
> > Scapy sending 10 broadcast packets from 10 different mac addresses:
> > - with ovs-vsctl set port dpdkbond other_config:all_slaves_active=false
> > (default unmodified behavior) 5 packets are received
> > - with other_config:all_slaves_active=true, all 10 packets are received.
> >
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1720935
> > Signed-off-by: Christophe Fontaine 
> > ---
>
> Hello Chris,
>
> This feature seems like it would be useful, and looks correctly
> implemented. However, the nomenclature used throughout the code and
> documentation is member in place of slave.
>

Hi Mike,
Sure, I'll fix that in v2.

Thanks,
Christophe

>
> Cheers,
> M
>
> >  NEWS |  1 +
> >  ofproto/bond.c   |  5 -
> >  ofproto/bond.h   |  2 ++
> >  vswitchd/bridge.c|  3 +++
> >  vswitchd/vswitch.xml | 20 
> >  5 files changed, 30 insertions(+), 1 deletion(-)
> >
> > diff --git a/NEWS b/NEWS
> > index df633e8e2..07f72fd5a 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -16,6 +16,7 @@ v2.17.0 - xx xxx 
> >   * Removed experimental tag for PMD Auto Load Balance.
> >   * New configuration knob 'other_config:n-offload-threads' to change 
> > the
> > number of HW offloading threads.
> > + * New configuration knob 'all_slaves_active' for non lacp bonds
> > - DPDK:
> >   * EAL argument --socket-mem is no longer configured by default upon
> > start-up.  If dpdk-socket-mem and dpdk-alloc-mem are not specified,
> > diff --git a/ofproto/bond.c b/ofproto/bond.c
> > index cdfdf0b9d..8bf166a41 100644
> > --- a/ofproto/bond.c
> > +++ b/ofproto/bond.c
> > @@ -145,6 +145,7 @@ struct bond {
> >  struct eth_addr active_member_mac; /* MAC address of the active 
> > member. */
> >  /* Legacy compatibility. */
> >  bool lacp_fallback_ab; /* Fallback to active-backup on LACP failure. */
> > +bool all_slaves_active;
> >
> >  struct ovs_refcount ref_cnt;
> >  };
> > @@ -448,6 +449,7 @@ bond_reconfigure(struct bond *bond, const struct 
> > bond_settings *s)
> >
> >  bond->updelay = s->up_delay;
> >  bond->downdelay = s->down_delay;
> > +bond->all_slaves_active = s->all_slaves_active;
> >
> >  if (bond->lacp_fallback_ab != s->lacp_fallback_ab_cfg) {
> >  bond->lacp_fallback_ab = s->lacp_fallback_ab_cfg;
> > @@ -893,7 +895,8 @@ bond_check_admissibility(struct bond *bond, const void 
> > *member_,
> >
> >  /* Drop all multicast packets on inactive members. */
> >  if (eth_addr_is_multicast(eth_dst)) {
> > -if (bond->active_member != member) {
> > +if (bond->active_member != member &&
> > +bond->all_slaves_active == false) {
> >  goto out;
> >  }
> >  }
> > diff --git a/ofproto/bond.h b/ofproto/bond.h
> > index 1683ec878..4e8c1872a 100644
> > --- a/ofproto/bond.h
> > +++ b/ofproto/bond.h
> > @@ -62,6 +62,8 @@ struct bond_settings {
> > ovs run. */
> >  bool use_lb_output_action;  /* Use lb_output action. Only applicable 
> > for
> > bond mode BALANCE TCP. */
> > +bool all_slaves_active; /* For non LACP bond, also accept multicast
> > +   packets on secondary interface. */
> >  };
> >
> >  /* Program startup. */
> > diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
> > index 5223aa897..53be7ddf6 100644
> > --- a/vswitchd/bridge.c
> > +++ b/vswitchd/bridge.c
> > @@ -4615,6 +4615,9 @@ port_configure_bond(struct port *port, struct 
> > bond_settings *s)
> >  s->use_lb_output_action = (s->balance == BM_TCP)
> >&& smap_get_bool(>cfg->other_config,
> > "lb-output-action", false);
> > +/* all_slaves_active is disabled by default */
> > +s->all_slaves_active = smap_get_bool(>cfg->other_config,
> > +   "all_slaves_active", false);
> >  }
> >
> >  /* Returns true if 'port' is synthetic, that is, if we constructed it 
> > locally
> > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> > index 0c6632617..d420397a3 100644
> > --- a/vswitchd/vswitch.xml
> > +++ b/vswitchd/vswitch.xml
> > @@ -2083,6 +2083,26 @@
> >  true).
> >
> >
> > +   > +  type='{"type": "boolean"}'>
> > +
> > +  Enable/Disable delivery of broadcast/multicast packets on 
> > secondary
> > +  interface of a bond. Relevant only when  is
> > +