[ovs-dev] [PATCH v2] configure: Properly handle case where libunwind.h is not available.

2019-10-16 Thread Yi-Hung Wei
It is possible that user install libunwind but not libunwind-devel,
and it will run into a compilation error.  So we need to check the
existence of the library and the header file.

Fixes: e2ed6fbeb18c ("fatal-signal: Catch SIGSEGV and print backtrace.")
Suggested-by: Ben Pfaff 
Signed-off-by: Yi-Hung Wei 
---
 m4/openvswitch.m4 | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/m4/openvswitch.m4 b/m4/openvswitch.m4
index 79e0be5a33dd..fe323a020b9d 100644
--- a/m4/openvswitch.m4
+++ b/m4/openvswitch.m4
@@ -640,7 +640,9 @@ AC_DEFUN([OVS_CHECK_UNBOUND],
 
 dnl Checks for libunwind.
 AC_DEFUN([OVS_CHECK_UNWIND],
-  [AC_CHECK_LIB(unwind, unw_backtrace, [HAVE_UNWIND=yes], [HAVE_UNWIND=no])
+  [AC_CHECK_LIB([unwind], [unw_backtrace],
+   [AC_CHECK_HEADERS([libunwind.h], [HAVE_UNWIND=yes], [HAVE_UNWIND=no])],
+   [HAVE_UNWIND=no])
if test "$HAVE_UNWIND" = yes; then
  AC_DEFINE([HAVE_UNWIND], [1], [Define to 1 if unwind is detected.])
  LIBS="$LIBS -lunwind"
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] configure: Properly handle case where libunwind.h is not available.

2019-10-16 Thread Yi-Hung Wei
On Wed, Oct 16, 2019 at 6:50 PM Ben Pfaff  wrote:
>
> On Tue, Oct 15, 2019 at 04:49:33PM -0700, Yi-Hung Wei wrote:
> > On Tue, Oct 15, 2019 at 3:25 PM Ben Pfaff  wrote:
> > >
> > > On Tue, Oct 15, 2019 at 02:30:48PM -0700, Yi-Hung Wei wrote:
> > > > It is possible that user install libunwind but not libunwind-devel,
> > > > and it will run into a compilation error.  So check the existence
> > > > of the header file instead of the library.
> > > >
> > > > Fixes: e2ed6fbeb18c ("fatal-signal: Catch SIGSEGV and print backtrace.")
> > > > Signed-off-by: Yi-Hung Wei 
> > > > ---
> > > >  m4/openvswitch.m4 | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/m4/openvswitch.m4 b/m4/openvswitch.m4
> > > > index 79e0be5a33dd..da7119951484 100644
> > > > --- a/m4/openvswitch.m4
> > > > +++ b/m4/openvswitch.m4
> > > > @@ -640,7 +640,7 @@ AC_DEFUN([OVS_CHECK_UNBOUND],
> > > >
> > > >  dnl Checks for libunwind.
> > > >  AC_DEFUN([OVS_CHECK_UNWIND],
> > > > -  [AC_CHECK_LIB(unwind, unw_backtrace, [HAVE_UNWIND=yes], 
> > > > [HAVE_UNWIND=no])
> > > > +  [AC_CHECK_HEADERS([libunwind.h], [HAVE_UNWIND=yes], [HAVE_UNWIND=no])
> > >
> > > It might be wise to check for both, e.g.:
> > >
> > > AC_CHECK_LIB([unwind], [unw_backtrace],
> > >   [AC_CHECK_HEADERS([libunwind.h], [HAVE_UNWIND=yes], [HAVE_UNWIND=no])],
> > >   [HAVE_UNWIND=no])
> > >
> >
> > Hi Ben,
> >
> > In most distros, libunwind-devel (or libunwind-dev) depends on
> > libunwind.  Is it necessary for us to check both?
>
> Not if everyone installs from a distro package.  If you install from a
> tarball, it's easy to provide configure options that make the header
> available but not the library, or vice versa.  It's better to catch that
> at configure time.

Thanks for explanation.   It's better to check both.  I will send out v2 soon.

Thanks,

-Yi-Hung
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v9 2/2] netdev-dpdk:Detailed packet drop statistics

2019-10-16 Thread Ben Pfaff
On Wed, Oct 16, 2019 at 02:16:14PM +0200, Ilya Maximets wrote:
> BTW, you will not find any description for statistics provided by the linux
> or DPDK drivers.  You could only look at the driver code or device spec.

I'm not sure the meanings are even consistent.  I've heard that
different Linux drivers count the number of bytes in a packet
differently, for example.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] configure: Properly handle case where libunwind.h is not available.

2019-10-16 Thread Ben Pfaff
On Tue, Oct 15, 2019 at 04:49:33PM -0700, Yi-Hung Wei wrote:
> On Tue, Oct 15, 2019 at 3:25 PM Ben Pfaff  wrote:
> >
> > On Tue, Oct 15, 2019 at 02:30:48PM -0700, Yi-Hung Wei wrote:
> > > It is possible that user install libunwind but not libunwind-devel,
> > > and it will run into a compilation error.  So check the existence
> > > of the header file instead of the library.
> > >
> > > Fixes: e2ed6fbeb18c ("fatal-signal: Catch SIGSEGV and print backtrace.")
> > > Signed-off-by: Yi-Hung Wei 
> > > ---
> > >  m4/openvswitch.m4 | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/m4/openvswitch.m4 b/m4/openvswitch.m4
> > > index 79e0be5a33dd..da7119951484 100644
> > > --- a/m4/openvswitch.m4
> > > +++ b/m4/openvswitch.m4
> > > @@ -640,7 +640,7 @@ AC_DEFUN([OVS_CHECK_UNBOUND],
> > >
> > >  dnl Checks for libunwind.
> > >  AC_DEFUN([OVS_CHECK_UNWIND],
> > > -  [AC_CHECK_LIB(unwind, unw_backtrace, [HAVE_UNWIND=yes], 
> > > [HAVE_UNWIND=no])
> > > +  [AC_CHECK_HEADERS([libunwind.h], [HAVE_UNWIND=yes], [HAVE_UNWIND=no])
> >
> > It might be wise to check for both, e.g.:
> >
> > AC_CHECK_LIB([unwind], [unw_backtrace],
> >   [AC_CHECK_HEADERS([libunwind.h], [HAVE_UNWIND=yes], [HAVE_UNWIND=no])],
> >   [HAVE_UNWIND=no])
> >
> 
> Hi Ben,
> 
> In most distros, libunwind-devel (or libunwind-dev) depends on
> libunwind.  Is it necessary for us to check both?

Not if everyone installs from a distro package.  If you install from a
tarball, it's easy to provide configure options that make the header
available but not the library, or vice versa.  It's better to catch that
at configure time.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] compat: Fix small naming issue

2019-10-16 Thread Yifeng Sun
LGTM, thanks Greg.

Reviewed-by: Yifeng Sun 

On Wed, Oct 16, 2019 at 1:21 PM Greg Rose  wrote:
>
> In commit 057772cf2477 the function is missing the rpl_ prefix
> and the define that replaces the original function should come
> after the function definition.
>
> Fixes: 057772cf2477 ("compat: Backport nf_ct_tmpl_alloc().")
> Signed-off-by: Greg Rose 
> ---
>  datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h 
> b/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h
> index 1015801..84cb09e 100644
> --- a/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h
> +++ b/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h
> @@ -7,11 +7,10 @@
>
>  #include 
>
> -#define nf_ct_tmpl_alloc rpl_nf_ct_tmpl_alloc
>  /* Released via destroy_conntrack() */
>  static inline struct nf_conn *
> -nf_ct_tmpl_alloc(struct net *net, const struct nf_conntrack_zone *zone,
> -gfp_t flags)
> +rpl_nf_ct_tmpl_alloc(struct net *net, const struct nf_conntrack_zone *zone,
> +gfp_t flags)
>  {
> struct nf_conn *tmpl;
>
> @@ -32,6 +31,7 @@ out_free:
> kfree(tmpl);
> return NULL;
>  }
> +#define nf_ct_tmpl_alloc rpl_nf_ct_tmpl_alloc
>
>  static void rpl_nf_ct_tmpl_free(struct nf_conn *tmpl)
>  {
> --
> 1.8.3.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] compat: Fix small naming issue

2019-10-16 Thread Greg Rose
In commit 057772cf2477 the function is missing the rpl_ prefix
and the define that replaces the original function should come
after the function definition.

Fixes: 057772cf2477 ("compat: Backport nf_ct_tmpl_alloc().")
Signed-off-by: Greg Rose 
---
 datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h 
b/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h
index 1015801..84cb09e 100644
--- a/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h
+++ b/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h
@@ -7,11 +7,10 @@
 
 #include 
 
-#define nf_ct_tmpl_alloc rpl_nf_ct_tmpl_alloc
 /* Released via destroy_conntrack() */
 static inline struct nf_conn *
-nf_ct_tmpl_alloc(struct net *net, const struct nf_conntrack_zone *zone,
-gfp_t flags)
+rpl_nf_ct_tmpl_alloc(struct net *net, const struct nf_conntrack_zone *zone,
+gfp_t flags)
 {
struct nf_conn *tmpl;
 
@@ -32,6 +31,7 @@ out_free:
kfree(tmpl);
return NULL;
 }
+#define nf_ct_tmpl_alloc rpl_nf_ct_tmpl_alloc
 
 static void rpl_nf_ct_tmpl_free(struct nf_conn *tmpl)
 {
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] Fix virtual port binding when the parents are scheduled in the same chassis

2019-10-16 Thread 0-day Robot
Bleep bloop.  Greetings Numan Siddique, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 201 characters long (recommended limit is 79)
#36 FILE: northd/ovn-northd.8.xml:583:
inport == P   ((arp.op == 1  
arp.spa == VIP  arp.tpa == VIP) || (arp.op == 2 
 arp.spa == VIP))

Lines checked: 151, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v9 2/2] netdev-dpdk:Detailed packet drop statistics

2019-10-16 Thread Kevin Traynor
On 16/10/2019 13:16, Ilya Maximets wrote:
> Hi Kevin,
> 
> Thanks for review. Some comments inline.
> 
> On 16.10.2019 12:15, Kevin Traynor wrote:
>> Hi Sriram,
>>
>> Thanks for working on making more fine grained stats for debugging. As
>> mentioned yesterday, this patch needs rebase so I just reviewed docs and
>> netdev-dpdk.c which applied. Comments below.
>>
>> On 21/09/2019 03:40, Sriram Vatala wrote:
>>> OVS may be unable to transmit packets for multiple reasons and
>>> today there is a single counter to track packets dropped due to
>>> any of those reasons. The most common reason is that a VM is
>>> unable to read packets fast enough causing the vhostuser port
>>> transmit queue on the OVS side to become full. This manifests
>>> as a problem with VNFs not receiving all packets. Having a
>>> separate drop counter to track packets dropped because the
>>> transmit queue is full will clearly indicate that the problem
>>> is on the VM side and not in OVS. Similarly maintaining separate
>>
>> It reads like a bit of a contradiction. On one hand the "The *most
>> common* reason is that a VM is unable to read packets fast enough".
>> While having a stat "*will clearly indicate* that the problem is on the
>> VM side".
>>
>>> counters for all possible drops helps in indicating sensible
>>> cause for packet drops.
>>>
>>> This patch adds custom software stats counters to track packets
>>> dropped at port level and these counters are displayed along with
>>> other stats in "ovs-vsctl get interface  statistics"
>>> command. The detailed stats will be available for both dpdk and
>>> vhostuser ports.
>>>
>>
>> I think the commit msg could be a bit clearer, suggest something like
>> below - feel free to modify (or reject):
>>
>> OVS may be unable to transmit packets for multiple reasons on the DPDK
>> datapath and today there is a single counter to track packets dropped
>> due to any of those reasons.
>>
>> This patch adds custom software stats for the different reasons packets
>> may be dropped during tx on the DPDK datapath in OVS.
>>
>> - MTU drops : drops that occur due to a too large packet size
>> - Qos drops: drops that occur due to egress QOS
>> - Tx failures: drops as returned by the DPDK PMD send function
>>
>> Note that the reason for tx failures is not specificied in OVS. In
>> practice for vhost ports it is most common that tx failures are because
>> there are not enough available descriptors, which is usually caused by
>> misconfiguration of the guest queues and/or because the guest is not
>> consuming packets fast enough from the queues.
>>
>> These counters are displayed along with other stats in "ovs-vsctl get
>> interface  statistics" command and are available for dpdk and
>> vhostuser/vhostuserclient ports.
>>
>> Signed-off-by: Sriram Vatala 
>>
>> ---
>>
>> v9:...
>> v8:...
>>
>>
>> Note that signed-off, --- and version info should be like this ^^^.
>> otherwise the review version comments will be in the git commit log when
>> it is applied.
>>
>>> --
>>> Changes since v8:
>>> Addressed comments given by Ilya.
>>>
>>> Signed-off-by: Sriram Vatala 
>>> ---
>>>   Documentation/topics/dpdk/vhost-user.rst  | 13 ++-
>>>   lib/netdev-dpdk.c | 81 +++
>>>   utilities/bugtool/automake.mk |  3 +-
>>>   utilities/bugtool/ovs-bugtool-get-port-stats  | 25 ++
>>>   .../plugins/network-status/openvswitch.xml|  1 +
>>>   5 files changed, 105 insertions(+), 18 deletions(-)
>>>   create mode 100755 utilities/bugtool/ovs-bugtool-get-port-stats
>>>
>>> diff --git a/Documentation/topics/dpdk/vhost-user.rst 
>>> b/Documentation/topics/dpdk/vhost-user.rst
>>> index 724aa62f6..89388a2bf 100644
>>> --- a/Documentation/topics/dpdk/vhost-user.rst
>>> +++ b/Documentation/topics/dpdk/vhost-user.rst
>>> @@ -551,7 +551,18 @@ processing packets at the required rate.
>>>   The amount of Tx retries on a vhost-user or vhost-user-client interface 
>>> can be
>>>   shown with::
>>>   
>>> -  $ ovs-vsctl get Interface dpdkvhostclient0 statistics:tx_retries
>>> +  $ ovs-vsctl get Interface dpdkvhostclient0 
>>> statistics:netdev_dpdk_tx_retries
>>
>> I think the "netdev_dpdk_" prefixes should be removed for a few reasons,
>>
>> - These are custom stats that will only be displayed for the dpdk ports
>> so they don't need any extra prefix to say they are for dpdk ports
>>
>> - The 'tx_retries' stat is part of OVS 2.12, I don't like to change its
>> name to a different one in OVS 2.13 unless there is a very good reason
>>
>> - The existing stats don't have this type of prefixes (granted most of
>> them are general stats):
> 
> The main reason for the prefix for me is to distinguish those stats
> from the stats that comest from the driver/HW.  Our new stats has
> very generic names that could be named a lot like or even equal to
> HW stats.  This might be not an issue for vhostuser, but for HW
> NICs this may be really confusing for users.
> 
> I'm not insisting on 

[ovs-dev] [PATCH ovn] Fix virtual port binding when the parents are scheduled in the same chassis

2019-10-16 Thread numans
From: Numan Siddique 

If a virtual port has 2 parents and if both of them are scheduled
on the same chassis, then virtual port binding doesn't work.

For virtual port binding we have the below logical flows:
inport == p1 && !is_chassis_resident("vip-port") && arp .. 
actions=bind_vport(vip-port); next;
inport == p2 && !is_chassis_resident("vip-port") && arp .. 
actions=bind_vport(vip-port); next;

Since we have !is_chassis_resident, as soon as p1 binds the port, the 
corresponding OF flows
for both the logical flows above are deleted. Because of this, when p2 becomes 
the
parent of vip-port it can't bind it.

This patch fixes this issue by removing this condition.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1762341
Signed-off-by: Numan Siddique 
---
 northd/ovn-northd.8.xml |  2 +-
 northd/ovn-northd.c |  3 +--
 tests/ovn.at| 39 ---
 3 files changed, 34 insertions(+), 10 deletions(-)

diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index b5dfcd183..d3e0e5ef2 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -580,7 +580,7 @@
   column with the match
 
 
-inport == P  !is_chassis_resident(V) 
 ((arp.op == 1  arp.spa == VIP  
arp.tpa == VIP) || (arp.op == 2  arp.spa == 
VIP))
+inport == P   ((arp.op == 1  
arp.spa == VIP  arp.tpa == VIP) || (arp.op == 2 
 arp.spa == VIP))
 
 
 
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index d0844dd64..ea8ad7c2d 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -5250,11 +5250,10 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap 
*ports,
 
 ds_clear();
 ds_put_format(, "inport == \"%s\" && "
-  "!is_chassis_resident(%s) && "
   "((arp.op == 1 && arp.spa == %s && "
   "arp.tpa == %s) || (arp.op == 2 && "
   "arp.spa == %s))",
-  vparent, op->json_key, virtual_ip, virtual_ip,
+  vparent, virtual_ip, virtual_ip,
   virtual_ip);
 ds_clear();
 ds_put_format(,
diff --git a/tests/ovn.at b/tests/ovn.at
index d14136748..6bdb9e8ed 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -15637,7 +15637,7 @@ ovn-nbctl lsp-set-addresses sw0-vir "50:54:00:00:00:10 
10.0.0.10"
 ovn-nbctl lsp-set-port-security sw0-vir "50:54:00:00:00:10 10.0.0.10"
 ovn-nbctl lsp-set-type sw0-vir virtual
 ovn-nbctl set logical_switch_port sw0-vir options:virtual-ip=10.0.0.10
-ovn-nbctl set logical_switch_port sw0-vir options:virtual-parents=sw0-p1,sw0-p2
+ovn-nbctl set logical_switch_port sw0-vir 
options:virtual-parents=sw0-p1,sw0-p2,sw0-p3
 
 ovn-nbctl lsp-add sw0 sw0-p1
 ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3"
@@ -15649,7 +15649,7 @@ ovn-nbctl lsp-set-port-security sw0-p2 
"50:54:00:00:00:04 10.0.0.4 10.0.0.10"
 
 ovn-nbctl lsp-add sw0 sw0-p3
 ovn-nbctl lsp-set-addresses sw0-p3 "50:54:00:00:00:05 10.0.0.5"
-ovn-nbctl lsp-set-port-security sw0-p3 "50:54:00:00:00:05 10.0.0.5"
+ovn-nbctl lsp-set-port-security sw0-p3 "50:54:00:00:00:05 10.0.0.5 10.0.0.10"
 
 # Create the second logical switch with one port
 ovn-nbctl ls-add sw1
@@ -15680,8 +15680,9 @@ ovn-nbctl --wait=hv sync
 ovn-sbctl dump-flows sw0 | grep ls_in_arp_rsp | grep bind_vport > lflows.txt
 
 AT_CHECK([cat lflows.txt], [0], [dnl
-  table=11(ls_in_arp_rsp  ), priority=100  , match=(inport == "sw0-p1" && 
!is_chassis_resident("sw0-vir") && ((arp.op == 1 && arp.spa == 10.0.0.10 && 
arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == 10.0.0.10))), 
action=(bind_vport("sw0-vir", inport); next;)
-  table=11(ls_in_arp_rsp  ), priority=100  , match=(inport == "sw0-p2" && 
!is_chassis_resident("sw0-vir") && ((arp.op == 1 && arp.spa == 10.0.0.10 && 
arp.tpa == 10.0.0.10) || (arp.op == 2 && arp.spa == 10.0.0.10))), 
action=(bind_vport("sw0-vir", inport); next;)
+  table=11(ls_in_arp_rsp  ), priority=100  , match=(inport == "sw0-p1" && 
((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 
&& arp.spa == 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;)
+  table=11(ls_in_arp_rsp  ), priority=100  , match=(inport == "sw0-p2" && 
((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 
&& arp.spa == 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;)
+  table=11(ls_in_arp_rsp  ), priority=100  , match=(inport == "sw0-p3" && 
((arp.op == 1 && arp.spa == 10.0.0.10 && arp.tpa == 10.0.0.10) || (arp.op == 2 
&& arp.spa == 10.0.0.10))), action=(bind_vport("sw0-vir", inport); next;)
 ])
 
 ovn-sbctl dump-flows lr0 | grep lr_in_arp_resolve | grep "reg0 == 10.0.0.10" \
@@ -15729,6 +15730,29 @@ AT_CHECK([cat lflows.txt], [0], [dnl
   table=11(lr_in_arp_resolve  ), priority=100  , match=(outport == "lr0-sw0" 
&& reg0 == 10.0.0.10), 

Re: [ovs-dev] [PATCHv2] ofproto-dpif: Expose datapath capability to ovsdb.

2019-10-16 Thread Gregory Rose


On 10/15/2019 2:06 PM, Gregory Rose wrote:


On 10/4/2019 3:32 PM, William Tu wrote:

The patch adds support for fetching the datapath's capabilities
from the result of 'check_support()', and write the supported capability
to a new database column, called 'capabilities' under Datapath table.

To see how it works, run:
  # ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev
  # ovs-vsctl -- --id=@m create Datapath datapath_version=0 \
  'ct_zones={}' 'capabilities={}' \
  -- set Open_vSwitch . datapaths:"netdev"=@m

  # ovs-vsctl list-dp-cap netdev
  ufid=true sample_nesting=true clone=true tnl_push_pop=true \
  ct_orig_tuple=true ct_eventmask=true ct_state=true \
  ct_clear=true max_vlan_headers=1 recirc=true ct_label=true \
  max_hash_alg=1 ct_state_nat=true ct_timeout=true \
  ct_mark=true ct_orig_tuple6=true check_pkt_len=true \
  masked_set_action=true max_mpls_depth=3 trunc=true ct_zone=true

Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/593749381
Signed-off-by: William Tu 


Patch works as advertised and the code looks fine to me.

Tested-by: Greg Rose 
Reviewed-by: Greg Rose 


I have to take this back.  While doing other work I found that this 
patch has an issue on 32bit

architecture:

ofproto/ofproto-dpif.c:In function ‘get_datapath_cap’:

ofproto/ofproto-dpif.c:5458:27:error: format ‘%lu’ expects argument of 
type ‘long unsigned int’, but argument 2 has type ‘size_t {aka unsigned 
int}’ [-Werror=format=]


str_value = xasprintf("%lu", odp.max_vlan_headers);

^

ofproto/ofproto-dpif.c:5462:27:error: format ‘%lu’ expects argument of 
type ‘long unsigned int’, but argument 2 has type ‘size_t {aka unsigned 
int}’ [-Werror=format=]


str_value = xasprintf("%lu", odp.max_mpls_depth);

^

ofproto/ofproto-dpif.c:5485:27:error: format ‘%lu’ expects argument of 
type ‘long unsigned int’, but argument 2 has type ‘size_t {aka unsigned 
int}’ [-Werror=format=]


str_value = xasprintf("%lu", s.max_hash_alg);

You'll need to fixup your formatting there.  I don't give 32 bit 
architectures all the attention I should but I suppose they're

still in use.

Thanks,

- Greg



Thanks William!

- Greg


---
v2:
rebase to master
---
  ofproto/ofproto-dpif.c | 51 +++
  ofproto/ofproto-provider.h |  2 ++
  ofproto/ofproto.c  | 12 
  ofproto/ofproto.h  |  2 ++
  tests/ovs-vsctl.at | 10 ++
  utilities/ovs-vsctl.8.in   |  6 
  utilities/ovs-vsctl.c  | 28 +
  vswitchd/bridge.c  | 21 +
  vswitchd/vswitch.ovsschema |  5 ++-
  vswitchd/vswitch.xml   | 76 
++

  10 files changed, 212 insertions(+), 1 deletion(-)

diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 496a16c8a4af..c37181664523 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -5440,6 +5440,56 @@ ct_del_zone_timeout_policy(const char 
*datapath_type, uint16_t zone_id)

  }
  }
  +static void
+get_datapath_cap(const char *datapath_type, struct smap *cap)
+{
+    char *str_value;
+    struct odp_support odp;
+    struct dpif_backer_support s;
+    struct dpif_backer *backer = shash_find_data(_dpif_backers,
+ datapath_type);
+    if (!backer) {
+    return;
+    }
+    s = backer->rt_support;
+    odp = s.odp;
+
+    /* ODP_SUPPORT_FIELDS */
+    str_value = xasprintf("%lu", odp.max_vlan_headers);
+    smap_add(cap, "max_vlan_headers", str_value);
+    free(str_value);
+
+    str_value = xasprintf("%lu", odp.max_mpls_depth);
+    smap_add(cap, "max_mpls_depth", str_value);
+    free(str_value);
+
+    smap_add(cap, "recirc", odp.recirc ? "true" : "false");
+    smap_add(cap, "ct_state", odp.ct_state ? "true" : "false");
+    smap_add(cap, "ct_zone", odp.ct_zone ? "true" : "false");
+    smap_add(cap, "ct_mark", odp.ct_mark ? "true" : "false");
+    smap_add(cap, "ct_label", odp.ct_label ? "true" : "false");
+    smap_add(cap, "ct_state_nat", odp.ct_state_nat ? "true" : "false");
+    smap_add(cap, "ct_orig_tuple", odp.ct_orig_tuple ? "true" : 
"false");
+    smap_add(cap, "ct_orig_tuple6", odp.ct_orig_tuple6 ? "true" : 
"false");

+
+    /* DPIF_SUPPORT_FIELDS */
+    smap_add(cap, "masked_set_action", s.masked_set_action ? "true" 
: "false");

+    smap_add(cap, "tnl_push_pop", s.tnl_push_pop ? "true" : "false");
+    smap_add(cap, "ufid", s.ufid ? "true" : "false");
+    smap_add(cap, "trunc", s.trunc ? "true" : "false");
+    smap_add(cap, "clone", s.clone ? "true" : "false");
+    smap_add(cap, "sample_nesting", s.sample_nesting ? "true" : 
"false");

+    smap_add(cap, "ct_eventmask", s.ct_eventmask ? "true" : "false");
+    smap_add(cap, "ct_clear", s.ct_clear ? "true" : "false");
+
+    str_value = xasprintf("%lu", s.max_hash_alg);
+    smap_add(cap, "max_hash_alg", str_value);
+    free(str_value);
+
+    smap_add(cap, "check_pkt_len", s.check_pkt_len ? "true" : "false");
+    smap_add(cap, "ct_timeout", s.ct_timeout ? 

Re: [ovs-dev] [PATCH] rhel: Support RHEL7.7 build and packaging

2019-10-16 Thread Gregory Rose



On 10/11/2019 2:49 PM, Yifeng Sun wrote:

This patch provides essential fixes for OVS to support
RHEL7.7's new kernel.

make rpm-fedora-kmod \
RPMBUILD_OPT='-D "kversion 3.10.0-1062.1.2.el7.x86_64"'

Signed-off-by: Yifeng Sun 


Thanks Yifeng!

Tested-by: Greg Rose 
Reviewed-by: Greg Rose 


---
  rhel/openvswitch-kmod-fedora.spec.in  |  9 +
  rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh | 14 ++
  2 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/rhel/openvswitch-kmod-fedora.spec.in 
b/rhel/openvswitch-kmod-fedora.spec.in
index b3588982ef7a..fbb8366990f1 100644
--- a/rhel/openvswitch-kmod-fedora.spec.in
+++ b/rhel/openvswitch-kmod-fedora.spec.in
@@ -12,8 +12,9 @@
  # Use the kversion macro such as
  # RPMBUILD_OPT='-D "kversion 3.10.0-693.1.1.el7.x86_64 
3.10.0-693.17.1.el7.x86_64"'
  # to build package for mulitple kernel versions in the same package
-# This only works for kernel 3.10.0 major revision 957 (RHEL 7.6),
-# major revision 693 (RHEL 7.4) and major revision 327 (RHEL 7.2).
+# This only works for kernel 3.10.0 major revision 1062 (RHEL 7.7),
+# major revision 957 (RHEL 7.6), major revision 693 (RHEL 7.4) and
+# major revision 327 (RHEL 7.2).
  # By default, build against the current running kernel version
  #%define kernel 3.1.5-1.fc16.x86_64
  #define kernel %{kernel_source}
@@ -92,8 +93,8 @@ if grep -qs "suse" /etc/os-release; then
  fi
  elif [ "$mainline_major" = "3" ] && [ "$mainline_minor" = "10" ] &&
   { [ "$major_rev" = "327" ] || [ "$major_rev" = "693" ] || \
-   [ "$major_rev" = "957" ]; }; then
-# For RHEL 7.2, 7.4 and 7.6
+   [ "$major_rev" = "957" ] || [ "$major_rev" == "1062" ]; }; then
+# For RHEL 7.2, 7.4, 7.6 and 7.7
  if [ -x "%{_datadir}/openvswitch/scripts/ovs-kmod-manage.sh" ]; then
  %{_datadir}/openvswitch/scripts/ovs-kmod-manage.sh
  fi
diff --git a/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh 
b/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh
index 693fb0b744b3..a643b55ff0f8 100644
--- a/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh
+++ b/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh
@@ -15,9 +15,10 @@
  # limitations under the License.
  
  # This script is intended to be used on the following kernels.

-#   - 3.10.0 major revision 327 (RHEL 7.2)
-#   - 3.10.0 major revision 693 (RHEL 7.4)
-#   - 3.10.0 major revision 957 (RHEL 7.6)
+#   - 3.10.0 major revision 327  (RHEL 7.2)
+#   - 3.10.0 major revision 693  (RHEL 7.4)
+#   - 3.10.0 major revision 957  (RHEL 7.6)
+#   - 3.10.0 major revision 1062 (RHEL 7.7)
  #   - 4.4.x,  x >= 73   (SLES 12 SP3)
  #   - 4.12.x, x >= 14   (SLES 12 SP4).
  # It is packaged in the openvswitch kmod RPM and run in the post-install
@@ -100,6 +101,11 @@ if [ "$mainline_major" = "3" ] && [ "$mainline_minor" = 
"10" ]; then
  comp_ver=10
  ver_offset=4
  installed_ver="$minor_rev"
+elif [ "$major_rev" = "1062" ]; then
+#echo "rhel77"
+comp_ver=10
+ver_offset=4
+installed_ver="$minor_rev"
  fi
  elif [ "$mainline_major" = "4" ] && [ "$mainline_minor" = "4" ]; then
  if [ "$mainline_patch" -ge "73" ]; then
@@ -111,7 +117,7 @@ elif [ "$mainline_major" = "4" ] && [ "$mainline_minor" = 
"4" ]; then
  elif [ "$mainline_major" = "4" ] && [ "$mainline_minor" = "12" ]; then
  if [ "$mainline_patch" -ge "14" ]; then
  #echo "sles12sp4"
-comp_ver=14
+comp_ver=1
  ver_offset=2
  installed_ver="$mainline_patch"
  fi


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCHv5] netdev-afxdp: Add need_wakeup supprt.

2019-10-16 Thread Loftus, Ciara
> 
> The patch adds support for using need_wakeup flag in AF_XDP rings.
> A new option, use_need_wakeup, is added. When this option is used,
> it means that OVS has to explicitly wake up the kernel RX, using poll()
> syscall and wake up TX, using sendto() syscall. This feature improves
> the performance by avoiding unnecessary sendto syscalls for TX.
> For RX, instead of kernel always busy-spinning on fille queue, OVS wakes
> up the kernel RX processing when fill queue is replenished.
> 
> The need_wakeup feature is merged into Linux kernel bpf-next tee with
> commit
> 77cd0d7b3f25 ("xsk: add support for need_wakeup flag in AF_XDP rings")
> and
> OVS enables it by default. Running the feature before this version causes
> xsk bind fails, please use options:use_need_wakeup=false to disable it.
> If users enable it but runs in an older version of libbpf, then the
> need_wakeup feature has no effect, and a warning message is logged.
> 
> For virtual interface, it's better set use_need_wakeup=false, since
> the virtual device's AF_XDP xmit is synchronous: the sendto syscall
> enters kernel and process the TX packet on tx queue directly.
> 
> On Intel Xeon E5-2620 v3 2.4GHz system, performance of physical port
> to physical port improves from 6.1Mpps to 7.3Mpps.


Hi William,

Thanks for the patch. 
I'm wondering if/how you've pinned the IRQs for the physical ports in the test 
above.
Are the IRQs pinned to standalone cores or does the PMD thread share the same 
core?
I'd be interested in seeing performance results for both, if you have them.

Thanks,
Ciara

> 
> Suggested-by: Ilya Maximets 
> Signed-off-by: William Tu 
> ---

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v9 2/2] netdev-dpdk:Detailed packet drop statistics

2019-10-16 Thread Sriram Vatala via dev
Hi Kevin & Ilya,
Thanks Kevin for reviewing the patch. Response inline.
Thanks Ilya for your response.

-Original Message-
From: Ilya Maximets 
Sent: 16 October 2019 17:46
To: Kevin Traynor ; Sriram Vatala 
; ovs-dev@openvswitch.org; Ilya Maximets 

Cc: Stokes, Ian 
Subject: Re: [PATCH v9 2/2] netdev-dpdk:Detailed packet drop statistics

Hi Kevin,

Thanks for review. Some comments inline.

On 16.10.2019 12:15, Kevin Traynor wrote:
> Hi Sriram,
>
> Thanks for working on making more fine grained stats for debugging. As
> mentioned yesterday, this patch needs rebase so I just reviewed docs
> and netdev-dpdk.c which applied. Comments below.
>
> On 21/09/2019 03:40, Sriram Vatala wrote:
>> OVS may be unable to transmit packets for multiple reasons and today
>> there is a single counter to track packets dropped due to any of
>> those reasons. The most common reason is that a VM is unable to read
>> packets fast enough causing the vhostuser port transmit queue on the
>> OVS side to become full. This manifests as a problem with VNFs not
>> receiving all packets. Having a separate drop counter to track
>> packets dropped because the transmit queue is full will clearly
>> indicate that the problem is on the VM side and not in OVS. Similarly
>> maintaining separate
>

> It reads like a bit of a contradiction. On one hand the "The *most
> common* reason is that a VM is unable to read packets fast enough".
> While having a stat "*will clearly indicate* that the problem is on
> the VM side".
>

I mentioned one use-case to emphasis the need for separate drop stats 
besides combined drop counter. I will rephrase the commit message.

>> counters for all possible drops helps in indicating sensible cause
>> for packet drops.
>>
>> This patch adds custom software stats counters to track packets
>> dropped at port level and these counters are displayed along with
>> other stats in "ovs-vsctl get interface  statistics"
>> command. The detailed stats will be available for both dpdk and
>> vhostuser ports.
>>
>
> I think the commit msg could be a bit clearer, suggest something like
> below - feel free to modify (or reject):
>
> OVS may be unable to transmit packets for multiple reasons on the DPDK
> datapath and today there is a single counter to track packets dropped
> due to any of those reasons.
>
> This patch adds custom software stats for the different reasons
> packets may be dropped during tx on the DPDK datapath in OVS.
>
> - MTU drops : drops that occur due to a too large packet size
> - Qos drops: drops that occur due to egress QOS
> - Tx failures: drops as returned by the DPDK PMD send function
>
> Note that the reason for tx failures is not specificied in OVS. In
> practice for vhost ports it is most common that tx failures are
> because there are not enough available descriptors, which is usually
> caused by misconfiguration of the guest queues and/or because the
> guest is not consuming packets fast enough from the queues.
>
> These counters are displayed along with other stats in "ovs-vsctl get
> interface  statistics" command and are available for dpdk and
> vhostuser/vhostuserclient ports.
>
Looks good to me. Thanks for the suggestion.
> Signed-off-by: Sriram Vatala 
>
> ---
>
> v9:...
> v8:...
>
>
> Note that signed-off, --- and version info should be like this ^^^.
> otherwise the review version comments will be in the git commit log
> when it is applied.
>
   I will take care of this in my next patch.
>> --
>> Changes since v8:
>> Addressed comments given by Ilya.
>>
>> Signed-off-by: Sriram Vatala 
>> ---
>>   Documentation/topics/dpdk/vhost-user.rst  | 13 ++-
>>   lib/netdev-dpdk.c | 81 +++
>>   utilities/bugtool/automake.mk |  3 +-
>>   utilities/bugtool/ovs-bugtool-get-port-stats  | 25 ++
>>   .../plugins/network-status/openvswitch.xml|  1 +
>>   5 files changed, 105 insertions(+), 18 deletions(-)
>>   create mode 100755 utilities/bugtool/ovs-bugtool-get-port-stats
>>
>> diff --git a/Documentation/topics/dpdk/vhost-user.rst
>> b/Documentation/topics/dpdk/vhost-user.rst
>> index 724aa62f6..89388a2bf 100644
>> --- a/Documentation/topics/dpdk/vhost-user.rst
>> +++ b/Documentation/topics/dpdk/vhost-user.rst
>> @@ -551,7 +551,18 @@ processing packets at the required rate.
>>   The amount of Tx retries on a vhost-user or vhost-user-client interface 
>> can be
>>   shown with::
>>
>> -  $ ovs-vsctl get Interface dpdkvhostclient0 statistics:tx_retries
>> +  $ ovs-vsctl get Interface dpdkvhostclient0
>> + statistics:netdev_dpdk_tx_retries
>
> I think the "netdev_dpdk_" prefixes should be removed for a few
> reasons,
>
> - These are custom stats that will only be displayed for the dpdk
> ports so they don't need any extra prefix to say they are for dpdk
> ports
>
> - The 'tx_retries' stat is part of OVS 2.12, I don't like to change
> its name to a different one in OVS 2.13 unless there is a very good
> reason
>
> - The existing 

Re: [ovs-dev] [PATCH ovn v2] ovn-northd: Fix IP multicast flooding to mrouter.

2019-10-16 Thread Dumitru Ceara
On Wed, Oct 16, 2019 at 4:04 PM Numan Siddique  wrote:
>
>
>
> On Wed, Oct 16, 2019 at 6:36 PM Dumitru Ceara  wrote:
>>
>> OVN logical flow "drop" actions can't be combined with other actions.
>> Commit 79308138891a created such a scenario if a logical switch has
>> mcast_snoop=true, mcast_flood_unregistered=false and is connected to a
>> logical router with mcast_relay=enabled.
>>
>> To fix the issue we now explicitly add a drop flow for unregistered IP
>> multicast traffic in a logical switch if mcast_snoop=true,
>> mcast_flood_unregistered=false and the switch doesn't have any ports
>> with mcast_flood=true and isn't connected to a router with
>> mcast_relay=true.
>>
>> Fixes: 79308138891a ("ovn-northd: Add static IP multicast flood 
>> configuration")
>> Signed-off-by: Dumitru Ceara 
>>
>
> Thanks. I applied this to master.
>
> Numan

Thanks!

>
>>
>> ---
>> v2: Address Numan's comments (update documentation).
>> ---
>>  northd/ovn-northd.8.xml | 13 +
>>  northd/ovn-northd.c |  8 +++-
>>  tests/ovn.at| 50 
>> ++---
>>  3 files changed, 67 insertions(+), 4 deletions(-)
>>
>> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
>> index 937702e..b5dfcd1 100644
>> --- a/northd/ovn-northd.8.xml
>> +++ b/northd/ovn-northd.8.xml
>> @@ -992,6 +992,19 @@ output;
>>
>>
>>
>> +A priority-80 flow that drops all unregistered IP multicast traffic
>> +if 
>> +:mcast_snoop='true' and
>> +
>> +:mcast_flood_unregistered='false' and the switch is
>> +not connected to a logical router that has
>> +
>> +:mcast_relay='true' and the switch doesn't have any
>> +logical port with > table="Logical_Switch_Port"/>
>> +:mcast_flood='true'.
>> +  
>> +
>> +  
>>  A priority-70 flow that outputs all packets with an Ethernet 
>> broadcast
>>  or multicast eth.dst to the MC_FLOOD
>>  multicast group.
>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>> index e41c9d7..d0844dd 100644
>> --- a/northd/ovn-northd.c
>> +++ b/northd/ovn-northd.c
>> @@ -5661,7 +5661,13 @@ build_lswitch_flows(struct hmap *datapaths, struct 
>> hmap *ports,
>>
>>  if (mcast_sw_info->flood_static) {
>>  ds_put_cstr(, "outport =\""MC_STATIC"\"; 
>> output;");
>> -} else {
>> +}
>> +
>> +/* Explicitly drop the traffic if relay or static flooding
>> + * is not configured.
>> + */
>> +if (!mcast_sw_info->flood_relay &&
>> +!mcast_sw_info->flood_static) {
>>  ds_put_cstr(, "drop;");
>>  }
>>
>> diff --git a/tests/ovn.at b/tests/ovn.at
>> index df00517..d141367 100644
>> --- a/tests/ovn.at
>> +++ b/tests/ovn.at
>> @@ -16306,7 +16306,7 @@ sleep 1
>>  OVN_CHECK_PACKETS([hv1/vif3-tx.pcap], [expected])
>>  OVN_CHECK_PACKETS([hv2/vif3-tx.pcap], [expected])
>>
>> -# Dissable IGMP querier on sw2.
>> +# Disable IGMP querier on sw2.
>>  ovn-nbctl set Logical_Switch sw2 \
>>  other_config:mcast_querier="false"
>>
>> @@ -16357,6 +16357,50 @@ send_igmp_v3_report hv2-vif3 hv2 \
>>  0001 $(ip_to_hex 10 0 0 1) f9f8 \
>>  $(ip_to_hex 239 0 1 68) 04 e9b9 \
>>  /dev/null
>> +
>> +# Check that the IGMP Group is learned by all switches.
>> +OVS_WAIT_UNTIL([
>> +total_entries=`ovn-sbctl find IGMP_Group | grep "239.0.1.68" | wc -l`
>> +test "${total_entries}" = "2"
>> +])
>> +
>> +# Send traffic from sw3 and make sure it is relayed by rtr.
>> +# to ports that joined.
>> +truncate -s 0 expected_routed_sw1
>> +truncate -s 0 expected_routed_sw2
>> +truncate -s 0 expected_empty
>> +
>> +as hv1 reset_pcap_file hv1-vif1 hv1/vif1
>> +as hv1 reset_pcap_file hv1-vif2 hv1/vif2
>> +as hv1 reset_pcap_file hv1-vif3 hv1/vif3
>> +as hv1 reset_pcap_file hv1-vif4 hv1/vif4
>> +as hv2 reset_pcap_file hv2-vif1 hv2/vif1
>> +as hv2 reset_pcap_file hv2-vif2 hv2/vif2
>> +as hv2 reset_pcap_file hv2-vif3 hv2/vif3
>> +as hv2 reset_pcap_file hv2-vif4 hv2/vif4
>> +
>> +send_ip_multicast_pkt hv2-vif4 hv2 \
>> +0001 01005e000144 \
>> +$(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e 20 ca70 11 \
>> +e518e518000a3b3a
>> +store_ip_multicast_pkt \
>> +0100 01005e000144 \
>> +$(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e 1f cb70 11 \
>> +e518e518000a3b3a expected_routed_sw1
>> +store_ip_multicast_pkt \
>> +0200 01005e000144 \
>> +$(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e 1f cb70 11 \
>> +e518e518000a3b3a expected_routed_sw2
>> +
>> +OVN_CHECK_PACKETS([hv1/vif1-tx.pcap], [expected_routed_sw1])
>> +OVN_CHECK_PACKETS([hv2/vif3-tx.pcap], [expected_routed_sw2])
>> +OVN_CHECK_PACKETS([hv1/vif4-tx.pcap], [expected_empty])
>> +OVN_CHECK_PACKETS([hv1/vif2-tx.pcap], [expected_empty])
>> 

Re: [ovs-dev] [PATCH ovn v2] ovn-northd: Fix IP multicast flooding to mrouter.

2019-10-16 Thread Numan Siddique
On Wed, Oct 16, 2019 at 6:36 PM Dumitru Ceara  wrote:

> OVN logical flow "drop" actions can't be combined with other actions.
> Commit 79308138891a created such a scenario if a logical switch has
> mcast_snoop=true, mcast_flood_unregistered=false and is connected to a
> logical router with mcast_relay=enabled.
>
> To fix the issue we now explicitly add a drop flow for unregistered IP
> multicast traffic in a logical switch if mcast_snoop=true,
> mcast_flood_unregistered=false and the switch doesn't have any ports
> with mcast_flood=true and isn't connected to a router with
> mcast_relay=true.
>
> Fixes: 79308138891a ("ovn-northd: Add static IP multicast flood
> configuration")
> Signed-off-by: Dumitru Ceara 
>
>
Thanks. I applied this to master.

Numan


> ---
> v2: Address Numan's comments (update documentation).
> ---
>  northd/ovn-northd.8.xml | 13 +
>  northd/ovn-northd.c |  8 +++-
>  tests/ovn.at| 50
> ++---
>  3 files changed, 67 insertions(+), 4 deletions(-)
>
> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> index 937702e..b5dfcd1 100644
> --- a/northd/ovn-northd.8.xml
> +++ b/northd/ovn-northd.8.xml
> @@ -992,6 +992,19 @@ output;
>
>
>
> +A priority-80 flow that drops all unregistered IP multicast
> traffic
> +if 
> +:mcast_snoop='true' and
> +
> +:mcast_flood_unregistered='false' and the switch is
> +not connected to a logical router that has
> +
> +:mcast_relay='true' and the switch doesn't have any
> +logical port with  table="Logical_Switch_Port"/>
> +:mcast_flood='true'.
> +  
> +
> +  
>  A priority-70 flow that outputs all packets with an Ethernet
> broadcast
>  or multicast eth.dst to the MC_FLOOD
>  multicast group.
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index e41c9d7..d0844dd 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -5661,7 +5661,13 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
>
>  if (mcast_sw_info->flood_static) {
>  ds_put_cstr(, "outport =\""MC_STATIC"\";
> output;");
> -} else {
> +}
> +
> +/* Explicitly drop the traffic if relay or static flooding
> + * is not configured.
> + */
> +if (!mcast_sw_info->flood_relay &&
> +!mcast_sw_info->flood_static) {
>  ds_put_cstr(, "drop;");
>  }
>
> diff --git a/tests/ovn.at b/tests/ovn.at
> index df00517..d141367 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -16306,7 +16306,7 @@ sleep 1
>  OVN_CHECK_PACKETS([hv1/vif3-tx.pcap], [expected])
>  OVN_CHECK_PACKETS([hv2/vif3-tx.pcap], [expected])
>
> -# Dissable IGMP querier on sw2.
> +# Disable IGMP querier on sw2.
>  ovn-nbctl set Logical_Switch sw2 \
>  other_config:mcast_querier="false"
>
> @@ -16357,6 +16357,50 @@ send_igmp_v3_report hv2-vif3 hv2 \
>  0001 $(ip_to_hex 10 0 0 1) f9f8 \
>  $(ip_to_hex 239 0 1 68) 04 e9b9 \
>  /dev/null
> +
> +# Check that the IGMP Group is learned by all switches.
> +OVS_WAIT_UNTIL([
> +total_entries=`ovn-sbctl find IGMP_Group | grep "239.0.1.68" | wc -l`
> +test "${total_entries}" = "2"
> +])
> +
> +# Send traffic from sw3 and make sure it is relayed by rtr.
> +# to ports that joined.
> +truncate -s 0 expected_routed_sw1
> +truncate -s 0 expected_routed_sw2
> +truncate -s 0 expected_empty
> +
> +as hv1 reset_pcap_file hv1-vif1 hv1/vif1
> +as hv1 reset_pcap_file hv1-vif2 hv1/vif2
> +as hv1 reset_pcap_file hv1-vif3 hv1/vif3
> +as hv1 reset_pcap_file hv1-vif4 hv1/vif4
> +as hv2 reset_pcap_file hv2-vif1 hv2/vif1
> +as hv2 reset_pcap_file hv2-vif2 hv2/vif2
> +as hv2 reset_pcap_file hv2-vif3 hv2/vif3
> +as hv2 reset_pcap_file hv2-vif4 hv2/vif4
> +
> +send_ip_multicast_pkt hv2-vif4 hv2 \
> +0001 01005e000144 \
> +$(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e 20 ca70 11 \
> +e518e518000a3b3a
> +store_ip_multicast_pkt \
> +0100 01005e000144 \
> +$(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e 1f cb70 11 \
> +e518e518000a3b3a expected_routed_sw1
> +store_ip_multicast_pkt \
> +0200 01005e000144 \
> +$(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e 1f cb70 11 \
> +e518e518000a3b3a expected_routed_sw2
> +
> +OVN_CHECK_PACKETS([hv1/vif1-tx.pcap], [expected_routed_sw1])
> +OVN_CHECK_PACKETS([hv2/vif3-tx.pcap], [expected_routed_sw2])
> +OVN_CHECK_PACKETS([hv1/vif4-tx.pcap], [expected_empty])
> +OVN_CHECK_PACKETS([hv1/vif2-tx.pcap], [expected_empty])
> +OVN_CHECK_PACKETS([hv1/vif3-tx.pcap], [expected_empty])
> +OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected_empty])
> +OVN_CHECK_PACKETS([hv2/vif2-tx.pcap], [expected_empty])
> +OVN_CHECK_PACKETS([hv2/vif4-tx.pcap], [expected_empty])
> +

Re: [ovs-dev] [PATCH ovn] ovn-northd: Fix IP multicast flooding to mrouter.

2019-10-16 Thread Dumitru Ceara
On Wed, Oct 16, 2019 at 2:43 PM Numan Siddique  wrote:
>
>
>
> On Wed, Oct 16, 2019 at 4:11 PM Dumitru Ceara  wrote:
>>
>> OVN logical flow "drop" actions can't be combined with other actions.
>> Commit 79308138891a created such a scenario if a logical switch has
>> mcast_snoop=true, mcast_flood_unregistered=false and is connected to a
>> logical router with mcast_relay=enabled.
>>
>> To fix the issue we now explicitly add a drop flow for unregistered IP
>> multicast traffic in a logical switch if mcast_snoop=true,
>> mcast_flood_unregistered=false and the switch doesn't have any ports
>> with mcast_flood=true and isn't connected to a router with
>> mcast_relay=true.
>>
>> Fixes: 79308138891a ("ovn-northd: Add static IP multicast flood 
>> configuration")
>> Signed-off-by: Dumitru Ceara 
>
>
> Hi Dumitru,
>
> Can you please update the ovn-northd documentation too about the flow changes 
> ?
>
> Thanks
> Numan

Hi Numan,

Thanks for the review.
I sent a v2: https://patchwork.ozlabs.org/patch/1177904/

Thanks,
Dumitru

>
>>
>> ---
>>  northd/ovn-northd.c |  8 +++-
>>  tests/ovn.at| 50 +++---
>>  2 files changed, 54 insertions(+), 4 deletions(-)
>>
>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>> index e41c9d7..d0844dd 100644
>> --- a/northd/ovn-northd.c
>> +++ b/northd/ovn-northd.c
>> @@ -5661,7 +5661,13 @@ build_lswitch_flows(struct hmap *datapaths, struct 
>> hmap *ports,
>>
>>  if (mcast_sw_info->flood_static) {
>>  ds_put_cstr(, "outport =\""MC_STATIC"\"; 
>> output;");
>> -} else {
>> +}
>> +
>> +/* Explicitly drop the traffic if relay or static flooding
>> + * is not configured.
>> + */
>> +if (!mcast_sw_info->flood_relay &&
>> +!mcast_sw_info->flood_static) {
>>  ds_put_cstr(, "drop;");
>>  }
>>
>> diff --git a/tests/ovn.at b/tests/ovn.at
>> index df00517..d141367 100644
>> --- a/tests/ovn.at
>> +++ b/tests/ovn.at
>> @@ -16306,7 +16306,7 @@ sleep 1
>>  OVN_CHECK_PACKETS([hv1/vif3-tx.pcap], [expected])
>>  OVN_CHECK_PACKETS([hv2/vif3-tx.pcap], [expected])
>>
>> -# Dissable IGMP querier on sw2.
>> +# Disable IGMP querier on sw2.
>>  ovn-nbctl set Logical_Switch sw2 \
>>  other_config:mcast_querier="false"
>>
>> @@ -16357,6 +16357,50 @@ send_igmp_v3_report hv2-vif3 hv2 \
>>  0001 $(ip_to_hex 10 0 0 1) f9f8 \
>>  $(ip_to_hex 239 0 1 68) 04 e9b9 \
>>  /dev/null
>> +
>> +# Check that the IGMP Group is learned by all switches.
>> +OVS_WAIT_UNTIL([
>> +total_entries=`ovn-sbctl find IGMP_Group | grep "239.0.1.68" | wc -l`
>> +test "${total_entries}" = "2"
>> +])
>> +
>> +# Send traffic from sw3 and make sure it is relayed by rtr.
>> +# to ports that joined.
>> +truncate -s 0 expected_routed_sw1
>> +truncate -s 0 expected_routed_sw2
>> +truncate -s 0 expected_empty
>> +
>> +as hv1 reset_pcap_file hv1-vif1 hv1/vif1
>> +as hv1 reset_pcap_file hv1-vif2 hv1/vif2
>> +as hv1 reset_pcap_file hv1-vif3 hv1/vif3
>> +as hv1 reset_pcap_file hv1-vif4 hv1/vif4
>> +as hv2 reset_pcap_file hv2-vif1 hv2/vif1
>> +as hv2 reset_pcap_file hv2-vif2 hv2/vif2
>> +as hv2 reset_pcap_file hv2-vif3 hv2/vif3
>> +as hv2 reset_pcap_file hv2-vif4 hv2/vif4
>> +
>> +send_ip_multicast_pkt hv2-vif4 hv2 \
>> +0001 01005e000144 \
>> +$(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e 20 ca70 11 \
>> +e518e518000a3b3a
>> +store_ip_multicast_pkt \
>> +0100 01005e000144 \
>> +$(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e 1f cb70 11 \
>> +e518e518000a3b3a expected_routed_sw1
>> +store_ip_multicast_pkt \
>> +0200 01005e000144 \
>> +$(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e 1f cb70 11 \
>> +e518e518000a3b3a expected_routed_sw2
>> +
>> +OVN_CHECK_PACKETS([hv1/vif1-tx.pcap], [expected_routed_sw1])
>> +OVN_CHECK_PACKETS([hv2/vif3-tx.pcap], [expected_routed_sw2])
>> +OVN_CHECK_PACKETS([hv1/vif4-tx.pcap], [expected_empty])
>> +OVN_CHECK_PACKETS([hv1/vif2-tx.pcap], [expected_empty])
>> +OVN_CHECK_PACKETS([hv1/vif3-tx.pcap], [expected_empty])
>> +OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected_empty])
>> +OVN_CHECK_PACKETS([hv2/vif2-tx.pcap], [expected_empty])
>> +OVN_CHECK_PACKETS([hv2/vif4-tx.pcap], [expected_empty])
>> +
>>  # Inject IGMP Join for 239.0.1.68 on sw3-p1.
>>  send_igmp_v3_report hv1-vif4 hv1 \
>>  0001 $(ip_to_hex 10 0 0 1) f9f8 \
>> @@ -16369,8 +16413,8 @@ OVS_WAIT_UNTIL([
>>  test "${total_entries}" = "3"
>>  ])
>>
>> -# Send traffic from sw3 and make sure it is relayed by rtr.
>> -# and ports that joined.
>> +# Send traffic from sw3 and make sure it is relayed by rtr
>> +# to ports that joined.
>>  truncate -s 0 expected_routed_sw1
>>  truncate -s 0 expected_routed_sw2
>>  truncate -s 0 expected_switched
>> --
>> 1.8.3.1
>>
>> 

[ovs-dev] [PATCH ovn v2] ovn-northd: Fix IP multicast flooding to mrouter.

2019-10-16 Thread Dumitru Ceara
OVN logical flow "drop" actions can't be combined with other actions.
Commit 79308138891a created such a scenario if a logical switch has
mcast_snoop=true, mcast_flood_unregistered=false and is connected to a
logical router with mcast_relay=enabled.

To fix the issue we now explicitly add a drop flow for unregistered IP
multicast traffic in a logical switch if mcast_snoop=true,
mcast_flood_unregistered=false and the switch doesn't have any ports
with mcast_flood=true and isn't connected to a router with
mcast_relay=true.

Fixes: 79308138891a ("ovn-northd: Add static IP multicast flood configuration")
Signed-off-by: Dumitru Ceara 

---
v2: Address Numan's comments (update documentation).
---
 northd/ovn-northd.8.xml | 13 +
 northd/ovn-northd.c |  8 +++-
 tests/ovn.at| 50 ++---
 3 files changed, 67 insertions(+), 4 deletions(-)

diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 937702e..b5dfcd1 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -992,6 +992,19 @@ output;
   
 
   
+A priority-80 flow that drops all unregistered IP multicast traffic
+if 
+:mcast_snoop='true' and
+
+:mcast_flood_unregistered='false' and the switch is
+not connected to a logical router that has
+
+:mcast_relay='true' and the switch doesn't have any
+logical port with 
+:mcast_flood='true'.
+  
+
+  
 A priority-70 flow that outputs all packets with an Ethernet broadcast
 or multicast eth.dst to the MC_FLOOD
 multicast group.
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index e41c9d7..d0844dd 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -5661,7 +5661,13 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap 
*ports,
 
 if (mcast_sw_info->flood_static) {
 ds_put_cstr(, "outport =\""MC_STATIC"\"; output;");
-} else {
+}
+
+/* Explicitly drop the traffic if relay or static flooding
+ * is not configured.
+ */
+if (!mcast_sw_info->flood_relay &&
+!mcast_sw_info->flood_static) {
 ds_put_cstr(, "drop;");
 }
 
diff --git a/tests/ovn.at b/tests/ovn.at
index df00517..d141367 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -16306,7 +16306,7 @@ sleep 1
 OVN_CHECK_PACKETS([hv1/vif3-tx.pcap], [expected])
 OVN_CHECK_PACKETS([hv2/vif3-tx.pcap], [expected])
 
-# Dissable IGMP querier on sw2.
+# Disable IGMP querier on sw2.
 ovn-nbctl set Logical_Switch sw2 \
 other_config:mcast_querier="false"
 
@@ -16357,6 +16357,50 @@ send_igmp_v3_report hv2-vif3 hv2 \
 0001 $(ip_to_hex 10 0 0 1) f9f8 \
 $(ip_to_hex 239 0 1 68) 04 e9b9 \
 /dev/null
+
+# Check that the IGMP Group is learned by all switches.
+OVS_WAIT_UNTIL([
+total_entries=`ovn-sbctl find IGMP_Group | grep "239.0.1.68" | wc -l`
+test "${total_entries}" = "2"
+])
+
+# Send traffic from sw3 and make sure it is relayed by rtr.
+# to ports that joined.
+truncate -s 0 expected_routed_sw1
+truncate -s 0 expected_routed_sw2
+truncate -s 0 expected_empty
+
+as hv1 reset_pcap_file hv1-vif1 hv1/vif1
+as hv1 reset_pcap_file hv1-vif2 hv1/vif2
+as hv1 reset_pcap_file hv1-vif3 hv1/vif3
+as hv1 reset_pcap_file hv1-vif4 hv1/vif4
+as hv2 reset_pcap_file hv2-vif1 hv2/vif1
+as hv2 reset_pcap_file hv2-vif2 hv2/vif2
+as hv2 reset_pcap_file hv2-vif3 hv2/vif3
+as hv2 reset_pcap_file hv2-vif4 hv2/vif4
+
+send_ip_multicast_pkt hv2-vif4 hv2 \
+0001 01005e000144 \
+$(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e 20 ca70 11 \
+e518e518000a3b3a
+store_ip_multicast_pkt \
+0100 01005e000144 \
+$(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e 1f cb70 11 \
+e518e518000a3b3a expected_routed_sw1
+store_ip_multicast_pkt \
+0200 01005e000144 \
+$(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e 1f cb70 11 \
+e518e518000a3b3a expected_routed_sw2
+
+OVN_CHECK_PACKETS([hv1/vif1-tx.pcap], [expected_routed_sw1])
+OVN_CHECK_PACKETS([hv2/vif3-tx.pcap], [expected_routed_sw2])
+OVN_CHECK_PACKETS([hv1/vif4-tx.pcap], [expected_empty])
+OVN_CHECK_PACKETS([hv1/vif2-tx.pcap], [expected_empty])
+OVN_CHECK_PACKETS([hv1/vif3-tx.pcap], [expected_empty])
+OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected_empty])
+OVN_CHECK_PACKETS([hv2/vif2-tx.pcap], [expected_empty])
+OVN_CHECK_PACKETS([hv2/vif4-tx.pcap], [expected_empty])
+
 # Inject IGMP Join for 239.0.1.68 on sw3-p1.
 send_igmp_v3_report hv1-vif4 hv1 \
 0001 $(ip_to_hex 10 0 0 1) f9f8 \
@@ -16369,8 +16413,8 @@ OVS_WAIT_UNTIL([
 test "${total_entries}" = "3"
 ])
 
-# Send traffic from sw3 and make sure it is relayed by rtr.
-# and ports that joined.
+# Send traffic from sw3 and make sure it is relayed by rtr
+# to ports that 

Re: [ovs-dev] [dpdk-latest PATCH] netdev-dpdk: add custom vhost statistics to count IRQs

2019-10-16 Thread Ilya Maximets

On 15.10.2019 13:20, Eelco Chaudron wrote:

When the dpdk vhost library executes an eventfd_write() call,
i.e. waking up the guest, a new callback will be called.

This patch adds the callback to count the number of
interrupts sent to the VM to track the number of times
interrupts where generated.

This might be of interest to find out system-calls were
called in the DPDK fast path.

The custom statistics can be seen with the following command:

   ovs-ofctl -O OpenFlow14 dump-ports  {}

Signed-off-by: Eelco Chaudron 
---
  lib/netdev-dpdk.c |   27 ++-
  1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index ba92e89..7ebbcdc 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -165,6 +165,8 @@ static int new_device(int vid);
  static void destroy_device(int vid);
  static int vring_state_changed(int vid, uint16_t queue_id, int enable);
  static void destroy_connection(int vid);
+static void vhost_guest_notified(int vid);
+
  static const struct vhost_device_ops virtio_net_device_ops =
  {
  .new_device =  new_device,
@@ -173,6 +175,7 @@ static const struct vhost_device_ops virtio_net_device_ops =
  .features_changed = NULL,
  .new_connection = NULL,
  .destroy_connection = destroy_connection,
+.guest_notified = vhost_guest_notified,
  };
  
  enum { DPDK_RING_SIZE = 256 };

@@ -416,6 +419,8 @@ struct netdev_dpdk {
  struct netdev_stats stats;
  /* Custom stat for retries when unable to transmit. */
  uint64_t tx_retries;
+/* Custom stat counting number of times a vhost remote was woken up */
+uint64_t vhost_irqs;
  /* Protects stats */
  rte_spinlock_t stats_lock;
  /* 4 pad bytes here. */
@@ -2826,7 +2831,8 @@ netdev_dpdk_vhost_get_custom_stats(const struct netdev 
*netdev,
  int i;
  
  #define VHOST_CSTATS \

-VHOST_CSTAT(tx_retries)
+VHOST_CSTAT(tx_retries) \
+VHOST_CSTAT(vhost_irqs)
  
  #define VHOST_CSTAT(NAME) + 1

  custom_stats->size = VHOST_CSTATS;
@@ -3746,6 +3752,25 @@ destroy_connection(int vid)
  }
  }
  
+static

+void vhost_guest_notified(int vid)
+{
+struct netdev_dpdk *dev;
+
+ovs_mutex_lock(_mutex);
+LIST_FOR_EACH (dev, list_node, _list) {
+if (netdev_dpdk_get_vid(dev) == vid) {
+ovs_mutex_lock(>mutex);
+rte_spinlock_lock(>stats_lock);
+dev->vhost_irqs++;
+rte_spinlock_unlock(>stats_lock);
+ovs_mutex_unlock(>mutex);
+break;
+}
+}
+ovs_mutex_unlock(_mutex);


So, for every single eventfd_write() we're taking the global mutex,
traversing list of all the devices, taking one more mutex and finally
taking the spinlock just to increment a single counter.
I think, it's too much.

Wouldn't it significantly affect interrupt based virtio drivers in
terms of performance?  How frequently 2 vhost-user ports will lock
each other on the global device mutex?  How frequently 2 PMD threads
will lock each other on rx/tx to different vrings of the same vhost
port?

I see that 'guest_notified' patch is already in dpdk master, but
wouldn't it be better to accumulate this statistics inside DPDK
and return it with some new API like rte_vhost_get_vring_xstats()
if required?  I don't see any usecase for this callback beside
counting some stats.

For the current patch I could only suggest to convert it to simple
coverage counter like it done in 'vhost_tx_contention' patch here:
https://patchwork.ozlabs.org/patch/1175849/

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next v4 10/10] net: openvswitch: simplify the ovs_dp_cmd_new

2019-10-16 Thread xiangxia . m . yue
From: Tonghao Zhang 

use the specified functions to init resource.

Signed-off-by: Tonghao Zhang 
Tested-by: Greg Rose 
---
 net/openvswitch/datapath.c | 60 +-
 1 file changed, 38 insertions(+), 22 deletions(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index aeb76e4..4d48e48 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -1576,6 +1576,31 @@ static int ovs_dp_change(struct datapath *dp, struct 
nlattr *a[])
return 0;
 }
 
+static int ovs_dp_stats_init(struct datapath *dp)
+{
+   dp->stats_percpu = netdev_alloc_pcpu_stats(struct dp_stats_percpu);
+   if (!dp->stats_percpu)
+   return -ENOMEM;
+
+   return 0;
+}
+
+static int ovs_dp_vport_init(struct datapath *dp)
+{
+   int i;
+
+   dp->ports = kmalloc_array(DP_VPORT_HASH_BUCKETS,
+ sizeof(struct hlist_head),
+ GFP_KERNEL);
+   if (!dp->ports)
+   return -ENOMEM;
+
+   for (i = 0; i < DP_VPORT_HASH_BUCKETS; i++)
+   INIT_HLIST_HEAD(>ports[i]);
+
+   return 0;
+}
+
 static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
 {
struct nlattr **a = info->attrs;
@@ -1584,7 +1609,7 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
struct datapath *dp;
struct vport *vport;
struct ovs_net *ovs_net;
-   int err, i;
+   int err;
 
err = -EINVAL;
if (!a[OVS_DP_ATTR_NAME] || !a[OVS_DP_ATTR_UPCALL_PID])
@@ -1597,35 +1622,26 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
err = -ENOMEM;
dp = kzalloc(sizeof(*dp), GFP_KERNEL);
if (dp == NULL)
-   goto err_free_reply;
+   goto err_destroy_reply;
 
ovs_dp_set_net(dp, sock_net(skb->sk));
 
/* Allocate table. */
err = ovs_flow_tbl_init(>table);
if (err)
-   goto err_free_dp;
+   goto err_destroy_dp;
 
-   dp->stats_percpu = netdev_alloc_pcpu_stats(struct dp_stats_percpu);
-   if (!dp->stats_percpu) {
-   err = -ENOMEM;
+   err = ovs_dp_stats_init(dp);
+   if (err)
goto err_destroy_table;
-   }
 
-   dp->ports = kmalloc_array(DP_VPORT_HASH_BUCKETS,
- sizeof(struct hlist_head),
- GFP_KERNEL);
-   if (!dp->ports) {
-   err = -ENOMEM;
-   goto err_destroy_percpu;
-   }
-
-   for (i = 0; i < DP_VPORT_HASH_BUCKETS; i++)
-   INIT_HLIST_HEAD(>ports[i]);
+   err = ovs_dp_vport_init(dp);
+   if (err)
+   goto err_destroy_stats;
 
err = ovs_meters_init(dp);
if (err)
-   goto err_destroy_ports_array;
+   goto err_destroy_ports;
 
/* Set up our datapath device. */
parms.name = nla_data(a[OVS_DP_ATTR_NAME]);
@@ -1675,15 +1691,15 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
 
 err_destroy_meters:
ovs_meters_exit(dp);
-err_destroy_ports_array:
+err_destroy_ports:
kfree(dp->ports);
-err_destroy_percpu:
+err_destroy_stats:
free_percpu(dp->stats_percpu);
 err_destroy_table:
ovs_flow_tbl_destroy(>table);
-err_free_dp:
+err_destroy_dp:
kfree(dp);
-err_free_reply:
+err_destroy_reply:
kfree_skb(reply);
 err:
return err;
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next v4 09/10] net: openvswitch: don't unlock mutex when changing the user_features fails

2019-10-16 Thread xiangxia . m . yue
From: Tonghao Zhang 

Unlocking of a not locked mutex is not allowed.
Other kernel thread may be in critical section while
we unlock it because of setting user_feature fail.

Fixes: 95a7233c4 ("net: openvswitch: Set OvS recirc_id from tc chain index")
Cc: Paul Blakey 
Signed-off-by: Tonghao Zhang 
Tested-by: Greg Rose 
---
 net/openvswitch/datapath.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 9fea7e1..aeb76e4 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -1657,6 +1657,7 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
ovs_dp_reset_user_features(skb, info);
}
 
+   ovs_unlock();
goto err_destroy_meters;
}
 
@@ -1673,7 +1674,6 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
return 0;
 
 err_destroy_meters:
-   ovs_unlock();
ovs_meters_exit(dp);
 err_destroy_ports_array:
kfree(dp->ports);
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next v4 08/10] net: openvswitch: fix possible memleak on destroy flow-table

2019-10-16 Thread xiangxia . m . yue
From: Tonghao Zhang 

When we destroy the flow tables which may contain the flow_mask,
so release the flow mask struct.

Signed-off-by: Tonghao Zhang 
Tested-by: Greg Rose 
---
 net/openvswitch/flow_table.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 5df5182..d5d768e 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -295,6 +295,18 @@ static void table_instance_destroy(struct table_instance 
*ti,
}
 }
 
+static void tbl_mask_array_destroy(struct flow_table *tbl)
+{
+   struct mask_array *ma = ovsl_dereference(tbl->mask_array);
+   int i;
+
+   /* Free the flow-mask and kfree_rcu the NULL is allowed. */
+   for (i = 0; i < ma->max; i++)
+   kfree_rcu(rcu_dereference_raw(ma->masks[i]), rcu);
+
+   kfree_rcu(rcu_dereference_raw(tbl->mask_array), rcu);
+}
+
 /* No need for locking this function is called from RCU callback or
  * error path.
  */
@@ -304,7 +316,7 @@ void ovs_flow_tbl_destroy(struct flow_table *table)
struct table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti);
 
free_percpu(table->mask_cache);
-   kfree_rcu(rcu_dereference_raw(table->mask_array), rcu);
+   tbl_mask_array_destroy(table);
table_instance_destroy(ti, ufid_ti, false);
 }
 
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next v4 07/10] net: openvswitch: add likely in flow_lookup

2019-10-16 Thread xiangxia . m . yue
From: Tonghao Zhang 

The most case *index < ma->max, and flow-mask is not NULL.
We add un/likely for performance.

Signed-off-by: Tonghao Zhang 
Tested-by: Greg Rose 
---
 net/openvswitch/flow_table.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 3e3d345..5df5182 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -518,7 +518,7 @@ static struct sw_flow *flow_lookup(struct flow_table *tbl,
struct sw_flow_mask *mask;
int i;
 
-   if (*index < ma->max) {
+   if (likely(*index < ma->max)) {
mask = rcu_dereference_ovsl(ma->masks[*index]);
if (mask) {
flow = masked_flow_lookup(ti, key, mask, n_mask_hit);
@@ -533,7 +533,7 @@ static struct sw_flow *flow_lookup(struct flow_table *tbl,
continue;
 
mask = rcu_dereference_ovsl(ma->masks[i]);
-   if (!mask)
+   if (unlikely(!mask))
break;
 
flow = masked_flow_lookup(ti, key, mask, n_mask_hit);
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next v4 06/10] net: openvswitch: simplify the flow_hash

2019-10-16 Thread xiangxia . m . yue
From: Tonghao Zhang 

Simplify the code and remove the unnecessary BUILD_BUG_ON.

Signed-off-by: Tonghao Zhang 
Tested-by: Greg Rose 
---
 net/openvswitch/flow_table.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index a10d421..3e3d345 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -432,13 +432,9 @@ int ovs_flow_tbl_flush(struct flow_table *flow_table)
 static u32 flow_hash(const struct sw_flow_key *key,
 const struct sw_flow_key_range *range)
 {
-   int key_start = range->start;
-   int key_end = range->end;
-   const u32 *hash_key = (const u32 *)((const u8 *)key + key_start);
-   int hash_u32s = (key_end - key_start) >> 2;
-
+   const u32 *hash_key = (const u32 *)((const u8 *)key + range->start);
/* Make sure number of hash bytes are multiple of u32. */
-   BUILD_BUG_ON(sizeof(long) % sizeof(u32));
+   int hash_u32s = range_n_bytes(range) >> 2;
 
return jhash2(hash_key, hash_u32s, 0);
 }
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next v4 05/10] net: openvswitch: optimize flow-mask looking up

2019-10-16 Thread xiangxia . m . yue
From: Tonghao Zhang 

The full looking up on flow table traverses all mask array.
If mask-array is too large, the number of invalid flow-mask
increase, performance will be drop.

One bad case, for example: M means flow-mask is valid and NULL
of flow-mask means deleted.

+---+
| M | NULL | ...  | NULL | M|
+---+

In that case, without this patch, openvswitch will traverses all
mask array, because there will be one flow-mask in the tail. This
patch changes the way of flow-mask inserting and deleting, and the
mask array will be keep as below: there is not a NULL hole. In the
fast path, we can "break" "for" (not "continue") in flow_lookup
when we get a NULL flow-mask.

 "break"
v
+---+
| M | M |  NULL |...   | NULL | NULL|
+---+

This patch don't optimize slow or control path, still using ma->max
to traverse. Slow path:
* tbl_mask_array_realloc
* ovs_flow_tbl_lookup_exact
* flow_mask_find

Signed-off-by: Tonghao Zhang 
Tested-by: Greg Rose 
---
 net/openvswitch/flow_table.c | 101 ++-
 1 file changed, 52 insertions(+), 49 deletions(-)

diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 8d4f50d..a10d421 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -538,7 +538,7 @@ static struct sw_flow *flow_lookup(struct flow_table *tbl,
 
mask = rcu_dereference_ovsl(ma->masks[i]);
if (!mask)
-   continue;
+   break;
 
flow = masked_flow_lookup(ti, key, mask, n_mask_hit);
if (flow) { /* Found */
@@ -695,7 +695,7 @@ struct sw_flow *ovs_flow_tbl_lookup_ufid(struct flow_table 
*tbl,
 int ovs_flow_tbl_num_masks(const struct flow_table *table)
 {
struct mask_array *ma = rcu_dereference_ovsl(table->mask_array);
-   return ma->count;
+   return READ_ONCE(ma->count);
 }
 
 static struct table_instance *table_instance_expand(struct table_instance *ti,
@@ -704,21 +704,33 @@ static struct table_instance 
*table_instance_expand(struct table_instance *ti,
return table_instance_rehash(ti, ti->n_buckets * 2, ufid);
 }
 
-static void tbl_mask_array_delete_mask(struct mask_array *ma,
-  struct sw_flow_mask *mask)
+static void tbl_mask_array_del_mask(struct flow_table *tbl,
+   struct sw_flow_mask *mask)
 {
-   int i;
+   struct mask_array *ma = ovsl_dereference(tbl->mask_array);
+   int i, ma_count = READ_ONCE(ma->count);
 
/* Remove the deleted mask pointers from the array */
-   for (i = 0; i < ma->max; i++) {
-   if (mask == ovsl_dereference(ma->masks[i])) {
-   RCU_INIT_POINTER(ma->masks[i], NULL);
-   ma->count--;
-   kfree_rcu(mask, rcu);
-   return;
-   }
+   for (i = 0; i < ma_count; i++) {
+   if (mask == ovsl_dereference(ma->masks[i]))
+   goto found;
}
+
BUG();
+   return;
+
+found:
+   WRITE_ONCE(ma->count, ma_count -1);
+
+   rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]);
+   RCU_INIT_POINTER(ma->masks[ma_count -1], NULL);
+
+   kfree_rcu(mask, rcu);
+
+   /* Shrink the mask array if necessary. */
+   if (ma->max >= (MASK_ARRAY_SIZE_MIN * 2) &&
+   ma_count <= (ma->max / 3))
+   tbl_mask_array_realloc(tbl, ma->max / 2);
 }
 
 /* Remove 'mask' from the mask list, if it is not needed any more. */
@@ -732,17 +744,8 @@ static void flow_mask_remove(struct flow_table *tbl, 
struct sw_flow_mask *mask)
BUG_ON(!mask->ref_count);
mask->ref_count--;
 
-   if (!mask->ref_count) {
-   struct mask_array *ma;
-
-   ma = ovsl_dereference(tbl->mask_array);
-   tbl_mask_array_delete_mask(ma, mask);
-
-   /* Shrink the mask array if necessary. */
-   if (ma->max >= (MASK_ARRAY_SIZE_MIN * 2) &&
-   ma->count <= (ma->max / 3))
-   tbl_mask_array_realloc(tbl, ma->max / 2);
-   }
+   if (!mask->ref_count)
+   tbl_mask_array_del_mask(tbl, mask);
}
 }
 
@@ -806,6 +809,29 @@ static struct sw_flow_mask *flow_mask_find(const struct 
flow_table *tbl,
return NULL;
 }
 
+static int tbl_mask_array_add_mask(struct flow_table *tbl,
+  struct sw_flow_mask *new)
+{
+   struct mask_array *ma = ovsl_dereference(tbl->mask_array);
+   int err, ma_count = READ_ONCE(ma->count);
+
+   if (ma_count >= ma->max) {
+   err = 

[ovs-dev] [PATCH net-next v4 04/10] net: openvswitch: optimize flow mask cache hash collision

2019-10-16 Thread xiangxia . m . yue
From: Tonghao Zhang 

Port the codes to linux upstream and with little changes.

Pravin B Shelar, says:
| In case hash collision on mask cache, OVS does extra flow
| lookup. Following patch avoid it.

Link: 
https://github.com/openvswitch/ovs/commit/0e6efbe2712da03522532dc5e84806a96f6a0dd1
Signed-off-by: Tonghao Zhang 
Tested-by: Greg Rose 
---
 net/openvswitch/flow_table.c | 95 
 1 file changed, 53 insertions(+), 42 deletions(-)

diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 237cf85..8d4f50d 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -508,6 +508,9 @@ static struct sw_flow *masked_flow_lookup(struct 
table_instance *ti,
return NULL;
 }
 
+/* Flow lookup does full lookup on flow table. It starts with
+ * mask from index passed in *index.
+ */
 static struct sw_flow *flow_lookup(struct flow_table *tbl,
   struct table_instance *ti,
   struct mask_array *ma,
@@ -516,18 +519,31 @@ static struct sw_flow *flow_lookup(struct flow_table *tbl,
   u32 *index)
 {
struct sw_flow *flow;
+   struct sw_flow_mask *mask;
int i;
 
-   for (i = 0; i < ma->max; i++)  {
-   struct sw_flow_mask *mask;
-
-   mask = rcu_dereference_ovsl(ma->masks[i]);
+   if (*index < ma->max) {
+   mask = rcu_dereference_ovsl(ma->masks[*index]);
if (mask) {
flow = masked_flow_lookup(ti, key, mask, n_mask_hit);
-   if (flow) { /* Found */
-   *index = i;
+   if (flow)
return flow;
-   }
+   }
+   }
+
+   for (i = 0; i < ma->max; i++)  {
+
+   if (i == *index)
+   continue;
+
+   mask = rcu_dereference_ovsl(ma->masks[i]);
+   if (!mask)
+   continue;
+
+   flow = masked_flow_lookup(ti, key, mask, n_mask_hit);
+   if (flow) { /* Found */
+   *index = i;
+   return flow;
}
}
 
@@ -546,58 +562,54 @@ struct sw_flow *ovs_flow_tbl_lookup_stats(struct 
flow_table *tbl,
  u32 skb_hash,
  u32 *n_mask_hit)
 {
-   struct mask_array *ma = rcu_dereference_ovsl(tbl->mask_array);
-   struct table_instance *ti = rcu_dereference_ovsl(tbl->ti);
-   struct mask_cache_entry  *entries, *ce, *del;
+   struct mask_array *ma = rcu_dereference(tbl->mask_array);
+   struct table_instance *ti = rcu_dereference(tbl->ti);
+   struct mask_cache_entry *entries, *ce;
struct sw_flow *flow;
-   u32 hash = skb_hash;
+   u32 hash;
int seg;
 
*n_mask_hit = 0;
if (unlikely(!skb_hash)) {
-   u32 __always_unused mask_index;
+   u32 mask_index = 0;
 
return flow_lookup(tbl, ti, ma, key, n_mask_hit, _index);
}
 
-   del = NULL;
+   /* Pre and post recirulation flows usually have the same skb_hash
+* value. To avoid hash collisions, rehash the 'skb_hash' with
+* 'recirc_id'.  */
+   if (key->recirc_id)
+   skb_hash = jhash_1word(skb_hash, key->recirc_id);
+
+   ce = NULL;
+   hash = skb_hash;
entries = this_cpu_ptr(tbl->mask_cache);
 
+   /* Find the cache entry 'ce' to operate on. */
for (seg = 0; seg < MC_HASH_SEGS; seg++) {
-   int index;
-
-   index = hash & (MC_HASH_ENTRIES - 1);
-   ce = [index];
-
-   if (ce->skb_hash == skb_hash) {
-   struct sw_flow_mask *mask;
-   struct sw_flow *flow;
-
-   mask = rcu_dereference_ovsl(ma->masks[ce->mask_index]);
-   if (mask) {
-   flow = masked_flow_lookup(ti, key, mask,
- n_mask_hit);
-   if (flow)  /* Found */
-   return flow;
-   }
-
-   del = ce;
-   break;
+   int index = hash & (MC_HASH_ENTRIES - 1);
+   struct mask_cache_entry *e;
+
+   e = [index];
+   if (e->skb_hash == skb_hash) {
+   flow = flow_lookup(tbl, ti, ma, key, n_mask_hit,
+  >mask_index);
+   if (!flow)
+   e->skb_hash = 0;
+   return flow;
}
 
-   if (!del || (del->skb_hash && !ce->skb_hash) ||
-   (rcu_dereference_ovsl(ma->masks[del->mask_index]) &&
-   

[ovs-dev] [PATCH net-next v4 03/10] net: openvswitch: shrink the mask array if necessary

2019-10-16 Thread xiangxia . m . yue
From: Tonghao Zhang 

When creating and inserting flow-mask, if there is no available
flow-mask, we realloc the mask array. When removing flow-mask,
if necessary, we shrink mask array.

Signed-off-by: Tonghao Zhang 
Tested-by: Greg Rose 
---
 net/openvswitch/flow_table.c | 33 +++--
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 0d1df53..237cf85 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -693,6 +693,23 @@ static struct table_instance *table_instance_expand(struct 
table_instance *ti,
return table_instance_rehash(ti, ti->n_buckets * 2, ufid);
 }
 
+static void tbl_mask_array_delete_mask(struct mask_array *ma,
+  struct sw_flow_mask *mask)
+{
+   int i;
+
+   /* Remove the deleted mask pointers from the array */
+   for (i = 0; i < ma->max; i++) {
+   if (mask == ovsl_dereference(ma->masks[i])) {
+   RCU_INIT_POINTER(ma->masks[i], NULL);
+   ma->count--;
+   kfree_rcu(mask, rcu);
+   return;
+   }
+   }
+   BUG();
+}
+
 /* Remove 'mask' from the mask list, if it is not needed any more. */
 static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
 {
@@ -706,18 +723,14 @@ static void flow_mask_remove(struct flow_table *tbl, 
struct sw_flow_mask *mask)
 
if (!mask->ref_count) {
struct mask_array *ma;
-   int i;
 
ma = ovsl_dereference(tbl->mask_array);
-   for (i = 0; i < ma->max; i++) {
-   if (mask == ovsl_dereference(ma->masks[i])) {
-   RCU_INIT_POINTER(ma->masks[i], NULL);
-   ma->count--;
-   kfree_rcu(mask, rcu);
-   return;
-   }
-   }
-   BUG();
+   tbl_mask_array_delete_mask(ma, mask);
+
+   /* Shrink the mask array if necessary. */
+   if (ma->max >= (MASK_ARRAY_SIZE_MIN * 2) &&
+   ma->count <= (ma->max / 3))
+   tbl_mask_array_realloc(tbl, ma->max / 2);
}
}
 }
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next v4 02/10] net: openvswitch: convert mask list in mask array

2019-10-16 Thread xiangxia . m . yue
From: Tonghao Zhang 

Port the codes to linux upstream and with little changes.

Pravin B Shelar, says:
| mask caches index of mask in mask_list. On packet recv OVS
| need to traverse mask-list to get cached mask. Therefore array
| is better for retrieving cached mask. This also allows better
| cache replacement algorithm by directly checking mask's existence.

Link: 
https://github.com/openvswitch/ovs/commit/d49fc3ff53c65e4eca9cabd52ac63396746a7ef5
Signed-off-by: Tonghao Zhang 
Tested-by: Greg Rose 
---
 net/openvswitch/flow.h   |   1 -
 net/openvswitch/flow_table.c | 210 ---
 net/openvswitch/flow_table.h |   8 +-
 3 files changed, 167 insertions(+), 52 deletions(-)

diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h
index b830d5f..8080518 100644
--- a/net/openvswitch/flow.h
+++ b/net/openvswitch/flow.h
@@ -166,7 +166,6 @@ struct sw_flow_key_range {
 struct sw_flow_mask {
int ref_count;
struct rcu_head rcu;
-   struct list_head list;
struct sw_flow_key_range range;
struct sw_flow_key key;
 };
diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 3d515c0..0d1df53 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -34,6 +34,7 @@
 #include 
 
 #define TBL_MIN_BUCKETS1024
+#define MASK_ARRAY_SIZE_MIN16
 #define REHASH_INTERVAL(10 * 60 * HZ)
 
 #define MC_HASH_SHIFT  8
@@ -168,9 +169,51 @@ static struct table_instance *table_instance_alloc(int 
new_size)
return ti;
 }
 
+static struct mask_array *tbl_mask_array_alloc(int size)
+{
+   struct mask_array *new;
+
+   size = max(MASK_ARRAY_SIZE_MIN, size);
+   new = kzalloc(sizeof(struct mask_array) +
+ sizeof(struct sw_flow_mask *) * size, GFP_KERNEL);
+   if (!new)
+   return NULL;
+
+   new->count = 0;
+   new->max = size;
+
+   return new;
+}
+
+static int tbl_mask_array_realloc(struct flow_table *tbl, int size)
+{
+   struct mask_array *old;
+   struct mask_array *new;
+
+   new = tbl_mask_array_alloc(size);
+   if (!new)
+   return -ENOMEM;
+
+   old = ovsl_dereference(tbl->mask_array);
+   if (old) {
+   int i;
+
+   for (i = 0; i < old->max; i++) {
+   if (ovsl_dereference(old->masks[i]))
+   new->masks[new->count++] = old->masks[i];
+   }
+   }
+
+   rcu_assign_pointer(tbl->mask_array, new);
+   kfree_rcu(old, rcu);
+
+   return 0;
+}
+
 int ovs_flow_tbl_init(struct flow_table *table)
 {
struct table_instance *ti, *ufid_ti;
+   struct mask_array *ma;
 
table->mask_cache = __alloc_percpu(sizeof(struct mask_cache_entry) *
   MC_HASH_ENTRIES,
@@ -178,9 +221,13 @@ int ovs_flow_tbl_init(struct flow_table *table)
if (!table->mask_cache)
return -ENOMEM;
 
+   ma = tbl_mask_array_alloc(MASK_ARRAY_SIZE_MIN);
+   if (!ma)
+   goto free_mask_cache;
+
ti = table_instance_alloc(TBL_MIN_BUCKETS);
if (!ti)
-   goto free_mask_cache;
+   goto free_mask_array;
 
ufid_ti = table_instance_alloc(TBL_MIN_BUCKETS);
if (!ufid_ti)
@@ -188,7 +235,7 @@ int ovs_flow_tbl_init(struct flow_table *table)
 
rcu_assign_pointer(table->ti, ti);
rcu_assign_pointer(table->ufid_ti, ufid_ti);
-   INIT_LIST_HEAD(>mask_list);
+   rcu_assign_pointer(table->mask_array, ma);
table->last_rehash = jiffies;
table->count = 0;
table->ufid_count = 0;
@@ -196,6 +243,8 @@ int ovs_flow_tbl_init(struct flow_table *table)
 
 free_ti:
__table_instance_destroy(ti);
+free_mask_array:
+   kfree(ma);
 free_mask_cache:
free_percpu(table->mask_cache);
return -ENOMEM;
@@ -255,6 +304,7 @@ void ovs_flow_tbl_destroy(struct flow_table *table)
struct table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti);
 
free_percpu(table->mask_cache);
+   kfree_rcu(rcu_dereference_raw(table->mask_array), rcu);
table_instance_destroy(ti, ufid_ti, false);
 }
 
@@ -460,17 +510,27 @@ static struct sw_flow *masked_flow_lookup(struct 
table_instance *ti,
 
 static struct sw_flow *flow_lookup(struct flow_table *tbl,
   struct table_instance *ti,
+  struct mask_array *ma,
   const struct sw_flow_key *key,
-  u32 *n_mask_hit)
+  u32 *n_mask_hit,
+  u32 *index)
 {
-   struct sw_flow_mask *mask;
struct sw_flow *flow;
+   int i;
 
-   list_for_each_entry_rcu(mask, >mask_list, list) {
-   flow = masked_flow_lookup(ti, key, mask, n_mask_hit);
-   if (flow)  /* Found 

[ovs-dev] [PATCH net-next v4 01/10] net: openvswitch: add flow-mask cache for performance

2019-10-16 Thread xiangxia . m . yue
From: Tonghao Zhang 

The idea of this optimization comes from a patch which
is committed in 2014, openvswitch community. The author
is Pravin B Shelar. In order to get high performance, I
implement it again. Later patches will use it.

Pravin B Shelar, says:
| On every packet OVS needs to lookup flow-table with every
| mask until it finds a match. The packet flow-key is first
| masked with mask in the list and then the masked key is
| looked up in flow-table. Therefore number of masks can
| affect packet processing performance.

Link: 
https://github.com/openvswitch/ovs/commit/5604935e4e1cbc16611d2d97f50b717aa31e8ec5
Signed-off-by: Tonghao Zhang 
Tested-by: Greg Rose 
---
 net/openvswitch/datapath.c   |   3 +-
 net/openvswitch/flow_table.c | 109 +--
 net/openvswitch/flow_table.h |  11 -
 3 files changed, 107 insertions(+), 16 deletions(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index f30e406..9fea7e1 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -227,7 +227,8 @@ void ovs_dp_process_packet(struct sk_buff *skb, struct 
sw_flow_key *key)
stats = this_cpu_ptr(dp->stats_percpu);
 
/* Look up flow. */
-   flow = ovs_flow_tbl_lookup_stats(>table, key, _mask_hit);
+   flow = ovs_flow_tbl_lookup_stats(>table, key, skb_get_hash(skb),
+_mask_hit);
if (unlikely(!flow)) {
struct dp_upcall_info upcall;
 
diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index cf3582c..3d515c0 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -36,6 +36,10 @@
 #define TBL_MIN_BUCKETS1024
 #define REHASH_INTERVAL(10 * 60 * HZ)
 
+#define MC_HASH_SHIFT  8
+#define MC_HASH_ENTRIES(1u << MC_HASH_SHIFT)
+#define MC_HASH_SEGS   ((sizeof(uint32_t) * 8) / MC_HASH_SHIFT)
+
 static struct kmem_cache *flow_cache;
 struct kmem_cache *flow_stats_cache __read_mostly;
 
@@ -168,10 +172,15 @@ int ovs_flow_tbl_init(struct flow_table *table)
 {
struct table_instance *ti, *ufid_ti;
 
-   ti = table_instance_alloc(TBL_MIN_BUCKETS);
+   table->mask_cache = __alloc_percpu(sizeof(struct mask_cache_entry) *
+  MC_HASH_ENTRIES,
+  __alignof__(struct 
mask_cache_entry));
+   if (!table->mask_cache)
+   return -ENOMEM;
 
+   ti = table_instance_alloc(TBL_MIN_BUCKETS);
if (!ti)
-   return -ENOMEM;
+   goto free_mask_cache;
 
ufid_ti = table_instance_alloc(TBL_MIN_BUCKETS);
if (!ufid_ti)
@@ -187,6 +196,8 @@ int ovs_flow_tbl_init(struct flow_table *table)
 
 free_ti:
__table_instance_destroy(ti);
+free_mask_cache:
+   free_percpu(table->mask_cache);
return -ENOMEM;
 }
 
@@ -243,6 +254,7 @@ void ovs_flow_tbl_destroy(struct flow_table *table)
struct table_instance *ti = rcu_dereference_raw(table->ti);
struct table_instance *ufid_ti = rcu_dereference_raw(table->ufid_ti);
 
+   free_percpu(table->mask_cache);
table_instance_destroy(ti, ufid_ti, false);
 }
 
@@ -425,7 +437,8 @@ static bool ovs_flow_cmp_unmasked_key(const struct sw_flow 
*flow,
 
 static struct sw_flow *masked_flow_lookup(struct table_instance *ti,
  const struct sw_flow_key *unmasked,
- const struct sw_flow_mask *mask)
+ const struct sw_flow_mask *mask,
+ u32 *n_mask_hit)
 {
struct sw_flow *flow;
struct hlist_head *head;
@@ -435,6 +448,8 @@ static struct sw_flow *masked_flow_lookup(struct 
table_instance *ti,
ovs_flow_mask_key(_key, unmasked, false, mask);
hash = flow_hash(_key, >range);
head = find_bucket(ti, hash);
+   (*n_mask_hit)++;
+
hlist_for_each_entry_rcu(flow, head, flow_table.node[ti->node_ver]) {
if (flow->mask == mask && flow->flow_table.hash == hash &&
flow_cmp_masked_key(flow, _key, >range))
@@ -443,30 +458,97 @@ static struct sw_flow *masked_flow_lookup(struct 
table_instance *ti,
return NULL;
 }
 
-struct sw_flow *ovs_flow_tbl_lookup_stats(struct flow_table *tbl,
-   const struct sw_flow_key *key,
-   u32 *n_mask_hit)
+static struct sw_flow *flow_lookup(struct flow_table *tbl,
+  struct table_instance *ti,
+  const struct sw_flow_key *key,
+  u32 *n_mask_hit)
 {
-   struct table_instance *ti = rcu_dereference_ovsl(tbl->ti);
struct sw_flow_mask *mask;
struct sw_flow *flow;
 
-   *n_mask_hit = 0;
list_for_each_entry_rcu(mask, >mask_list, 

[ovs-dev] [PATCH net-next v4 00/10] optimize openvswitch flow looking up

2019-10-16 Thread xiangxia . m . yue
From: Tonghao Zhang 

This series patch optimize openvswitch for performance or simplify
codes.

Patch 1, 2, 4: Port Pravin B Shelar patches to
linux upstream with little changes.

Patch 5, 6, 7: Optimize the flow looking up and
simplify the flow hash.

Patch 8, 9: are bugfix.

The performance test is on Intel Xeon E5-2630 v4.
The test topology is show as below:

+---+
|   +---+   |
|   | eth0   ovs-switcheth1 |   | Host0
|   +---+   |
+---+
  ^   |
  |   |
  |   |
  |   |
  |   v
+-++ ++-+
| netperf  | Host1   | netserver| Host2
+--+ +--+

We use netperf send the 64B packets, and insert 255+ flow-mask:
$ ovs-dpctl add-flow ovs-switch 
"in_port(1),eth(dst=00:01:00:00:00:00/ff:ff:ff:ff:ff:01),eth_type(0x0800),ipv4(frag=no)"
 2
...
$ ovs-dpctl add-flow ovs-switch 
"in_port(1),eth(dst=00:ff:00:00:00:00/ff:ff:ff:ff:ff:ff),eth_type(0x0800),ipv4(frag=no)"
 2
$
$ netperf -t UDP_STREAM -H 2.2.2.200 -l 40 -- -m 18

* Without series patch, throughput 8.28Mbps
* With series patch, throughput 46.05Mbps

v3->v4:
access ma->count with READ_ONCE/WRITE_ONCE API. More information,
see patch 5 comments. 

v2->v3:
update ma point when realloc mask_array in patch 5.

v1->v2:
use kfree_rcu instead of call_rcu

Tonghao Zhang (10):
  net: openvswitch: add flow-mask cache for performance
  net: openvswitch: convert mask list in mask array
  net: openvswitch: shrink the mask array if necessary
  net: openvswitch: optimize flow mask cache hash collision
  net: openvswitch: optimize flow-mask looking up
  net: openvswitch: simplify the flow_hash
  net: openvswitch: add likely in flow_lookup
  net: openvswitch: fix possible memleak on destroy flow-table
  net: openvswitch: don't unlock mutex when changing the user_features
fails
  net: openvswitch: simplify the ovs_dp_cmd_new

 net/openvswitch/datapath.c   |  65 +
 net/openvswitch/flow.h   |   1 -
 net/openvswitch/flow_table.c | 316 +--
 net/openvswitch/flow_table.h |  19 ++-
 4 files changed, 329 insertions(+), 72 deletions(-)

-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ovn-northd: Fix IP multicast flooding to mrouter.

2019-10-16 Thread Numan Siddique
On Wed, Oct 16, 2019 at 4:11 PM Dumitru Ceara  wrote:

> OVN logical flow "drop" actions can't be combined with other actions.
> Commit 79308138891a created such a scenario if a logical switch has
> mcast_snoop=true, mcast_flood_unregistered=false and is connected to a
> logical router with mcast_relay=enabled.
>
> To fix the issue we now explicitly add a drop flow for unregistered IP
> multicast traffic in a logical switch if mcast_snoop=true,
> mcast_flood_unregistered=false and the switch doesn't have any ports
> with mcast_flood=true and isn't connected to a router with
> mcast_relay=true.
>
> Fixes: 79308138891a ("ovn-northd: Add static IP multicast flood
> configuration")
> Signed-off-by: Dumitru Ceara 
>

Hi Dumitru,

Can you please update the ovn-northd documentation too about the flow
changes ?

Thanks
Numan


> ---
>  northd/ovn-northd.c |  8 +++-
>  tests/ovn.at| 50
> +++---
>  2 files changed, 54 insertions(+), 4 deletions(-)
>
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index e41c9d7..d0844dd 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -5661,7 +5661,13 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
>
>  if (mcast_sw_info->flood_static) {
>  ds_put_cstr(, "outport =\""MC_STATIC"\";
> output;");
> -} else {
> +}
> +
> +/* Explicitly drop the traffic if relay or static flooding
> + * is not configured.
> + */
> +if (!mcast_sw_info->flood_relay &&
> +!mcast_sw_info->flood_static) {
>  ds_put_cstr(, "drop;");
>  }
>
> diff --git a/tests/ovn.at b/tests/ovn.at
> index df00517..d141367 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -16306,7 +16306,7 @@ sleep 1
>  OVN_CHECK_PACKETS([hv1/vif3-tx.pcap], [expected])
>  OVN_CHECK_PACKETS([hv2/vif3-tx.pcap], [expected])
>
> -# Dissable IGMP querier on sw2.
> +# Disable IGMP querier on sw2.
>  ovn-nbctl set Logical_Switch sw2 \
>  other_config:mcast_querier="false"
>
> @@ -16357,6 +16357,50 @@ send_igmp_v3_report hv2-vif3 hv2 \
>  0001 $(ip_to_hex 10 0 0 1) f9f8 \
>  $(ip_to_hex 239 0 1 68) 04 e9b9 \
>  /dev/null
> +
> +# Check that the IGMP Group is learned by all switches.
> +OVS_WAIT_UNTIL([
> +total_entries=`ovn-sbctl find IGMP_Group | grep "239.0.1.68" | wc -l`
> +test "${total_entries}" = "2"
> +])
> +
> +# Send traffic from sw3 and make sure it is relayed by rtr.
> +# to ports that joined.
> +truncate -s 0 expected_routed_sw1
> +truncate -s 0 expected_routed_sw2
> +truncate -s 0 expected_empty
> +
> +as hv1 reset_pcap_file hv1-vif1 hv1/vif1
> +as hv1 reset_pcap_file hv1-vif2 hv1/vif2
> +as hv1 reset_pcap_file hv1-vif3 hv1/vif3
> +as hv1 reset_pcap_file hv1-vif4 hv1/vif4
> +as hv2 reset_pcap_file hv2-vif1 hv2/vif1
> +as hv2 reset_pcap_file hv2-vif2 hv2/vif2
> +as hv2 reset_pcap_file hv2-vif3 hv2/vif3
> +as hv2 reset_pcap_file hv2-vif4 hv2/vif4
> +
> +send_ip_multicast_pkt hv2-vif4 hv2 \
> +0001 01005e000144 \
> +$(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e 20 ca70 11 \
> +e518e518000a3b3a
> +store_ip_multicast_pkt \
> +0100 01005e000144 \
> +$(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e 1f cb70 11 \
> +e518e518000a3b3a expected_routed_sw1
> +store_ip_multicast_pkt \
> +0200 01005e000144 \
> +$(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e 1f cb70 11 \
> +e518e518000a3b3a expected_routed_sw2
> +
> +OVN_CHECK_PACKETS([hv1/vif1-tx.pcap], [expected_routed_sw1])
> +OVN_CHECK_PACKETS([hv2/vif3-tx.pcap], [expected_routed_sw2])
> +OVN_CHECK_PACKETS([hv1/vif4-tx.pcap], [expected_empty])
> +OVN_CHECK_PACKETS([hv1/vif2-tx.pcap], [expected_empty])
> +OVN_CHECK_PACKETS([hv1/vif3-tx.pcap], [expected_empty])
> +OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected_empty])
> +OVN_CHECK_PACKETS([hv2/vif2-tx.pcap], [expected_empty])
> +OVN_CHECK_PACKETS([hv2/vif4-tx.pcap], [expected_empty])
> +
>  # Inject IGMP Join for 239.0.1.68 on sw3-p1.
>  send_igmp_v3_report hv1-vif4 hv1 \
>  0001 $(ip_to_hex 10 0 0 1) f9f8 \
> @@ -16369,8 +16413,8 @@ OVS_WAIT_UNTIL([
>  test "${total_entries}" = "3"
>  ])
>
> -# Send traffic from sw3 and make sure it is relayed by rtr.
> -# and ports that joined.
> +# Send traffic from sw3 and make sure it is relayed by rtr
> +# to ports that joined.
>  truncate -s 0 expected_routed_sw1
>  truncate -s 0 expected_routed_sw2
>  truncate -s 0 expected_switched
> --
> 1.8.3.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] tests: Get rid of timeout options for control utilities.

2019-10-16 Thread Ilya Maximets

On 15.10.2019 18:10, Ilya Maximets wrote:

'OVS_CTL_TIMEOUT' environment variable is exported in tests/atlocal.in
and controls timeouts for all OVS utilities in testsuite.

There should be no manual tweaks for each single command.

This helps with running tests under valgrind where commands could
take really long time as you only need to change 'OVS_CTL_TIMEOUT'
in a single place.

Few manual timeouts were left in places where they make sense.

Signed-off-by: Ilya Maximets 
---


Thanks, Ben and Yifeng! Applied to master.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v9 2/2] netdev-dpdk:Detailed packet drop statistics

2019-10-16 Thread Ilya Maximets

Hi Kevin,

Thanks for review. Some comments inline.

On 16.10.2019 12:15, Kevin Traynor wrote:

Hi Sriram,

Thanks for working on making more fine grained stats for debugging. As
mentioned yesterday, this patch needs rebase so I just reviewed docs and
netdev-dpdk.c which applied. Comments below.

On 21/09/2019 03:40, Sriram Vatala wrote:

OVS may be unable to transmit packets for multiple reasons and
today there is a single counter to track packets dropped due to
any of those reasons. The most common reason is that a VM is
unable to read packets fast enough causing the vhostuser port
transmit queue on the OVS side to become full. This manifests
as a problem with VNFs not receiving all packets. Having a
separate drop counter to track packets dropped because the
transmit queue is full will clearly indicate that the problem
is on the VM side and not in OVS. Similarly maintaining separate


It reads like a bit of a contradiction. On one hand the "The *most
common* reason is that a VM is unable to read packets fast enough".
While having a stat "*will clearly indicate* that the problem is on the
VM side".


counters for all possible drops helps in indicating sensible
cause for packet drops.

This patch adds custom software stats counters to track packets
dropped at port level and these counters are displayed along with
other stats in "ovs-vsctl get interface  statistics"
command. The detailed stats will be available for both dpdk and
vhostuser ports.



I think the commit msg could be a bit clearer, suggest something like
below - feel free to modify (or reject):

OVS may be unable to transmit packets for multiple reasons on the DPDK
datapath and today there is a single counter to track packets dropped
due to any of those reasons.

This patch adds custom software stats for the different reasons packets
may be dropped during tx on the DPDK datapath in OVS.

- MTU drops : drops that occur due to a too large packet size
- Qos drops: drops that occur due to egress QOS
- Tx failures: drops as returned by the DPDK PMD send function

Note that the reason for tx failures is not specificied in OVS. In
practice for vhost ports it is most common that tx failures are because
there are not enough available descriptors, which is usually caused by
misconfiguration of the guest queues and/or because the guest is not
consuming packets fast enough from the queues.

These counters are displayed along with other stats in "ovs-vsctl get
interface  statistics" command and are available for dpdk and
vhostuser/vhostuserclient ports.

Signed-off-by: Sriram Vatala 

---

v9:...
v8:...


Note that signed-off, --- and version info should be like this ^^^.
otherwise the review version comments will be in the git commit log when
it is applied.


--
Changes since v8:
Addressed comments given by Ilya.

Signed-off-by: Sriram Vatala 
---
  Documentation/topics/dpdk/vhost-user.rst  | 13 ++-
  lib/netdev-dpdk.c | 81 +++
  utilities/bugtool/automake.mk |  3 +-
  utilities/bugtool/ovs-bugtool-get-port-stats  | 25 ++
  .../plugins/network-status/openvswitch.xml|  1 +
  5 files changed, 105 insertions(+), 18 deletions(-)
  create mode 100755 utilities/bugtool/ovs-bugtool-get-port-stats

diff --git a/Documentation/topics/dpdk/vhost-user.rst 
b/Documentation/topics/dpdk/vhost-user.rst
index 724aa62f6..89388a2bf 100644
--- a/Documentation/topics/dpdk/vhost-user.rst
+++ b/Documentation/topics/dpdk/vhost-user.rst
@@ -551,7 +551,18 @@ processing packets at the required rate.
  The amount of Tx retries on a vhost-user or vhost-user-client interface can be
  shown with::
  
-  $ ovs-vsctl get Interface dpdkvhostclient0 statistics:tx_retries

+  $ ovs-vsctl get Interface dpdkvhostclient0 statistics:netdev_dpdk_tx_retries


I think the "netdev_dpdk_" prefixes should be removed for a few reasons,

- These are custom stats that will only be displayed for the dpdk ports
so they don't need any extra prefix to say they are for dpdk ports

- The 'tx_retries' stat is part of OVS 2.12, I don't like to change its
name to a different one in OVS 2.13 unless there is a very good reason

- The existing stats don't have this type of prefixes (granted most of
them are general stats):


The main reason for the prefix for me is to distinguish those stats
from the stats that comest from the driver/HW.  Our new stats has
very generic names that could be named a lot like or even equal to
HW stats.  This might be not an issue for vhostuser, but for HW
NICs this may be really confusing for users.

I'm not insisting on 'netdev_dpdk_' prefix, but, IMHO, we need a
way to clearly distinguish those stats.  We may use different prefixes
like 'sw_' or just 'ovs_'.


# ovs-vsctl get Interface dpdkvhost0 statistics
{"rx_1024_to_1522_packets"=0, "rx_128_to_255_packets"=0,
"rx_1523_to_max_packets"=0, "rx_1_to_64_packets"=25622176,
"rx_256_to_511_packets"=0, "rx_512_to_1023_packets"=0,
"rx_65_to_127_packets"=0, 

Re: [ovs-dev] [PATCH] tc: Limit the max action number to 16

2019-10-16 Thread Roi Dayan



On 2019-10-16 2:40 PM, Simon Horman wrote:
> On Wed, Oct 16, 2019 at 11:37:14AM +0300, Roi Dayan wrote:
>> From: Chris Mi 
>>
>> Currently, ovs supports to offload max TCA_ACT_MAX_PRIO(32) actions.
>> But net sched api has a limit of 4K message size which is not enough
>> for 32 actions when echo flag is set.
>>
>> After a lot of testing, we find that 16 actions is a reasonable number.
>> So in this commit, we introduced a new define to limit the max actions.
>>
>> Fixes: 0c70132cd288("tc: Make the actions order consistent")
>> Signed-off-by: Chris Mi 
>> Reviewed-by: Roi Dayan 
> 
> Hi Chris,
> 
> I'm unclear on what problem is this patch solving.

Hi Simon,

I can help with the answer here.
When ovs send netlink msg to tc to add a filter we also add
the echo flag to get a reply data. this reply can be big as
it contains everything from the request and if it pass the 4K
size then the kernel will just not fill/send it but the return
status of the tc command is still 0 for success.

In ovs we use that reply to get back the handle id on success but
for this case will fail.
To make sure this ack reply always exists for successful tc calls
we limit the number of actions here to avoid reaching the 4K size limit.

Thanks,
Roi

> 
>> ---
>>  lib/netdev-offload-tc.c | 4 ++--
>>  lib/tc.c| 6 +++---
>>  lib/tc.h| 4 +++-
>>  3 files changed, 8 insertions(+), 6 deletions(-)
>>
>> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
>> index 6814390eab2f..f6d1abb2e695 100644
>> --- a/lib/netdev-offload-tc.c
>> +++ b/lib/netdev-offload-tc.c
>> @@ -1360,8 +1360,8 @@ netdev_tc_flow_put(struct netdev *netdev, struct match 
>> *match,
>>  }
>>  
>>  NL_ATTR_FOR_EACH(nla, left, actions, actions_len) {
>> -if (flower.action_count >= TCA_ACT_MAX_PRIO) {
>> -VLOG_DBG_RL(, "Can only support %d actions", 
>> flower.action_count);
>> +if (flower.action_count >= TCA_ACT_MAX_NUM) {
>> +VLOG_DBG_RL(, "Can only support %d actions", 
>> TCA_ACT_MAX_NUM);
>>  return EOPNOTSUPP;
>>  }
>>  action = [flower.action_count];
>> diff --git a/lib/tc.c b/lib/tc.c
>> index 316a0ee5dc7c..d5487097d8ec 100644
>> --- a/lib/tc.c
>> +++ b/lib/tc.c
>> @@ -1458,7 +1458,7 @@ static int
>>  nl_parse_flower_actions(struct nlattr **attrs, struct tc_flower *flower)
>>  {
>>  const struct nlattr *actions = attrs[TCA_FLOWER_ACT];
>> -static struct nl_policy actions_orders_policy[TCA_ACT_MAX_PRIO + 1] = 
>> {};
>> +static struct nl_policy actions_orders_policy[TCA_ACT_MAX_NUM + 1] = {};
>>  struct nlattr *actions_orders[ARRAY_SIZE(actions_orders_policy)];
>>  const int max_size = ARRAY_SIZE(actions_orders_policy);
>>  
>> @@ -1477,8 +1477,8 @@ nl_parse_flower_actions(struct nlattr **attrs, struct 
>> tc_flower *flower)
>>  if (actions_orders[i]) {
>>  int err;
>>  
>> -if (flower->action_count >= TCA_ACT_MAX_PRIO) {
>> -VLOG_DBG_RL(_rl, "Can only support %d actions", 
>> flower->action_count);
>> +if (flower->action_count >= TCA_ACT_MAX_NUM) {
>> +VLOG_DBG_RL(_rl, "Can only support %d actions", 
>> TCA_ACT_MAX_NUM);
>>  return EOPNOTSUPP;
>>  }
>>  err = nl_parse_single_action(actions_orders[i], flower);
>> diff --git a/lib/tc.h b/lib/tc.h
>> index f4213579a2fd..f4073c6c47e6 100644
>> --- a/lib/tc.h
>> +++ b/lib/tc.h
>> @@ -208,6 +208,8 @@ enum tc_offloaded_state {
>>  TC_OFFLOADED_STATE_NOT_IN_HW,
>>  };
>>  
>> +#define TCA_ACT_MAX_NUM 16
>> +
>>  struct tc_flower {
>>  uint32_t handle;
>>  uint32_t prio;
>> @@ -216,7 +218,7 @@ struct tc_flower {
>>  struct tc_flower_key mask;
>>  
>>  int action_count;
>> -struct tc_action actions[TCA_ACT_MAX_PRIO];
>> +struct tc_action actions[TCA_ACT_MAX_NUM];
>>  
>>  struct ovs_flow_stats stats;
>>  uint64_t lastused;
>> -- 
>> 2.8.4
>>
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-devdata=02%7C01%7Croid%40mellanox.com%7C9097cc90c5ac40952b3f08d7522da0e1%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637068228233302023sdata=vEyT6W3%2Fd%2B3gcyygwFwqXJxQHPsxC7gyzmNTs8FquVw%3Dreserved=0
>>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] tc: Limit the max action number to 16

2019-10-16 Thread Simon Horman
On Wed, Oct 16, 2019 at 11:37:14AM +0300, Roi Dayan wrote:
> From: Chris Mi 
> 
> Currently, ovs supports to offload max TCA_ACT_MAX_PRIO(32) actions.
> But net sched api has a limit of 4K message size which is not enough
> for 32 actions when echo flag is set.
> 
> After a lot of testing, we find that 16 actions is a reasonable number.
> So in this commit, we introduced a new define to limit the max actions.
> 
> Fixes: 0c70132cd288("tc: Make the actions order consistent")
> Signed-off-by: Chris Mi 
> Reviewed-by: Roi Dayan 

Hi Chris,

I'm unclear on what problem is this patch solving.

> ---
>  lib/netdev-offload-tc.c | 4 ++--
>  lib/tc.c| 6 +++---
>  lib/tc.h| 4 +++-
>  3 files changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> index 6814390eab2f..f6d1abb2e695 100644
> --- a/lib/netdev-offload-tc.c
> +++ b/lib/netdev-offload-tc.c
> @@ -1360,8 +1360,8 @@ netdev_tc_flow_put(struct netdev *netdev, struct match 
> *match,
>  }
>  
>  NL_ATTR_FOR_EACH(nla, left, actions, actions_len) {
> -if (flower.action_count >= TCA_ACT_MAX_PRIO) {
> -VLOG_DBG_RL(, "Can only support %d actions", 
> flower.action_count);
> +if (flower.action_count >= TCA_ACT_MAX_NUM) {
> +VLOG_DBG_RL(, "Can only support %d actions", TCA_ACT_MAX_NUM);
>  return EOPNOTSUPP;
>  }
>  action = [flower.action_count];
> diff --git a/lib/tc.c b/lib/tc.c
> index 316a0ee5dc7c..d5487097d8ec 100644
> --- a/lib/tc.c
> +++ b/lib/tc.c
> @@ -1458,7 +1458,7 @@ static int
>  nl_parse_flower_actions(struct nlattr **attrs, struct tc_flower *flower)
>  {
>  const struct nlattr *actions = attrs[TCA_FLOWER_ACT];
> -static struct nl_policy actions_orders_policy[TCA_ACT_MAX_PRIO + 1] = {};
> +static struct nl_policy actions_orders_policy[TCA_ACT_MAX_NUM + 1] = {};
>  struct nlattr *actions_orders[ARRAY_SIZE(actions_orders_policy)];
>  const int max_size = ARRAY_SIZE(actions_orders_policy);
>  
> @@ -1477,8 +1477,8 @@ nl_parse_flower_actions(struct nlattr **attrs, struct 
> tc_flower *flower)
>  if (actions_orders[i]) {
>  int err;
>  
> -if (flower->action_count >= TCA_ACT_MAX_PRIO) {
> -VLOG_DBG_RL(_rl, "Can only support %d actions", 
> flower->action_count);
> +if (flower->action_count >= TCA_ACT_MAX_NUM) {
> +VLOG_DBG_RL(_rl, "Can only support %d actions", 
> TCA_ACT_MAX_NUM);
>  return EOPNOTSUPP;
>  }
>  err = nl_parse_single_action(actions_orders[i], flower);
> diff --git a/lib/tc.h b/lib/tc.h
> index f4213579a2fd..f4073c6c47e6 100644
> --- a/lib/tc.h
> +++ b/lib/tc.h
> @@ -208,6 +208,8 @@ enum tc_offloaded_state {
>  TC_OFFLOADED_STATE_NOT_IN_HW,
>  };
>  
> +#define TCA_ACT_MAX_NUM 16
> +
>  struct tc_flower {
>  uint32_t handle;
>  uint32_t prio;
> @@ -216,7 +218,7 @@ struct tc_flower {
>  struct tc_flower_key mask;
>  
>  int action_count;
> -struct tc_action actions[TCA_ACT_MAX_PRIO];
> +struct tc_action actions[TCA_ACT_MAX_NUM];
>  
>  struct ovs_flow_stats stats;
>  uint64_t lastused;
> -- 
> 2.8.4
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovsdb-server: Allow replication from older schema version servers.

2019-10-16 Thread Numan Siddique
On Wed, Oct 16, 2019 at 3:21 AM Ben Pfaff  wrote:

> On Wed, Oct 16, 2019 at 12:20:48AM +0530, nusid...@redhat.com wrote:
> > From: Numan Siddique 
> >
> > Presently, replication is not allowed if there is a schema version
> mismatch between
> > the schema returned by the active ovsdb-server and the local db schema.
> This is
> > causing failures in OVN DB HA deployments during uprades.
> >
> > In the case of OpenStack tripleo deployment with OVN, OVN DB
> ovsdb-servers are
> > deployed on a multi node controller cluster in active/standby mode.
> During
> > minor updates or major upgrades, the cluster is updated one at a time. If
> > a node A is running active OVN DB ovsdb-servers and when it is updated,
> another
> > node B becomes active. After the update when OVN DB ovsdb-servers in A
> are started,
> > these ovsdb-servers fail to replicate from the active if there is a
> schema
> > version mismatch.
> >
> > This patch addresses this issue by allowing replication even if there is
> a
> > schema version mismatch only if
> >   - The standby ovsdb-servers's local db schema version is greater than
> that
> > of the active. The version x should match with the active and the
> version y
> > should be greater than that of the active.
> >   - If all the active ovsdb-server schema tables are present in the
> > local db schema.
> >
> > This should not result in any data loss.
>
> This is interesting.
>
> From a database design perspective, I believe that this introduces the
> first place where the database itself looks at version numbers.  Until
> now, the database has carried the version number and validated that it
> has the right form, but the database server itself has not attached any
> meaning to the version number.  With this change, the version number has
> some significance to the database server, at least when replication is
> in use.
>
> This change is noteworthy.  It should be mentioned in the documentation.
> Probably some small note should be added in places that claim OVSDB does
> not enforce a particular version numbering schema
> (e.g. ovsdb/ovsdb-schemas.man, Documentation/ref/ovsdb.7.rst), since
> this is still *mostly* true but not entirely.
>
> There is a possible alternate way to do this.  The code could actually
> compare the two schemas.  If the newer schema would accept everything
> that the older schema does, then it could accept the substitution.
> (This would only work if both schemas were actually available; I'm not
> sure that's true.)
>

Thanks for the comments. I think the alternative suggested by you makes
more sense and more robust. I will go with this.
Both schemas will be available. So I don't see any issues.

Thanks
Numan


>
> I didn't really review the patch itself.  It looks like it adds some new
> code for ad hoc parsing of versions.  It would be better to use struct
> ovsdb_version and use ovsdb_parse_version() instead.
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] ovn-northd: Fix IP multicast flooding to mrouter.

2019-10-16 Thread Dumitru Ceara
OVN logical flow "drop" actions can't be combined with other actions.
Commit 79308138891a created such a scenario if a logical switch has
mcast_snoop=true, mcast_flood_unregistered=false and is connected to a
logical router with mcast_relay=enabled.

To fix the issue we now explicitly add a drop flow for unregistered IP
multicast traffic in a logical switch if mcast_snoop=true,
mcast_flood_unregistered=false and the switch doesn't have any ports
with mcast_flood=true and isn't connected to a router with
mcast_relay=true.

Fixes: 79308138891a ("ovn-northd: Add static IP multicast flood configuration")
Signed-off-by: Dumitru Ceara 
---
 northd/ovn-northd.c |  8 +++-
 tests/ovn.at| 50 +++---
 2 files changed, 54 insertions(+), 4 deletions(-)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index e41c9d7..d0844dd 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -5661,7 +5661,13 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap 
*ports,
 
 if (mcast_sw_info->flood_static) {
 ds_put_cstr(, "outport =\""MC_STATIC"\"; output;");
-} else {
+}
+
+/* Explicitly drop the traffic if relay or static flooding
+ * is not configured.
+ */
+if (!mcast_sw_info->flood_relay &&
+!mcast_sw_info->flood_static) {
 ds_put_cstr(, "drop;");
 }
 
diff --git a/tests/ovn.at b/tests/ovn.at
index df00517..d141367 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -16306,7 +16306,7 @@ sleep 1
 OVN_CHECK_PACKETS([hv1/vif3-tx.pcap], [expected])
 OVN_CHECK_PACKETS([hv2/vif3-tx.pcap], [expected])
 
-# Dissable IGMP querier on sw2.
+# Disable IGMP querier on sw2.
 ovn-nbctl set Logical_Switch sw2 \
 other_config:mcast_querier="false"
 
@@ -16357,6 +16357,50 @@ send_igmp_v3_report hv2-vif3 hv2 \
 0001 $(ip_to_hex 10 0 0 1) f9f8 \
 $(ip_to_hex 239 0 1 68) 04 e9b9 \
 /dev/null
+
+# Check that the IGMP Group is learned by all switches.
+OVS_WAIT_UNTIL([
+total_entries=`ovn-sbctl find IGMP_Group | grep "239.0.1.68" | wc -l`
+test "${total_entries}" = "2"
+])
+
+# Send traffic from sw3 and make sure it is relayed by rtr.
+# to ports that joined.
+truncate -s 0 expected_routed_sw1
+truncate -s 0 expected_routed_sw2
+truncate -s 0 expected_empty
+
+as hv1 reset_pcap_file hv1-vif1 hv1/vif1
+as hv1 reset_pcap_file hv1-vif2 hv1/vif2
+as hv1 reset_pcap_file hv1-vif3 hv1/vif3
+as hv1 reset_pcap_file hv1-vif4 hv1/vif4
+as hv2 reset_pcap_file hv2-vif1 hv2/vif1
+as hv2 reset_pcap_file hv2-vif2 hv2/vif2
+as hv2 reset_pcap_file hv2-vif3 hv2/vif3
+as hv2 reset_pcap_file hv2-vif4 hv2/vif4
+
+send_ip_multicast_pkt hv2-vif4 hv2 \
+0001 01005e000144 \
+$(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e 20 ca70 11 \
+e518e518000a3b3a
+store_ip_multicast_pkt \
+0100 01005e000144 \
+$(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e 1f cb70 11 \
+e518e518000a3b3a expected_routed_sw1
+store_ip_multicast_pkt \
+0200 01005e000144 \
+$(ip_to_hex 10 0 0 42) $(ip_to_hex 239 0 1 68) 1e 1f cb70 11 \
+e518e518000a3b3a expected_routed_sw2
+
+OVN_CHECK_PACKETS([hv1/vif1-tx.pcap], [expected_routed_sw1])
+OVN_CHECK_PACKETS([hv2/vif3-tx.pcap], [expected_routed_sw2])
+OVN_CHECK_PACKETS([hv1/vif4-tx.pcap], [expected_empty])
+OVN_CHECK_PACKETS([hv1/vif2-tx.pcap], [expected_empty])
+OVN_CHECK_PACKETS([hv1/vif3-tx.pcap], [expected_empty])
+OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected_empty])
+OVN_CHECK_PACKETS([hv2/vif2-tx.pcap], [expected_empty])
+OVN_CHECK_PACKETS([hv2/vif4-tx.pcap], [expected_empty])
+
 # Inject IGMP Join for 239.0.1.68 on sw3-p1.
 send_igmp_v3_report hv1-vif4 hv1 \
 0001 $(ip_to_hex 10 0 0 1) f9f8 \
@@ -16369,8 +16413,8 @@ OVS_WAIT_UNTIL([
 test "${total_entries}" = "3"
 ])
 
-# Send traffic from sw3 and make sure it is relayed by rtr.
-# and ports that joined.
+# Send traffic from sw3 and make sure it is relayed by rtr
+# to ports that joined.
 truncate -s 0 expected_routed_sw1
 truncate -s 0 expected_routed_sw2
 truncate -s 0 expected_switched
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v9 2/2] netdev-dpdk:Detailed packet drop statistics

2019-10-16 Thread Kevin Traynor
Hi Sriram,

Thanks for working on making more fine grained stats for debugging. As
mentioned yesterday, this patch needs rebase so I just reviewed docs and
netdev-dpdk.c which applied. Comments below.

On 21/09/2019 03:40, Sriram Vatala wrote:
> OVS may be unable to transmit packets for multiple reasons and
> today there is a single counter to track packets dropped due to
> any of those reasons. The most common reason is that a VM is
> unable to read packets fast enough causing the vhostuser port
> transmit queue on the OVS side to become full. This manifests
> as a problem with VNFs not receiving all packets. Having a
> separate drop counter to track packets dropped because the
> transmit queue is full will clearly indicate that the problem
> is on the VM side and not in OVS. Similarly maintaining separate

It reads like a bit of a contradiction. On one hand the "The *most
common* reason is that a VM is unable to read packets fast enough".
While having a stat "*will clearly indicate* that the problem is on the
VM side".

> counters for all possible drops helps in indicating sensible
> cause for packet drops.
> 
> This patch adds custom software stats counters to track packets
> dropped at port level and these counters are displayed along with
> other stats in "ovs-vsctl get interface  statistics"
> command. The detailed stats will be available for both dpdk and
> vhostuser ports.
> 

I think the commit msg could be a bit clearer, suggest something like
below - feel free to modify (or reject):

OVS may be unable to transmit packets for multiple reasons on the DPDK
datapath and today there is a single counter to track packets dropped
due to any of those reasons.

This patch adds custom software stats for the different reasons packets
may be dropped during tx on the DPDK datapath in OVS.

- MTU drops : drops that occur due to a too large packet size
- Qos drops: drops that occur due to egress QOS
- Tx failures: drops as returned by the DPDK PMD send function

Note that the reason for tx failures is not specificied in OVS. In
practice for vhost ports it is most common that tx failures are because
there are not enough available descriptors, which is usually caused by
misconfiguration of the guest queues and/or because the guest is not
consuming packets fast enough from the queues.

These counters are displayed along with other stats in "ovs-vsctl get
interface  statistics" command and are available for dpdk and
vhostuser/vhostuserclient ports.

Signed-off-by: Sriram Vatala 

---

v9:...
v8:...


Note that signed-off, --- and version info should be like this ^^^.
otherwise the review version comments will be in the git commit log when
it is applied.

> --
> Changes since v8:
> Addressed comments given by Ilya.
> 
> Signed-off-by: Sriram Vatala 
> ---
>  Documentation/topics/dpdk/vhost-user.rst  | 13 ++-
>  lib/netdev-dpdk.c | 81 +++
>  utilities/bugtool/automake.mk |  3 +-
>  utilities/bugtool/ovs-bugtool-get-port-stats  | 25 ++
>  .../plugins/network-status/openvswitch.xml|  1 +
>  5 files changed, 105 insertions(+), 18 deletions(-)
>  create mode 100755 utilities/bugtool/ovs-bugtool-get-port-stats
> 
> diff --git a/Documentation/topics/dpdk/vhost-user.rst 
> b/Documentation/topics/dpdk/vhost-user.rst
> index 724aa62f6..89388a2bf 100644
> --- a/Documentation/topics/dpdk/vhost-user.rst
> +++ b/Documentation/topics/dpdk/vhost-user.rst
> @@ -551,7 +551,18 @@ processing packets at the required rate.
>  The amount of Tx retries on a vhost-user or vhost-user-client interface can 
> be
>  shown with::
>  
> -  $ ovs-vsctl get Interface dpdkvhostclient0 statistics:tx_retries
> +  $ ovs-vsctl get Interface dpdkvhostclient0 
> statistics:netdev_dpdk_tx_retries

I think the "netdev_dpdk_" prefixes should be removed for a few reasons,

- These are custom stats that will only be displayed for the dpdk ports
so they don't need any extra prefix to say they are for dpdk ports

- The 'tx_retries' stat is part of OVS 2.12, I don't like to change its
name to a different one in OVS 2.13 unless there is a very good reason

- The existing stats don't have this type of prefixes (granted most of
them are general stats):
# ovs-vsctl get Interface dpdkvhost0 statistics
{"rx_1024_to_1522_packets"=0, "rx_128_to_255_packets"=0,
"rx_1523_to_max_packets"=0, "rx_1_to_64_packets"=25622176,
"rx_256_to_511_packets"=0, "rx_512_to_1023_packets"=0,
"rx_65_to_127_packets"=0, rx_bytes=1537330560, rx_dropped=0,
rx_errors=0, rx_packets=25622176, tx_bytes=3829825920, tx_dropped=0,
tx_packets=63830432, tx_retries=0}

Also, just to note that if there are changes to existing
interfaces/behaviour it should really mention that in the commit message
so it is highlighted.

> +
> +When the guest is not able to consume the packets fast enough, the transmit
> +queue of the port gets filled up i.e queue runs out of free descriptors.
> +This is the most likely reason why dpdk transmit API 

Re: [ovs-dev] [PATCH] tc: Limit the max action number to 16

2019-10-16 Thread 0-day Robot
Bleep bloop.  Greetings Roi Dayan, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 87 characters long (recommended limit is 79)
#57 FILE: lib/tc.c:1481:
VLOG_DBG_RL(_rl, "Can only support %d actions", 
TCA_ACT_MAX_NUM);

Lines checked: 85, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Infosecurity netherlands 2019 VISITORS DATAINFO

2019-10-16 Thread Kelly Queen
Hello,

 

We are in our final week for our sales of Info Security Expo 2019 attendees 
list and wanted to check if you are still interested in acquiring the entire 
list for a special offer of just €675 USD.


Info Security Expo 2019

Date: - 30 -31 OCTOBER 2019 | UTRECHT ANNUAL FAIR

Counts: - 8500

 

Let me know your interest, so that we can send you additional information.

 

Waiting for your response.

 

Regards,

Kelly Queen| Demand Generation Executive

 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] OVN: Send RARP for vif ports for which OVN does not know the IP.

2019-10-16 Thread Numan Siddique
On Sat, Oct 12, 2019 at 4:53 AM Ankur Sharma 
wrote:

> ISSUE:
> For a VIF port (on a bridged logical switch), OVN sends out
> GARPs, advertising port's mac and IP.
>
> However, if a VIF port (on a bridged logical switch) has not
> been assigned an IP, then OVN does not advertise anything.
> As a result, for such VIF ports basic operations like VM
> migration will not work, since TOR will direct the packets
> destined to this VIF to old chassis.
>
> This patch addresses the same by advertising reverse ARP (RARP)
> for non ip assigned bridged logical switch connected VIF ports.
>
> Signed-off-by: Ankur Sharma 
>

Thanks for the patch.

I applied this patch to the master with below small changes.

For your future patches, If you can add "ovn" to the subject prefix,
ovsrobot can properly apply this patch to ovn repo
and run the tests in travis CI.

diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index 9d6d9e900..d826da186 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -3940,7 +3940,8 @@ send_garp_rarp_prepare(struct ovsdb_idl_index
*sbrec_port_binding_by_datapath,
_ip_keys, _l3gw_ports,
chassis, active_tunnels,
_addresses);
-/* For deleted ports and deleted nat ips, remove from
send_garp_rarp_data. */
+/* For deleted ports and deleted nat ips, remove from
+ * send_garp_rarp_data. */
 struct shash_node *iter, *next;
 SHASH_FOR_EACH_SAFE (iter, next, _garp_rarp_data) {
 if (!sset_contains(_vifs, iter->name) &&

diff --git a/tests/ovn.at b/tests/ovn.at
index 0c8a05568..df00517e8 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -17378,8 +17378,6 @@ ovn-nbctl lsp-set-port-security lp11
f0:00:00:00:00:11

 OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up lp11` = xup])

-sleep 1
-
 ovn-nbctl --wait=sb sync

 ovn-nbctl show
@@ -17399,9 +17397,9 @@ echo "--- Post Traffic hv1 dump ---"
 as hv1 ovs-ofctl -O OpenFlow13 dump-flows br-int
 as hv1 ovs-appctl fdb/show br-phys

-AT_CHECK([ovs-appctl fdb/show br-phys | grep f0:00:00:00:00:11 | grep 101
| wc -l], [0], [1
-])
-
+OVS_WAIT_UNTIL(
+[test 1 = `ovs-appctl fdb/show br-phys | grep f0:00:00:00:00:11 | \
+grep 101 | wc -l`])

 OVN_CLEANUP([hv1])

Thanks
Numan

---
>  controller/pinctrl.c | 225
> +++
>  tests/ovn.at |  70 +++-
>  2 files changed, 188 insertions(+), 107 deletions(-)
>
> diff --git a/controller/pinctrl.c b/controller/pinctrl.c
> index 3cbfb0f..9d6d9e9 100644
> --- a/controller/pinctrl.c
> +++ b/controller/pinctrl.c
> @@ -129,12 +129,12 @@ VLOG_DEFINE_THIS_MODULE(pinctrl);
>   *pinctrl_handler thread sends the periodic IPv6 RAs
> using
>   *the shash - 'ipv6_ras'
>   *
> - * gARP handling- pinctrl_run() prepares the gARP information
> - *(see send_garp_prepare()) in the shash
> 'send_garp_data'
> - *by looking into the Southbound DB table
> Port_Binding.
> - *
> - *pinctrl_handler() thread sends these gARPs using the
> - *shash 'send_garp_data'.
> + * g/rARP handling- pinctrl_run() prepares the g/rARP information
> + * (see send_garp_rarp_prepare()) in the shash
> + * 'send_garp_rarp_data' by looking into the
> + * Southbound DB table Port_Binding.
> + * pinctrl_handler() thread sends these gARPs using
> the
> + * shash 'send_garp_rarp_data'.
>   *
>   * IGMP Queries - pinctrl_run() prepares the IGMP queries (at most one
>   *per local datapath) based on the mcast_snoop_map
> @@ -149,7 +149,8 @@ VLOG_DEFINE_THIS_MODULE(pinctrl);
>   *  and pinctrl_run().
>   *  'pinctrl_handler_seq' is used by pinctrl_run() to
>   *  wake up pinctrl_handler thread from poll_block() if any changes
> happened
> - *  in 'send_garp_data', 'ipv6_ras' and 'buffered_mac_bindings'
> structures.
> + *  in 'send_garp_rarp_data', 'ipv6_ras' and 'buffered_mac_bindings'
> + *  structures.
>   *
>   *  'pinctrl_main_seq' is used by pinctrl_handler() thread to wake up
>   *  the main thread from poll_block() when mac bindings/igmp groups need
> to
> @@ -195,10 +196,10 @@ static void flush_put_mac_bindings(void);
>  static void send_mac_binding_buffered_pkts(struct rconn *swconn)
>  OVS_REQUIRES(pinctrl_mutex);
>
> -static void init_send_garps(void);
> -static void destroy_send_garps(void);
> -static void send_garp_wait(long long int send_garp_time);
> -static void send_garp_prepare(
> +static void init_send_garps_rarps(void);
> +static void destroy_send_garps_rarps(void);
> +static void send_garp_rarp_wait(long long int send_garp_rarp_time);
> +static void send_garp_rarp_prepare(
>  struct ovsdb_idl_index *sbrec_port_binding_by_datapath,
>  struct ovsdb_idl_index *sbrec_port_binding_by_name,
>  

[ovs-dev] [PATCH] tc: Limit the max action number to 16

2019-10-16 Thread Roi Dayan
From: Chris Mi 

Currently, ovs supports to offload max TCA_ACT_MAX_PRIO(32) actions.
But net sched api has a limit of 4K message size which is not enough
for 32 actions when echo flag is set.

After a lot of testing, we find that 16 actions is a reasonable number.
So in this commit, we introduced a new define to limit the max actions.

Fixes: 0c70132cd288("tc: Make the actions order consistent")
Signed-off-by: Chris Mi 
Reviewed-by: Roi Dayan 
---
 lib/netdev-offload-tc.c | 4 ++--
 lib/tc.c| 6 +++---
 lib/tc.h| 4 +++-
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
index 6814390eab2f..f6d1abb2e695 100644
--- a/lib/netdev-offload-tc.c
+++ b/lib/netdev-offload-tc.c
@@ -1360,8 +1360,8 @@ netdev_tc_flow_put(struct netdev *netdev, struct match 
*match,
 }
 
 NL_ATTR_FOR_EACH(nla, left, actions, actions_len) {
-if (flower.action_count >= TCA_ACT_MAX_PRIO) {
-VLOG_DBG_RL(, "Can only support %d actions", 
flower.action_count);
+if (flower.action_count >= TCA_ACT_MAX_NUM) {
+VLOG_DBG_RL(, "Can only support %d actions", TCA_ACT_MAX_NUM);
 return EOPNOTSUPP;
 }
 action = [flower.action_count];
diff --git a/lib/tc.c b/lib/tc.c
index 316a0ee5dc7c..d5487097d8ec 100644
--- a/lib/tc.c
+++ b/lib/tc.c
@@ -1458,7 +1458,7 @@ static int
 nl_parse_flower_actions(struct nlattr **attrs, struct tc_flower *flower)
 {
 const struct nlattr *actions = attrs[TCA_FLOWER_ACT];
-static struct nl_policy actions_orders_policy[TCA_ACT_MAX_PRIO + 1] = {};
+static struct nl_policy actions_orders_policy[TCA_ACT_MAX_NUM + 1] = {};
 struct nlattr *actions_orders[ARRAY_SIZE(actions_orders_policy)];
 const int max_size = ARRAY_SIZE(actions_orders_policy);
 
@@ -1477,8 +1477,8 @@ nl_parse_flower_actions(struct nlattr **attrs, struct 
tc_flower *flower)
 if (actions_orders[i]) {
 int err;
 
-if (flower->action_count >= TCA_ACT_MAX_PRIO) {
-VLOG_DBG_RL(_rl, "Can only support %d actions", 
flower->action_count);
+if (flower->action_count >= TCA_ACT_MAX_NUM) {
+VLOG_DBG_RL(_rl, "Can only support %d actions", 
TCA_ACT_MAX_NUM);
 return EOPNOTSUPP;
 }
 err = nl_parse_single_action(actions_orders[i], flower);
diff --git a/lib/tc.h b/lib/tc.h
index f4213579a2fd..f4073c6c47e6 100644
--- a/lib/tc.h
+++ b/lib/tc.h
@@ -208,6 +208,8 @@ enum tc_offloaded_state {
 TC_OFFLOADED_STATE_NOT_IN_HW,
 };
 
+#define TCA_ACT_MAX_NUM 16
+
 struct tc_flower {
 uint32_t handle;
 uint32_t prio;
@@ -216,7 +218,7 @@ struct tc_flower {
 struct tc_flower_key mask;
 
 int action_count;
-struct tc_action actions[TCA_ACT_MAX_PRIO];
+struct tc_action actions[TCA_ACT_MAX_NUM];
 
 struct ovs_flow_stats stats;
 uint64_t lastused;
-- 
2.8.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev