Re: [ovs-dev] [PATCH v1 ovn] ovn-ctl: Support passing ssl certs for northd

2019-09-17 Thread Han Zhou
On Mon, Sep 16, 2019 at 3:12 PM  wrote:
>
> From: Aliasgar Ginwala 
>
> When using ssl mode for ovn nb/sb active-standby/cluster db service
models,
> northd can use ssl mode too.
> e.g. one can pass  --ovn-northd-ssl-key, --ovn-northd-ssl-ca-cert and
> --ovn-northd-ssl-cert to start northd with ssl
>
> Signed-off-by: Aliasgar Ginwala 
> ---
>  utilities/ovn-ctl | 16 
>  1 file changed, 16 insertions(+)
>
> diff --git a/utilities/ovn-ctl b/utilities/ovn-ctl
> index 4242cd2c8..433ee4f50 100755
> --- a/utilities/ovn-ctl
> +++ b/utilities/ovn-ctl
> @@ -344,6 +344,15 @@ start_northd () {
>  if test X"$OVN_NORTHD_LOGFILE" != X; then
>  set "$@" --log-file=$OVN_NORTHD_LOGFILE
>  fi
> +if test X"$OVN_NORTHD_SSL_KEY" != X; then
> +set "$@" --private-key=$OVN_NORTHD_SSL_KEY
> +fi
> +if test X"$OVN_NORTHD_SSL_CERT" != X; then
> +set "$@" --certificate=$OVN_NORTHD_SSL_CERT
> +fi
> +if test X"$OVN_NORTHD_SSL_CA_CERT" != X; then
> +set "$@" --ca-cert=$OVN_NORTHD_SSL_CA_CERT
> +fi
>
>  [ "$OVN_USER" != "" ] && set "$@" --user "$OVN_USER"
>
> @@ -513,6 +522,10 @@ set_defaults () {
>  OVN_CONTROLLER_SSL_CA_CERT=""
>  OVN_CONTROLLER_SSL_BOOTSTRAP_CA_CERT=""
>
> +OVN_NORTHD_SSL_KEY=""
> +OVN_NORTHD_SSL_CERT=""
> +OVN_NORTHD_SSL_CA_CERT=""
> +
>  DB_SB_CREATE_INSECURE_REMOTE="no"
>  DB_NB_CREATE_INSECURE_REMOTE="no"
>
> @@ -617,6 +630,9 @@ Options:
>--ovn-sb-db-ssl-key=KEY OVN Southbound DB SSL private key file
>--ovn-sb-db-ssl-cert=CERT OVN Southbound DB SSL certificate file
>--ovn-sb-db-ssl-ca-cert=CERT OVN Southbound DB SSL CA certificate file
> +  --ovn-northd-ssl-key=KEY OVN Northd SSL private key file
> +  --ovn-northd-ssl-cert=CERT OVN Northd SSL certificate file
> +  --ovn-northd-ssl-ca-cert=CERT OVN Northd SSL CA certificate file
>--ovn-manage-ovsdb=yes|noWhether or not the OVN databases
should be
> automatically started and stopped
along
> with ovn-northd. The default is
"yes". If
> --
> 2.20.1 (Apple Git-117)
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Thanks Ali.
Acked-by: Han Zhou 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] checkpatch: Ignore utitilies/bugtool.

2019-09-17 Thread Ben Pfaff
On Mon, Sep 16, 2019 at 09:35:48AM -0700, Gregory Rose wrote:
> On 9/12/2019 11:11 AM, William Tu wrote:
> > Signed-off-by: William Tu 
> > ---
> >   utilities/checkpatch.py | 2 ++
> >   1 file changed, 2 insertions(+)
> > 
> > diff --git a/utilities/checkpatch.py b/utilities/checkpatch.py
> > index f8fa00e306a8..a9f27b52f3c8 100755
> > --- a/utilities/checkpatch.py
> > +++ b/utilities/checkpatch.py
> > @@ -844,6 +844,8 @@ def ovs_checkpatch_parse(text, filename, author=None, 
> > committer=None):
> >   # for a common style.
> >   if current_file.startswith('include/sparse'):
> >   continue
> > +if current_file.startswith('utilities/bugtool'):
> > +continue
> >   run_checks(current_file, cmp_line, lineno)
> >   run_file_checks(text)
> 
> Seems fine to me.
> 
> Reviewed-by: Greg Rose 

Thanks, William and Greg.  I applied this to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] tests: Fix ovs-vsctl unit test failure regression.

2019-09-17 Thread Ben Pfaff
On Tue, Sep 17, 2019 at 04:34:37PM -0700, Justin Pettit wrote:
> 
> > On Sep 17, 2019, at 2:30 PM, Ben Pfaff  wrote:
> > 
> > This worked fine as long as there was only one table whose name started
> > with "C", but now we have three of them.
> > 
> > CC: Justin Pettit 
> > Fixes: 61a5264d60d0 ("ovs-vswitchd: Add Datapath, CT_Zone, and 
> > CT_Zone_Policy tables.")
> > Signed-off-by: Ben Pfaff 
> 
> Acked-by: Justin Pettit 

Thanks, applied to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] tests: Fix ovs-vsctl unit test failure regression.

2019-09-17 Thread Justin Pettit


> On Sep 17, 2019, at 2:30 PM, Ben Pfaff  wrote:
> 
> This worked fine as long as there was only one table whose name started
> with "C", but now we have three of them.
> 
> CC: Justin Pettit 
> Fixes: 61a5264d60d0 ("ovs-vswitchd: Add Datapath, CT_Zone, and CT_Zone_Policy 
> tables.")
> Signed-off-by: Ben Pfaff 

Acked-by: Justin Pettit 

Thanks!

--Justin



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] tests: Fix ovs-vsctl unit test failure regression.

2019-09-17 Thread Ben Pfaff
This worked fine as long as there was only one table whose name started
with "C", but now we have three of them.

CC: Justin Pettit 
Fixes: 61a5264d60d0 ("ovs-vswitchd: Add Datapath, CT_Zone, and CT_Zone_Policy 
tables.")
Signed-off-by: Ben Pfaff 
---
 tests/ovs-vsctl.at | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/ovs-vsctl.at b/tests/ovs-vsctl.at
index 46fa3c5b1a33..4907be35d342 100644
--- a/tests/ovs-vsctl.at
+++ b/tests/ovs-vsctl.at
@@ -890,10 +890,10 @@ AT_CHECK([RUN_OVS_VSCTL([set bridge br0 flood_vlans=-1])],
 AT_CHECK([RUN_OVS_VSCTL([set bridge br0 flood_vlans=4096])],
   [1], [], [ovs-vsctl: constraint violation: 4096 is not in the valid range 0 
to 4095 (inclusive)
 ])
-AT_CHECK([RUN_OVS_VSCTL([set c br1 'connection-mode=xyz'])],
+AT_CHECK([RUN_OVS_VSCTL([set co br1 'connection-mode=xyz'])],
   [1], [], [[ovs-vsctl: constraint violation: xyz is not one of the allowed 
values ([in-band, out-of-band])
 ]])
-AT_CHECK([RUN_OVS_VSCTL([set c br1 connection-mode:x=y])],
+AT_CHECK([RUN_OVS_VSCTL([set co br1 connection-mode:x=y])],
   [1], [], [ovs-vsctl: cannot specify key to set for non-map column 
connection_mode
 ])
 AT_CHECK([RUN_OVS_VSCTL([add bridge br1 datapath_id x y])],
-- 
2.21.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] Learn the mac binding only if required

2019-09-17 Thread Han Zhou
On Mon, Sep 16, 2019 at 10:17 AM  wrote:
>
> From: Numan Siddique 
>
> OVN has the actions - put_arp and put_nd to learn the mac bindings from
the
> ARP/ND packets. These actions update the Southbound MAC_Binding table.
> These actions translates to controller actions. Whenever pinctrl thread
> receives such packets, it wakes up the main ovn-controller thread.
> If the MAC_Binding table is already upto date, this results
> in unnecessary CPU cyles. There are some security implications as well.
> A rogue VM can flood broadcast ARP request/reply packets and this
> could cause DoS issues. A physical switch may send periodic GARPs
> and these packets hit ovn-controllers.
>
> This patch solves these problems by learning the mac bindings only if
> required. There is no need to apply the put_arp/put_nd action if the
> Southbound MAC_Binding row is upto date.
>
> New actions - lookup_arp and lookup_nd are added which looks up the
> IP, MAC pair in the mac_binding table and stores the result in a
> register. 1 if lookup is successful, 0 otherwise.
>
> ovn-northd adds 2 new stages - lookup_arp and put_arp before ip_input
> in the router ingress pipeline.
>
> The logical flows looks something like:
>
> table=1 (lr_in_lookup_arp), priority=100  , match=(arp),
>  reg9[4] = lookup_arp(inport, arp.spa, arp.sha); next;)
>
> table=1 (lr_in_lookup_arp), priority=0, match=(1), action=(next;)
> ...
> table=2 (lr_in_put_arp   ), priority=100  ,
>  match=(arp.op == 2 && reg9[4] == 0),
>  action=(put_arp(inport, arp.spa, arp.sha);)
> table=2 (lr_in_put_arp   ), priority=90   , match=(arp.op == 2),
action=(drop;)
> table=2 (lr_in_put_arp   ), priority=0, match=(1), action=(next;)
>
> The lflow module of ovn-controller adds OF flows in table 31
(OFTABLE_MAC_LOOKUP)
> for each mac_binding entry with the match reg0 = ip && eth.src = mac with
> the action - load:1->reg2[0]
>
> Eg:
> table=31,
priority=100,arp,reg0=0xaca8006f,reg14=0x3,metadata=0x3,dl_src=00:44:00:00:00:04
>   actions=load:1->NXM_NX_REG2[0]
>
> This patch should also address the issue reported in 'Reported-at'
>
> Reported-at: https://bugzilla.redhat.com/1729846
> Reported-by: Haidong Li 
> CC: Han ZHou 
> CC: Dumitru Ceara 
> Tested-by: Dumitru Ceara 
> Signed-off-by: Numan Siddique 
> ---
>
> v1 -> v2
> ===
>* Addressed review comments from Han - Storing the result
>  of lookup_arp/lookup_nd in a register.
>
>  controller/lflow.c   |  36 -
>  controller/lflow.h   |   1 +
>  include/ovn/actions.h|  13 ++
>  include/ovn/logical-fields.h |   3 +
>  lib/actions.c| 115 ++
>  northd/ovn-northd.8.xml  | 251 --
>  northd/ovn-northd.c  | 205 ++---
>  ovn-sb.xml   |  57 +++
>  tests/ovn.at | 290 ++-
>  tests/test-ovn.c |   1 +
>  utilities/ovn-trace.c|  69 +
>  11 files changed, 861 insertions(+), 180 deletions(-)
>

Hi Numan,

This looks great. I spent more time on the review and here are my comments:

1. #define MFF_LOG_LOOKUP_MAC MFF_REG2

This new logical field is overlapping with the logical registers, as
defined:

/* Logical registers.
   *
   * Make sure these don't overlap with the logical fields! */
  #define MFF_LOG_REG0 MFF_REG0
  #define MFF_N_LOG_REGS 10

REG0 - REG9 are already reserved by above definition. If we have to use
one, it should be REG9, and then update MFF_N_LOG_REGS to 9. However, I
think it is better to use just one bit from MFF_LOG_FLAGS. The bits of the
flag is defined as MLF macros, and we can define a new one for
MLF_LOOKUP_MAC_BIT.
In addition, this change needs to be documented in ovn-architecture.

2. #define OFTABLE_MAC_LOOKUP   31

Table 31 is the last table of logical flows. This mac lookup table is not
part of logical flows, so it is better to use table 67.

3. lookup_arp/lookup_nd needs to be documented in ovn-architecture, at the
same place where put_arp/get_arp, etc. are documented.

4. In ovn-northd.xml, the term "MAC learning" is better to be changed to
something like "MAC-binding learning", or just "Neigbour learning", because
"MAC learning" usually means MAC-port table populating in L2 switch in
networking terminology. It is better to avoid the ambiguity.

5. For the pipeline, introducing the neigbour learning stage makes the
pipeline more clean. However, I think it could be a little more efficient.
Table 1 matches ARP and ND, and table 2 redo the match again. Would it be
better to have the default flow in table 1 set reg9[4] = 1, so that in
table 2 we can have a high priority flow just match "reg9[4] = 1" with
action "next"? This way, for non-ARP/ND packets, which is the most case,
are more efficient? Also, this makes the table 2 more clean with just one
task: neighbour learning, and leave the tasks of dropping the
packets/replying ARP/ND to the next stages.


[ovs-dev] [PATCH] fatal-signal: Catch SIGSEGV and print backtrace.

2019-09-17 Thread William Tu
The patch catches the SIGSEGV signal and prints the backtrace
using libunwind, hopefully makes it easier to debug.

Signed-off-by: William Tu 
---
 .travis.yml|  1 +
 configure.ac   |  1 +
 lib/fatal-signal.c | 40 +++-
 m4/openvswitch.m4  | 10 ++
 4 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/.travis.yml b/.travis.yml
index 370b3d0a6c98..f5d62387c89b 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -25,6 +25,7 @@ addons:
   - selinux-policy-dev
   - libunbound-dev
   - libunbound-dev:i386
+  - libunwind-dev
 
 before_install: ./.travis/${TRAVIS_OS_NAME}-prepare.sh
 
diff --git a/configure.ac b/configure.ac
index 1d45c4fdd153..15922418062b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -139,6 +139,7 @@ OVS_LIBTOOL_VERSIONS
 OVS_CHECK_CXX
 AX_FUNC_POSIX_MEMALIGN
 OVS_CHECK_UNBOUND
+OVS_CHECK_UNWIND
 
 OVS_CHECK_INCLUDE_NEXT([stdio.h string.h])
 AC_CONFIG_FILES([
diff --git a/lib/fatal-signal.c b/lib/fatal-signal.c
index 3b905b6de766..a7325c1ba37e 100644
--- a/lib/fatal-signal.c
+++ b/lib/fatal-signal.c
@@ -34,6 +34,11 @@
 
 #include "openvswitch/type-props.h"
 
+#ifdef HAVE_UNWIND
+#define UNW_LOCAL_ONLY
+#include 
+#endif
+
 #ifndef SIG_ATOMIC_MAX
 #define SIG_ATOMIC_MAX TYPE_MAXIMUM(sig_atomic_t)
 #endif
@@ -42,7 +47,8 @@ VLOG_DEFINE_THIS_MODULE(fatal_signal);
 
 /* Signals to catch. */
 #ifndef _WIN32
-static const int fatal_signals[] = { SIGTERM, SIGINT, SIGHUP, SIGALRM };
+static const int fatal_signals[] = { SIGTERM, SIGINT, SIGHUP, SIGALRM,
+ SIGSEGV };
 #else
 static const int fatal_signals[] = { SIGTERM };
 #endif
@@ -151,6 +157,32 @@ fatal_signal_add_hook(void (*hook_cb)(void *aux), void 
(*cancel_cb)(void *aux),
 ovs_mutex_unlock();
 }
 
+#ifdef HAVE_UNWIND
+static void
+show_backtrace(void) {
+char buf[4096];
+unw_cursor_t cursor;
+unw_context_t uc;
+unw_word_t ip, sp;
+unw_word_t offset;
+
+unw_getcontext();
+unw_init_local(, );
+
+while (unw_step() > 0) {
+unw_get_reg(, UNW_REG_IP, );
+unw_get_reg(, UNW_REG_SP, );
+unw_get_proc_name(, buf, 4095, );
+VLOG_WARN("0x%016lx <%s+0x%lx>\n", ip, buf, offset);
+}
+}
+#else
+static void
+show_backtrace(void) {
+/* Nothing. */
+}
+#endif
+
 /* Handles fatal signal number 'sig_nr'.
  *
  * Ordinarily this is the actual signal handler.  When other code needs to
@@ -164,6 +196,12 @@ void
 fatal_signal_handler(int sig_nr)
 {
 #ifndef _WIN32
+if (sig_nr == SIGSEGV) {
+show_backtrace();
+fflush(stderr);
+signal(sig_nr, SIG_DFL);
+raise(sig_nr);
+}
 ignore(write(signal_fds[1], "", 1));
 #else
 SetEvent(wevent);
diff --git a/m4/openvswitch.m4 b/m4/openvswitch.m4
index cd6b51d86c16..f8bb069e80c9 100644
--- a/m4/openvswitch.m4
+++ b/m4/openvswitch.m4
@@ -705,3 +705,13 @@ AC_DEFUN([OVS_CHECK_UNBOUND],
fi
AM_CONDITIONAL([HAVE_UNBOUND], [test "$HAVE_UNBOUND" = yes])
AC_SUBST([HAVE_UNBOUND])])
+
+dnl Checks for libunwind.
+AC_DEFUN([OVS_CHECK_UNWIND],
+  [AC_CHECK_LIB(unwind, unw_backtrace, [HAVE_UNWIND=yes], [HAVE_UNWIND=no])
+   if test "$HAVE_UNWIND" = yes; then
+ AC_DEFINE([HAVE_UNWIND], [1], [Define to 1 if unwind is detected.])
+ LIBS="$LIBS -lunwind"
+   fi
+   AM_CONDITIONAL([HAVE_UNWIND], [test "$HAVE_UNWIND" = yes])
+   AC_SUBST([HAVE_UNWIND])])
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCHv4] netdev-afxdp: Add need_wakeup supprt.

2019-09-17 Thread William Tu
The patch adds support for using need_wakeup flag in AF_XDP rings.
A new option, use_need_wakeup, is added. When this option is used,
it means that OVS has to explicitly wake up the kernel RX, using poll()
syscall and wake up TX, using sendto() syscall. This feature improves
the performance by avoiding unnecessary sendto syscalls for TX.
For RX, instead of kernel always busy-spinning on fille queue, OVS wakes
up the kernel RX processing when fill queue is replenished.

The need_wakeup feature is merged into Linux kernel 5.3.0-rc1 and OVS
enables it by default. Running the feature before this version causes
xsk bind fails, please use options:use_need_wakeup=false to disable it.
If users enable it but runs in an older version of libbpf, then the
need_wakeup feature has no effect, and a warning message is logged.

For virtual interface, it's better set use_need_wakeup=false, since
the virtual device's AF_XDP xmit is synchronous: the sendto syscall
enters kernel and process the TX packet on tx queue directly.

I tested on kernel 5.3.0-rc3 using its libbpf.  On Intel Xeon E5-2620
v3 2.4GHz system, performance of physical port to physical port improves
from 6.1Mpps to 7.3Mpps. Testing on 5.2.0-rc6 using libbpf from 5.3.0-rc3
does not work due to libbpf API change. Users have to use the older
libbpf for older kernel.

Suggested-by: Ilya Maximets 
Signed-off-by: William Tu 
---
v4:
- move use_need_wakeup check inside xsk_rx_wakeup_if_needed

v3:
- add warning when user enables it but libbpf not support it
- revise documentation

v2:
- address feedbacks from Ilya and Eelco
- add options:use_need_wakeup, default to true
- remove poll timeout=1sec, make poll() return immediately
- naming change: rename to xsk_rx_wakeup_if_needing
- fix indents and return value for errno
---
 Documentation/intro/install/afxdp.rst |  15 -
 acinclude.m4  |   8 +++
 lib/netdev-afxdp.c| 104 ++
 lib/netdev-linux-private.h|   2 +
 vswitchd/vswitch.xml  |  13 +
 5 files changed, 127 insertions(+), 15 deletions(-)

diff --git a/Documentation/intro/install/afxdp.rst 
b/Documentation/intro/install/afxdp.rst
index 820e9d993d8f..545516b2bbec 100644
--- a/Documentation/intro/install/afxdp.rst
+++ b/Documentation/intro/install/afxdp.rst
@@ -176,9 +176,18 @@ in :doc:`general`::
   ovs-vswitchd ...
   ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev
 
-Make sure your device driver support AF_XDP, and to use 1 PMD (on core 4)
-on 1 queue (queue 0) device, configure these options: **pmd-cpu-mask,
-pmd-rxq-affinity, and n_rxq**. The **xdpmode** can be "drv" or "skb"::
+Make sure your device driver support AF_XDP, netdev-afxdp supports
+the following additional options (see man ovs-vswitchd.conf.db for
+more details):
+
+ * **xdpmode**: use "drv" for driver mode, or "skb" for skb mode.
+
+ * **use_need_wakeup**: disable by setting to "false", otherwise default
+   is "true"
+
+For example, to use 1 PMD (on core 4) on 1 queue (queue 0) device,
+configure these options: **pmd-cpu-mask, pmd-rxq-affinity, and n_rxq**.
+The **xdpmode** can be "drv" or "skb"::
 
   ethtool -L enp2s0 combined 1
   ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x10
diff --git a/acinclude.m4 b/acinclude.m4
index f0e38898b17a..df1082c455fc 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -276,6 +276,14 @@ AC_DEFUN([OVS_CHECK_LINUX_AF_XDP], [
   [Define to 1 if AF_XDP support is available and enabled.])
 LIBBPF_LDADD=" -lbpf -lelf"
 AC_SUBST([LIBBPF_LDADD])
+
+AC_CHECK_DECL([xsk_ring_prod__needs_wakeup], [
+  AC_DEFINE([HAVE_XDP_NEED_WAKEUP], [1],
+[XDP need wakeup support detected in xsk.h.])
+], [
+  AC_DEFINE([HAVE_XDP_NEED_WAKEUP], [0],
+[XDP need wakeup support not detected in xsk.h.])
+  ], [#include ])
   fi
   AM_CONDITIONAL([HAVE_AF_XDP], test "$AF_XDP_ENABLE" = true)
 ])
diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c
index e5b058d08a09..a101a750bc5f 100644
--- a/lib/netdev-afxdp.c
+++ b/lib/netdev-afxdp.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -82,7 +83,7 @@ BUILD_ASSERT_DECL(PROD_NUM_DESCS == CONS_NUM_DESCS);
 #define UMEM2DESC(elem, base) ((uint64_t)((char *)elem - (char *)base))
 
 static struct xsk_socket_info *xsk_configure(int ifindex, int xdp_queue_id,
- int mode);
+ int mode, bool use_need_wakeup);
 static void xsk_remove_xdp_program(uint32_t ifindex, int xdpmode);
 static void xsk_destroy(struct xsk_socket_info *xsk);
 static int xsk_configure_all(struct netdev *netdev);
@@ -117,6 +118,54 @@ struct xsk_socket_info {
 atomic_uint64_t tx_dropped;
 };
 
+#ifdef HAVE_XDP_NEED_WAKEUP
+static inline void
+xsk_rx_wakeup_if_needed(struct xsk_umem_info *umem,
+struct netdev *netdev, int fd)
+{
+struct 

Re: [ovs-dev] [ovs-discuss] ovs-vswitchd exited silently

2019-09-17 Thread Flavio Leitner
On Thu, Sep 05, 2019 at 09:18:22AM +, Frank Wang(王培辉) wrote:
> Hi All

Hi Franck,

Perhaps check the permissions of /var/run/openvswitch ?

If they are correct, you can start ovsdb-server service first and
then start the ovs-vswitchd manually as the service does.

fbl

> 
> 
> 
>  I’m encountered a very strange problem that ovs-vswitchd exited 
> silently without any exceptions while ovsdb-server running normally, I’m 
> using openvswitch 2.9.2
> 
> I didn’t find any valuable log except that
> 
> ovs|1|unixctl|WARN|failed to connect to 
> /var/run/openvswitch/ovs-vswitchd.5168.ctl
> 
> ovs-ctl[303320]: 2019-09-05T03:12:28Z|1|unixctl|WARN|failed to connect to 
> /var/run/openvswitch/ovs-vswitchd.5168.ctl
> 
> ovs-ctl[303320]: ovs-appctl: cannot connect to 
> "/var/run/openvswitch/ovs-vswitchd.5168.ctl" (Connection refused)
> 
> 
> 
> See the details below.
> 
> 
> 
> [root@node-34435-75174 ~]# systemctl status ovs-vswitchd -l
> 
> ● ovs-vswitchd.service - Open vSwitch Forwarding Unit
> 
>Loaded: loaded (/usr/lib/systemd/system/ovs-vswitchd.service; static; 
> vendor preset: disabled)
> 
>Active: inactive (dead) since Thu 2019-09-05 11:12:28 CST; 4h 4min ago
> 
>   Process: 303320 ExecStop=/usr/share/openvswitch/scripts/ovs-ctl 
> --no-ovsdb-server stop (code=exited, status=0/SUCCESS)
> 
> 
> 
> Aug 27 11:09:13 node-34435-75174 systemd[1]: Starting Open vSwitch Forwarding 
> Unit...
> 
> Aug 27 11:09:13 node-34435-75174 ovs-ctl[5105]: Inserting openvswitch module 
> [  OK  ]
> 
> Aug 27 11:09:13 node-34435-75174 ovs-ctl[5105]: Starting ovs-vswitchd [  OK  ]
> 
> Aug 27 11:09:13 node-34435-75174 ovs-vsctl[5257]: ovs|1|vsctl|INFO|Called 
> as ovs-vsctl --no-wait set Open_vSwitch . 
> external-ids:hostname=node-34435-75174
> 
> Aug 27 11:09:13 node-34435-75174 ovs-ctl[5105]: Enabling remote OVSDB 
> managers [  OK  ]
> 
> Aug 27 11:09:13 node-34435-75174 systemd[1]: Started Open vSwitch Forwarding 
> Unit.
> 
> Sep 05 11:12:28 node-34435-75174 ovs-appctl[30]: 
> ovs|1|unixctl|WARN|failed to connect to 
> /var/run/openvswitch/ovs-vswitchd.5168.ctl
> 
> Sep 05 11:12:28 node-34435-75174 ovs-ctl[303320]: 
> 2019-09-05T03:12:28Z|1|unixctl|WARN|failed to connect to 
> /var/run/openvswitch/ovs-vswitchd.5168.ctl
> 
> Sep 05 11:12:28 node-34435-75174 ovs-ctl[303320]: ovs-appctl: cannot connect 
> to "/var/run/openvswitch/ovs-vswitchd.5168.ctl" (Connection refused)
> 
> 
> 
> [root@node-34435-75174 ~]# ls /var/run/openvswitch/ -l
> 
> total 12
> 
> srwxr-x---. 1 openvswitch openvswitch  0 Aug 27 11:09 db.sock
> 
> srwxr-x---. 1 openvswitch openvswitch  0 Aug 27 11:09 managevSwitch.mgmt
> 
> srwxr-x---. 1 openvswitch openvswitch  0 Aug 27 11:09 managevSwitch.snoop
> 
> srwxr-x---. 1 openvswitch openvswitch  0 Aug 27 11:09 ovsdb-server.5084.ctl
> 
> -rw-r--r--. 1 openvswitch openvswitch  5 Aug 27 11:09 ovsdb-server.pid
> 
> srwxr-x---. 1 openvswitch openvswitch  0 Aug 27 11:09 ovs-vswitchd.5168.ctl
> 
> -rw-r--r--. 1 openvswitch openvswitch  5 Aug 27 11:09 ovs-vswitchd.pid
> 
> srwxr-x---. 1 openvswitch openvswitch  0 Aug 27 11:09 sw-01.mgmt
> 
> srwxr-x---. 1 openvswitch openvswitch  0 Aug 27 11:09 sw-01.snoop
> 
> -rw-r--r--. 1 rootroot43 Aug 27 11:09 useropts
> 
> 
> 
> cat /var/log/message
> 
> Sep  5 11:00:01 node-34435-75174 systemd: Removed slice User Slice of root.
> 
> Sep  5 11:01:01 node-34435-75174 systemd: Created slice User Slice of root.
> 
> Sep  5 11:01:01 node-34435-75174 systemd: Started Session 1530 of user root.
> 
> Sep  5 11:01:01 node-34435-75174 systemd: Removed slice User Slice of root.
> 
> Sep  5 11:10:01 node-34435-75174 systemd: Created slice User Slice of root.
> 
> Sep  5 11:10:01 node-34435-75174 systemd: Started Session 1531 of user root.
> 
> Sep  5 11:10:01 node-34435-75174 systemd: Removed slice User Slice of root.
> 
> Sep  5 11:12:28 node-34435-75174 ovs-appctl: ovs|1|unixctl|WARN|failed to 
> connect to /var/run/openvswitch/ovs-vswitchd.5168.ctl
> 
> Sep  5 11:12:28 node-34435-75174 ovs-ctl: 
> 2019-09-05T03:12:28Z|1|unixctl|WARN|failed to connect to 
> /var/run/openvswitch/ovs-vswitchd.5168.ctl
> 
> Sep  5 11:12:28 node-34435-75174 ovs-ctl: ovs-appctl: cannot connect to 
> "/var/run/openvswitch/ovs-vswitchd.5168.ctl" (Connection refused)
> 
> Sep  5 11:20:01 node-34435-75174 systemd: Created slice User Slice of root.
> 
> Sep  5 11:20:01 node-34435-75174 systemd: Started Session 1532 of user root.
> 
> Sep  5 11:20:01 node-34435-75174 systemd: Removed slice User Slice of root.
> 
> Sep  5 11:24:12 node-34435-75174 systemd: Starting Cleanup of Temporary 
> Directories...
> 
> Sep  5 11:24:12 node-34435-75174 systemd: Started Cleanup of Temporary 
> Directories.
> 
> Sep  5 11:30:01 node-34435-75174 systemd: Created slice User Slice of root.
> 
> Sep  5 11:30:01 node-34435-75174 systemd: Started Session 1533 of user root.
> 
> Sep  5 11:30:01 node-34435-75174 systemd: Removed slice User Slice of root.
> 
> 

Re: [ovs-dev] [PATCHv3] netdev-afxdp: Add need_wakeup supprt.

2019-09-17 Thread William Tu
On Tue, Sep 17, 2019 at 01:41:17PM +0200, Eelco Chaudron wrote:
> Two comments below…
>   
> 
> On 11 Sep 2019, at 19:58, William Tu wrote:
> 
> >The patch adds support for using need_wakeup flag in AF_XDP rings.
> >A new option, use_need_wakeup, is added. When this option is used,
> >it means that OVS has to explicitly wake up the kernel RX, using poll()
> >syscall and wake up TX, using sendto() syscall. This feature improves
> >the performance by avoiding unnecessary sendto syscalls for TX.
> >For RX, instead of kernel always busy-spinning on fille queue, OVS wakes
> >up the kernel RX processing when fill queue is replenished.
> >
> >The need_wakeup feature is merged into Linux kernel 5.3.0-rc1 and OVS
> >enables it by default. Running the feature before this version causes
> >xsk bind fails, please use options:use_need_wakeup=false to disable it.
> >If users enable it but runs in an older version of libbpf, then the
> >need_wakeup feature has no effect, and a warning message is logged.
> >
> >For virtual interface, it's better set use_need_wakeup=false, since
> >the virtual device's AF_XDP xmit is synchronous: the sendto syscall
> >enters kernel and process the TX packet on tx queue directly.
> >
> >I tested on kernel 5.3.0-rc3 using its libbpf.  On Intel Xeon E5-2620
> >v3 2.4GHz system, performance of physical port to physical port improves
> >from 6.1Mpps to 7.3Mpps. Testing on 5.2.0-rc6 using libbpf from 5.3.0-rc3
> >does not work due to libbpf API change. Users have to use the older
> >libbpf for older kernel.
> >
> >Suggested-by: Ilya Maximets 
> >Signed-off-by: William Tu 
> >---
> >v3:
> >- add warning when user enables it but libbpf not support it
> >- revise documentation
> >
> >v2:
> >- address feedbacks from Ilya and Eelco
> >- add options:use_need_wakeup, default to true
> >- remove poll timeout=1sec, make poll() return immediately
> >- naming change: rename to xsk_rx_wakeup_if_needing
> >- fix indents and return value for errno
> >---
> > Documentation/intro/install/afxdp.rst |  15 -
> > acinclude.m4  |   8 +++
> > lib/netdev-afxdp.c| 101
> >++
> > lib/netdev-linux-private.h|   2 +
> > vswitchd/vswitch.xml  |  13 +
> > 5 files changed, 124 insertions(+), 15 deletions(-)
> >
> >diff --git a/Documentation/intro/install/afxdp.rst
> >b/Documentation/intro/install/afxdp.rst
> >index 820e9d993d8f..545516b2bbec 100644
> >--- a/Documentation/intro/install/afxdp.rst
> >+++ b/Documentation/intro/install/afxdp.rst
> >@@ -176,9 +176,18 @@ in :doc:`general`::
> >   ovs-vswitchd ...
> >   ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev
> >
> >-Make sure your device driver support AF_XDP, and to use 1 PMD (on core 4)
> >-on 1 queue (queue 0) device, configure these options: **pmd-cpu-mask,
> >-pmd-rxq-affinity, and n_rxq**. The **xdpmode** can be "drv" or "skb"::
> >+Make sure your device driver support AF_XDP, netdev-afxdp supports
> >+the following additional options (see man ovs-vswitchd.conf.db for
> >+more details):
> >+
> >+ * **xdpmode**: use "drv" for driver mode, or "skb" for skb mode.
> >+
> >+ * **use_need_wakeup**: disable by setting to "false", otherwise default
> >+   is "true"
> >+
> >+For example, to use 1 PMD (on core 4) on 1 queue (queue 0) device,
> >+configure these options: **pmd-cpu-mask, pmd-rxq-affinity, and n_rxq**.
> >+The **xdpmode** can be "drv" or "skb"::
> >
> >   ethtool -L enp2s0 combined 1
> >   ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x10
> >diff --git a/acinclude.m4 b/acinclude.m4
> >index f0e38898b17a..df1082c455fc 100644
> >--- a/acinclude.m4
> >+++ b/acinclude.m4
> >@@ -276,6 +276,14 @@ AC_DEFUN([OVS_CHECK_LINUX_AF_XDP], [
> >   [Define to 1 if AF_XDP support is available and enabled.])
> > LIBBPF_LDADD=" -lbpf -lelf"
> > AC_SUBST([LIBBPF_LDADD])
> >+
> >+AC_CHECK_DECL([xsk_ring_prod__needs_wakeup], [
> >+  AC_DEFINE([HAVE_XDP_NEED_WAKEUP], [1],
> >+[XDP need wakeup support detected in xsk.h.])
> >+], [
> >+  AC_DEFINE([HAVE_XDP_NEED_WAKEUP], [0],
> >+[XDP need wakeup support not detected in xsk.h.])
> >+  ], [#include ])
> >   fi
> >   AM_CONDITIONAL([HAVE_AF_XDP], test "$AF_XDP_ENABLE" = true)
> > ])
> >diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c
> >index e5b058d08a09..4d4c90f91806 100644
> >--- a/lib/netdev-afxdp.c
> >+++ b/lib/netdev-afxdp.c
> >@@ -26,6 +26,7 @@
> > #include 
> > #include 
> > #include 
> >+#include 
> > #include 
> > #include 
> > #include 
> >@@ -82,7 +83,7 @@ BUILD_ASSERT_DECL(PROD_NUM_DESCS == CONS_NUM_DESCS);
> > #define UMEM2DESC(elem, base) ((uint64_t)((char *)elem - (char *)base))
> >
> > static struct xsk_socket_info *xsk_configure(int ifindex, int
> >xdp_queue_id,
> >- int mode);
> >+ int mode, bool
> >use_need_wakeup);
> > 

Re: [ovs-dev] [PATCH 10/10] conntrack: Validate accessing of conntrack data in pkt_metadata

2019-09-17 Thread William Tu
On Wed, Sep 11, 2019 at 02:18:36PM -0700, Yifeng Sun wrote:
> Valgrind reported:
> 
> 1305: ofproto-dpif - conntrack - ipv6
> 
> ==26942== Conditional jump or move depends on uninitialised value(s)
> ==26942==at 0x587C00: check_orig_tuple (conntrack.c:1006)
> ==26942==by 0x587C00: process_one (conntrack.c:1141)
> ==26942==by 0x587C00: conntrack_execute (conntrack.c:1220)
> ==26942==by 0x47B00F: dp_execute_cb (dpif-netdev.c:7305)
> ==26942==by 0x4AF756: odp_execute_actions (odp-execute.c:794)
> ==26942==by 0x477532: dp_netdev_execute_actions (dpif-netdev.c:7349)
> ==26942==by 0x477532: handle_packet_upcall (dpif-netdev.c:6630)
> ==26942==by 0x477532: fast_path_processing (dpif-netdev.c:6726)
> ==26942==by 0x47933C: dp_netdev_input__ (dpif-netdev.c:6814)
> ==26942==by 0x479AB8: dp_netdev_input (dpif-netdev.c:6852)
> ==26942==by 0x479AB8: dp_netdev_process_rxq_port (dpif-netdev.c:4287)
> ==26942==by 0x47A6A9: dpif_netdev_run (dpif-netdev.c:5264)
> ==26942==by 0x4324E7: type_run (ofproto-dpif.c:342)
> ==26942==by 0x41C5FE: ofproto_type_run (ofproto.c:1734)
> ==26942==by 0x40BAAC: bridge_run__ (bridge.c:2965)
> ==26942==by 0x410CF3: bridge_run (bridge.c:3029)
> ==26942==by 0x407614: main (ovs-vswitchd.c:127)
> ==26942==  Uninitialised value was created by a heap allocation
> ==26942==at 0x4C2DB8F: malloc (in 
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==26942==by 0x532574: xmalloc (util.c:138)
> ==26942==by 0x46CD62: dp_packet_new (dp-packet.c:153)
> ==26942==by 0x4A0431: eth_from_flow_str (netdev-dummy.c:1644)
> ==26942==by 0x4A0431: netdev_dummy_receive (netdev-dummy.c:1783)
> ==26942==by 0x531990: process_command (unixctl.c:308)
> ==26942==by 0x531990: run_connection (unixctl.c:342)
> ==26942==by 0x531990: unixctl_server_run (unixctl.c:393)
> ==26942==by 0x40761E: main (ovs-vswitchd.c:128)
> 
> 1316: ofproto-dpif - conntrack - tcp port reuse
> 
> ==24039== Conditional jump or move depends on uninitialised value(s)
> ==24039==at 0x587BF5: check_orig_tuple (conntrack.c:1004)
> ==24039==by 0x587BF5: process_one (conntrack.c:1141)
> ==24039==by 0x587BF5: conntrack_execute (conntrack.c:1220)
> ==24039==by 0x47B02F: dp_execute_cb (dpif-netdev.c:7306)
> ==24039==by 0x4AF7A6: odp_execute_actions (odp-execute.c:794)
> ==24039==by 0x47755B: dp_netdev_execute_actions (dpif-netdev.c:7350)
> ==24039==by 0x47755B: handle_packet_upcall (dpif-netdev.c:6631)
> ==24039==by 0x47755B: fast_path_processing (dpif-netdev.c:6727)
> ==24039==by 0x47935C: dp_netdev_input__ (dpif-netdev.c:6815)
> ==24039==by 0x479AD8: dp_netdev_input (dpif-netdev.c:6853)
> ==24039==by 0x479AD8: dp_netdev_process_rxq_port
> (dpif-netdev.c:4287)
> ==24039==by 0x47A6C9: dpif_netdev_run (dpif-netdev.c:5264)
> ==24039==by 0x4324F7: type_run (ofproto-dpif.c:342)
> ==24039==by 0x41C5FE: ofproto_type_run (ofproto.c:1734)
> ==24039==by 0x40BAAC: bridge_run__ (bridge.c:2965)
> ==24039==by 0x410CF3: bridge_run (bridge.c:3029)
> ==24039==by 0x407614: main (ovs-vswitchd.c:127)
> ==24039==  Uninitialised value was created by a heap allocation
> ==24039==at 0x4C2DB8F: malloc (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==24039==by 0x5325C4: xmalloc (util.c:138)
> ==24039==by 0x46D144: dp_packet_new (dp-packet.c:153)
> ==24039==by 0x46D144: dp_packet_new_with_headroom (dp-packet.c:163)
> ==24039==by 0x51191E: eth_from_hex (packets.c:498)
> ==24039==by 0x4A03B9: eth_from_packet (netdev-dummy.c:1609)
> ==24039==by 0x4A03B9: netdev_dummy_receive (netdev-dummy.c:1765)
> ==24039==by 0x5319E0: process_command (unixctl.c:308)
> ==24039==by 0x5319E0: run_connection (unixctl.c:342)
> ==24039==by 0x5319E0: unixctl_server_run (unixctl.c:393)
> ==24039==by 0x40761E: main (ovs-vswitchd.c:128)
> 
> According to comments in pkt_metadata_init(), conntrack data is valid
> only if pkt_metadata.ct_state != 0. This patch prevents
> check_orig_tuple() get called when conntrack data is uninitialized.
> 
> Signed-off-by: Yifeng Sun 

LGTM
Acked-by: William Tu 

> ---
>  lib/conntrack.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/conntrack.c b/lib/conntrack.c
> index e5266e579452..86c16b2fbe77 100644
> --- a/lib/conntrack.c
> +++ b/lib/conntrack.c
> @@ -1138,7 +1138,8 @@ process_one(struct conntrack *ct, struct dp_packet *pkt,
>  handle_nat(pkt, conn, zone, ctx->reply, ctx->icmp_related);
>  }
>  
> -} else if (check_orig_tuple(ct, pkt, ctx, now, , nat_action_info)) {
> +} else if (pkt->md.ct_state
> +   && check_orig_tuple(ct, pkt, ctx, now, , 
> nat_action_info)) {
>  create_new_conn = conn_update_state(ct, pkt, ctx, conn, now);
>  } else {
>  if (ctx->icmp_related) {
> -- 
> 2.7.4
> 
> ___
> dev mailing list

Re: [ovs-dev] [PATCH 09/10] db-ctl-base: Free leaked ovsdb_datum

2019-09-17 Thread William Tu
On Wed, Sep 11, 2019 at 02:18:35PM -0700, Yifeng Sun wrote:
> Valgrind reported:
> 
> 2491: database commands -- negative checks
> 
> ==19245== 36 (32 direct, 4 indirect) bytes in 1 blocks are definitely lost in 
> loss record 36 of 53
> ==19245==at 0x4C2FD5F: realloc (in 
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==19245==by 0x431AB4: xrealloc (util.c:149)
> ==19245==by 0x41656D: ovsdb_datum_reallocate (ovsdb-data.c:1883)
> ==19245==by 0x41656D: ovsdb_datum_union (ovsdb-data.c:1961)
> ==19245==by 0x4107B2: cmd_add (db-ctl-base.c:1494)
> ==19245==by 0x406E2E: do_vsctl (ovs-vsctl.c:2626)
> ==19245==by 0x406E2E: main (ovs-vsctl.c:183)
> 
> ==19252== 16 bytes in 1 blocks are definitely lost in loss record 9 of 52
> ==19252==at 0x4C2DB8F: malloc (in 
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==19252==by 0x430F74: xmalloc (util.c:138)
> ==19252==by 0x414D07: clone_atoms (ovsdb-data.c:990)
> ==19252==by 0x4153F6: ovsdb_datum_clone (ovsdb-data.c:1012)
> ==19252==by 0x4104D3: cmd_remove (db-ctl-base.c:1564)
> ==19252==by 0x406E2E: do_vsctl (ovs-vsctl.c:2626)
> ==19252==by 0x406E2E: main (ovs-vsctl.c:183)
> 
> This patch fixes them.
> 
> Signed-off-by: Yifeng Sun 

LGTM.
Acked-by: William Tu 



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 08/10] ofproto-dpif: Free leaked 'webster'

2019-09-17 Thread William Tu
On Wed, Sep 11, 2019 at 02:18:34PM -0700, Yifeng Sun wrote:
> Valgrind reported:
> 
> 1122: ofproto-dpif - select group with explicit dp_hash selection method
> 
> ==16884== 64 bytes in 1 blocks are definitely lost in loss record 320 of 346
> ==16884==at 0x4C2FB55: calloc (in 
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==16884==by 0x532512: xcalloc (util.c:121)
> ==16884==by 0x4262B9: group_setup_dp_hash_table (ofproto-dpif.c:4846)
> ==16884==by 0x4267CB: group_set_selection_method (ofproto-dpif.c:4938)
> ==16884==by 0x4267CB: group_construct (ofproto-dpif.c:4984)
> ==16884==by 0x417250: init_group (ofproto.c:7286)
> ==16884==by 0x41B4FC: add_group_start (ofproto.c:7316)
> ==16884==by 0x42247A: ofproto_group_mod_start (ofproto.c:7589)
> ==16884==by 0x4250EC: handle_group_mod (ofproto.c:7744)
> ==16884==by 0x4250EC: handle_single_part_openflow (ofproto.c:8428)
> ==16884==by 0x4250EC: handle_openflow (ofproto.c:8606)
> ==16884==by 0x4579E2: ofconn_run (connmgr.c:1318)
> ==16884==by 0x4579E2: connmgr_run (connmgr.c:355)
> ==16884==by 0x41E0F5: ofproto_run (ofproto.c:1845)
> ==16884==by 0x40BA63: bridge_run__ (bridge.c:2971)
> ==16884==by 0x410CF3: bridge_run (bridge.c:3029)
> ==16884==by 0x407614: main (ovs-vswitchd.c:127)
> 
> This patch fixes it.
> 
> Signed-off-by: Yifeng Sun 

LGTM.
Acked-by: William Tu 

> ---
>  ofproto/ofproto-dpif.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> index 46fa1357163b..7bb0f7bdb4f3 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> @@ -4871,6 +4871,7 @@ group_setup_dp_hash_table(struct group_dpif *group, 
> size_t max_hash)
>  if (n_hash > MAX_SELECT_GROUP_HASH_VALUES ||
>  (max_hash != 0 && n_hash > max_hash)) {
>  VLOG_DBG("  Too many hash values required: %"PRIu64, n_hash);
> +free(webster);
>  return false;
>  }
>  
> -- 
> 2.7.4
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 07/10] dns-resolve: Free 'struct ub_result' when callback returns error results

2019-09-17 Thread William Tu
On Wed, Sep 11, 2019 at 02:18:33PM -0700, Yifeng Sun wrote:
> Valgrind reported:
> 
> 1074: ofproto - flush flows, groups, and meters for controller change
> 
> ==5499== 695 (288 direct, 407 indirect) bytes in 3 blocks are definitely lost 
> in loss record 344 of 355
> ==5499==at 0x4C2FB55: calloc (in 
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==5499==by 0x5E7F145: ??? (in 
> /usr/lib/x86_64-linux-gnu/libunbound.so.2.4.0)
> ==5499==by 0x5E6EBDE: ub_resolve_async (in 
> /usr/lib/x86_64-linux-gnu/libunbound.so.2.4.0)
> ==5499==by 0x55C739: resolve_async__.part.5 (dns-resolve.c:233)
> ==5499==by 0x55C85C: resolve_async__ (dns-resolve.c:261)
> ==5499==by 0x55C85C: resolve_callback__ (dns-resolve.c:262)
> ==5499==by 0x5E6FEF1: ub_process (in 
> /usr/lib/x86_64-linux-gnu/libunbound.so.2.4.0)
> ==5499==by 0x55CAF3: dns_resolve (dns-resolve.c:153)
> ==5499==by 0x523864: parse_sockaddr_components_dns (socket-util.c:438)
> ==5499==by 0x523864: parse_sockaddr_components (socket-util.c:504)
> ==5499==by 0x524468: inet_parse_active (socket-util.c:541)
> ==5499==by 0x524564: inet_open_active (socket-util.c:579)
> ==5499==by 0x5959F9: tcp_open (stream-tcp.c:56)
> ==5499==by 0x529192: stream_open (stream.c:228)
> ==5499==by 0x529910: stream_open_with_default_port (stream.c:724)
> ==5499==by 0x595FAE: vconn_stream_open (vconn-stream.c:81)
> ==5499==by 0x535C9B: vconn_open (vconn.c:250)
> ==5499==by 0x517C59: reconnect (rconn.c:467)
> ==5499==by 0x5184C7: run_BACKOFF (rconn.c:492)
> ==5499==by 0x5184C7: rconn_run (rconn.c:660)
> ==5499==by 0x457FE8: ofservice_run (connmgr.c:1992)
> ==5499==by 0x457FE8: connmgr_run (connmgr.c:367)
> ==5499==by 0x41E0F5: ofproto_run (ofproto.c:1845)
> ==5499==by 0x40BA63: bridge_run__ (bridge.c:2971)
> 
> In ub_resolve_async's callback function, 'struct ub_result' should be
> finally freed even if there is a resolving error. This patch fixes it.
> 
> Signed-off-by: Yifeng Sun 
LGTM
Acked-by: William Tu 


> ---
>  lib/dns-resolve.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/lib/dns-resolve.c b/lib/dns-resolve.c
> index e98e65f493ed..1ff58960fe01 100644
> --- a/lib/dns-resolve.c
> +++ b/lib/dns-resolve.c
> @@ -251,6 +251,7 @@ resolve_callback__(void *req_, int err, struct ub_result 
> *result)
>  struct resolve_request *req = req_;
>  
>  if (err != 0 || (result->qtype == ns_t_ && !result->havedata)) {
> +ub_resolve_free(result);
>  req->state = RESOLVE_ERROR;
>  VLOG_ERR_RL(, "%s: failed to resolve", req->name);
>  return;
> @@ -265,6 +266,7 @@ resolve_callback__(void *req_, int err, struct ub_result 
> *result)
>  
>  char *addr;
>  if (!resolve_result_to_addr__(result, )) {
> +ub_resolve_free(result);
>  req->state = RESOLVE_ERROR;
>  VLOG_ERR_RL(, "%s: failed to resolve", req->name);
>  return;
> -- 
> 2.7.4
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 06/10] ovsdb-client: Free ovsdb_schema

2019-09-17 Thread William Tu
On Wed, Sep 11, 2019 at 02:18:32PM -0700, Yifeng Sun wrote:
> Valgrind reported:
> 
> 1925: schema conversion online - standalone
> 
> ==10727== 689 (56 direct, 633 indirect) bytes in 1 blocks are definitely lost 
> in loss record 64 of 66
> ==10727==at 0x4C2FB55: calloc (in 
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==10727==by 0x449D42: xcalloc (util.c:121)
> ==10727==by 0x40F45C: ovsdb_schema_create (ovsdb.c:41)
> ==10727==by 0x40F7F8: ovsdb_schema_from_json (ovsdb.c:217)
> ==10727==by 0x40FB4E: ovsdb_schema_from_file (ovsdb.c:101)
> ==10727==by 0x40B156: do_convert (ovsdb-client.c:1639)
> ==10727==by 0x4061C6: main (ovsdb-client.c:282)
> 
> This patch fixes it.
> 
> Signed-off-by: Yifeng Sun 
LGTM
Acked-by: William Tu 

> ---
>  ovsdb/ovsdb-client.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/ovsdb/ovsdb-client.c b/ovsdb/ovsdb-client.c
> index 9ae15e557661..bfc90e6f7f85 100644
> --- a/ovsdb/ovsdb-client.c
> +++ b/ovsdb/ovsdb-client.c
> @@ -1654,6 +1654,7 @@ do_convert(struct jsonrpc *rpc, const char *database_ 
> OVS_UNUSED,
>  ovsdb_schema_to_json(new_schema)), NULL);
>  check_txn(jsonrpc_transact_block(rpc, request, ), );
>  jsonrpc_msg_destroy(reply);
> +ovsdb_schema_destroy(new_schema);
>  }
>  
>  static void
> -- 
> 2.7.4
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 05/10] trigger: Free leaked ovsdb_schema

2019-09-17 Thread William Tu
On Wed, Sep 11, 2019 at 02:18:31PM -0700, Yifeng Sun wrote:
> Valgrind reported:
> 
> 1925: schema conversion online - standalone
> 
> ==10884== 689 (56 direct, 633 indirect) bytes in 1 blocks are definitely lost 
> in loss record 384 of 420
> ==10884==at 0x4C2FB55: calloc (in 
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==10884==by 0x44A592: xcalloc (util.c:121)
> ==10884==by 0x40E2EC: ovsdb_schema_create (ovsdb.c:41)
> ==10884==by 0x40E688: ovsdb_schema_from_json (ovsdb.c:217)
> ==10884==by 0x416C6F: ovsdb_trigger_try (trigger.c:246)
> ==10884==by 0x40D4DE: ovsdb_jsonrpc_trigger_create (jsonrpc-server.c:1119)
> ==10884==by 0x40D4DE: ovsdb_jsonrpc_session_got_request 
> (jsonrpc-server.c:986)
> ==10884==by 0x40D4DE: ovsdb_jsonrpc_session_run (jsonrpc-server.c:556)
> ==10884==by 0x40D4DE: ovsdb_jsonrpc_session_run_all (jsonrpc-server.c:586)
> ==10884==by 0x40D4DE: ovsdb_jsonrpc_server_run (jsonrpc-server.c:401)
> ==10884==by 0x406A6E: main_loop (ovsdb-server.c:209)
> ==10884==by 0x406A6E: main (ovsdb-server.c:460)
> 
> 'new_schema' should also be freed when there is no error.
> This patch fixes it.
> 
> Signed-off-by: Yifeng Sun 
LGTM
Acked-by: William Tu 

> ---
>  ovsdb/trigger.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/ovsdb/trigger.c b/ovsdb/trigger.c
> index 6f4ed96b000b..7e62e90ae381 100644
> --- a/ovsdb/trigger.c
> +++ b/ovsdb/trigger.c
> @@ -254,8 +254,8 @@ ovsdb_trigger_try(struct ovsdb_trigger *t, long long int 
> now)
>  if (!error) {
>  error = ovsdb_convert(t->db, new_schema, );
>  }
> +ovsdb_schema_destroy(new_schema);
>  if (error) {
> -ovsdb_schema_destroy(new_schema);
>  trigger_convert_error(t, error);
>  return false;
>  }
> -- 
> 2.7.4
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 04/10] ovs-ofctl: Free leaked minimatch

2019-09-17 Thread William Tu
On Wed, Sep 11, 2019 at 02:18:30PM -0700, Yifeng Sun wrote:
> Valgrind reported:
> 
> 1056: ofproto - bundle with multiple flow mods (OpenFlow 1.4)
> 
> ==19220== 160 bytes in 2 blocks are definitely lost in loss record 24 of 34
> ==19220==at 0x4C2DB8F: malloc (in 
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==19220==by 0x4979A4: xmalloc (util.c:138)
> ==19220==by 0x42407D: miniflow_alloc (flow.c:3340)
> ==19220==by 0x4296CF: minimatch_init (match.c:1758)
> ==19220==by 0x46273D: parse_ofp_str__ (ofp-flow.c:1759)
> ==19220==by 0x465B9E: parse_ofp_str (ofp-flow.c:1790)
> ==19220==by 0x465CE0: parse_ofp_flow_mod_str (ofp-flow.c:1817)
> ==19220==by 0x465DF6: parse_ofp_flow_mod_file (ofp-flow.c:1876)
> ==19220==by 0x410BA3: ofctl_flow_mod_file.isra.19 (ovs-ofctl.c:1773)
> ==19220==by 0x417933: ovs_cmdl_run_command__ (command-line.c:223)
> ==19220==by 0x406F68: main (ovs-ofctl.c:179)
> 
> This patch fixes it.
> 
> Signed-off-by: Yifeng Sun 
Looks good to me.

Acked-by: William Tu 
> ---
>  utilities/ovs-ofctl.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/utilities/ovs-ofctl.c b/utilities/ovs-ofctl.c
> index 754629d3dfbb..06289d296573 100644
> --- a/utilities/ovs-ofctl.c
> +++ b/utilities/ovs-ofctl.c
> @@ -1724,6 +1724,7 @@ bundle_flow_mod__(const char *remote, struct 
> ofputil_flow_mod *fms,
>  
>  ovs_list_push_back(, >list_node);
>  free(CONST_CAST(struct ofpact *, fm->ofpacts));
> +minimatch_destroy(>match);
>  }
>  
>  bundle_transact(vconn, , OFPBF_ORDERED | OFPBF_ATOMIC);
> -- 
> 2.7.4
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] Exclude inport and outport symbol tables from conjunction

2019-09-17 Thread Han Zhou
On Tue, Sep 17, 2019 at 5:21 AM Mark Michelson  wrote:
>
> On 9/16/19 12:04 PM, Han Zhou wrote:
> >
> >
> > On Mon, Sep 16, 2019 at 4:15 AM Dumitru Ceara  > > wrote:
> >  >
> >  > On Sat, Sep 14, 2019 at 7:16 PM Han Zhou  > > wrote:
> >  > >
> >  > >
> >  > >
> >  > > On Sat, Sep 14, 2019 at 9:09 AM Han Zhou  > > wrote:
> >  > > >
> >  > > >
> >  > > >
> >  > > > On Sat, Sep 14, 2019 at 12:40 AM Numan Siddique
> > mailto:nusid...@redhat.com>> wrote:
> >  > > > >
> >  > > > >
> >  > > > >
> >  > > > > On Sat, Sep 14, 2019 at 2:41 AM Daniel Alvarez Sanchez
> > mailto:dalva...@redhat.com>> wrote:
> >  > > > >>
> >  > > > >> Acked-by: Daniel Alvarez  > >
> >  > > > >>
> >  > > > >>
> >  > > > >> On Fri, Sep 13, 2019 at 11:02 PM Mark Michelson
> > mailto:mmich...@redhat.com>> wrote:
> >  > > > >> >
> >  > > > >> > Acked-by: Mark Michelson
> >  > > > >> >
> >  > > > >> > It sucks that we lose the efficiency of the conjunctive
> > match altogether
> >  > > > >> > on port groups because of this error, but I understand this
> > is a huge
> >  > > > >> > bug and needs fixing.
> >  > > > >> If I'm not mistaken, from OpenStack standpoint conjunction was
> > *only*
> >  > > > >> being used when using port groups and ACLs that matched on
> > port ranges
> >  > > > >> ( e.g tcp.dst >= X && tcp.dst <=Y) which was not working.
> > Therefore
> >  > > > >> we're not losing performance because it was already broken
> > (given that
> >  > > > >> there was more than one ACL like that).
> >  > > > >>
> >  > > > >> >
> >  > > > >> > Perhaps it would be good to start up a discussion on this
> > list about a
> >  > > > >> > more longterm solution that would allow for conjunctive
> > matches with no
> >  > > > >> > ambiguity.
> >  > > > >> Agreed! We already discussed some ideas on IRC but it'd be
> > awesome to
> >  > > > >> have a thread and brainstorm there.
> >  > > > >>
> >  > > > >
> >  > > > > Thanks for the reviews. I applied this to master.
> >  > > > > Agree we can discuss it further and come up with ideas.
> >  > > > >
> >  > > > > I know Dumitru has some idea to make use of conjunctions for
> > port groups.
> >  > > > > CC'ing Han if he has any comments on ideas.
> >  > > > >
> >  > > >
> >  > > > Hi Numan,
> >  > > >
> >  > > > This is a good finding. However, I think it is not specifically a
> > problem of port group. It seems to be a more general problem and this
> > patch fixes only a special case.
> >  > > > For example, would there be similar problem for below ACLs
> > without port groups:
> >  > > >
> >  > > > ip4 && tcp.src >= 1000 && tcp.src <= 1001 && tcp.dst >= 500 &&
> > tcp.dst <= 501
> >  > > > ip4 && tcp.src >= 1000 && tcp.src <= 1001 && tcp.dst >= 600 &&
> > tcp.dst <= 601
> >  > > >
> >  > > > Another example is with address set:
> >  > > >
> >  > > > ip4 && ip4.src == $as1 && tcp.dst >= 500 && tcp.dst <= 501
> >  > > > ip4 && ip4.src == $as1 && tcp.dst >= 600 && tcp.dst <= 601
> >  > > >
> >  > > > Or even without range:
> >  > > > ip4 && tcp.src == {1000, 1001} && tcp.dst == {500, 501}
> >  > > > ip4 && tcp.src == {1000, 1001} && tcp.dst == {600, 601}
> >  > > >
> >  > > > You may think of more examples. Whenever there are multiple
> > conjunctionable ACLs with same match as part of the conjunction, it
> > should result in such problem.
> >  > > >
> >  > > > A quick fix to all these problems may be just abandon
> > conjunction, but I believe there are better ways to address it.
> >  > > >
> >  > > > First of all, these matches can be rewritten by combining them in
> > a single ACL with "OR" operator, e.g.:
> >  > > >
> >  > > > outport == @pg1 && ip4 && tcp.dst >= 500 && tcp.dst <= 501
> >  > > > outport == @pg1 && ip4 && tcp.dst >= 600 && tcp.dst <= 601
> >  > > >
> >  > > > can be rewritten as >
> >  > > >
> >  > > > outport == @pg1 && ip4 && (tcp.dst >= 500 && tcp.dst <= 501 ||
> > tcp.dst >= 600 && tcp.dst <= 601)
> >  > > >
> >  > > > Similar can be done for all above examples. So, a workaround to
> > the problem is from user side (e.g. OpenStack) to make sure always
> > combining ACLs with "OR" if there are common conjunctionable matches
> > between different ACLs. However, a better way would be in ovn-northd
> > itself to detect and combine such ACLs internally, before generating the
> > final logical flows in SB. It may be more convenient to be done in
> > ovn-controller, because we are not even parsing the ACLs in ovn-northd
> > today, but the cost of such pre-processing would be duplicated in all
> > HVs. It surely will increase CPU cost for doing such combination, but
> > I'd not worry too much if we do it properly at each LS level instead of
> > for all ACLs.
> >  > >
> >  > > I just thought a little more about combining the conjunctions. It
> > seems we can do it without pre-processing by just handling duplicated
> > flows in ofctrl_add_flow(). Currently we just drop duplicated 

Re: [ovs-dev] [PATCH 03/10] dpif-netdev: Handle uninitialized value error for 'match.wc'

2019-09-17 Thread William Tu
On Wed, Sep 11, 2019 at 02:18:29PM -0700, Yifeng Sun wrote:
> Valgrind reported that match.wc was not initialized, as below:
> 
> 1176: ofproto-dpif - fragment handling - actions
> 
> ==21214== Conditional jump or move depends on uninitialised value(s)
> ==21214==at 0x4B77C1: odp_flow_key_from_flow__ (odp-util.c:6143)
> ==21214==by 0x46DB58: dp_netdev_upcall (dpif-netdev.c:6239)
> ==21214==by 0x4774A7: handle_packet_upcall (dpif-netdev.c:6608)
> ==21214==by 0x4774A7: fast_path_processing (dpif-netdev.c:6726)
> ==21214==by 0x47933C: dp_netdev_input__ (dpif-netdev.c:6814)
> ==21214==by 0x479AB8: dp_netdev_input (dpif-netdev.c:6852)
> ==21214==by 0x479AB8: dp_netdev_process_rxq_port (dpif-netdev.c:4287)
> ==21214==by 0x47A6A9: dpif_netdev_run (dpif-netdev.c:5264)
> ==21214==by 0x4324E7: type_run (ofproto-dpif.c:342)
> ==21214==by 0x41C5FE: ofproto_type_run (ofproto.c:1734)
> ==21214==by 0x40BAAC: bridge_run__ (bridge.c:2965)
> ==21214==by 0x410CF3: bridge_run (bridge.c:3029)
> ==21214==by 0x407614: main (ovs-vswitchd.c:127)
> ==21214==  Uninitialised value was created by a stack allocation
> ==21214==at 0x4769C3: fast_path_processing (dpif-netdev.c:6672)
> 
> 'match' is allocated on stack but its 'wc' is accessed in
> odp_flow_key_from_flow__ without proper initialization.
> This patch fixes it.
> 
> Signed-off-by: Yifeng Sun 
LGTM
Acked-by: William Tu 

> ---
>  lib/dpif-netdev.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index a88a78f8a688..6be6e47ed127 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -6600,6 +6600,7 @@ handle_packet_upcall(struct dp_netdev_pmd_thread *pmd,
>  
>  match.tun_md.valid = false;
>  miniflow_expand(>mf, );
> +memset(, 0, sizeof match.wc);
>  
>  ofpbuf_clear(actions);
>  ofpbuf_clear(put_actions);
> -- 
> 2.7.4
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 02/10] ofproto-dpif: Uninitialize 'xlate_cache' to free resources

2019-09-17 Thread William Tu
On Wed, Sep 11, 2019 at 02:18:28PM -0700, Yifeng Sun wrote:
> Valgrind reported:
> 
> 1210: ofproto-dpif - continuation after clone
> 
> ==32205== 4,392 (1,440 direct, 2,952 indirect) bytes in 12 blocks are 
> definitely lost in loss record 359 of 362
> ==32205==at 0x4C2DB8F: malloc (in 
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==32205==by 0x532574: xmalloc (util.c:138)
> ==32205==by 0x4F98CA: ofpbuf_init (ofpbuf.c:123)
> ==32205==by 0x42C07B: nxt_resume (ofproto-dpif.c:5110)
> ==32205==by 0x41796F: handle_nxt_resume (ofproto.c:3677)
> ==32205==by 0x424583: handle_single_part_openflow (ofproto.c:8473)
> ==32205==by 0x424583: handle_openflow (ofproto.c:8606)
> ==32205==by 0x4579E2: ofconn_run (connmgr.c:1318)
> ==32205==by 0x4579E2: connmgr_run (connmgr.c:355)
> ==32205==by 0x41E0F5: ofproto_run (ofproto.c:1845)
> ==32205==by 0x40BA63: bridge_run__ (bridge.c:2971)
> ==32205==by 0x410CF3: bridge_run (bridge.c:3029)
> ==32205==by 0x407614: main (ovs-vswitchd.c:127)
> 
> This is because 'xcache' was not destroyed properly. This patch fixes it.
> 
> Signed-off-by: Yifeng Sun 

LGTM
Acked-by: William Tu 

> ---
>  ofproto/ofproto-dpif.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> index 751535249e21..46fa1357163b 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> @@ -5148,6 +5148,7 @@ nxt_resume(struct ofproto *ofproto_,
>  /* Clean up. */
>  ofpbuf_uninit(_actions);
>  dp_packet_uninit();
> +xlate_cache_uninit();
>  
>  return error;
>  }
> -- 
> 2.7.4
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 01/10] raft: Free leaked json data

2019-09-17 Thread William Tu
On Wed, Sep 11, 2019 at 02:18:27PM -0700, Yifeng Sun wrote:
> Valgrind reported:
> 
> 1924: compacting online - cluster
> 
> ==29312== 2,886 (240 direct, 2,646 indirect) bytes in 6 blocks are definitely 
> lost in loss record 406 of 413
> ==29312==at 0x4C2DB8F: malloc (in 
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==29312==by 0x44A5F4: xmalloc (util.c:138)
> ==29312==by 0x4308EA: json_create (json.c:1451)
> ==29312==by 0x4308EA: json_object_create (json.c:254)
> ==29312==by 0x430ED0: json_parser_push_object (json.c:1273)
> ==29312==by 0x430ED0: json_parser_input (json.c:1371)
> ==29312==by 0x431CF1: json_lex_input (json.c:991)
> ==29312==by 0x43233B: json_parser_feed (json.c:1149)
> ==29312==by 0x41D87F: parse_body.isra.0 (log.c:411)
> ==29312==by 0x41E141: ovsdb_log_read (log.c:476)
> ==29312==by 0x42646D: raft_read_log (raft.c:866)
> ==29312==by 0x42646D: raft_open (raft.c:951)
> ==29312==by 0x4151AF: ovsdb_storage_open__ (storage.c:81)
> ==29312==by 0x408FFC: open_db (ovsdb-server.c:642)
> ==29312==by 0x40657F: main (ovsdb-server.c:358)
> 
> This patch fixes it.
> 
> Signed-off-by: Yifeng Sun 

LGTM
Acked-by: William Tu 

> ---
>  ovsdb/raft.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/ovsdb/raft.c b/ovsdb/raft.c
> index 9eabe2cfeecd..a45c7f8ba998 100644
> --- a/ovsdb/raft.c
> +++ b/ovsdb/raft.c
> @@ -883,6 +883,7 @@ raft_read_log(struct raft *raft)
>  error = raft_apply_record(raft, i, );
>  raft_record_uninit();
>  }
> +json_destroy(json);
>  if (error) {
>  return ovsdb_wrap_error(error, "error reading record %llu from "
>  "%s log", i, raft->name);
> -- 
> 2.7.4
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6 0/1] Balance-tcp bond mode optimization

2019-09-17 Thread Vishal Deep Ajmera via dev

> >
> Let me check this in my setup. I always used 'netdev' bridges for testing my
> patch.
> May be I need to be check for data path support in the display function as
> well.

Hi,
I have sent v7 version of patch fixing this issue.

Warm Regards,
Vishal Ajmera
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] Exclude inport and outport symbol tables from conjunction

2019-09-17 Thread Mark Michelson

On 9/16/19 12:04 PM, Han Zhou wrote:



On Mon, Sep 16, 2019 at 4:15 AM Dumitru Ceara > wrote:

 >
 > On Sat, Sep 14, 2019 at 7:16 PM Han Zhou > wrote:

 > >
 > >
 > >
 > > On Sat, Sep 14, 2019 at 9:09 AM Han Zhou > wrote:

 > > >
 > > >
 > > >
 > > > On Sat, Sep 14, 2019 at 12:40 AM Numan Siddique 
mailto:nusid...@redhat.com>> wrote:

 > > > >
 > > > >
 > > > >
 > > > > On Sat, Sep 14, 2019 at 2:41 AM Daniel Alvarez Sanchez 
mailto:dalva...@redhat.com>> wrote:

 > > > >>
 > > > >> Acked-by: Daniel Alvarez >

 > > > >>
 > > > >>
 > > > >> On Fri, Sep 13, 2019 at 11:02 PM Mark Michelson 
mailto:mmich...@redhat.com>> wrote:

 > > > >> >
 > > > >> > Acked-by: Mark Michelson
 > > > >> >
 > > > >> > It sucks that we lose the efficiency of the conjunctive 
match altogether
 > > > >> > on port groups because of this error, but I understand this 
is a huge

 > > > >> > bug and needs fixing.
 > > > >> If I'm not mistaken, from OpenStack standpoint conjunction was 
*only*
 > > > >> being used when using port groups and ACLs that matched on 
port ranges
 > > > >> ( e.g tcp.dst >= X && tcp.dst <=Y) which was not working. 
Therefore
 > > > >> we're not losing performance because it was already broken 
(given that

 > > > >> there was more than one ACL like that).
 > > > >>
 > > > >> >
 > > > >> > Perhaps it would be good to start up a discussion on this 
list about a
 > > > >> > more longterm solution that would allow for conjunctive 
matches with no

 > > > >> > ambiguity.
 > > > >> Agreed! We already discussed some ideas on IRC but it'd be 
awesome to

 > > > >> have a thread and brainstorm there.
 > > > >>
 > > > >
 > > > > Thanks for the reviews. I applied this to master.
 > > > > Agree we can discuss it further and come up with ideas.
 > > > >
 > > > > I know Dumitru has some idea to make use of conjunctions for 
port groups.

 > > > > CC'ing Han if he has any comments on ideas.
 > > > >
 > > >
 > > > Hi Numan,
 > > >
 > > > This is a good finding. However, I think it is not specifically a 
problem of port group. It seems to be a more general problem and this 
patch fixes only a special case.
 > > > For example, would there be similar problem for below ACLs 
without port groups:

 > > >
 > > > ip4 && tcp.src >= 1000 && tcp.src <= 1001 && tcp.dst >= 500 && 
tcp.dst <= 501
 > > > ip4 && tcp.src >= 1000 && tcp.src <= 1001 && tcp.dst >= 600 && 
tcp.dst <= 601

 > > >
 > > > Another example is with address set:
 > > >
 > > > ip4 && ip4.src == $as1 && tcp.dst >= 500 && tcp.dst <= 501
 > > > ip4 && ip4.src == $as1 && tcp.dst >= 600 && tcp.dst <= 601
 > > >
 > > > Or even without range:
 > > > ip4 && tcp.src == {1000, 1001} && tcp.dst == {500, 501}
 > > > ip4 && tcp.src == {1000, 1001} && tcp.dst == {600, 601}
 > > >
 > > > You may think of more examples. Whenever there are multiple 
conjunctionable ACLs with same match as part of the conjunction, it 
should result in such problem.

 > > >
 > > > A quick fix to all these problems may be just abandon 
conjunction, but I believe there are better ways to address it.

 > > >
 > > > First of all, these matches can be rewritten by combining them in 
a single ACL with "OR" operator, e.g.:

 > > >
 > > > outport == @pg1 && ip4 && tcp.dst >= 500 && tcp.dst <= 501
 > > > outport == @pg1 && ip4 && tcp.dst >= 600 && tcp.dst <= 601
 > > >
 > > > can be rewritten as >
 > > >
 > > > outport == @pg1 && ip4 && (tcp.dst >= 500 && tcp.dst <= 501 || 
tcp.dst >= 600 && tcp.dst <= 601)

 > > >
 > > > Similar can be done for all above examples. So, a workaround to 
the problem is from user side (e.g. OpenStack) to make sure always 
combining ACLs with "OR" if there are common conjunctionable matches 
between different ACLs. However, a better way would be in ovn-northd 
itself to detect and combine such ACLs internally, before generating the 
final logical flows in SB. It may be more convenient to be done in 
ovn-controller, because we are not even parsing the ACLs in ovn-northd 
today, but the cost of such pre-processing would be duplicated in all 
HVs. It surely will increase CPU cost for doing such combination, but 
I'd not worry too much if we do it properly at each LS level instead of 
for all ACLs.

 > >
 > > I just thought a little more about combining the conjunctions. It 
seems we can do it without pre-processing by just handling duplicated 
flows in ofctrl_add_flow(). Currently we just drop duplicated flows, but 
we can check that if the action is conjuncture and the conjuncture ID is 
different, we can perform a combination by using existing flow's 
conjunction id to update all the flows related to that to-be-added 
duplicated flow. This way, the combination is performed on-the-fly, 
without introduce too much cost and without introduce parsing in 
ovn-northd either.

 >
 > Hi Han,
 >
 > Will this actually work without a change in OVS? I wonder because in
 > the ovs-fields 

Re: [ovs-dev] [PATCHv3] netdev-afxdp: Add need_wakeup supprt.

2019-09-17 Thread Eelco Chaudron

Two comments below…


On 11 Sep 2019, at 19:58, William Tu wrote:


The patch adds support for using need_wakeup flag in AF_XDP rings.
A new option, use_need_wakeup, is added. When this option is used,
it means that OVS has to explicitly wake up the kernel RX, using 
poll()

syscall and wake up TX, using sendto() syscall. This feature improves
the performance by avoiding unnecessary sendto syscalls for TX.
For RX, instead of kernel always busy-spinning on fille queue, OVS 
wakes

up the kernel RX processing when fill queue is replenished.

The need_wakeup feature is merged into Linux kernel 5.3.0-rc1 and OVS
enables it by default. Running the feature before this version causes
xsk bind fails, please use options:use_need_wakeup=false to disable 
it.

If users enable it but runs in an older version of libbpf, then the
need_wakeup feature has no effect, and a warning message is logged.

For virtual interface, it's better set use_need_wakeup=false, since
the virtual device's AF_XDP xmit is synchronous: the sendto syscall
enters kernel and process the TX packet on tx queue directly.

I tested on kernel 5.3.0-rc3 using its libbpf.  On Intel Xeon E5-2620
v3 2.4GHz system, performance of physical port to physical port 
improves
from 6.1Mpps to 7.3Mpps. Testing on 5.2.0-rc6 using libbpf from 
5.3.0-rc3

does not work due to libbpf API change. Users have to use the older
libbpf for older kernel.

Suggested-by: Ilya Maximets 
Signed-off-by: William Tu 
---
v3:
- add warning when user enables it but libbpf not support it
- revise documentation

v2:
- address feedbacks from Ilya and Eelco
- add options:use_need_wakeup, default to true
- remove poll timeout=1sec, make poll() return immediately
- naming change: rename to xsk_rx_wakeup_if_needing
- fix indents and return value for errno
---
 Documentation/intro/install/afxdp.rst |  15 -
 acinclude.m4  |   8 +++
 lib/netdev-afxdp.c| 101 
++

 lib/netdev-linux-private.h|   2 +
 vswitchd/vswitch.xml  |  13 +
 5 files changed, 124 insertions(+), 15 deletions(-)

diff --git a/Documentation/intro/install/afxdp.rst 
b/Documentation/intro/install/afxdp.rst

index 820e9d993d8f..545516b2bbec 100644
--- a/Documentation/intro/install/afxdp.rst
+++ b/Documentation/intro/install/afxdp.rst
@@ -176,9 +176,18 @@ in :doc:`general`::
   ovs-vswitchd ...
   ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev

-Make sure your device driver support AF_XDP, and to use 1 PMD (on 
core 4)

-on 1 queue (queue 0) device, configure these options: **pmd-cpu-mask,
-pmd-rxq-affinity, and n_rxq**. The **xdpmode** can be "drv" or 
"skb"::

+Make sure your device driver support AF_XDP, netdev-afxdp supports
+the following additional options (see man ovs-vswitchd.conf.db for
+more details):
+
+ * **xdpmode**: use "drv" for driver mode, or "skb" for skb mode.
+
+ * **use_need_wakeup**: disable by setting to "false", otherwise 
default

+   is "true"
+
+For example, to use 1 PMD (on core 4) on 1 queue (queue 0) device,
+configure these options: **pmd-cpu-mask, pmd-rxq-affinity, and 
n_rxq**.

+The **xdpmode** can be "drv" or "skb"::

   ethtool -L enp2s0 combined 1
   ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x10
diff --git a/acinclude.m4 b/acinclude.m4
index f0e38898b17a..df1082c455fc 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -276,6 +276,14 @@ AC_DEFUN([OVS_CHECK_LINUX_AF_XDP], [
   [Define to 1 if AF_XDP support is available and 
enabled.])

 LIBBPF_LDADD=" -lbpf -lelf"
 AC_SUBST([LIBBPF_LDADD])
+
+AC_CHECK_DECL([xsk_ring_prod__needs_wakeup], [
+  AC_DEFINE([HAVE_XDP_NEED_WAKEUP], [1],
+[XDP need wakeup support detected in xsk.h.])
+], [
+  AC_DEFINE([HAVE_XDP_NEED_WAKEUP], [0],
+[XDP need wakeup support not detected in xsk.h.])
+  ], [#include ])
   fi
   AM_CONDITIONAL([HAVE_AF_XDP], test "$AF_XDP_ENABLE" = true)
 ])
diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c
index e5b058d08a09..4d4c90f91806 100644
--- a/lib/netdev-afxdp.c
+++ b/lib/netdev-afxdp.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -82,7 +83,7 @@ BUILD_ASSERT_DECL(PROD_NUM_DESCS == CONS_NUM_DESCS);
 #define UMEM2DESC(elem, base) ((uint64_t)((char *)elem - (char 
*)base))


 static struct xsk_socket_info *xsk_configure(int ifindex, int 
xdp_queue_id,

- int mode);
+ int mode, bool 
use_need_wakeup);

 static void xsk_remove_xdp_program(uint32_t ifindex, int xdpmode);
 static void xsk_destroy(struct xsk_socket_info *xsk);
 static int xsk_configure_all(struct netdev *netdev);
@@ -117,6 +118,49 @@ struct xsk_socket_info {
 atomic_uint64_t tx_dropped;
 };

+#ifdef HAVE_XDP_NEED_WAKEUP
+static inline void
+xsk_rx_wakeup_if_needed(struct xsk_umem_info *umem,
+

Re: [ovs-dev] [PATCH v7 1/1] Avoid dp_hash recirculation for balance-tcp bond selection mode

2019-09-17 Thread 0-day Robot
Bleep bloop.  Greetings Vishal Deep Ajmera via dev, I am a robot and I have 
tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 81 characters long (recommended limit is 79)
#218 FILE: lib/dpif-netdev.c:732:
 * Note: This flag is decoupled from 
'reload'

WARNING: Line is 81 characters long (recommended limit is 79)
#219 FILE: lib/dpif-netdev.c:733:
 * flag otherwise full pmd reload will 
become

Lines checked: 1629, Warnings: 2, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v7 1/1] Avoid dp_hash recirculation for balance-tcp bond selection mode

2019-09-17 Thread Vishal Deep Ajmera via dev
Problem:

In OVS-DPDK, flows with output over a bond interface of type “balance-tcp”
(using a hash on TCP/UDP 5-tuple) get translated by the ofproto layer into
"HASH" and "RECIRC" datapath actions. After recirculation, the packet is
forwarded to the bond member port based on 8-bits of the datapath hash
value computed through dp_hash. This causes performance degradation in the
following ways:

1. The recirculation of the packet implies another lookup of the packet’s
flow key in the exact match cache (EMC) and potentially Megaflow classifier
(DPCLS). This is the biggest cost factor.

2. The recirculated packets have a new “RSS” hash and compete with the
original packets for the scarce number of EMC slots. This implies more
EMC misses and potentially EMC thrashing causing costly DPCLS lookups.

3. The 256 extra megaflow entries per bond for dp_hash bond selection put
additional load on the revalidation threads.

Owing to this performance degradation, deployments stick to “balance-slb”
bond mode even though it does not do active-active load balancing for
VXLAN- and GRE-tunnelled traffic because all tunnel packet have the same
source MAC address.

Proposed optimization:
--
This proposal introduces a new load-balancing output action instead of
recirculation.

Maintain one table per-bond (could just be an array of uint16's) and
program it the same way internal flows are created today for each possible
hash value(256 entries) from ofproto layer. Use this table to load-balance
flows as part of output action processing.

Currently xlate_normal() -> output_normal() -> bond_update_post_recirc_rules()
-> bond_may_recirc() and compose_output_action__() generate
“dp_hash(hash_l4(0))” and “recirc()” actions. In this case the
RecircID identifies the bond. For the recirculated packets the ofproto layer
installs megaflow entries that match on RecircID and masked dp_hash and send
them to the corresponding output port.

Instead, we will now generate actions as
"hash(l4(0)),lb_output(bond,)"

This combines hash computation (only if needed, else re-use RSS hash) and
inline load-balancing over the bond. This action is used *only* for balance-tcp
bonds in OVS-DPDK datapath (the OVS kernel datapath remains unchanged).

Example:

Current scheme:
---
With 1 IP-UDP flow:

flow-dump from pmd on cpu core: 2
recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=02:00:00:02:14:01,dst=0c:c4:7a:58:f0:2b),eth_type(0x0800),ipv4(frag=no),
 packets:2828969, bytes:181054016, used:0.000s, 
actions:hash(hash_l4(0)),recirc(0x1)

recirc_id(0x1),dp_hash(0x113683bd/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no),
 packets:2828937, bytes:181051968, used:0.000s, actions:2

With 8 IP-UDP flows (with random UDP src port): (all hitting same DPCL):

flow-dump from pmd on cpu core: 2
recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=02:00:00:02:14:01,dst=0c:c4:7a:58:f0:2b),eth_type(0x0800),ipv4(frag=no),
 packets:2674009, bytes:171136576, used:0.000s, 
actions:hash(hash_l4(0)),recirc(0x1)

recirc_id(0x1),dp_hash(0xf8e02b7e/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no),
 packets:377395, bytes:24153280, used:0.000s, actions:2
recirc_id(0x1),dp_hash(0xb236c260/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no),
 packets:333486, bytes:21343104, used:0.000s, actions:1
recirc_id(0x1),dp_hash(0x7d89eb18/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no),
 packets:348461, bytes:22301504, used:0.000s, actions:1
recirc_id(0x1),dp_hash(0xa78d75df/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no),
 packets:633353, bytes:40534592, used:0.000s, actions:2
recirc_id(0x1),dp_hash(0xb58d846f/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no),
 packets:319901, bytes:20473664, used:0.001s, actions:2
recirc_id(0x1),dp_hash(0x24534406/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no),
 packets:334985, bytes:21439040, used:0.001s, actions:1
recirc_id(0x1),dp_hash(0x3cf32550/0xff),in_port(7),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no),
 packets:326404, bytes:20889856, used:0.001s, actions:1

New scheme:
---
We can do with a single flow entry (for any number of new flows):

in_port(7),packet_type(ns=0,id=0),eth(src=02:00:00:02:14:01,dst=0c:c4:7a:58:f0:2b),eth_type(0x0800),ipv4(frag=no),
 packets:2674009, bytes:171136576, used:0.000s, 
actions:hash(l4(0)),lb_output(bond,1)

A new CLI has been added to dump datapath bond cache as given below.

“sudo ovs-appctl dpif-netdev/dp-bond-show [dp]”

root@ubuntu-190:performance_scripts # sudo ovs-appctl dpif-netdev/dp-bond-show
Bond cache:
bond-id 1 :
bucket 0 - slave 2
bucket 1 - slave 1
bucket 2 - slave 2
bucket 3 - slave 1

Performance improvement:

With a prototype of the proposed idea, the following perf improvement is 

[ovs-dev] [PATCH v7 0/1] Balance-tcp bond mode optimization

2019-09-17 Thread Vishal Deep Ajmera via dev
v6->v7:
 Fixed issue reported by Matteo for bond/show.

v5->v6:
 Addressed comments from Ilya Maximets.
 https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/362001.html
 Rebased to OVS master.

v4->v5:
 Support for stats per hash bucket.
 Support for dynamic load balancing.
 Rebased to OVS Master.

v3->v4:
 Addressed Ilya Maximets comments.
 https://mail.openvswitch.org/pipermail/ovs-dev/2019-July/360452.html

v2->v3:
 Rebased to OVS master.
 Fixed git merge issue.

v1->v2:
 Updated datapath action to hash + lb-output.
 Updated throughput test observations.
 Rebased to OVS master.

Vishal Deep Ajmera (1):
  Avoid dp_hash recirculation for balance-tcp bond selection mode

 datapath/linux/compat/include/linux/openvswitch.h |   2 +
 lib/dpif-netdev.c | 515 --
 lib/dpif-netlink.c|   3 +
 lib/dpif-provider.h   |   8 +
 lib/dpif.c|  48 ++
 lib/dpif.h|   7 +
 lib/odp-execute.c |   2 +
 lib/odp-util.c|   4 +
 ofproto/bond.c|  52 ++-
 ofproto/bond.h|   9 +
 ofproto/ofproto-dpif-ipfix.c  |   1 +
 ofproto/ofproto-dpif-sflow.c  |   1 +
 ofproto/ofproto-dpif-xlate.c  |  39 +-
 ofproto/ofproto-dpif.c|  32 ++
 ofproto/ofproto-dpif.h|  12 +-
 tests/lacp.at |   9 +
 vswitchd/bridge.c |   6 +
 vswitchd/vswitch.xml  |  10 +
 18 files changed, 700 insertions(+), 60 deletions(-)

-- 
1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] INFO: task hung in ovs_dp_cmd_new

2019-09-17 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:2339cd6c bpf: fix precision tracking of stack slots
git tree:   bpf
console output: https://syzkaller.appspot.com/x/log.txt?x=14707b0160
kernel config:  https://syzkaller.appspot.com/x/.config?x=b89bb446a3faaba4
dashboard link: https://syzkaller.appspot.com/bug?extid=a9d62dbe662772066f3c
compiler:   gcc (GCC) 9.0.0 20181231 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+a9d62dbe662772066...@syzkaller.appspotmail.com

INFO: task syz-executor.0:3410 blocked for more than 143 seconds.
  Not tainted 5.3.0-rc7+ #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor.0  D26056  3410  18433 0x0004
Call Trace:
 context_switch kernel/sched/core.c:3254 [inline]
 __schedule+0x755/0x1580 kernel/sched/core.c:3880
 schedule+0xd9/0x260 kernel/sched/core.c:3947
 schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:4006
 __mutex_lock_common kernel/locking/mutex.c:1007 [inline]
 __mutex_lock+0x7b0/0x13c0 kernel/locking/mutex.c:1077
 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1092
 ovs_lock net/openvswitch/datapath.c:105 [inline]
 ovs_dp_cmd_new+0x579/0xea0 net/openvswitch/datapath.c:1613
 genl_family_rcv_msg+0x74b/0xf90 net/netlink/genetlink.c:629
 genl_rcv_msg+0xca/0x170 net/netlink/genetlink.c:654
 netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
 genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
 netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
 netlink_sendmsg+0x8a5/0xd60 net/netlink/af_netlink.c:1917
 sock_sendmsg_nosec net/socket.c:637 [inline]
 sock_sendmsg+0xd7/0x130 net/socket.c:657
 ___sys_sendmsg+0x803/0x920 net/socket.c:2311
 __sys_sendmsg+0x105/0x1d0 net/socket.c:2356
 __do_sys_sendmsg net/socket.c:2365 [inline]
 __se_sys_sendmsg net/socket.c:2363 [inline]
 __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2363
 do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4598e9
Code: Bad RIP value.
RSP: 002b:7fa2501cdc78 EFLAGS: 0246 ORIG_RAX: 002e
RAX: ffda RBX: 0003 RCX: 004598e9
RDX:  RSI: 2240 RDI: 0004
RBP: 0075bf20 R08:  R09: 
R10:  R11: 0246 R12: 7fa2501ce6d4
R13: 004c R14: 004dcfd8 R15: 
INFO: task syz-executor.0:3414 blocked for more than 143 seconds.
  Not tainted 5.3.0-rc7+ #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor.0  D26136  3414  18433 0x0004
Call Trace:
 context_switch kernel/sched/core.c:3254 [inline]
 __schedule+0x755/0x1580 kernel/sched/core.c:3880
 schedule+0xd9/0x260 kernel/sched/core.c:3947
 schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:4006
 __mutex_lock_common kernel/locking/mutex.c:1007 [inline]
 __mutex_lock+0x7b0/0x13c0 kernel/locking/mutex.c:1077
 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1092
 ovs_lock net/openvswitch/datapath.c:105 [inline]
 ovs_dp_cmd_new+0x579/0xea0 net/openvswitch/datapath.c:1613
 genl_family_rcv_msg+0x74b/0xf90 net/netlink/genetlink.c:629
 genl_rcv_msg+0xca/0x170 net/netlink/genetlink.c:654
 netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
 genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
 netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
 netlink_sendmsg+0x8a5/0xd60 net/netlink/af_netlink.c:1917
 sock_sendmsg_nosec net/socket.c:637 [inline]
 sock_sendmsg+0xd7/0x130 net/socket.c:657
 ___sys_sendmsg+0x803/0x920 net/socket.c:2311
 __sys_sendmsg+0x105/0x1d0 net/socket.c:2356
 __do_sys_sendmsg net/socket.c:2365 [inline]
 __se_sys_sendmsg net/socket.c:2363 [inline]
 __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2363
 do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4598e9
Code: Bad RIP value.
RSP: 002b:7fa2501acc78 EFLAGS: 0246 ORIG_RAX: 002e
RAX: ffda RBX: 0003 RCX: 004598e9
RDX:  RSI: 2240 RDI: 0006
RBP: 0075bfc8 R08:  R09: 
R10:  R11: 0246 R12: 7fa2501ad6d4
R13: 004c R14: 004dcfd8 R15: 
INFO: task syz-executor.0:3416 blocked for more than 144 seconds.
  Not tainted 5.3.0-rc7+ #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor.0  D25968  3416  18433 0x0004
Call Trace:
 context_switch kernel/sched/core.c:3254 [inline]
 __schedule+0x755/0x1580 kernel/sched/core.c:3880
 schedule+0xd9/0x260 kernel/sched/core.c:3947
 schedule_preempt_disabled+0x13/0x20