Re: [ovs-dev] [PATCH 2/2] netdev-dpdk: Remove deprecated ring port type.

2020-03-03 Thread David Marchand
On Mon, Mar 2, 2020 at 1:50 PM Ilya Maximets  wrote:
> 'dpdkr' ring ports was deprecated in 2.13 release and was not
> actually used for a long time.  Remove support now.
>
> More details in
> commit b4c5f00c339b ("netdev-dpdk: Deprecate ring ports.")
>
> Signed-off-by: Ilya Maximets 

Acked-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] dpdk: Remove deprecated pdump support.

2020-03-03 Thread David Marchand
On Mon, Mar 2, 2020 at 1:50 PM Ilya Maximets  wrote:
> DPDK pdump was deprecated in 2.13 release and didn't actually
> work since 2.11.  Removing it.
>
> More details in commit 4ae8c4617fd3 ("dpdk: Deprecate pdump support.")
>
> Signed-off-by: Ilya Maximets 

Acked-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] Broadcast DHCPREPLY when BROADCAST flag is set

2020-03-03 Thread 0-day Robot
Bleep bloop.  Greetings Ihar Hrachyshka, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 93 characters long (recommended limit is 79)
#47 FILE: controller/pinctrl.c:1197:
uint32_t ip_dst = DHCP_FLAGS_BROADCAST(dhcp_data->flags) ? 
htonl(0x) : *offer_ip;

Lines checked: 345, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] rhel: Fix make rpm-fedora by unified sphinx

2020-03-03 Thread Tao YunXiang
If OVS>=2.13, we need '/usr/bin/sphinx-build-3' to bulid[0]. So it could be
better to unify the version of "sphinx-build" between OVS and OVN, to avoid
erros during make.

[0] 
https://github.com/openvswitch/ovs/blob/master/rhel/openvswitch-fedora.spec.in#L65

Co-authored-by: Liu Chang 
Co-authored-by: Rong Yin 
Signed-off-by: Tao YunXiang 
Signed-off-by: Liu Chang 
Signed-off-by: Rong Yin 

---
 rhel/ovn-fedora.spec.in | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/rhel/ovn-fedora.spec.in b/rhel/ovn-fedora.spec.in
index 6a59ccc40..421d83869 100644
--- a/rhel/ovn-fedora.spec.in
+++ b/rhel/ovn-fedora.spec.in
@@ -50,7 +50,7 @@ BuildRequires: python3-devel
 BuildRequires: desktop-file-utils
 BuildRequires: groff graphviz
 BuildRequires: checkpolicy, selinux-policy-devel
-BuildRequires: /usr/bin/sphinx-build
+BuildRequires: /usr/bin/sphinx-build-3
 # make check dependencies
 BuildRequires: procps-ng
 %if %{with libcapng}
-- 
2.17.1



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] Broadcast DHCPREPLY when BROADCAST flag is set

2020-03-03 Thread Ihar Hrachyshka
As per RFC2131, section 4.1:
   A server or relay agent sending or relaying a DHCP message directly
   to a DHCP client (i.e., not to a relay agent specified in the
   'giaddr' field) SHOULD examine the BROADCAST bit in the 'flags'
   field.  If this bit is set to 1, the DHCP message SHOULD be sent as
   an IP broadcast using an IP broadcast address (preferably 0x)
   as the IP destination address and the link-layer broadcast address as
   the link-layer destination address.

This patch changes destination IP address to 255.255.255.255 when client
set BROADCAST flag in their DHCPREQUEST. Note: the offered IP address is
still part of the DHCP payload.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1801006

Signed-off-by: Ihar Hrachyshka 
---
 controller/pinctrl.c |  6 +++
 northd/ovn-northd.c  |  7 ++--
 tests/ovn.at | 98 
 3 files changed, 80 insertions(+), 31 deletions(-)

diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index d06915a65..63ce35bce 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -966,6 +966,8 @@ pinctrl_handle_tcp_reset(struct rconn *swconn, const struct 
flow *ip_flow,
 dp_packet_uninit();
 }
 
+#define DHCP_FLAGS_BROADCAST(flags) (((flags) & (1UL << 7)) != 0)
+
 /* Called with in the pinctrl_handler thread context. */
 static void
 pinctrl_handle_put_dhcp_opts(
@@ -1190,7 +1192,11 @@ pinctrl_handle_put_dhcp_opts(
 
 udp->udp_len = htons(new_l4_size);
 
+/* Send a broadcast IP frame when BROADCAST flag is set */
 struct ip_header *out_ip = dp_packet_l3(_out);
+uint32_t ip_dst = DHCP_FLAGS_BROADCAST(dhcp_data->flags) ? 
htonl(0x) : *offer_ip;
+put_16aligned_be32(_ip->ip_dst, ip_dst);
+
 out_ip->ip_tot_len = htons(pkt_out.l4_ofs - pkt_out.l3_ofs + new_l4_size);
 udp->udp_csum = 0;
 /* Checksum needs to be initialized to zero. */
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 3aba0487d..58dfee617 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -4271,10 +4271,9 @@ build_dhcpv4_action(struct ovn_port *op, ovs_be32 
offer_ip,
 ds_put_cstr(options_action, "); next;");
 
 ds_put_format(response_action, "eth.dst = eth.src; eth.src = %s; "
-  "ip4.dst = "IP_FMT"; ip4.src = %s; udp.src = 67; "
-  "udp.dst = 68; outport = inport; flags.loopback = 1; "
-  "output;",
-  server_mac, IP_ARGS(offer_ip), server_ip);
+  "ip4.src = %s; udp.src = 67; udp.dst = 68; "
+  "outport = inport; flags.loopback = 1; output;",
+  server_mac, server_ip);
 
 ds_put_format(ipv4_addr_match,
   "ip4.src == "IP_FMT" && ip4.dst == {%s, 255.255.255.255}",
diff --git a/tests/ovn.at b/tests/ovn.at
index cbaa6d4a2..a7842a35e 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -4595,10 +4595,11 @@ sleep 2
 as hv1 ovs-vsctl show
 
 # This shell function sends a DHCP request packet
-# test_dhcp INPORT SRC_MAC DHCP_TYPE OFFER_IP REQUEST_IP ...
+# test_dhcp INPORT SRC_MAC DHCP_TYPE BROADCAST CIADDR OFFER_IP REQUEST_IP 
USE_IP ...
 test_dhcp() {
-local inport=$1 src_mac=$2 dhcp_type=$3 ciaddr=$4 offer_ip=$5 
request_ip=$6 use_ip=$7
-shift; shift; shift; shift; shift; shift; shift;
+local inport=$1 src_mac=$2 dhcp_type=$3 broadcast=$4 ciaddr=$5 offer_ip=$6 
request_ip=$7 use_ip=$8
+shift; shift; shift; shift; shift; shift; shift; shift;
+
 if test $use_ip != 0; then
 src_ip=$1
 dst_ip=$2
@@ -4607,6 +4608,7 @@ test_dhcp() {
 src_ip=`ip_to_hex 0 0 0 0`
 dst_ip=`ip_to_hex 255 255 255 255`
 fi
+
 if test $request_ip != 0; then
 ip_len=0120
 udp_len=010b
@@ -4614,10 +4616,19 @@ test_dhcp() {
 ip_len=011a
 udp_len=0106
 fi
+
+if test $broadcast != 0; then
+flags=8000
+reply_dst_ip=`ip_to_hex 255 255 255 255`
+else
+flags=
+reply_dst_ip=${offer_ip}
+fi
+
 local 
request=${src_mac}08004510${ip_len}8011${src_ip}${dst_ip}
 # udp header and dhcp header
 request=${request}00440043${udp_len}
-
request=${request}010106006359aa76${ciaddr}${src_mac}
+
request=${request}010106006359aa76${flags}${ciaddr}${src_mac}
 # client hardware padding
 request=${request}
 # server hostname
@@ -4655,10 +4666,10 @@ test_dhcp() {
 ip_len=$(printf "%x" $ip_len)
 udp_len=$(printf "%x" $udp_len)
 # $ip_len var will be in 3 digits i.e 134. So adding a '0' before 
$ip_len
-local 
reply=${src_mac}${srv_mac}080045100${ip_len}8011${srv_ip}${offer_ip}
+local 
reply=${src_mac}${srv_mac}080045100${ip_len}8011${srv_ip}${reply_dst_ip}
 # udp header and dhcp header.
 # $udp_len var will be in 3 digits. So adding a '0' before 

Re: [ovs-dev] [PATCH V2 02/11] compat: Fix up changes to inet frags in 5.1+

2020-03-03 Thread Gregory Rose


On 2/28/2020 9:02 AM, Gregory Rose wrote:

On 2/27/2020 3:54 PM, Yi-Hung Wei wrote:

On Wed, Feb 26, 2020 at 9:41 AM Greg Rose  wrote:
Since Linux kernel release 5.1 the fragments field of the 
inet_frag_queue

structure is removed and now only the rb_fragments structure with an
rb_node pointer is used for both ipv4 and ipv6.  In addition, the
atomic_sub and atomic_add functions are replaced with their
equivalent long counterparts.

Signed-off-by: Greg Rose 
---
  acinclude.m4  |  2 ++
  datapath/linux/compat/include/net/inet_frag.h | 21 
+

  2 files changed, 23 insertions(+)

diff --git a/acinclude.m4 b/acinclude.m4
index db64267..cad76c7 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -1067,6 +1067,8 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [
[OVS_DEFINE([HAVE_RBTREE_RB_LINK_NODE_RCU])])
    OVS_GREP_IFELSE([$KSRC/include/net/dst_ops.h], [bool 
confirm_neigh],

[OVS_DEFINE([HAVE_DST_OPS_CONFIRM_NEIGH])])
+  OVS_GREP_IFELSE([$KSRC/include/net/inet_frag.h], [fqdir],
+  [OVS_DEFINE([HAVE_INET_FRAG_FQDIR])])

    if cmp -s datapath/linux/kcompat.h.new \
  datapath/linux/kcompat.h >/dev/null 2>&1; then
diff --git a/datapath/linux/compat/include/net/inet_frag.h 
b/datapath/linux/compat/include/net/inet_frag.h

index 124c8be..e3c6df3 100644
--- a/datapath/linux/compat/include/net/inet_frag.h
+++ b/datapath/linux/compat/include/net/inet_frag.h
@@ -18,7 +18,16 @@ static inline bool inet_frag_evicting(struct 
inet_frag_queue *q)

  #ifdef HAVE_INET_FRAG_QUEUE_WITH_LIST_EVICTOR
 return !hlist_unhashed(>list_evictor);
  #else
+/*
+ * We can't use acinclude.m4 to check this as the field 'fragments'
+ * also matches 'rb_fragments'.
+ */
+#if LINUX_VERSION_CODE < KERNEL_VERSION(5,1,0)
 return (q_flags(q) & INET_FRAG_FIRST_IN) && q->fragments != 
NULL;

+#else
+   return (q_flags(q) & INET_FRAG_FIRST_IN) &&
+   q->rb_fragments.rb_node != NULL;
+#endif
  #endif /* HAVE_INET_FRAG_QUEUE_WITH_LIST_EVICTOR */
  }
  #endif /* HAVE_INET_FRAG_EVICTING */

Yes, it's a bummer if we can not check some fields in the acinclude.m4.

It looks like inet_frag_evicting() has been removed by upstream commit
399d1404be66 ("inet: frags: get rif of inet_frag_evicting()") since
kernel 4.17, and the only place that we use is in rpl_ipfrag_init() ->
ip_expire() when HAVE_CORRECT_MRU_HANDLING is not available.
Therefore, how about the following approach that avoids the less
reliable kernel version check.

--- a/datapath/linux/compat/include/net/inet_frag.h
+++ b/datapath/linux/compat/include/net/inet_frag.h
@@ -12,6 +12,7 @@
  #define qp_flags(qp) (qp->q.flags)
  #endif

+#ifndef HAVE_CORRECT_MRU_HANDLING
  #ifndef HAVE_INET_FRAG_EVICTING
  static inline bool inet_frag_evicting(struct inet_frag_queue *q)
  {
@@ -22,6 +23,7 @@ static inline bool inet_frag_evicting(struct
inet_frag_queue *q)
  #endif /* HAVE_INET_FRAG_QUEUE_WITH_LIST_EVICTOR */
  }
  #endif /* HAVE_INET_FRAG_EVICTING */
+#endif /* HAVE_CORRECT_MRU_HANDLING */


I am good with the rest of the backports in this patch.


Hi Yi-Hung,

I have a V3 set of patches for this series but I found that the 
inet_frags changes in
this patch (with or without your suggested change) are failing tests 59 
and 60 in

check-kmod on a Ubuntu 18.04 system with the 4.15.0-88-generic kernel.  I'm
investigating that issue and should have the updated V3 patches ready as 
soon

as I sort this out.

Thanks,

- Greg

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] dpdk: Remove deprecated pdump support.

2020-03-03 Thread Aaron Conole
Ilya Maximets  writes:

> DPDK pdump was deprecated in 2.13 release and didn't actually
> work since 2.11.  Removing it.
>
> More details in commit 4ae8c4617fd3 ("dpdk: Deprecate pdump support.")
>
> Signed-off-by: Ilya Maximets 
> ---

Acked-by: Aaron Conole 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] netdev-dpdk: Remove deprecated ring port type.

2020-03-03 Thread Aaron Conole
Ilya Maximets  writes:

> 'dpdkr' ring ports was deprecated in 2.13 release and was not
> actually used for a long time.  Remove support now.
>
> More details in
> commit b4c5f00c339b ("netdev-dpdk: Deprecate ring ports.")
>
> Signed-off-by: Ilya Maximets 
> ---

Acked-by: Aaron Conole 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net 08/16] openvswitch: add missing attribute validation for hash

2020-03-03 Thread Gregory Rose



On 3/2/2020 9:05 PM, Jakub Kicinski wrote:

Add missing attribute validation for OVS_PACKET_ATTR_HASH
to the netlink policy.

Fixes: bd1903b7c459 ("net: openvswitch: add hash info to upcall")
Signed-off-by: Jakub Kicinski 
---
CC: Pravin B Shelar 
CC: Tonghao Zhang 
CC: d...@openvswitch.org
---
  net/openvswitch/datapath.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index c047afd12116..07a7dd185995 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -645,6 +645,7 @@ static const struct nla_policy 
packet_policy[OVS_PACKET_ATTR_MAX + 1] = {
[OVS_PACKET_ATTR_ACTIONS] = { .type = NLA_NESTED },
[OVS_PACKET_ATTR_PROBE] = { .type = NLA_FLAG },
[OVS_PACKET_ATTR_MRU] = { .type = NLA_U16 },
+   [OVS_PACKET_ATTR_HASH] = { .type = NLA_U64 },
  };
  
  static const struct genl_ops dp_packet_genl_ops[] = {


LGTM

Reviewed-by: Greg Rose 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] packets: Fix typo in comment.

2020-03-03 Thread Ben Pfaff
Reported-by: Toms Atteka 
Signed-off-by: Ben Pfaff 
---
 lib/packets.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/packets.h b/lib/packets.h
index 5d7f82c45b6a..4c1e91dee254 100644
--- a/lib/packets.h
+++ b/lib/packets.h
@@ -963,7 +963,7 @@ union ovs_16aligned_in6_addr {
 ovs_16aligned_be32 be32[4];
 };
 
-/* Like struct in6_hdr, but whereas that struct requires 32-bit alignment, this
+/* Like struct ip6_hdr, but whereas that struct requires 32-bit alignment, this
  * one only requires 16-bit alignment. */
 struct ovs_16aligned_ip6_hdr {
 union {
-- 
2.24.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] netdev-offload-tc: Flush rules on ingress block when init tc flow api

2020-03-03 Thread Dmytro Linkin
On Tue, Mar 03, 2020 at 04:42:12PM +0100, Ilya Maximets wrote:
> On 3/3/20 4:29 PM, Dmytro Linkin wrote:
> > On Tue, Mar 03, 2020 at 02:15:21PM +0100, Ilya Maximets wrote:
> >> On 3/3/20 8:58 AM, Roi Dayan wrote:
> >>>
> >>>
> >>> On 2020-02-27 5:22 PM, Roi Dayan wrote:
>  From: Dmytro Linkin 
> 
>  OVS can fail to attach ingress block on iface when init tc flow api,
>  if block already exist with rules on it and is shared with other iface.
>  Fix by flush all existing rules on the ingress block prior to deleting
>  it.
> 
>  Fixes: 093c9458fb02 ("tc: allow offloading of block ids")
>  Signed-off-by: Dmytro Linkin 
>  Acked-by: Raed Salem 
>  Acked-by: Roi Dayan 
>  ---
>   lib/netdev-offload-tc.c | 10 +-
>   1 file changed, 9 insertions(+), 1 deletion(-)
> 
>  diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
>  index 550e440b3a45..b5ff6ccca55e 100644
>  --- a/lib/netdev-offload-tc.c
>  +++ b/lib/netdev-offload-tc.c
>  @@ -1907,6 +1907,7 @@ netdev_tc_init_flow_api(struct netdev *netdev)
>   static struct ovsthread_once block_once = 
>  OVSTHREAD_ONCE_INITIALIZER;
>   enum tc_qdisc_hook hook = get_tc_qdisc_hook(netdev);
>   uint32_t block_id = 0;
>  +struct tcf_id id;
>   int ifindex;
>   int error;
>   
>  @@ -1917,6 +1918,14 @@ netdev_tc_init_flow_api(struct netdev *netdev)
>   return -ifindex;
>   }
>   
>  +block_id = get_block_id_from_netdev(netdev);
>  +
>  +/* Flush rules explicitly needed when we work with ingress_block,
>  + * so we will not fail with reattaching block to bond iface, for ex.
>  + */
>  +id = tc_make_tcf_id(ifindex, block_id, 0, hook);
>  +tc_del_filter();
>  +
>   /* make sure there is no ingress/egress qdisc */
>   tc_add_del_qdisc(ifindex, false, 0, hook);
>   
>  @@ -1930,7 +1939,6 @@ netdev_tc_init_flow_api(struct netdev *netdev)
>   ovsthread_once_done(_mask_once);
>   }
>   
>  -block_id = get_block_id_from_netdev(netdev);
>   error = tc_add_del_qdisc(ifindex, true, block_id, hook);
>   
>   if (error && error != EEXIST) {
> 
> >>>
> >>> +ilya
> >>>
> >>> Hi Ilya,
> >>>
> >>> can you help review/ack this?
> >>
> >> Hi.  I'm not an expert in linux tc, but since you're asking...
> >>
> >> IIUC, the issue is that tc_add_del_qdisc(ifindex, true, block_id, hook)
> >> fails on bond iface.  Is it correct?
> > 
> > At first tc_add_del_qdisc() fails on deletion, because qdisc, which is
> > added to block, is in use (rules are exist). We just don't care about
> > any error returned. Then tc_add_del_qdisc() fail to add it, because it
> > wasn't deleted and in use.
> > 
> >> In this case I see one strange thing.  We're clearing ingress qdisk
> >> for block_id == 0, but after that trying to create new one with
> >> block_id == ifindex (for LAG interface).  Will it help if we delete
> >> ingress qdisk providing correct block_id?  This sounds like something
> >> sane to do.
> >  
> > Deleting block_id != 0 will fail, due to existing rules. But actually,
> > deleting qdisc with block_id == 0 still delete correct block.
> > 
> > Anyway, the point here is to flush rules on specified ingress block.
> 
> Hmm.  OK.
> 
> Shouldn't we unconditionally flush rules from qdisk inside the
> tc_add_del_qdisc() if deletion is requested?  From your explanation
> it seems like a prerequisite for that and qdisk will not be deleted
> if we will not flush rules.

Flushing rules needed only for shared block, 'cause like in our case,
there are ifaces not attached to the bridge, so OVS don't ask kernel to
delete qdisc. In other known cases kernel flush rules by itself on qdisc
deletion, so flush inside tc_add_del_qdisc() mostly not needed.
Mb, better do it for shared blocks only.

> BTW, one thing in this patch that doesn't look good is that you're
> using block_id before probing the support.

Probably yes. What You suggest here? By case, it work. Probing triggers
on internal OVS ifaces at first, so later calls to get block_id return
valid id.

Regards, Dmytro
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [RFC v2 ovn 6/6] ovn-controller: I-P for ct zone and OVS interface changes

2020-03-03 Thread numans
From: Numan Siddique 

Signed-off-by: Numan Siddique 
---
 controller/binding.c| 22 --
 controller/ovn-controller.c | 86 -
 controller/ovn-controller.h | 22 ++
 controller/physical.c   | 47 
 controller/physical.h   |  5 ++-
 5 files changed, 157 insertions(+), 25 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index 236db32c0..22c99c60b 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -501,22 +501,6 @@ remove_local_lport_ids(const struct sbrec_port_binding 
*binding_rec,
 sset_find_and_delete(local_lport_ids, buf);
 }
 
-enum local_binding_type {
-BT_VIF,
-BT_CHILD,
-BT_VIRTUAL
-};
-
-struct local_binding {
-struct ovs_list node;   /* In parent if any. */
-char *name;
-enum local_binding_type type;
-const struct ovsrec_interface *iface;
-const struct sbrec_port_binding *pb;
-
-struct ovs_list children;
-};
-
 static struct local_binding *
 local_binding_create(const char *name, const struct ovsrec_interface *iface,
  const struct sbrec_port_binding *pb,
@@ -537,12 +521,6 @@ local_binding_add(struct shash *local_bindings, struct 
local_binding *lbinding)
 shash_add(local_bindings, lbinding->name, lbinding);
 }
 
-static struct local_binding *
-local_binding_find(struct shash *local_bindings, const char *name)
-{
-return shash_find_data(local_bindings, name);
-}
-
 static void
 local_binding_destroy(struct local_binding *lbinding)
 {
diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index ec365dd74..105f109d0 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -1355,6 +1355,10 @@ static void init_physical_ctx(struct engine_node *node,
 
 ovs_assert(br_int && chassis);
 
+struct ovsrec_interface_table *iface_table =
+(struct ovsrec_interface_table *)EN_OVSDB_GET(
+engine_get_input("OVS_interface", node));
+
 struct ed_type_ct_zones *ct_zones_data =
 engine_get_input_data("ct_zones", node);
 struct simap *ct_zones = _zones_data->current;
@@ -1364,12 +1368,14 @@ static void init_physical_ctx(struct engine_node *node,
 p_ctx->mc_group_table = multicast_group_table;
 p_ctx->br_int = br_int;
 p_ctx->chassis_table = chassis_table;
+p_ctx->iface_table = iface_table;
 p_ctx->chassis = chassis;
 p_ctx->active_tunnels = _data->active_tunnels;
 p_ctx->local_datapaths = _data->local_datapaths;
 p_ctx->local_lports = _data->local_lports;
 p_ctx->ct_zones = ct_zones;
 p_ctx->mff_ovn_geneve = ed_mff_ovn_geneve->mff_ovn_geneve;
+p_ctx->local_bindings = _data->local_bindings;
 }
 
 static void init_lflow_ctx(struct engine_node *node,
@@ -1777,6 +1783,70 @@ flow_output_port_groups_handler(struct engine_node 
*node, void *data)
 return _flow_output_resource_ref_handler(node, data, REF_TYPE_PORTGROUP);
 }
 
+struct ed_type_physical_flow_changes {
+bool ct_zones_changed;
+};
+
+static bool
+flow_output_physical_flow_changes_handler(struct engine_node *node, void *data)
+{
+struct ed_type_physical_flow_changes *pfc =
+engine_get_input_data("physical_flow_changes", node);
+struct ed_type_runtime_data *rt_data =
+engine_get_input_data("runtime_data", node);
+
+struct ed_type_flow_output *fo = data;
+struct physical_ctx p_ctx;
+init_physical_ctx(node, rt_data, _ctx);
+
+engine_set_node_state(node, EN_UPDATED);
+if (pfc->ct_zones_changed) {
+pfc->ct_zones_changed = false;
+physical_run(_ctx, >flow_table);
+return true;
+}
+
+return physical_handle_ovs_iface_changes(_ctx, >flow_table);
+}
+
+static void *
+en_physical_flow_changes_init(struct engine_node *node OVS_UNUSED,
+ struct engine_arg *arg OVS_UNUSED)
+{
+struct ed_type_physical_flow_changes *data = xzalloc(sizeof *data);
+return data;
+}
+
+static void
+en_physical_flow_changes_cleanup(void *data OVS_UNUSED)
+{
+
+}
+
+static void
+en_physical_flow_changes_run(struct engine_node *node, void *data OVS_UNUSED)
+{
+engine_set_node_state(node, EN_UPDATED);
+}
+
+static bool
+physical_flow_changes_ct_zones_handler(struct engine_node *node OVS_UNUSED,
+  void *data OVS_UNUSED)
+{
+struct ed_type_physical_flow_changes *pfc = data;
+pfc->ct_zones_changed = true;
+engine_set_node_state(node, EN_UPDATED);
+return false;
+}
+
+static bool
+flow_output_noop_handler(struct engine_node *node OVS_UNUSED,
+  void *data OVS_UNUSED)
+{
+return true;
+}
+
+
 struct ovn_controller_exit_args {
 bool *exiting;
 bool *restart;
@@ -1899,6 +1969,7 @@ main(int argc, char *argv[])
 ENGINE_NODE(runtime_data, "runtime_data");
 ENGINE_NODE(mff_ovn_geneve, "mff_ovn_geneve");
 ENGINE_NODE(ofctrl_is_connected, "ofctrl_is_connected");
+

[ovs-dev] [RFC v2 ovn 5/6] ofctrl_check_and_add_flow: Replace the actions of an existing flow if actions have changed.

2020-03-03 Thread numans
From: Numan Siddique 

If ofctrl_check_and_add_flow(F') is called where flow F' has match-actions (M, 
A2)
and if there already exists a flow F with match-actions (M, A1) in the desired 
flow
table, then this function  doesn't update the existing flow F with new actions
actions A2.

This patch fixes it. Presently we don't see any issues because of this 
behaviour.
But it's better to fix it.

Signed-off-by: Numan Siddique 
---
 controller/ofctrl.c | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/controller/ofctrl.c b/controller/ofctrl.c
index 36e39ba06..2df7400be 100644
--- a/controller/ofctrl.c
+++ b/controller/ofctrl.c
@@ -667,14 +667,23 @@ ofctrl_check_and_add_flow(struct ovn_desired_flow_table 
*flow_table,
 
 ovn_flow_log(f, "ofctrl_add_flow");
 
-if (ovn_flow_lookup(_table->match_flow_table, f, true)) {
-if (log_duplicate_flow) {
-static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5);
-if (!VLOG_DROP_DBG()) {
-char *s = ovn_flow_to_string(f);
-VLOG_DBG("dropping duplicate flow: %s", s);
-free(s);
+struct ovn_flow *existing_f =
+ovn_flow_lookup(_table->match_flow_table, f, true);
+if (existing_f) {
+if (ofpacts_equal(f->ofpacts, f->ofpacts_len,
+  existing_f->ofpacts, existing_f->ofpacts_len)) {
+if (log_duplicate_flow) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5);
+if (!VLOG_DROP_DBG()) {
+char *s = ovn_flow_to_string(f);
+VLOG_DBG("dropping duplicate flow: %s", s);
+free(s);
+}
 }
+} else {
+free(existing_f->ofpacts);
+existing_f->ofpacts = xmemdup(f->ofpacts, f->ofpacts_len);
+existing_f->ofpacts_len = f->ofpacts_len;
 }
 ovn_flow_destroy(f);
 return;
-- 
2.24.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [RFC v2 ovn 1/6] Refactor binding_run()to take two context argument - binding_ctx_in and binding_ctx_out.

2020-03-03 Thread numans
From: Numan Siddique 

No functional changes are introduced in this patch.

Signed-off-by: Numan Siddique 
---
 controller/binding.c| 260 +---
 controller/binding.h|  39 +++---
 controller/ovn-controller.c | 120 ++---
 3 files changed, 214 insertions(+), 205 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index c3376e25e..5742e5111 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -441,12 +441,9 @@ static bool
 is_our_chassis(const struct sbrec_chassis *chassis_rec,
const struct sbrec_port_binding *binding_rec,
const struct sset *active_tunnels,
-   const struct shash *lport_to_iface,
+   const struct ovsrec_interface *iface_rec,
const struct sset *local_lports)
 {
-const struct ovsrec_interface *iface_rec
-= shash_find_data(lport_to_iface, binding_rec->logical_port);
-
 bool our_chassis = false;
 if (iface_rec
 || (binding_rec->parent_port && binding_rec->parent_port[0] &&
@@ -478,78 +475,73 @@ is_our_chassis(const struct sbrec_chassis *chassis_rec,
  * additional processing.
  */
 static const struct sbrec_port_binding **
-consider_local_datapath(struct ovsdb_idl_txn *ovnsb_idl_txn,
-struct ovsdb_idl_txn *ovs_idl_txn,
-struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
-struct ovsdb_idl_index *sbrec_port_binding_by_datapath,
-struct ovsdb_idl_index *sbrec_port_binding_by_name,
-const struct sset *active_tunnels,
-const struct sbrec_chassis *chassis_rec,
-const struct sbrec_port_binding *binding_rec,
+consider_local_datapath(const struct sbrec_port_binding *binding_rec,
 struct hmap *qos_map,
-struct hmap *local_datapaths,
-struct shash *lport_to_iface,
-struct sset *local_lports,
-struct sset *local_lport_ids,
+const struct ovsrec_interface *iface_rec,
+struct binding_ctx_in *b_ctx_in,
+struct binding_ctx_out *b_ctx_out,
 const struct sbrec_port_binding **vpbs,
 size_t *n_vpbs, size_t *n_allocated_vpbs)
 {
-const struct ovsrec_interface *iface_rec
-= shash_find_data(lport_to_iface, binding_rec->logical_port);
-
-bool our_chassis = is_our_chassis(chassis_rec, binding_rec, active_tunnels,
-  lport_to_iface, local_lports);
+bool our_chassis = is_our_chassis(b_ctx_in->chassis_rec, binding_rec,
+  b_ctx_in->active_tunnels, iface_rec,
+  b_ctx_out->local_lports);
 if (iface_rec
 || (binding_rec->parent_port && binding_rec->parent_port[0] &&
-sset_contains(local_lports, binding_rec->parent_port))) {
+sset_contains(b_ctx_out->local_lports, binding_rec->parent_port))) 
{
 if (binding_rec->parent_port && binding_rec->parent_port[0]) {
 /* Add child logical port to the set of all local ports. */
-sset_add(local_lports, binding_rec->logical_port);
+sset_add(b_ctx_out->local_lports, binding_rec->logical_port);
 }
-add_local_datapath(sbrec_datapath_binding_by_key,
-   sbrec_port_binding_by_datapath,
-   sbrec_port_binding_by_name,
-   binding_rec->datapath, false, local_datapaths);
-if (iface_rec && qos_map && ovs_idl_txn) {
+add_local_datapath(b_ctx_in->sbrec_datapath_binding_by_key,
+   b_ctx_in->sbrec_port_binding_by_datapath,
+   b_ctx_in->sbrec_port_binding_by_name,
+   binding_rec->datapath, false,
+   b_ctx_out->local_datapaths);
+if (iface_rec && qos_map && b_ctx_in->ovs_idl_txn) {
 get_qos_params(binding_rec, qos_map);
 }
 } else if (!strcmp(binding_rec->type, "l2gateway")) {
 if (our_chassis) {
-sset_add(local_lports, binding_rec->logical_port);
-add_local_datapath(sbrec_datapath_binding_by_key,
-   sbrec_port_binding_by_datapath,
-   sbrec_port_binding_by_name,
-   binding_rec->datapath, false, local_datapaths);
+sset_add(b_ctx_out->local_lports, binding_rec->logical_port);
+add_local_datapath(b_ctx_in->sbrec_datapath_binding_by_key,
+   b_ctx_in->sbrec_port_binding_by_datapath,
+   b_ctx_in->sbrec_port_binding_by_name,
+   binding_rec->datapath, false,
+

[ovs-dev] [RFC v2 ovn 3/6] ovn-controller: I-P for port binding in runtime_data stage

2020-03-03 Thread numans
From: Numan Siddique 

This patch handles the port binding changes incrementally in the runtime_data
stage.

Signed-off-by: Numan Siddique 
---
 controller/binding.c| 455 ++--
 controller/binding.h|   7 +-
 controller/ovn-controller.c |  50 +++-
 3 files changed, 431 insertions(+), 81 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index 38b426bff..236db32c0 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -349,17 +349,6 @@ setup_qos(const char *egress_iface, struct hmap *queue_map)
 netdev_close(netdev_phy);
 }
 
-static void
-update_local_lport_ids(struct sset *local_lport_ids,
-   const struct sbrec_port_binding *binding_rec)
-{
-char buf[16];
-snprintf(buf, sizeof(buf), "%"PRId64"_%"PRId64,
- binding_rec->datapath->tunnel_key,
- binding_rec->tunnel_key);
-sset_add(local_lport_ids, buf);
-}
-
 /*
  * Get the encap from the chassis for this port. The interface
  * may have an external_ids:encap-ip= set; if so we
@@ -490,6 +479,28 @@ consider_localnet_port(const struct sbrec_port_binding 
*binding_rec,
 ld->localnet_port = binding_rec;
 }
 
+static void
+update_local_lport_ids(struct sset *local_lport_ids,
+   const struct sbrec_port_binding *binding_rec)
+{
+char buf[16];
+snprintf(buf, sizeof(buf), "%"PRId64"_%"PRId64,
+ binding_rec->datapath->tunnel_key,
+ binding_rec->tunnel_key);
+sset_add(local_lport_ids, buf);
+}
+
+static void
+remove_local_lport_ids(const struct sbrec_port_binding *binding_rec,
+   struct sset *local_lport_ids)
+{
+char buf[16];
+snprintf(buf, sizeof(buf), "%"PRId64"_%"PRId64,
+ binding_rec->datapath->tunnel_key,
+ binding_rec->tunnel_key);
+sset_find_and_delete(local_lport_ids, buf);
+}
+
 enum local_binding_type {
 BT_VIF,
 BT_CHILD,
@@ -545,18 +556,21 @@ local_binding_destroy(struct local_binding *lbinding)
 free(lbinding);
 }
 
+static
+void local_binding_delete(struct shash *local_bindings,
+  struct local_binding *lbinding)
+{
+shash_find_and_delete(local_bindings, lbinding->name);
+local_binding_destroy(lbinding);
+}
+
 void
 local_bindings_destroy(struct shash *local_bindings)
 {
 struct shash_node *node, *next;
 SHASH_FOR_EACH_SAFE (node, next, local_bindings) {
 struct local_binding *lbinding = node->data;
-struct local_binding *c, *n;
-LIST_FOR_EACH_SAFE (c, n, node, >children) {
-ovs_list_remove(>node);
-free(c->name);
-free(c);
-}
+local_binding_destroy(lbinding);
 }
 
 shash_destroy(local_bindings);
@@ -651,6 +665,22 @@ release_lport(const struct sbrec_port_binding *pb)
 }
 }
 
+static void
+release_local_binding_children(struct local_binding *lbinding)
+{
+struct local_binding *l;
+LIST_FOR_EACH (l, node, >children) {
+release_lport(l->pb);
+}
+}
+
+static void
+release_local_binding(struct local_binding *lbinding)
+{
+release_local_binding_children(lbinding);
+release_lport(lbinding->pb);
+}
+
 static void
 consider_port_binding_for_vif(const struct sbrec_port_binding *pb,
   struct binding_ctx_in *b_ctx_in,
@@ -725,7 +755,6 @@ consider_port_binding(const struct sbrec_port_binding *pb,
   struct binding_ctx_out *b_ctx_out,
   struct hmap *qos_map)
 {
-
 bool our_chassis = is_our_chassis(b_ctx_in->chassis_rec, pb,
   b_ctx_in->active_tunnels, NULL,
   b_ctx_out->local_lports);
@@ -818,6 +847,8 @@ build_local_bindings_from_local_ifaces(struct 
binding_ctx_in *b_ctx_in,
 lbinding->pb = pb;
 }
 sset_add(b_ctx_out->local_lports, iface_id);
+smap_replace(b_ctx_out->local_iface_ids, iface_rec->name,
+ iface_id);
 }
 
 /* Check if this is a tunnel interface. */
@@ -935,62 +966,6 @@ binding_run(struct binding_ctx_in *b_ctx_in, struct 
binding_ctx_out *b_ctx_out)
 hmap_destroy(_map);
 }
 
-/* Returns true if port-binding changes potentially require flow changes on
- * the current chassis. Returns false if we are sure there is no impact. */
-bool
-binding_evaluate_port_binding_changes(struct binding_ctx_in *b_ctx_in,
-  struct binding_ctx_out *b_ctx_out)
-{
-if (!b_ctx_in->chassis_rec) {
-return true;
-}
-
-bool changed = false;
-
-const struct sbrec_port_binding *binding_rec;
-SBREC_PORT_BINDING_TABLE_FOR_EACH_TRACKED (binding_rec,
-   b_ctx_in->port_binding_table) {
-/* XXX: currently OVSDB change tracking doesn't 

[ovs-dev] [RFC v2 ovn 4/6] ovn-controller: I-P for datapath binding

2020-03-03 Thread numans
From: Numan Siddique 

This patch adds partial support of incremental processing of datapath binding.
If a datapath is deleted, then a full recompute is triggered.

Signed-off-by: Numan Siddique 
---
 controller/ovn-controller.c | 26 +-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 8502b17ed..ec365dd74 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -1189,6 +1189,29 @@ runtime_data_sb_port_binding_handler(struct engine_node 
*node, void *data)
 return binding_handle_port_binding_changes(_ctx_in, _ctx_out);
 }
 
+static bool
+runtime_data_sb_datapath_binding_handler(struct engine_node *node OVS_UNUSED,
+ void *data OVS_UNUSED)
+{
+struct sbrec_datapath_binding_table *dp_table =
+(struct sbrec_datapath_binding_table *)EN_OVSDB_GET(
+engine_get_input("SB_datapath_binding", node));
+const struct sbrec_datapath_binding *dp;
+struct ed_type_runtime_data *rt_data = data;
+
+SBREC_DATAPATH_BINDING_TABLE_FOR_EACH_TRACKED (dp, dp_table) {
+if (sbrec_datapath_binding_is_deleted(dp)) {
+if (get_local_datapath(_data->local_datapaths,
+   dp->tunnel_key)) {
+return false;
+}
+}
+}
+
+engine_set_node_state(node, EN_VALID);
+return true;
+}
+
 /* Connection tracking zones. */
 struct ed_type_ct_zones {
 unsigned long bitmap[BITMAP_N_LONGS(MAX_CT_ZONES)];
@@ -1935,7 +1958,8 @@ main(int argc, char *argv[])
 engine_add_input(_runtime_data, _ovs_qos, NULL);
 
 engine_add_input(_runtime_data, _sb_chassis, NULL);
-engine_add_input(_runtime_data, _sb_datapath_binding, NULL);
+engine_add_input(_runtime_data, _sb_datapath_binding,
+ runtime_data_sb_datapath_binding_handler);
 engine_add_input(_runtime_data, _sb_port_binding,
  runtime_data_sb_port_binding_handler);
 
-- 
2.24.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [RFC v2 ovn 2/6] ovn-controller: Refactor binding.c

2020-03-03 Thread numans
From: Numan Siddique 

Signed-off-by: Numan Siddique 
---
 controller/binding.c| 682 ++--
 controller/binding.h|  16 +-
 controller/ovn-controller.c |  49 ++-
 controller/pinctrl.c|  19 +-
 controller/pinctrl.h|   4 +-
 5 files changed, 466 insertions(+), 304 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index 5742e5111..38b426bff 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -69,47 +69,6 @@ binding_register_ovs_idl(struct ovsdb_idl *ovs_idl)
 ovsdb_idl_add_column(ovs_idl, _qos_col_type);
 }
 
-static void
-get_local_iface_ids(const struct ovsrec_bridge *br_int,
-struct shash *lport_to_iface,
-struct sset *local_lports,
-struct sset *egress_ifaces)
-{
-int i;
-
-for (i = 0; i < br_int->n_ports; i++) {
-const struct ovsrec_port *port_rec = br_int->ports[i];
-const char *iface_id;
-int j;
-
-if (!strcmp(port_rec->name, br_int->name)) {
-continue;
-}
-
-for (j = 0; j < port_rec->n_interfaces; j++) {
-const struct ovsrec_interface *iface_rec;
-
-iface_rec = port_rec->interfaces[j];
-iface_id = smap_get(_rec->external_ids, "iface-id");
-int64_t ofport = iface_rec->n_ofport ? *iface_rec->ofport : 0;
-
-if (iface_id && ofport > 0) {
-shash_add(lport_to_iface, iface_id, iface_rec);
-sset_add(local_lports, iface_id);
-}
-
-/* Check if this is a tunnel interface. */
-if (smap_get(_rec->options, "remote_ip")) {
-const char *tunnel_iface
-= smap_get(_rec->status, "tunnel_egress_iface");
-if (tunnel_iface) {
-sset_add(egress_ifaces, tunnel_iface);
-}
-}
-}
-}
-}
-
 static void
 add_local_datapath__(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
  struct ovsdb_idl_index *sbrec_port_binding_by_datapath,
@@ -469,170 +428,6 @@ is_our_chassis(const struct sbrec_chassis *chassis_rec,
 return our_chassis;
 }
 
-/* Updates 'binding_rec' and if the port binding is local also updates the
- * local datapaths and ports.
- * Updates and returns the array of local virtual ports that will require
- * additional processing.
- */
-static const struct sbrec_port_binding **
-consider_local_datapath(const struct sbrec_port_binding *binding_rec,
-struct hmap *qos_map,
-const struct ovsrec_interface *iface_rec,
-struct binding_ctx_in *b_ctx_in,
-struct binding_ctx_out *b_ctx_out,
-const struct sbrec_port_binding **vpbs,
-size_t *n_vpbs, size_t *n_allocated_vpbs)
-{
-bool our_chassis = is_our_chassis(b_ctx_in->chassis_rec, binding_rec,
-  b_ctx_in->active_tunnels, iface_rec,
-  b_ctx_out->local_lports);
-if (iface_rec
-|| (binding_rec->parent_port && binding_rec->parent_port[0] &&
-sset_contains(b_ctx_out->local_lports, binding_rec->parent_port))) 
{
-if (binding_rec->parent_port && binding_rec->parent_port[0]) {
-/* Add child logical port to the set of all local ports. */
-sset_add(b_ctx_out->local_lports, binding_rec->logical_port);
-}
-add_local_datapath(b_ctx_in->sbrec_datapath_binding_by_key,
-   b_ctx_in->sbrec_port_binding_by_datapath,
-   b_ctx_in->sbrec_port_binding_by_name,
-   binding_rec->datapath, false,
-   b_ctx_out->local_datapaths);
-if (iface_rec && qos_map && b_ctx_in->ovs_idl_txn) {
-get_qos_params(binding_rec, qos_map);
-}
-} else if (!strcmp(binding_rec->type, "l2gateway")) {
-if (our_chassis) {
-sset_add(b_ctx_out->local_lports, binding_rec->logical_port);
-add_local_datapath(b_ctx_in->sbrec_datapath_binding_by_key,
-   b_ctx_in->sbrec_port_binding_by_datapath,
-   b_ctx_in->sbrec_port_binding_by_name,
-   binding_rec->datapath, false,
-   b_ctx_out->local_datapaths);
-}
-} else if (!strcmp(binding_rec->type, "chassisredirect")) {
-if (ha_chassis_group_contains(binding_rec->ha_chassis_group,
-  b_ctx_in->chassis_rec)) {
-add_local_datapath(b_ctx_in->sbrec_datapath_binding_by_key,
-   b_ctx_in->sbrec_port_binding_by_datapath,
-   b_ctx_in->sbrec_port_binding_by_name,
-   binding_rec->datapath, false,
-   

[ovs-dev] [RFC v2 ovn 0/6] Incremental processing improvements.

2020-03-03 Thread numans
From: Numan Siddique 

This patch series tries to handle port binding, datapath binding
and ovs interface changes incrementally.

First 2 patches does some refactoring. Submitting this as RFC to get
some comments.

Still need to update the proper commit messages and fix one test case
related to Incremental processing which is failing. Another test case
related to Interconnection is failing.

Below are the results of some testing I did with ovn-fake-multinode
setup

Test setup
--
 1. ovn-central fake node running OVN dbs and 2 compute nodes running
ovn-controller.

 2. Before running the tests, used an existing OVN db with the below
resources
   No of logical switches - 53
   No of logical ports- 1256
   No of logical routers  - 9
   No of logical router ports - 56
   No of port groups  - 152
   No of logical flows- 45447

   Port bindings on compute-1 -  19
   Port bindings on compute-2 -  18
   No of OF flows on compute-1 - 84996
   No of OF flows on compute-2 - 84901

 3. The test does the following
- Creates 2 logical switches (one for each compute node) and connect to a
  logical router for each compute node.
- 100 logical ports are created (50 per lswitch), a simple ACL is added and 
the address
  set is created for each port.
- Each port is bound on the respective compute node and the test
  pings the IP of the port (from another port belonging to the same
  lswitch created earlier).


Below are the results with OVN master

+--+
|   Response Times (sec)
   |
+--+++++++-+---+
| action   | min| median | 90%ile | 95%ile | max
| avg| success | count |
+--+++++++-+---+
| ovn.create_or_update_address_set | 0.484  | 0.507  | 0.526  | 0.537  | 0.547  
| 0.51   | 100.0%  | 100   |
| ovn.create_port_acls | 0.948  | 1.014  | 1.05   | 1.06   | 1.089  
| 1.016  | 100.0%  | 100   |
| ovn_network.bind_port| 1.244  | 1.294  | 1.333  | 1.346  | 1.397  
| 1.301  | 100.0%  | 100   |
| ovn.bind_ovs_vm  | 0.379  | 0.432  | 0.463  | 0.469  | 0.472  
| 0.434  | 100.0%  | 100   |
| ovn.bind_internal_vm | 0.824  | 0.861  | 0.897  | 0.906  | 0.941  
| 0.868  | 100.0%  | 100   |
| ovn_network.wait_port_ping   | 7.153  | 7.183  | 7.214  | 7.22   | 7.237  
| 7.185  | 100.0%  | 100   |
| total| 10.373 | 10.479 | 10.544 | 10.551 | 10.581 
| 10.483 | 100.0%  | 100   |
+--+++++++-+---+
Load duration: 1048.8911039829254
Full duration: 1050.4055325984955

Below are the results with these patches

+---+
| Response Times (sec)  
|
+--+---++++---+---+-+---+
| action   | min   | median | 90%ile | 95%ile | max   | 
avg   | success | count |
+--+---++++---+---+-+---+
| ovn.create_or_update_address_set | 0.479 | 0.492  | 0.513  | 0.52   | 0.531 | 
0.495 | 100.0%  | 100   |
| ovn.create_port_acls | 0.946 | 0.964  | 0.998  | 1.001  | 1.041 | 
0.969 | 100.0%  | 100   |
| ovn_network.bind_port| 1.22  | 1.297  | 1.342  | 1.36   | 1.402 | 
1.298 | 100.0%  | 100   |
| ovn.bind_ovs_vm  | 0.373 | 0.418  | 0.436  | 0.455  | 0.472 | 
0.42  | 100.0%  | 100   |
| ovn.bind_internal_vm | 0.818 | 0.88   | 0.911  | 0.93   | 0.982 | 
0.878 | 100.0%  | 100   |
| ovn_network.wait_port_ping   | 3.812 | 3.854  | 3.906  | 3.934  | 4.015 | 
3.861 | 100.0%  | 100   |
| total| 6.977 | 7.165  | 7.277  | 7.308  | 7.401 | 
7.176 | 100.0%  | 100   |
+--+---++++---+---+-+---+
Load duration: 718.401807308197
Full duration: 719.8932237625122


Testing the same with fresh OVN dbs didn't show any improvements in the
timings.


v1 -> v2
-
 * Added 2 new patches
 * Patch 5 (ofctrl_check_and_add_flow) was submitted earlier too and
   the previous discussion is here - https://patchwork.ozlabs.org/patch/1202417/
 * Patch 6 handles I-P for ct_zone and OVS interface changes in
   flow_output_run stage.
Numan Siddique (6):
  Refactor binding_run()to take two context argument - binding_ctx_in
and binding_ctx_out.
  

[ovs-dev] YOUR APPROVED REWARD:| REF:- CCMML/LON/709/SGT43658/2020

2020-03-03 Thread Mrs. GRACE DAVIS (U.K COMMUNICATIONS DEPT).,










Sir/Madam, 
Attn; The Manager / Director, CEO 


Ref SUB| COCA COLA REWARD LONDON, UNITED KINGDOM. 


REF STATUS: YOUR EMAIL DATA-PROFILE AND/OR NAME HAS BEEN SHORT-LISTED AMONGST 
OTHER 54 BENEFICIARIES; 
FOR YOU TO RECEIVE THE CASH REWARD SUM OF 1,265,000.00 MILLION GBP FROM THE 
COCA-COLA MEGA MILLIONS SWEEPSTAKES. 

THIS REWARD IS FOR OUR 10 YEARS ANNIVERSARY CELEBRATION OF SOUTH AFRICA'S 
SUCCESSFUL WORLD CUP HOSTED IN 2010 
AND FOR OUR OFFICIAL SPONSORSHIP AGAIN IN THE COMING YEAR, 2022 IN QATAR: 
https://www.fifa.com/worldcup/qatar2022/ 
W HICH WAS LAUGHED LAST WEEKEND HERE IN LONDON, UNITED KINGDOM . 
FOR CLAMS, Your Should Submit Your Details Here In Our Office To Receive The 
Reward Certified Bank Draft Of Your Funds, 
Delivered To You In Your Country, Through The Office Of MR.. SMITH OTIS of 
FTMGTLLC S/A., The Accredite d Fiduciary 

Consultant In South Africa Who Shall Contact You For Your Reward Prize Within 2 
- 3 Working Days. 

* TITLE & YOUR NAMES IN FULL:- 

* YOUR MOBILE PHONE No.: OR WHATSAPP PHONE No.:- 

* FULL RESIDENTIAL ADDRESS (WITH SUBURB, CITY or STATE AND COUNTRY):- * YOUR 
REWARD REF: CCMML/LON/709/SGT43658/2020 

PAYMENT OPTIONS: - (1) (EFT) ELECTRONIC FUNDS TRANSFER:- 
(2) VISA GOLD ATM BANK CARD (STRICTLY COLLECTIONS IN PERSONA):- 
THEREFORE, YOU SHOULD CONTACT OUR FIDUCIARY CONSULTANT; 

MR. SMITH OTIS 

EMAIL: jsmithsqag...@englandmail.com 

TEL/FAX: 0027(086)8573119 



THANKS IN ANTICIPATION OF YOUR PUNCTUALITY REPLY THROUGH OUR OFFICE EMAIL 
BELOW. 
WARMEST REGARDS, 

/ 

-- 
MRS. GRACE DAVIS | U.K COMMUNICATIONS DEPT., 
(OFFICE TEL: 0044-(0743)-8573119 | EMAIL: ADD EMAIL HERE 
U.K HEAD OF COMMUNICATIONS | UXBRIDGE UB8 1EZ, UNITED KINGDOM 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] netdev-offload-tc: Flush rules on ingress block when init tc flow api

2020-03-03 Thread Ilya Maximets
On 3/3/20 4:29 PM, Dmytro Linkin wrote:
> On Tue, Mar 03, 2020 at 02:15:21PM +0100, Ilya Maximets wrote:
>> On 3/3/20 8:58 AM, Roi Dayan wrote:
>>>
>>>
>>> On 2020-02-27 5:22 PM, Roi Dayan wrote:
 From: Dmytro Linkin 

 OVS can fail to attach ingress block on iface when init tc flow api,
 if block already exist with rules on it and is shared with other iface.
 Fix by flush all existing rules on the ingress block prior to deleting
 it.

 Fixes: 093c9458fb02 ("tc: allow offloading of block ids")
 Signed-off-by: Dmytro Linkin 
 Acked-by: Raed Salem 
 Acked-by: Roi Dayan 
 ---
  lib/netdev-offload-tc.c | 10 +-
  1 file changed, 9 insertions(+), 1 deletion(-)

 diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
 index 550e440b3a45..b5ff6ccca55e 100644
 --- a/lib/netdev-offload-tc.c
 +++ b/lib/netdev-offload-tc.c
 @@ -1907,6 +1907,7 @@ netdev_tc_init_flow_api(struct netdev *netdev)
  static struct ovsthread_once block_once = OVSTHREAD_ONCE_INITIALIZER;
  enum tc_qdisc_hook hook = get_tc_qdisc_hook(netdev);
  uint32_t block_id = 0;
 +struct tcf_id id;
  int ifindex;
  int error;
  
 @@ -1917,6 +1918,14 @@ netdev_tc_init_flow_api(struct netdev *netdev)
  return -ifindex;
  }
  
 +block_id = get_block_id_from_netdev(netdev);
 +
 +/* Flush rules explicitly needed when we work with ingress_block,
 + * so we will not fail with reattaching block to bond iface, for ex.
 + */
 +id = tc_make_tcf_id(ifindex, block_id, 0, hook);
 +tc_del_filter();
 +
  /* make sure there is no ingress/egress qdisc */
  tc_add_del_qdisc(ifindex, false, 0, hook);
  
 @@ -1930,7 +1939,6 @@ netdev_tc_init_flow_api(struct netdev *netdev)
  ovsthread_once_done(_mask_once);
  }
  
 -block_id = get_block_id_from_netdev(netdev);
  error = tc_add_del_qdisc(ifindex, true, block_id, hook);
  
  if (error && error != EEXIST) {

>>>
>>> +ilya
>>>
>>> Hi Ilya,
>>>
>>> can you help review/ack this?
>>
>> Hi.  I'm not an expert in linux tc, but since you're asking...
>>
>> IIUC, the issue is that tc_add_del_qdisc(ifindex, true, block_id, hook)
>> fails on bond iface.  Is it correct?
> 
> At first tc_add_del_qdisc() fails on deletion, because qdisc, which is
> added to block, is in use (rules are exist). We just don't care about
> any error returned. Then tc_add_del_qdisc() fail to add it, because it
> wasn't deleted and in use.
> 
>> In this case I see one strange thing.  We're clearing ingress qdisk
>> for block_id == 0, but after that trying to create new one with
>> block_id == ifindex (for LAG interface).  Will it help if we delete
>> ingress qdisk providing correct block_id?  This sounds like something
>> sane to do.
>  
> Deleting block_id != 0 will fail, due to existing rules. But actually,
> deleting qdisc with block_id == 0 still delete correct block.
> 
> Anyway, the point here is to flush rules on specified ingress block.

Hmm.  OK.

Shouldn't we unconditionally flush rules from qdisk inside the
tc_add_del_qdisc() if deletion is requested?  From your explanation
it seems like a prerequisite for that and qdisk will not be deleted
if we will not flush rules.

BTW, one thing in this patch that doesn't look good is that you're
using block_id before probing the support.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] netdev-offload-tc: Flush rules on ingress block when init tc flow api

2020-03-03 Thread Dmytro Linkin
On Tue, Mar 03, 2020 at 02:15:21PM +0100, Ilya Maximets wrote:
> On 3/3/20 8:58 AM, Roi Dayan wrote:
> > 
> > 
> > On 2020-02-27 5:22 PM, Roi Dayan wrote:
> >> From: Dmytro Linkin 
> >>
> >> OVS can fail to attach ingress block on iface when init tc flow api,
> >> if block already exist with rules on it and is shared with other iface.
> >> Fix by flush all existing rules on the ingress block prior to deleting
> >> it.
> >>
> >> Fixes: 093c9458fb02 ("tc: allow offloading of block ids")
> >> Signed-off-by: Dmytro Linkin 
> >> Acked-by: Raed Salem 
> >> Acked-by: Roi Dayan 
> >> ---
> >>  lib/netdev-offload-tc.c | 10 +-
> >>  1 file changed, 9 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> >> index 550e440b3a45..b5ff6ccca55e 100644
> >> --- a/lib/netdev-offload-tc.c
> >> +++ b/lib/netdev-offload-tc.c
> >> @@ -1907,6 +1907,7 @@ netdev_tc_init_flow_api(struct netdev *netdev)
> >>  static struct ovsthread_once block_once = OVSTHREAD_ONCE_INITIALIZER;
> >>  enum tc_qdisc_hook hook = get_tc_qdisc_hook(netdev);
> >>  uint32_t block_id = 0;
> >> +struct tcf_id id;
> >>  int ifindex;
> >>  int error;
> >>  
> >> @@ -1917,6 +1918,14 @@ netdev_tc_init_flow_api(struct netdev *netdev)
> >>  return -ifindex;
> >>  }
> >>  
> >> +block_id = get_block_id_from_netdev(netdev);
> >> +
> >> +/* Flush rules explicitly needed when we work with ingress_block,
> >> + * so we will not fail with reattaching block to bond iface, for ex.
> >> + */
> >> +id = tc_make_tcf_id(ifindex, block_id, 0, hook);
> >> +tc_del_filter();
> >> +
> >>  /* make sure there is no ingress/egress qdisc */
> >>  tc_add_del_qdisc(ifindex, false, 0, hook);
> >>  
> >> @@ -1930,7 +1939,6 @@ netdev_tc_init_flow_api(struct netdev *netdev)
> >>  ovsthread_once_done(_mask_once);
> >>  }
> >>  
> >> -block_id = get_block_id_from_netdev(netdev);
> >>  error = tc_add_del_qdisc(ifindex, true, block_id, hook);
> >>  
> >>  if (error && error != EEXIST) {
> >>
> > 
> > +ilya
> > 
> > Hi Ilya,
> > 
> > can you help review/ack this?
> 
> Hi.  I'm not an expert in linux tc, but since you're asking...
> 
> IIUC, the issue is that tc_add_del_qdisc(ifindex, true, block_id, hook)
> fails on bond iface.  Is it correct?

At first tc_add_del_qdisc() fails on deletion, because qdisc, which is
added to block, is in use (rules are exist). We just don't care about
any error returned. Then tc_add_del_qdisc() fail to add it, because it
wasn't deleted and in use.

> In this case I see one strange thing.  We're clearing ingress qdisk
> for block_id == 0, but after that trying to create new one with
> block_id == ifindex (for LAG interface).  Will it help if we delete
> ingress qdisk providing correct block_id?  This sounds like something
> sane to do.
 
Deleting block_id != 0 will fail, due to existing rules. But actually,
deleting qdisc with block_id == 0 still delete correct block.

Anyway, the point here is to flush rules on specified ingress block.


Regards, Dmytro.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] lib: use acquire-release semantics for pvector size

2020-03-03 Thread Yanqin Wei
Hi Ilya,

Thanks for your comments. V4 has been updated, could you please check it again?

https://patchwork.ozlabs.org/patch/1248440/

Best Regards,
Wei Yanqin

> -Original Message-
> From: Ilya Maximets 
> Sent: Friday, February 28, 2020 9:40 PM
> To: Yanqin Wei ; d...@openvswitch.org
> Cc: nd ; Lijian Zhang ; Gavin Hu
> ; i.maxim...@ovn.org
> Subject: Re: [ovs-dev] [PATCH v3] lib: use acquire-release semantics for 
> pvector
> size
> 
> On 2/27/20 5:12 PM, Yanqin Wei wrote:
> > Read/write concurrency of pvector library is implemented by a temp
> > vector and RCU protection. Considering performance reason, insertion
> > does not follow this scheme.
> > In insertion function, a thread fence ensures size incrementation is
> > done after new entry is stored. But there is no barrier in the
> > iteration fuction(pvector_cursor_init). Entry point access may be
> > reorderd before
> 
> s/reorderd/reordered/
> 
> > loading vector size, so the invalid entry point may be loaded when
> > vector iteration.
> > This patch fixes it by acquire-release pair. It can guarantee new size
> > is observed by reader after new entry stored by writer. And this is
> > implemented by one-way barrier instead of two-way memory fence.
> >
> 
> I believe we need a 'Fixes' tag here.
> 
> > Reviewed-by: Gavin Hu 
> > Reviewed-by: Lijian Zhang 
> > Signed-off-by: Yanqin Wei 
> > ---
> >  lib/pvector.c | 18 +++---  lib/pvector.h | 12
> > 
> >  2 files changed, 19 insertions(+), 11 deletions(-)
> >
> > diff --git a/lib/pvector.c b/lib/pvector.c index aaeee9214..f557b0559
> > 100644
> > --- a/lib/pvector.c
> > +++ b/lib/pvector.c
> > @@ -33,7 +33,7 @@ pvector_impl_alloc(size_t size)
> >  struct pvector_impl *impl;
> >
> >  impl = xmalloc(sizeof *impl + size * sizeof impl->vector[0]);
> > -impl->size = 0;
> > +atomic_init(>size, 0);
> >  impl->allocated = size;
> >
> >  return impl;
> > @@ -117,18 +117,22 @@ pvector_insert(struct pvector *pvec, void *ptr,
> > int priority)  {
> >  struct pvector_impl *temp = pvec->temp;
> >  struct pvector_impl *old = pvector_impl_get(pvec);
> > +size_t size;
> >
> >  ovs_assert(ptr != NULL);
> >
> > +/* There is no possible concurrent writer. Insertions must be protected
> > + * by mutex or be always excuted from the same thread */
> 
> Please, add a period to the end of the comment.
> 
> > +atomic_read_relaxed(>size, );
> > +
> >  /* Check if can add to the end without reallocation. */
> > -if (!temp && old->allocated > old->size &&
> > -(!old->size || priority <= old->vector[old->size - 1].priority)) {
> > -old->vector[old->size].ptr = ptr;
> > -old->vector[old->size].priority = priority;
> > +if (!temp && old->allocated > size &&
> > +(!size || priority <= old->vector[size - 1].priority)) {
> > +old->vector[size].ptr = ptr;
> > +old->vector[size].priority = priority;
> >  /* Size increment must not be visible to the readers before the new
> >   * entry is stored. */
> > -atomic_thread_fence(memory_order_release);
> > -++old->size;
> > +atomic_store_explicit(>size, size + 1,
> > + memory_order_release);
> >  } else {
> >  if (!temp) {
> >  temp = pvector_impl_dup(old); diff --git a/lib/pvector.h
> > b/lib/pvector.h index b990ed9d5..55b725ba7 100644
> > --- a/lib/pvector.h
> > +++ b/lib/pvector.h
> > @@ -69,8 +69,8 @@ struct pvector_entry {  };
> >
> >  struct pvector_impl {
> > -size_t size;   /* Number of entries in the vector. */
> > -size_t allocated;  /* Number of allocated entries. */
> > +atomic_size_t size;   /* Number of entries in the vector. */
> > +size_t allocated; /* Number of allocated entries. */
> >  struct pvector_entry vector[];
> >  };
> >
> > @@ -181,12 +181,16 @@ pvector_cursor_init(const struct pvector *pvec,
> > {
> >  const struct pvector_impl *impl;
> >  struct pvector_cursor cursor;
> > +size_t size;
> >
> >  impl = ovsrcu_get(struct pvector_impl *, >impl);
> >
> > -ovs_prefetch_range(impl->vector, impl->size * sizeof impl->vector[0]);
> > +/* Use memory_order_acquire to ensure entry access can not be
> > + * reordered to happen before size read */
> 
> Ditto.
> 
> > +atomic_read_explicit(>size, , memory_order_acquire);
> 
> sparse complains here about 'const' qualifier:
> 
> lib/pvector.h:190:5: error: incorrect type in argument 1 (different modifiers)
> lib/pvector.h:190:5:expected void *
> lib/pvector.h:190:5:got unsigned long const *
> 
> You need something like this:
> 
> atomic_read_explicit(_CAST(struct pvector_impl *, impl)->size,
>  , memory_order_acquire);
> 
> > +ovs_prefetch_range(impl->vector, size * sizeof impl->vector[0]);
> >
> > -cursor.size = impl->size;
> > +cursor.size = size;
> >  cursor.vector = impl->vector;
> >  cursor.entry_idx = -1;
> 

[ovs-dev] [PATCH v4] lib: use acquire-release semantics for pvector size

2020-03-03 Thread Yanqin Wei
Read/write concurrency of pvector library is implemented by a temp vector
and RCU protection. Considering performance reason, insertion does not
follow this scheme.
In insertion function, a thread fence ensures size incrementation is done
after new entry is stored. But there is no barrier in the iteration
fuction(pvector_cursor_init). Entry point access may be reordered before
loading vector size, so the invalid entry point may be loaded when vector
iteration.
This patch fixes it by acquire-release pair. It can guarantee new size is
observed by reader after new entry stored by writer. And this is
implemented by one-way barrier instead of two-way memory fence.

Fixes: fe7cfa5c3f19 ("lib/pvector: Non-intrusive RCU priority vector.")
Reviewed-by: Gavin Hu 
Reviewed-by: Lijian Zhang 
Signed-off-by: Yanqin Wei 
---
 lib/pvector.c | 18 +++---
 lib/pvector.h | 13 +
 2 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/lib/pvector.c b/lib/pvector.c
index aaeee9214..cc527fdc4 100644
--- a/lib/pvector.c
+++ b/lib/pvector.c
@@ -33,7 +33,7 @@ pvector_impl_alloc(size_t size)
 struct pvector_impl *impl;
 
 impl = xmalloc(sizeof *impl + size * sizeof impl->vector[0]);
-impl->size = 0;
+atomic_init(>size, 0);
 impl->allocated = size;
 
 return impl;
@@ -117,18 +117,22 @@ pvector_insert(struct pvector *pvec, void *ptr, int 
priority)
 {
 struct pvector_impl *temp = pvec->temp;
 struct pvector_impl *old = pvector_impl_get(pvec);
+size_t size;
 
 ovs_assert(ptr != NULL);
 
+/* There is no possible concurrent writer. Insertions must be protected
+ * by mutex or be always excuted from the same thread. */
+atomic_read_relaxed(>size, );
+
 /* Check if can add to the end without reallocation. */
-if (!temp && old->allocated > old->size &&
-(!old->size || priority <= old->vector[old->size - 1].priority)) {
-old->vector[old->size].ptr = ptr;
-old->vector[old->size].priority = priority;
+if (!temp && old->allocated > size &&
+(!size || priority <= old->vector[size - 1].priority)) {
+old->vector[size].ptr = ptr;
+old->vector[size].priority = priority;
 /* Size increment must not be visible to the readers before the new
  * entry is stored. */
-atomic_thread_fence(memory_order_release);
-++old->size;
+atomic_store_explicit(>size, size + 1, memory_order_release);
 } else {
 if (!temp) {
 temp = pvector_impl_dup(old);
diff --git a/lib/pvector.h b/lib/pvector.h
index b990ed9d5..c5024487f 100644
--- a/lib/pvector.h
+++ b/lib/pvector.h
@@ -69,8 +69,8 @@ struct pvector_entry {
 };
 
 struct pvector_impl {
-size_t size;   /* Number of entries in the vector. */
-size_t allocated;  /* Number of allocated entries. */
+atomic_size_t size;   /* Number of entries in the vector. */
+size_t allocated; /* Number of allocated entries. */
 struct pvector_entry vector[];
 };
 
@@ -181,12 +181,17 @@ pvector_cursor_init(const struct pvector *pvec,
 {
 const struct pvector_impl *impl;
 struct pvector_cursor cursor;
+size_t size;
 
 impl = ovsrcu_get(struct pvector_impl *, >impl);
 
-ovs_prefetch_range(impl->vector, impl->size * sizeof impl->vector[0]);
+/* Use memory_order_acquire to ensure entry access can not be
+ * reordered to happen before size read. */
+atomic_read_explicit(_CAST(struct pvector_impl *, impl)->size,
+, memory_order_acquire);
+ovs_prefetch_range(impl->vector, size * sizeof impl->vector[0]);
 
-cursor.size = impl->size;
+cursor.size = size;
 cursor.vector = impl->vector;
 cursor.entry_idx = -1;
 
-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpif-netdev: Enter quiescent state after each offloading operation.

2020-03-03 Thread Eli Britstein


On 2/28/2020 11:03 AM, Ilya Maximets wrote:

On 2/27/20 10:00 AM, Eli Britstein wrote:

On 2/26/2020 2:31 PM, Ilya Maximets wrote:

On 2/26/20 1:05 PM, Eli Britstein wrote:

On 2/24/2020 1:06 PM, Ilya Maximets wrote:

On 2/23/20 3:32 PM, Eli Britstein wrote:

On 2/21/2020 4:54 PM, Ilya Maximets wrote:

If the offloading queue is big and filled continuously, offloading
thread may have no chance to quiesce blocking rcu callbacks and
other threads waiting for synchronization.

Fix that by entering momentary quiescent state after each operation
since we're not holding any rcu-protected memory here.

Fixes: 02bb2824e51d ("dpif-netdev: do hw flow offload in a thread")
Reported-by: Eli Britstein 
Reported-at: 
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-discuss%2F2020-February%2F049768.htmldata=02%7C01%7Celibr%40mellanox.com%7Ceb2c70875cb14a4d4b4108d7bc2d2676%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637184774411852418sdata=8YgXyAJ248Mqhiq4tbUcAb357Dqdu1ue%2BBXp%2F6pX6tY%3Dreserved=0
Signed-off-by: Ilya Maximets 
---
     lib/dpif-netdev.c | 1 +
     1 file changed, 1 insertion(+)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index d393aab5e..a798db45d 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2512,6 +2512,7 @@ dp_netdev_flow_offload_main(void *data OVS_UNUSED)
     VLOG_DBG("%s to %s netdev flow\n",
  ret == 0 ? "succeed" : "failed", op);
     dp_netdev_free_flow_offload(offload);
+    ovsrcu_quiesce();

This seems to solve the issue of responsiveness, but I have encountered a crash 
using it, while there is a lot of flow deletions.

#0  0x01e462dc in get_unaligned_u32 (p=0x7f57f7813500) at 
lib/unaligned.h:86
#1  0x01e46388 in hash_bytes (p_=0x7f57f7813500, n=16, basis=0) at 
lib/hash.c:38
#2  0x01f7f836 in ufid_to_rte_flow_data_find (ufid=0x7f57f7813500) at 
lib/netdev-offload-dpdk.c:134
#3  0x01f88693 in netdev_offload_dpdk_flow_del (netdev=0x42260da0, 
ufid=0x7f57f7813500, stats=0x0) at lib/netdev-offload-dpdk.c:3361
#4  0x01e6a00d in netdev_flow_del (netdev=0x42260da0, 
ufid=0x7f57f7813500, stats=0x0) at lib/netdev-offload.c:296

This might mean 2 things:
1. Incorrect usage of RCU protected data inside netdev-offload-dpdk.
2. We're not allowed to quiesce if we have any item in offloading queue,
  i.e. bad RCU usage scheme in dpif-netdev.

Both cases are bad and I'm not sure if we can easily fix case #2.
I'll take a look and try to find a root cause.

It was my bad. Sorry. This is another issue, that exists in my branch only (CT 
offloads), so it's not related. I was able to fix it.

OK.  Good to know.


However, I wanted to reproduce with master branch, but then I cannot create a lot of 
flows as I get: "upcall: datapath flow limit reached".

I tried setting large number using "ovs-appctl upcall/set-flow-limit 40", 
but it didn't help.

Number of flows in dpif-netdev is artificially limited by MAX_FLOWS (65536)
value.  There was a recent discussion about this limit and I think we could
just remove it.  I have a small patch for this, will send soon.
For now, you can just raise the hardcoded value of MAX_FLOWS in dpif-netdev.c.

The issue is not in dpif-netdev.c. It's in ofproto/ofproto-dpif-upcall.c. If 
I'm not wrong, when it happens, there is not attempt to install the flow, so 
anyway MAX_FLOWS in dpif-netdev.c is not the blocker.

There are couple of things that controls the maximum number of datapath flows.
First is the other_config:flow-limit that defaults to 20.  Second is the
datapath internal MAX_FLOWS define.  And also, "ovs-appctl 
upcall/set-flow-limit"
should be able to modify this value.  Sounds strange.


BTW, my setup is the PF and VF rep in a bridge. I run testpmd with macswap on 
the VF, and send 10k different src MAC packets (OpenFlow configuration is 
NORMAL). So, I expect total 20k flows, also less than 64k of MAX_FLOWS.

Are you sure that packets are not corrupted?  Could you dump datapath flows
and verify that these flows are what you've expected?


This is for example for 4 different MACs 
(e4:10:00:00:00:01-e4:10:00:00:00:04). As expected:


$ ovs-appctl dpctl/dump-flows -m
flow-dump from pmd on cpu core: 8
ufid:65aae19f-06a9-4887-8d2d-60b04187f7cd, 
skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(vm1),packet_type(ns=0,id=0),eth(src=e4:11:00:00:00:01,dst=e4:10:00:00:00:01),eth_type(0x0800),ipv4(src=1.1.1.1/0.0.0.0,dst=1.1.1.2/0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=1025/0,dst=1025/0), 
packets:11316766, bytes:1244844260, used:0.013s, offloaded:yes, dp:dpdk, 
actions:pf, dp-extra-info:miniflow_bits(5,1)
ufid:a0e4b9bc-8973-41d1-b2cf-29f2a14935d8, 

Re: [ovs-dev] [PATCH] netdev-offload-tc: Flush rules on ingress block when init tc flow api

2020-03-03 Thread Ilya Maximets
On 3/3/20 8:58 AM, Roi Dayan wrote:
> 
> 
> On 2020-02-27 5:22 PM, Roi Dayan wrote:
>> From: Dmytro Linkin 
>>
>> OVS can fail to attach ingress block on iface when init tc flow api,
>> if block already exist with rules on it and is shared with other iface.
>> Fix by flush all existing rules on the ingress block prior to deleting
>> it.
>>
>> Fixes: 093c9458fb02 ("tc: allow offloading of block ids")
>> Signed-off-by: Dmytro Linkin 
>> Acked-by: Raed Salem 
>> Acked-by: Roi Dayan 
>> ---
>>  lib/netdev-offload-tc.c | 10 +-
>>  1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
>> index 550e440b3a45..b5ff6ccca55e 100644
>> --- a/lib/netdev-offload-tc.c
>> +++ b/lib/netdev-offload-tc.c
>> @@ -1907,6 +1907,7 @@ netdev_tc_init_flow_api(struct netdev *netdev)
>>  static struct ovsthread_once block_once = OVSTHREAD_ONCE_INITIALIZER;
>>  enum tc_qdisc_hook hook = get_tc_qdisc_hook(netdev);
>>  uint32_t block_id = 0;
>> +struct tcf_id id;
>>  int ifindex;
>>  int error;
>>  
>> @@ -1917,6 +1918,14 @@ netdev_tc_init_flow_api(struct netdev *netdev)
>>  return -ifindex;
>>  }
>>  
>> +block_id = get_block_id_from_netdev(netdev);
>> +
>> +/* Flush rules explicitly needed when we work with ingress_block,
>> + * so we will not fail with reattaching block to bond iface, for ex.
>> + */
>> +id = tc_make_tcf_id(ifindex, block_id, 0, hook);
>> +tc_del_filter();
>> +
>>  /* make sure there is no ingress/egress qdisc */
>>  tc_add_del_qdisc(ifindex, false, 0, hook);
>>  
>> @@ -1930,7 +1939,6 @@ netdev_tc_init_flow_api(struct netdev *netdev)
>>  ovsthread_once_done(_mask_once);
>>  }
>>  
>> -block_id = get_block_id_from_netdev(netdev);
>>  error = tc_add_del_qdisc(ifindex, true, block_id, hook);
>>  
>>  if (error && error != EEXIST) {
>>
> 
> +ilya
> 
> Hi Ilya,
> 
> can you help review/ack this?

Hi.  I'm not an expert in linux tc, but since you're asking...

IIUC, the issue is that tc_add_del_qdisc(ifindex, true, block_id, hook)
fails on bond iface.  Is it correct?

In this case I see one strange thing.  We're clearing ingress qdisk
for block_id == 0, but after that trying to create new one with
block_id == ifindex (for LAG interface).  Will it help if we delete
ingress qdisk providing correct block_id?  This sounds like something
sane to do.

What do you think?

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovsdb-idl.c: Clear conditions when clearing IDL.

2020-03-03 Thread Dumitru Ceara
CC-ing Andy and Ben.

On 3/3/20 7:43 AM, Han Zhou wrote:
> 
> 
> On Mon, Mar 2, 2020 at 7:55 AM Dumitru Ceara  > wrote:
>>
>> On 2/29/20 12:00 AM, Dumitru Ceara wrote:
>> > If the ovsdb-server reply to "monitor_cond_since" requests has
>> > "found" == false then ovsdb_idl_db_parse_monitor_reply() calls
>> > ovsdb_idl_db_clear() which iterates through all tables and
>> > unconditionally sets table->cond_changed to false.
>> >
>> > However, if the client had already set a new condition for some of the
>> > tables, this new condition request will never be sent to ovsdb-server
>> > until the condition is reset to a different value. This is due to the
>> > check in ovsdb_idl_db_set_condition().
>> >
>> > In order to fix this we now also call ovsdb_idl_condition_clear() for
>> > each table condition in ovsdb_idl_db_clear(). This ensures that
>> > resetting the condition to the same value as before the
>> > "monitor_cond_since" reply will trigger a new request.
>> >
>> > One way to replicate the issue is described in the bugzilla reporting
>> > the bug, when ovn-controller is configured to use "ovn-monitor-all":
>> > https://bugzilla.redhat.com/show_bug.cgi?id=1808125#c6
>> >
>> > Reported-by: Dan Williams mailto:d...@redhat.com>>
>> > Reported-at: https://bugzilla.redhat.com/1808125
>> > CC: Han Zhou mailto:hz...@ovn.org>>
>> > Fixes: 403a6a0cb003 ("ovsdb-idl: Fast resync from server when
> connection reset.")
>> > Signed-off-by: Dumitru Ceara  >
>>
>> Hi Han,
>>
>> I was wondering if you could review this change because ovn-kubernetes
>> is trying to enable ovn-monitor-all for scalability reasons but their CI
>> fails due to the bug fixed by the patch below.
>>
>> Thanks,
>> Dumitru
>>
>> > ---
>> >  lib/ovsdb-idl.c | 6 +-
>> >  1 file changed, 5 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
>> > index 190143f..64721ca 100644
>> > --- a/lib/ovsdb-idl.c
>> > +++ b/lib/ovsdb-idl.c
>> > @@ -610,7 +610,11 @@ ovsdb_idl_db_clear(struct ovsdb_idl_db *db)
>> >          struct ovsdb_idl_table *table = >tables[i];
>> >          struct ovsdb_idl_row *row, *next_row;
>> >
>> > -        table->cond_changed = false;
>> > +        if (table->cond_changed) {
>> > +            table->cond_changed = false;
>> > +            ovsdb_idl_condition_clear(>condition);
>> > +        }
>> > +
>> >          if (hmap_is_empty(>rows)) {
>> >              continue;
>> >          }
>> >
>>
> 
> Hi Dumitru>
> If I understand the problem, it is when the client updated monitor
> condition and at the same time it received monitor_cond_since reply from
> server for it's previous conditional monitor request, and the handling
> of the reply overwrite the flag table->cond_changed from true to false,
> and because of this, the new condition is not sent to server.
>
Hi Han,

Yes, that's what's happening.

> However, I don't understand the fix. It seems that the fix should be
> "don't overwrite the flag", instead of clearing the conditions as well.
> Also, by simply looking at this code I don't understand why the flag
> needs to be set to false when clearing data in IDL. What would be the
> problem if we keep it as what it is? If you understand the details,
> could you explain more about the fix?
> 
> In addition, I didn't change this logic when implementing
> monitor_cond_since in 403a6a0cb003 ("ovsdb-idl: Fast resync from server
> when connection reset."). In that commit it only reduced the chance of
> calling ovsdb_idl_db_clear() as an optimization. Before that commit,
> ovsdb_idl_db_parse_monitor_reply() always calls ovsdb_idl_db_clear() and
> it would result in same problem.
You're right, my bad, I messed up the "git blame", sorry about that. It
was actually introduced by:

https://github.com/openvswitch/ovs/commit/5351980b047f4dd40be7a59a1e4b910df21eca0a

I'll fix up the commit message in V2 once we decide on the best approach.

Hi Andy, Ben,

As far as I understand, whenever ovsdb_idl_db_clear() is called during
reconnect, any buffered but unsent conditions should be cleared because
they will be anyway resent with the following monitoring condition
update message after reconnect has completed.

However, clearing the "table->cond_changed" flag is not enough because
when the IDL client (ovn-controller in my case) will try to set the same
condition as before (e.g., "true" if ovn-monitor-all=true)
ovsdb_idl_db_set_condition() will return early:

https://github.com/openvswitch/ovs/blob/master/lib/ovsdb-idl.c#L1518

Therefore in the cases when ovsdb_idl_db_clear() is called, we should
also clear the existing conditions on the DB tables.

As Han suggested, we could also revert the changes from 5351980b047f
("ovsdb-idl: Avoid sending redundant conditional monitoring updates")
and we wouldn't hit the issue anymore. I'm not sure however if this has
any other major implications.

What do you think?

Thanks,
Dumitru

> 
> Thanks,
> Han