[ovs-dev] [PATCH ovn 6/7] ovn-northd.c: Remove the use of the REGBIT_SKIP_LOOKUP_NEIGHBOR bit.

2020-07-22 Thread Han Zhou
In LR ingress stage LOOKUP_NEIGHBOR and LEARN_NEIGHBOR, the flag
REGBIT_SKIP_LOOKUP_NEIGHBOR was used to indicate if mac-binding
lookup can be skipped. This patch avoid using the bit by combining
it with the REGBIT_LOOKUP_NEIGHBOR_RESULT bit, and assigning 1
to REGBIT_LOOKUP_NEIGHBOR_RESULT serves same purpose of skipping
the lookup. There will be a new bit needed in a future patch, and
this change can avoid using too many bits unnecessarily.

Signed-off-by: Han Zhou 
---
 northd/ovn-northd.8.xml | 11 ---
 northd/ovn-northd.c |  6 ++
 2 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 741c928..9f2c70f 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -1595,7 +1595,7 @@ next;
 
   
 A priority-0 fallback flow that matches all packets and applies
-the action reg9[3] = 1; next;
+the action reg9[2] = 1; next;
 advancing the packet to the next table.
   
 
@@ -1610,17 +1610,14 @@ next;
 
 
   reg9[2] will be 1 if the lookup_arp/lookup_nd
-  in the previous table was successful.
-
-
-
-  reg9[3] will be 1 if there was no need to do the lookup.
+  in the previous table was successful, or if there was no need to do the
+  lookup.
 
 
 
   
 A priority-100 flow with the match
-reg9[2] == 1 || reg9[3] == 1 and advances the packet
+reg9[2] == 1 and advances the packet
 to the next table as there is no need to learn the neighbor.
   
 
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 4e11a25..425f522 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -221,7 +221,6 @@ enum ovn_stage {
 /* Register to store the result of check_pkt_larger action. */
 #define REGBIT_PKT_LARGER"reg9[1]"
 #define REGBIT_LOOKUP_NEIGHBOR_RESULT "reg9[2]"
-#define REGBIT_SKIP_LOOKUP_NEIGHBOR "reg9[3]"
 
 /* Register to store the eth address associated to a router port for packets
  * received in S_ROUTER_IN_ADMISSION.
@@ -8275,14 +8274,13 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap 
*ports,
   "lookup_nd(inport, ip6.src, nd.sll); next;");
 
 /* For other packet types, we can skip neighbor learning.
- * So set REGBIT_SKIP_LOOKUP_NEIGHBOR to 1. */
+ * So set REGBIT_LOOKUP_NEIGHBOR_RESULT to 1. */
 ovn_lflow_add(lflows, od, S_ROUTER_IN_LOOKUP_NEIGHBOR, 0, "1",
-  REGBIT_SKIP_LOOKUP_NEIGHBOR" = 1; next;");
+  REGBIT_LOOKUP_NEIGHBOR_RESULT" = 1; next;");
 
 /* Flows for LEARN_NEIGHBOR. */
 /* Skip Neighbor learning if not required. */
 ovn_lflow_add(lflows, od, S_ROUTER_IN_LEARN_NEIGHBOR, 100,
-  REGBIT_SKIP_LOOKUP_NEIGHBOR" == 1 || "
   REGBIT_LOOKUP_NEIGHBOR_RESULT" == 1", "next;");
 
 ovn_lflow_add(lflows, od, S_ROUTER_IN_LEARN_NEIGHBOR, 90,
-- 
2.1.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 7/7] ovn-northd.c: Support optionally disabling neighbor learning from ARP request/NS.

2020-07-22 Thread Han Zhou
Support a new logical router option "always_learn_from_arp_request" that 
controls
behavior when handling ARP requests or IPv4 ND-NS packets.

"true" - Always learn the MAC/IP binding and add a new MAC_Binding entry
(default behavior)

"false" - If there is a MAC_binding for that IP and the MAC is different, or,
if TPA of ARP request belongs to any router port on this router, then
update/add that MAC/IP binding. Otherwise, don't update/add entries.

Reported-by: Girish Moodalbail 
Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-discuss/2020-May/049995.html
Signed-off-by: Han Zhou 
---
 northd/ovn-northd.8.xml | 109 
 northd/ovn-northd.c |  96 +++---
 ovn-nb.xml  |  27 
 tests/ovn.at|  18 
 4 files changed, 217 insertions(+), 33 deletions(-)

diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 9f2c70f..30936ab 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -1537,58 +1537,126 @@ output;
 
   For each router port P that owns IP address A,
   which belongs to subnet S with prefix length L,
-  a priority-100 flow is added which matches
-  inport == P 
-  arp.spa == S/L  arp.op == 1
-  (ARP request) with the
-  following actions:
+  if the option always_learn_from_arp_request is
+  true for this router, a priority-100 flow is added which
+  matches inport == P  arp.spa ==
+  S/L  arp.op == 1 (ARP request)
+  with the following actions:
+
+
+
+reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
+next;
+
+
+
+  If the option always_learn_from_arp_request is
+  false, the following two flows are added.
+
+
+
+  A priority-110 flow is added which matches inport ==
+  P  arp.spa == S/L
+   arp.tpa == A  arp.op == 1
+  (ARP request) with the following actions:
 
 
 
 reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
+reg9[3] = 1;
+next;
+
+
+
+  A priority-100 flow is added which matches inport ==
+  P  arp.spa == S/L
+   arp.op == 1 (ARP request) with the following
+  actions:
+
+
+
+reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
+reg9[3] = lookup_arp_ip(inport, arp.spa);
 next;
 
 
 
   If the logical router port P is a distributed gateway
   router port, additional match
-  is_chassis_resident(cr-P) is added so that
-  the resident gateway chassis handles the neighbor lookup.
+  is_chassis_resident(cr-P) is added for all
+  these flows.
 
   
 
   
 
   A priority-100 flow which matches on ARP reply packets and applies
-  the actions:
+  the actions if the option always_learn_from_arp_request
+  is true:
 
 
 
 reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
 next;
 
+
+
+  If the option always_learn_from_arp_request
+  is false, the above actions will be:
+
+
+
+reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
+reg9[3] = 1;
+next;
+
+
   
 
   
 
   A priority-100 flow which matches on IPv6 Neighbor Discovery
-  advertisement packet and applies the actions:
+  advertisement packet and applies the actions if the option
+  always_learn_from_arp_request is true:
 
 
 
 reg9[2] = lookup_nd(inport, nd.target, nd.tll);
 next;
 
+
+
+  If the option always_learn_from_arp_request
+  is false, the above actions will be:
+
+
+
+reg9[2] = lookup_nd(inport, nd.target, nd.tll);
+reg9[3] = 1;
+next;
+
   
 
   
 
   A priority-100 flow which matches on IPv6 Neighbor Discovery
-  solicitation packet and applies the actions:
+  solicitation packet and applies the actions if the option
+  always_learn_from_arp_request is true:
+
+
+
+reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
+next;
+
+
+
+  If the option always_learn_from_arp_request
+  is false, the above actions will be:
 
 
 
 reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
+reg9[3] = lookup_nd_ip(inport, ip6.src);
 next;
 
   
@@ -1604,21 +1672,28 @@ next;
 
 
   This table adds flows to learn the mac bindings from the ARP and
-  IPv6 Neighbor Solicitation/Advertisement packets if ARP/ND lookup
-  failed in the previous table.
+  IPv6 Neighbor Solicitation/Advertisement packets if it is needed
+  according to the lookup results from the previous stage.
 
 
 
   reg9[2] will be 1 if the lookup_arp/lookup_nd
-  in the previous table was successful, or if there 

[ovs-dev] [PATCH ovn 0/7] Avoid ARP flow explosion.

2020-07-22 Thread Han Zhou
This patch series addresses the problem discussed in [0]. Below options need to
be configured for the gateway routers:

options:always_learn_from_arp_request = false
options:dynamic_neigh_routers = true

[0] - https://mail.openvswitch.org/pipermail/ovs-discuss/2020-May/049994.html
Han Zhou (7):
  ovn-northd: Support optionally avoid static neighbor flows in routers.
  tests: Fix get_arp/get_nd tests mac-binding table id.
  actions: Rename xxx_lookup_mac to xxx_lookup_mac_bind.
  actions: Implement new actions lookup_arp_ip and lookup_nd_ip.
  ovn-northd.8.xml: Fix reg9 bits documentation.
  ovn-northd.c: Remove the use of the REGBIT_SKIP_LOOKUP_NEIGHBOR bit.
  ovn-northd.c: Support optionally disabling neighbor learning from ARP 
request/NS.

 controller/lflow.c  |   4 +-
 include/ovn/actions.h   |  10 
 lib/actions.c   | 133 
 northd/ovn-northd.8.xml | 123 +++-
 northd/ovn-northd.c | 108 +++
 ovn-nb.xml  |  40 +++
 ovn-sb.xml  |  55 
 tests/ovn.at|  76 +--
 tests/test-ovn.c|  11 ++--
 utilities/ovn-trace.c   |  67 
 10 files changed, 554 insertions(+), 73 deletions(-)

-- 
2.1.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 2/7] tests: Fix get_arp/get_nd tests mac-binding table id.

2020-07-22 Thread Han Zhou
The table id used in test is not the same as the one used in
real implementation. Although it doesn't affect correctness, it
may cause confusion when people are studying test cases.

Signed-off-by: Han Zhou 
---
 tests/ovn.at |  8 
 tests/test-ovn.c | 11 ++-
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/tests/ovn.at b/tests/ovn.at
index e19efaf..b06642b 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -1181,10 +1181,10 @@ arp { };
 
 # get_arp
 get_arp(outport, ip4.dst);
-encodes as 
push:NXM_NX_REG0[],push:NXM_OF_IP_DST[],pop:NXM_NX_REG0[],set_field:00:00:00:00:00:00->eth_dst,resubmit(,65),pop:NXM_NX_REG0[]
+encodes as 
push:NXM_NX_REG0[],push:NXM_OF_IP_DST[],pop:NXM_NX_REG0[],set_field:00:00:00:00:00:00->eth_dst,resubmit(,66),pop:NXM_NX_REG0[]
 has prereqs eth.type == 0x800
 get_arp(inport, reg0);
-encodes as 
push:NXM_NX_REG15[],push:NXM_NX_REG0[],push:NXM_NX_XXREG0[96..127],push:NXM_NX_REG14[],pop:NXM_NX_REG15[],pop:NXM_NX_REG0[],set_field:00:00:00:00:00:00->eth_dst,resubmit(,65),pop:NXM_NX_REG0[],pop:NXM_NX_REG15[]
+encodes as 
push:NXM_NX_REG15[],push:NXM_NX_REG0[],push:NXM_NX_XXREG0[96..127],push:NXM_NX_REG14[],pop:NXM_NX_REG15[],pop:NXM_NX_REG0[],set_field:00:00:00:00:00:00->eth_dst,resubmit(,66),pop:NXM_NX_REG0[],pop:NXM_NX_REG15[]
 
 get_arp;
 Syntax error at `;' expecting `('.
@@ -1299,10 +1299,10 @@ nd_na_router { eth.src = 12:34:56:78:9a:bc; nd.tll = 
12:34:56:78:9a:bc; outport
 
 # get_nd
 get_nd(outport, ip6.dst);
-encodes as 
push:NXM_NX_XXREG0[],push:NXM_NX_IPV6_DST[],pop:NXM_NX_XXREG0[],set_field:00:00:00:00:00:00->eth_dst,resubmit(,65),pop:NXM_NX_XXREG0[]
+encodes as 
push:NXM_NX_XXREG0[],push:NXM_NX_IPV6_DST[],pop:NXM_NX_XXREG0[],set_field:00:00:00:00:00:00->eth_dst,resubmit(,66),pop:NXM_NX_XXREG0[]
 has prereqs eth.type == 0x86dd
 get_nd(inport, xxreg0);
-encodes as 
push:NXM_NX_REG15[],push:NXM_NX_REG14[],pop:NXM_NX_REG15[],set_field:00:00:00:00:00:00->eth_dst,resubmit(,65),pop:NXM_NX_REG15[]
+encodes as 
push:NXM_NX_REG15[],push:NXM_NX_REG14[],pop:NXM_NX_REG15[],set_field:00:00:00:00:00:00->eth_dst,resubmit(,66),pop:NXM_NX_REG15[]
 get_nd;
 Syntax error at `;' expecting `('.
 get_nd();
diff --git a/tests/test-ovn.c b/tests/test-ovn.c
index c3bfd20..ba62828 100644
--- a/tests/test-ovn.c
+++ b/tests/test-ovn.c
@@ -39,6 +39,7 @@
 #include "openvswitch/shash.h"
 #include "simap.h"
 #include "util.h"
+#include "controller/lflow.h"
 
 /* --relops: Bitmap of the relational operators to test, in exhaustive test. */
 static unsigned int test_relops;
@@ -1333,11 +1334,11 @@ test_parse_actions(struct ovs_cmdl_context *ctx 
OVS_UNUSED)
 .meter_table = _table,
 
 .pipeline = OVNACT_P_INGRESS,
-.ingress_ptable = 8,
-.egress_ptable = 40,
-.output_ptable = 64,
-.mac_bind_ptable = 65,
-.mac_lookup_ptable = 67,
+.ingress_ptable = OFTABLE_LOG_INGRESS_PIPELINE,
+.egress_ptable = OFTABLE_LOG_EGRESS_PIPELINE,
+.output_ptable = OFTABLE_SAVE_INPORT,
+.mac_bind_ptable = OFTABLE_MAC_BINDING,
+.mac_lookup_ptable = OFTABLE_MAC_LOOKUP,
 };
 struct ofpbuf ofpacts;
 ofpbuf_init(, 0);
-- 
2.1.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 1/7] ovn-northd: Support optionally avoid static neighbor flows in routers.

2020-07-22 Thread Han Zhou
Support option:dynamic_neigh_routers for logical routers, so that in
particular use cases static neighbor flows are not prepopulated IP
addresses belonging to neighbor router ports, to avoid flow exploding
problem reported for ovn-kubernetes large scale setup.

Reported-by: Girish Moodalbail 
Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-discuss/2020-May/049995.html
Signed-off-by: Han Zhou 
---
 northd/ovn-northd.8.xml |  5 -
 northd/ovn-northd.c |  6 ++
 ovn-nb.xml  | 13 +
 3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index eb2514f..9b1bcec 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -2701,7 +2701,10 @@ outport = P;
   Logical_Switch_Port table.  For router ports
   connected to other logical routers, MAC bindings can be known
   statically from the mac and networks
-  column in the Logical_Router_Port table.
+  column in the Logical_Router_Port table.  (Note: the
+  flow is NOT installed for the IP addresses that belong to a neighbor
+  logical router port if the current router has the
+  options:dynamic_neigh_routers set to true)
 
 
 
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 1921982..4e11a25 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -10202,6 +10202,12 @@ build_lrouter_flows(struct hmap *datapaths, struct 
hmap *ports,
 continue;
 }
 
+if (peer->od->nbr &&
+smap_get_bool(>od->nbr->options,
+  "dynamic_neigh_routers", false)) {
+continue;
+}
+
 for (size_t i = 0; i < op->od->n_router_ports; i++) {
 const char *router_port_name = smap_get(
 >od->router_ports[i]->nbsp->options,
diff --git a/ovn-nb.xml b/ovn-nb.xml
index db5908c..e8450aa 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -1846,6 +1846,19 @@
   connected to the logical router. Default: False.
 
   
+  
+
+  If set to true, the router will resolve neighbor
+  routers' MAC addresses only by dynamic ARP/ND, instead of
+  prepopulating static mappings for all neighbor routers in the ARP/ND
+  Resolution stage.  This reduces number of flows, but requires ARP/ND
+  messages to resolve the IP-MAC bindings when needed.  It is
+  false by default.  It is recommended to set to
+  true when a large number of logical routers are
+  connected to the same logical switch but most of them never need to
+  send traffic between each other.
+
+  
 
 
 
-- 
2.1.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 4/7] actions: Implement new actions lookup_arp_ip and lookup_nd_ip.

2020-07-22 Thread Han Zhou
lookup_arp_ip and lookup_nd_ip are added to lookup if an entry exists
in MAC bindings for a given IP address, for IPv4 and IPv6 respectively.

Signed-off-by: Han Zhou 
---
 controller/lflow.c|   4 +-
 include/ovn/actions.h |  10 +
 lib/actions.c | 112 ++
 ovn-sb.xml|  55 +
 tests/ovn.at  |  50 ++
 utilities/ovn-trace.c |  54 ++--
 6 files changed, 281 insertions(+), 4 deletions(-)

diff --git a/controller/lflow.c b/controller/lflow.c
index b2f5857..1515612 100644
--- a/controller/lflow.c
+++ b/controller/lflow.c
@@ -782,13 +782,15 @@ consider_neighbor_flow(struct ovsdb_idl_index 
*sbrec_port_binding_by_name,
 
 uint64_t stub[1024 / 8];
 struct ofpbuf ofpacts = OFPBUF_STUB_INITIALIZER(stub);
+uint8_t value = 1;
 put_load(mac.ea, sizeof mac.ea, MFF_ETH_DST, 0, 48, );
+put_load(, sizeof value, MFF_LOG_FLAGS, MLF_LOOKUP_MAC_BIT, 1,
+ );
 ofctrl_add_flow(flow_table, OFTABLE_MAC_BINDING, 100,
 b->header_.uuid.parts[0], _arp_match,
 , >header_.uuid);
 
 ofpbuf_clear();
-uint8_t value = 1;
 put_load(, sizeof value, MFF_LOG_FLAGS, MLF_LOOKUP_MAC_BIT, 1,
  );
 match_set_dl_src(_arp_match, mac);
diff --git a/include/ovn/actions.h b/include/ovn/actions.h
index 0f7f4cd..34ba0d8 100644
--- a/include/ovn/actions.h
+++ b/include/ovn/actions.h
@@ -75,9 +75,11 @@ struct ovn_extend_table;
 OVNACT(GET_ARP,   ovnact_get_mac_bind)\
 OVNACT(PUT_ARP,   ovnact_put_mac_bind)\
 OVNACT(LOOKUP_ARP,ovnact_lookup_mac_bind) \
+OVNACT(LOOKUP_ARP_IP, ovnact_lookup_mac_bind_ip) \
 OVNACT(GET_ND,ovnact_get_mac_bind)\
 OVNACT(PUT_ND,ovnact_put_mac_bind)\
 OVNACT(LOOKUP_ND, ovnact_lookup_mac_bind) \
+OVNACT(LOOKUP_ND_IP,  ovnact_lookup_mac_bind_ip) \
 OVNACT(PUT_DHCPV4_OPTS,   ovnact_put_opts)\
 OVNACT(PUT_DHCPV6_OPTS,   ovnact_put_opts)\
 OVNACT(SET_QUEUE, ovnact_set_queue)   \
@@ -308,6 +310,14 @@ struct ovnact_lookup_mac_bind {
 struct expr_field mac;  /* 48-bit Ethernet address. */
 };
 
+/* OVNACT_LOOKUP_ARP_IP, OVNACT_LOOKUP_ND_IP. */
+struct ovnact_lookup_mac_bind_ip {
+struct ovnact ovnact;
+struct expr_field dst;  /* 1-bit destination field. */
+struct expr_field port; /* Logical port name. */
+struct expr_field ip;   /* 32-bit or 128-bit IP address. */
+};
+
 struct ovnact_gen_option {
 const struct gen_opts_map *option;
 struct expr_constant_set value;
diff --git a/lib/actions.c b/lib/actions.c
index 82463fa..e84dae7 100644
--- a/lib/actions.c
+++ b/lib/actions.c
@@ -1973,6 +1973,110 @@ ovnact_lookup_mac_bind_free(
 
 }
 
+
+static void format_lookup_mac_bind_ip(
+const struct ovnact_lookup_mac_bind_ip *lookup_mac,
+struct ds *s, const char *name)
+{
+expr_field_format(_mac->dst, s);
+ds_put_format(s, " = %s(", name);
+expr_field_format(_mac->port, s);
+ds_put_cstr(s, ", ");
+expr_field_format(_mac->ip, s);
+ds_put_cstr(s, ");");
+}
+
+static void
+format_LOOKUP_ARP_IP(const struct ovnact_lookup_mac_bind_ip *lookup_mac,
+ struct ds *s)
+{
+format_lookup_mac_bind_ip(lookup_mac, s, "lookup_arp_ip");
+}
+
+static void
+format_LOOKUP_ND_IP(const struct ovnact_lookup_mac_bind_ip *lookup_mac,
+struct ds *s)
+{
+format_lookup_mac_bind_ip(lookup_mac, s, "lookup_nd_ip");
+}
+
+static void
+encode_lookup_mac_bind_ip(const struct ovnact_lookup_mac_bind_ip *lookup_mac,
+  enum mf_field_id ip_field,
+  const struct ovnact_encode_params *ep,
+  struct ofpbuf *ofpacts)
+{
+const struct arg args[] = {
+{ expr_resolve_field(_mac->port), MFF_LOG_OUTPORT },
+{ expr_resolve_field(_mac->ip), ip_field },
+};
+
+encode_setup_args(args, ARRAY_SIZE(args), ofpacts);
+init_stack(ofpact_put_STACK_PUSH(ofpacts), MFF_ETH_DST);
+
+struct mf_subfield dst = expr_resolve_field(_mac->dst);
+ovs_assert(dst.field);
+
+put_load(0, MFF_LOG_FLAGS, MLF_LOOKUP_MAC_BIT, 1, ofpacts);
+emit_resubmit(ofpacts, ep->mac_bind_ptable);
+
+struct ofpact_reg_move *orm = ofpact_put_REG_MOVE(ofpacts);
+orm->dst = dst;
+orm->src.field = mf_from_id(MFF_LOG_FLAGS);
+orm->src.ofs = MLF_LOOKUP_MAC_BIT;
+orm->src.n_bits = 1;
+
+init_stack(ofpact_put_STACK_POP(ofpacts), MFF_ETH_DST);
+encode_restore_args(args, ARRAY_SIZE(args), ofpacts);
+}
+
+static void
+encode_LOOKUP_ARP_IP(const struct ovnact_lookup_mac_bind_ip *lookup_mac,
+ const struct ovnact_encode_params *ep,
+ struct ofpbuf *ofpacts)
+{
+encode_lookup_mac_bind_ip(lookup_mac, MFF_REG0, ep, ofpacts);
+}
+
+static void

[ovs-dev] [PATCH ovn 5/7] ovn-northd.8.xml: Fix reg9 bits documentation.

2020-07-22 Thread Han Zhou
Update the reg9 bits according to current implementation.

Fixes: 2dc7869436de ("ovn-northd: Address scale issues with DNAT flows.")
Signed-off-by: Han Zhou 
---
 northd/ovn-northd.8.xml | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 9b1bcec..741c928 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -1545,7 +1545,7 @@ output;
 
 
 
-reg9[4] = lookup_arp(inport, arp.spa, arp.sha);
+reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
 next;
 
 
@@ -1564,7 +1564,7 @@ next;
 
 
 
-reg9[4] = lookup_arp(inport, arp.spa, arp.sha);
+reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
 next;
 
   
@@ -1576,7 +1576,7 @@ next;
 
 
 
-reg9[4] = lookup_nd(inport, nd.target, nd.tll);
+reg9[2] = lookup_nd(inport, nd.target, nd.tll);
 next;
 
   
@@ -1588,14 +1588,14 @@ next;
 
 
 
-reg9[4] = lookup_nd(inport, ip6.src, nd.sll);
+reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
 next;
 
   
 
   
 A priority-0 fallback flow that matches all packets and applies
-the action reg9[5] = 1; next;
+the action reg9[3] = 1; next;
 advancing the packet to the next table.
   
 
@@ -1609,18 +1609,18 @@ next;
 
 
 
-  reg9[4] will be 1 if the lookup_arp/lookup_nd
+  reg9[2] will be 1 if the lookup_arp/lookup_nd
   in the previous table was successful.
 
 
 
-  reg9[5] will be 1 if there was no need to do the lookup.
+  reg9[3] will be 1 if there was no need to do the lookup.
 
 
 
   
 A priority-100 flow with the match
-reg9[4] == 1 || reg9[5] == 1 and advances the packet
+reg9[2] == 1 || reg9[3] == 1 and advances the packet
 to the next table as there is no need to learn the neighbor.
   
 
-- 
2.1.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 3/7] actions: Rename xxx_lookup_mac to xxx_lookup_mac_bind.

2020-07-22 Thread Han Zhou
For the functions related to lookup_arp/lookup_nd, renaming them to
avoid confusion, because those functions checks both mac and ip in
mac-bindings. This patch renames them so that a future patch can
add a function that only looks up by ip without confusing names.

This patch also removes the unnecessary OVS_UNUSED for the function
execute_lookup_mac() in ovn-trace.c.

Signed-off-by: Han Zhou 
---
 lib/actions.c | 21 +++--
 utilities/ovn-trace.c | 13 +++--
 2 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/lib/actions.c b/lib/actions.c
index e14907e..82463fa 100644
--- a/lib/actions.c
+++ b/lib/actions.c
@@ -1867,8 +1867,9 @@ ovnact_put_mac_bind_free(struct ovnact_put_mac_bind 
*put_mac OVS_UNUSED)
 {
 }
 
-static void format_lookup_mac(const struct ovnact_lookup_mac_bind *lookup_mac,
-  struct ds *s, const char *name)
+static void format_lookup_mac_bind(
+const struct ovnact_lookup_mac_bind *lookup_mac,
+struct ds *s, const char *name)
 {
 expr_field_format(_mac->dst, s);
 ds_put_format(s, " = %s(", name);
@@ -1884,21 +1885,21 @@ static void
 format_LOOKUP_ARP(const struct ovnact_lookup_mac_bind *lookup_mac,
  struct ds *s)
 {
-format_lookup_mac(lookup_mac, s, "lookup_arp");
+format_lookup_mac_bind(lookup_mac, s, "lookup_arp");
 }
 
 static void
 format_LOOKUP_ND(const struct ovnact_lookup_mac_bind *lookup_mac,
 struct ds *s)
 {
-format_lookup_mac(lookup_mac, s, "lookup_nd");
+format_lookup_mac_bind(lookup_mac, s, "lookup_nd");
 }
 
 static void
-encode_lookup_mac(const struct ovnact_lookup_mac_bind *lookup_mac,
-  enum mf_field_id ip_field,
-  const struct ovnact_encode_params *ep,
-  struct ofpbuf *ofpacts)
+encode_lookup_mac_bind(const struct ovnact_lookup_mac_bind *lookup_mac,
+   enum mf_field_id ip_field,
+   const struct ovnact_encode_params *ep,
+   struct ofpbuf *ofpacts)
 {
 const struct arg args[] = {
 { expr_resolve_field(_mac->port), MFF_LOG_INPORT },
@@ -1928,7 +1929,7 @@ encode_LOOKUP_ARP(const struct ovnact_lookup_mac_bind 
*lookup_mac,
   const struct ovnact_encode_params *ep,
   struct ofpbuf *ofpacts)
 {
-encode_lookup_mac(lookup_mac, MFF_REG0, ep, ofpacts);
+encode_lookup_mac_bind(lookup_mac, MFF_REG0, ep, ofpacts);
 }
 
 static void
@@ -1936,7 +1937,7 @@ encode_LOOKUP_ND(const struct ovnact_lookup_mac_bind 
*lookup_mac,
 const struct ovnact_encode_params *ep,
 struct ofpbuf *ofpacts)
 {
-encode_lookup_mac(lookup_mac, MFF_XXREG0, ep, ofpacts);
+encode_lookup_mac_bind(lookup_mac, MFF_XXREG0, ep, ofpacts);
 }
 
 static void
diff --git a/utilities/ovn-trace.c b/utilities/ovn-trace.c
index de75088..2c432ac 100644
--- a/utilities/ovn-trace.c
+++ b/utilities/ovn-trace.c
@@ -1746,10 +1746,10 @@ execute_get_mac_bind(const struct ovnact_get_mac_bind 
*bind,
 }
 
 static void
-execute_lookup_mac(const struct ovnact_lookup_mac_bind *bind OVS_UNUSED,
-   const struct ovntrace_datapath *dp OVS_UNUSED,
-   struct flow *uflow OVS_UNUSED,
-   struct ovs_list *super OVS_UNUSED)
+execute_lookup_mac_bind(const struct ovnact_lookup_mac_bind *bind,
+const struct ovntrace_datapath *dp,
+struct flow *uflow,
+struct ovs_list *super)
 {
 /* Get logical port number.*/
 struct mf_subfield port_sf = expr_resolve_field(>port);
@@ -2214,11 +2214,12 @@ trace_actions(const struct ovnact *ovnacts, size_t 
ovnacts_len,
 break;
 
 case OVNACT_LOOKUP_ARP:
-execute_lookup_mac(ovnact_get_LOOKUP_ARP(a), dp, uflow, super);
+execute_lookup_mac_bind(ovnact_get_LOOKUP_ARP(a), dp, uflow,
+super);
 break;
 
 case OVNACT_LOOKUP_ND:
-execute_lookup_mac(ovnact_get_LOOKUP_ND(a), dp, uflow, super);
+execute_lookup_mac_bind(ovnact_get_LOOKUP_ND(a), dp, uflow, super);
 break;
 
 case OVNACT_PUT_DHCPV4_OPTS:
-- 
2.1.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3] datapath-windows: Reset ct_mark/ct_label to support ALG

2020-07-22 Thread Jinjun Gao
The ct_mark/ct_label setting on related connection keep the same
behavior with Linux datapath. If one CT entry has parent/master entry,
its ct_mark and ct_label should inherit from the corresponding part
of parent/master entry at initialization.

Signed-off-by: Jinjun Gao 
---
v3: Resolve Alin's comment and use RtlCopyMemory to replace memcpy
v2: Resolve Anand's comment to add braces for if-block.
---
 datapath-windows/ovsext/Conntrack.c | 86 +++--
 1 file changed, 54 insertions(+), 32 deletions(-)

diff --git a/datapath-windows/ovsext/Conntrack.c 
b/datapath-windows/ovsext/Conntrack.c
index d065591..2610d62 100644
--- a/datapath-windows/ovsext/Conntrack.c
+++ b/datapath-windows/ovsext/Conntrack.c
@@ -789,60 +789,82 @@ OvsProcessConntrackEntry(OvsForwardingContext *fwdCtx,
 static __inline VOID
 OvsConntrackSetMark(OvsFlowKey *key,
 POVS_CT_ENTRY entry,
-UINT32 value,
-UINT32 mask,
+MD_MARK *mark,
 BOOLEAN *markChanged)
 {
-UINT32 newMark;
-newMark = value | (entry->mark & ~(mask));
-if (entry->mark != newMark) {
+POVS_CT_ENTRY parent = entry->parent;
+BOOLEAN changed = FALSE;
+UINT32 newMark = 0;
+
+if (parent && parent->mark) {
+newMark = parent->mark;
+changed = TRUE;
+} else if (mark) {
+newMark = mark->value | (entry->mark & ~(mark->mask));
+changed = TRUE;
+}
+
+if (changed && entry->mark != newMark) {
 entry->mark = newMark;
 key->ct.mark = newMark;
 *markChanged = TRUE;
 }
 }

+static __inline BOOLEAN
+OvsConntrackIsLabelsNonZero(const struct ovs_key_ct_labels *labels)
+{
+UINT8 i;
+
+for (i = 0; i < OVS_CT_LABELS_LEN_32; i++) {
+if (labels->ct_labels_32[i]) {
+return TRUE;
+}
+}
+
+return FALSE;
+}
+
 static __inline void
 OvsConntrackSetLabels(OvsFlowKey *key,
   POVS_CT_ENTRY entry,
-  struct ovs_key_ct_labels *val,
-  struct ovs_key_ct_labels *mask,
+  MD_LABELS *labels,
   BOOLEAN *labelChanged)
 {
-ovs_u128 v, m, pktMdLabel = {0};
-memcpy(, val, sizeof v);
-memcpy(, mask, sizeof m);
-memcpy(, >labels, sizeof(struct ovs_key_ct_labels));
+POVS_CT_ENTRY parent = entry->parent;

-pktMdLabel.u64.lo = v.u64.lo | (pktMdLabel.u64.lo & ~(m.u64.lo));
-pktMdLabel.u64.hi = v.u64.hi | (pktMdLabel.u64.hi & ~(m.u64.hi));
+/* Inherit master's labels at labels initialization, if any. */
+if (!OvsConntrackIsLabelsNonZero(>labels) &&
+parent && OvsConntrackIsLabelsNonZero(>labels)) {
+RtlCopyMemory(>labels, >labels, OVS_CT_LABELS_LEN);
+*labelChanged = TRUE;
+}
+
+/* Update labels according to value of ct_label in ct commit */
+if (labels && OvsConntrackIsLabelsNonZero(>mask)) {
+UINT8 i;
+UINT32 *dst = entry->labels.ct_labels_32;
+for (i = 0; i < OVS_CT_LABELS_LEN_32; i++) {
+dst[i] = (dst[i] & ~(labels->mask.ct_labels_32[i])) |
+ (labels->value.ct_labels_32[i] & 
labels->mask.ct_labels_32[i]);
+}

-if (!NdisEqualMemory(>labels, ,
- sizeof(struct ovs_key_ct_labels))) {
 *labelChanged = TRUE;
 }
-NdisMoveMemory(>labels, ,
-   sizeof(struct ovs_key_ct_labels));
-NdisMoveMemory(>ct.labels, ,
-   sizeof(struct ovs_key_ct_labels));
+
+/* Update flow key's ct labels */
+NdisMoveMemory(>ct.labels, >labels, OVS_CT_LABELS_LEN);
 }

 static void
 OvsCtSetMarkLabel(OvsFlowKey *key,
-   POVS_CT_ENTRY entry,
-   MD_MARK *mark,
-   MD_LABELS *labels,
-   BOOLEAN *triggerUpdateEvent)
+  POVS_CT_ENTRY entry,
+  MD_MARK *mark,
+  MD_LABELS *labels,
+  BOOLEAN *triggerUpdateEvent)
 {
-if (mark) {
-OvsConntrackSetMark(key, entry, mark->value, mark->mask,
-triggerUpdateEvent);
-}
-
-if (labels) {
-OvsConntrackSetLabels(key, entry, >value, >mask,
-  triggerUpdateEvent);
-}
+OvsConntrackSetMark(key, entry, mark, triggerUpdateEvent);
+OvsConntrackSetLabels(key, entry, labels, triggerUpdateEvent);
 }

 /*
--
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [ovs-dev v2] datapath-windows: Reset ct_mark/ct_label to support ALG

2020-07-22 Thread Jinjun Gao
Alin, thanks for review. I have updated the patch to resolve your comment and 
use RtlCopyMemory to replace memcpy. The PR also is updated: 
https://github.com/openvswitch/ovs/pull/324 .

- Jinjun


From: Alin Serdean 
Date: Thursday, July 23, 2020 at 3:47 AM
To: Jinjun Gao , "d...@openvswitch.org" 
, Anand Kumar 
Cc: Jinjun Gao 
Subject: RE: [ovs-dev v2] datapath-windows: Reset ct_mark/ct_label to support 
ALG

Thanks a lot for the patch.

In general please prefer to use RtlCopyMemory instead of memcpy: 
https://community.osr.com/discussion/242667/rtlcopymemory-vs-memcpy

Just a few more nits inlined.

Alin.

From: Jinjun Gao
Sent: Monday, July 20, 2020 6:00 PM
To: d...@openvswitch.org; Alin 
Serdean; 
kumaran...@vmware.com
Cc: Jinjun Gao
Subject: [ovs-dev v2] datapath-windows: Reset ct_mark/ct_label to support ALG

The ct_mark/ct_label setting on related connection keep the same
behavior with Linux datapath. If one CT entry has parent/master entry,
its ct_mark and ct_label should inherit from the corresponding part
of parent/master entry at initialization.

Signed-off-by: Jinjun Gao 
---
v2: Resolve Anand's comment to add braces for if-block.
---
 datapath-windows/ovsext/Conntrack.c | 88 +++--
 1 file changed, 56 insertions(+), 32 deletions(-)

diff --git a/datapath-windows/ovsext/Conntrack.c 
b/datapath-windows/ovsext/Conntrack.c
index d065591..cda6d8b 100644
--- a/datapath-windows/ovsext/Conntrack.c
+++ b/datapath-windows/ovsext/Conntrack.c
@@ -789,60 +789,84 @@ OvsProcessConntrackEntry(OvsForwardingContext *fwdCtx,
 static __inline VOID
 OvsConntrackSetMark(OvsFlowKey *key,
 POVS_CT_ENTRY entry,
-UINT32 value,
-UINT32 mask,
+MD_MARK *mark,
 BOOLEAN *markChanged)
 {
-UINT32 newMark;
-newMark = value | (entry->mark & ~(mask));
-if (entry->mark != newMark) {
+POVS_CT_ENTRY parent = entry->parent;
+BOOLEAN changed = FALSE;
+UINT32 newMark = 0;
+
+if (parent && parent->mark) {
+newMark = parent->mark;
+changed = TRUE;
+} else if (mark) {
+newMark = mark->value | (entry->mark & ~(mark->mask));
+changed = TRUE;
+}
+
+if (changed && entry->mark != newMark) {
 entry->mark = newMark;
 key->ct.mark = newMark;
 *markChanged = TRUE;
 }
 }

+static __inline BOOLEAN
+OvsConntrackIsLabelsNonZero(const struct ovs_key_ct_labels *labels)
+{
+UINT8 i;
+
+for (i = 0; i < OVS_CT_LABELS_LEN_32; i++) {
+if (labels->ct_labels_32[i]) {
+return TRUE;
+}
+}
+
+return FALSE;
+}
+
 static __inline void
 OvsConntrackSetLabels(OvsFlowKey *key,
   POVS_CT_ENTRY entry,
-  struct ovs_key_ct_labels *val,
-  struct ovs_key_ct_labels *mask,
+  MD_LABELS *labels,
   BOOLEAN *labelChanged)
 {
-ovs_u128 v, m, pktMdLabel = {0};
-memcpy(, val, sizeof v);
-memcpy(, mask, sizeof m);
-memcpy(, >labels, sizeof(struct ovs_key_ct_labels));
+POVS_CT_ENTRY parent = entry->parent;

-pktMdLabel.u64.lo = v.u64.lo | (pktMdLabel.u64.lo & ~(m.u64.lo));
-pktMdLabel.u64.hi = v.u64.hi | (pktMdLabel.u64.hi & ~(m.u64.hi));
+/* Inherit master's labels at labels initialization, if any. */
+if (!OvsConntrackIsLabelsNonZero(>labels) &&
+parent && OvsConntrackIsLabelsNonZero(>labels)) {
+memcpy(>labels, >labels, OVS_CT_LABELS_LEN);
+*labelChanged = TRUE;
+}
+
+/* Use the same computing method with Linux kernel datapath.
+ * It is more clean and easy understanding.
+ */
[Alin]Nit: You should drop the comment above. It is more useful to
say what you are actually computing.
+if (labels && OvsConntrackIsLabelsNonZero(>mask)) {
+UINT8 i;
+UINT32 * dst = entry->labels.ct_labels_32;
[Alin] Nit: Remove the ‘ ‘. UINT32 *dst
+for (i = 0; i < OVS_CT_LABELS_LEN_32; i++) {
+dst[i] = (dst[i] & ~(labels->mask.ct_labels_32[i])) |
+ (labels->value.ct_labels_32[i] & 
labels->mask.ct_labels_32[i]);
+}

-if (!NdisEqualMemory(>labels, ,
- sizeof(struct ovs_key_ct_labels))) {
 *labelChanged = TRUE;
 }
-NdisMoveMemory(>labels, ,
-   sizeof(struct ovs_key_ct_labels));
-NdisMoveMemory(>ct.labels, ,
-   sizeof(struct ovs_key_ct_labels));
+
+ 

Re: [ovs-dev] OVN nb-db and sb-db out of sync

2020-07-22 Thread Tony Liu
Hi,

I see why sb-db broke at 1568th port-binding.
The 1568th datapath-binding in sb-db references the same

_uuid   : 108cf745-db82-43c0-a9d3-afe27a41e4aa
external_ids: {logical-switch="8a5d1d3c-e9fc-4cbe-a461-98ff838e6473", 
name=neutron-e907dc17-f1e8-4217-a37d-86e9a98c86c2, name2=net-97-192}
tunnel_key  : 1567

_uuid   : d934ed92-2f3c-4b31-8a76-2a5047a3bb46
external_ids: {logical-switch="8a5d1d3c-e9fc-4cbe-a461-98ff838e6473", 
name=neutron-e907dc17-f1e8-4217-a37d-86e9a98c86c2, name2=net-97-192}
tunnel_key  : 1568

I don't believe this is supposed to happen. Any idea how could it happen?
Then ovn-northd is stuck in trying to delete this duplication, and it ignores 
all the following updates.
That caused out-of-sync between nb-db and sb-db.
Any way I can fix it manually, like with ovn-sbctl to delete it?


Thanks!

Tony


From: dev  on behalf of Tony Liu 

Sent: July 22, 2020 11:33 AM
To: ovs-dev 
Subject: [ovs-dev] OVN nb-db and sb-db out of sync

Hi,

During a scaling test where 4000 networks are created from OpenStack, I see that
nb-db and sb-db are out of sync. All 4000 logical switches and 8000 LS ports
(GW port and service port of each network) are created in nb-db. In sb-db,
only 1567 port-bindings, 4000 is expected.

[root@ovn-db-2 ~]# ovn-nbctl list nb_global
_uuid   : b7b3aa05-f7ed-4dbc-979f-10445ac325b8
connections : []
external_ids: {"neutron:liveness_check_at"="2020-07-22 
04:03:17.726917+00:00"}
hv_cfg  : 312
ipsec   : false
name: ""
nb_cfg  : 2636
options : {mac_prefix="ca:e8:07", 
svc_monitor_mac="4e:d0:3a:80:d4:b7"}
sb_cfg  : 2005
ssl : []

[root@ovn-db-2 ~]# ovn-sbctl list sb_global
_uuid   : 3720bc1d-b0da-47ce-85ca-96fa8d398489
connections : []
external_ids: {}
ipsec   : false
nb_cfg  : 312
options : {mac_prefix="ca:e8:07", 
svc_monitor_mac="4e:d0:3a:80:d4:b7"}
ssl : []

Is there any way to force ovn-northd to rebuild sb-db to sync with nb-db,
like manipulating nb_cfg or anything else? Note, it's 3-node RAFT cluster for 
both
nb-db and sb-db.

Is that "incremental update" implemented in 20.03?
If not, in which release it's going to be available?


Thanks!

Tony

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Claves para una implementación efectiva

2020-07-22 Thread NOM 035
Buenos día 
Quise aprovechar la oportunidad de hacerte una invitación para tomar nuestro:
 
Nombre: NOM 035. Claves para una implementación efectiva.
Fecha: 13 de Agosto
Horario: 10:00 am a 13:00 pm 
Formato: En línea con interacción en vivo.
Lugar: En Vivo desde su computadora
Instructor:Consultora de Recursos Humanos con 14 años de experiencia trabajando 
en proyectos para impulsar el capital humano en las organizaciones.

Este curso permite a los participantes identificar los principales retos en la 
implementación de la NOM 035 dentro de sus centros de trabajo, así como claves 
prácticas para superarlos y obtener los beneficios que esta norma brinda.

Objetivos Específicos:

- Será capaz de identificar los conceptos relacionados con la norma. 
-  Conocerá de manera general los requisitos base de cumplimiento, así como el 
impacto de estos en la productividad y desarrollo de sus organizaciones.
-  Descubrirá una alternativa para facilitar la implementación de la NOM 035 en 
sus organizaciones.

Solicita información respondiendo a este correo con la palabra NOM, junto con 
los siguientes datos:

Nombre:
Correo electrónico:
Número telefónico:
Email Alterno:

Para información inmediata llamar al:
(+52) 55 15 54 66 30 - (+52) 55 30 16 70 85

o puede enviarnos un Whatsapp. 

Qué tengas un gran día.
Saludos.

Innova Learn México - innovalearn. mx - Mérida, Yucatán, México


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCHv2] python: fixed package dependency

2020-07-22 Thread Gregory Rose



On 7/22/2020 2:25 PM, Toms Atteka wrote:

Python3 does not have python3-twisted-web. Required codebase is inside
python3-twisted.

Signed-off-by: Toms Atteka 
---
  debian/control | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/debian/control b/debian/control
index 0646b22a1..6420b9d3e 100644
--- a/debian/control
+++ b/debian/control
@@ -188,7 +188,7 @@ Description: Python bindings for Open vSwitch
  Package: openvswitch-test
  Architecture: all
  Depends: python3,
- python3-twisted-web,
+ python3-twisted,
   ${misc:Depends},
   ${python3:Depends}
  Description: Open vSwitch test package



LGTM
Acked-by: Greg Rose 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCHv2] python: fixed package dependency

2020-07-22 Thread Toms Atteka
Python3 does not have python3-twisted-web. Required codebase is inside
python3-twisted.

Signed-off-by: Toms Atteka 
---
 debian/control | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/debian/control b/debian/control
index 0646b22a1..6420b9d3e 100644
--- a/debian/control
+++ b/debian/control
@@ -188,7 +188,7 @@ Description: Python bindings for Open vSwitch
 Package: openvswitch-test
 Architecture: all
 Depends: python3,
- python3-twisted-web,
+ python3-twisted,
  ${misc:Depends},
  ${python3:Depends}
 Description: Open vSwitch test package
-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] python: fixed package dependency

2020-07-22 Thread 0-day Robot
Bleep bloop.  Greetings Toms Atteka, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
ERROR: Author Toms Atteka  needs to sign off.
WARNING: Unexpected sign-offs from developers who are not authors or co-authors 
or committers: Toms Atteka 
Lines checked: 29, Warnings: 1, Errors: 1


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] INFO: task hung in ovs_dp_masks_rebalance

2020-07-22 Thread syzbot
Hello,

syzbot found the following issue on:

HEAD commit:fa56a987 Merge branch 'ionic-updates'
git tree:   net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1105f05310
kernel config:  https://syzkaller.appspot.com/x/.config?x=2b7b67c0c1819c87
dashboard link: https://syzkaller.appspot.com/bug?extid=bad6507e5db05017b008
compiler:   gcc (GCC) 10.1.0-syz 20200507

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+bad6507e5db05017b...@syzkaller.appspotmail.com

INFO: task kworker/1:8:10136 blocked for more than 143 seconds.
  Not tainted 5.8.0-rc4-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/1:8 D27336 10136  2 0x4000
Workqueue: events ovs_dp_masks_rebalance
Call Trace:
 context_switch kernel/sched/core.c:3453 [inline]
 __schedule+0x8e1/0x1eb0 kernel/sched/core.c:4178
 schedule+0xd0/0x2a0 kernel/sched/core.c:4253
 schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:4312
 __mutex_lock_common kernel/locking/mutex.c:1033 [inline]
 __mutex_lock+0x3e2/0x10d0 kernel/locking/mutex.c:1103
 ovs_lock net/openvswitch/datapath.c:105 [inline]
 ovs_dp_masks_rebalance+0x18/0x80 net/openvswitch/datapath.c:2355
 process_one_work+0x94c/0x1670 kernel/workqueue.c:2269
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2415
 kthread+0x3b5/0x4a0 kernel/kthread.c:291
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
INFO: task kworker/0:9:8606 blocked for more than 143 seconds.
  Not tainted 5.8.0-rc4-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/0:9 D27792  8606  2 0x4000
Workqueue: events ovs_dp_masks_rebalance
Call Trace:
 context_switch kernel/sched/core.c:3453 [inline]
 __schedule+0x8e1/0x1eb0 kernel/sched/core.c:4178
 schedule+0xd0/0x2a0 kernel/sched/core.c:4253
 schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:4312
 __mutex_lock_common kernel/locking/mutex.c:1033 [inline]
 __mutex_lock+0x3e2/0x10d0 kernel/locking/mutex.c:1103
 ovs_lock net/openvswitch/datapath.c:105 [inline]
 ovs_dp_masks_rebalance+0x18/0x80 net/openvswitch/datapath.c:2355
 process_one_work+0x94c/0x1670 kernel/workqueue.c:2269
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2415
 kthread+0x3b5/0x4a0 kernel/kthread.c:291
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
INFO: task syz-executor.2:9414 blocked for more than 143 seconds.
  Not tainted 5.8.0-rc4-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor.2  D27832  9414   7005 0x4004
Call Trace:
 context_switch kernel/sched/core.c:3453 [inline]
 __schedule+0x8e1/0x1eb0 kernel/sched/core.c:4178
 schedule+0xd0/0x2a0 kernel/sched/core.c:4253
 schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:4312
 __mutex_lock_common kernel/locking/mutex.c:1033 [inline]
 __mutex_lock+0x3e2/0x10d0 kernel/locking/mutex.c:1103
 ovs_lock net/openvswitch/datapath.c:105 [inline]
 ovs_dp_cmd_new+0x69d/0x1160 net/openvswitch/datapath.c:1690
 genl_family_rcv_msg_doit net/netlink/genetlink.c:669 [inline]
 genl_family_rcv_msg net/netlink/genetlink.c:714 [inline]
 genl_rcv_msg+0x61d/0x980 net/netlink/genetlink.c:731
 netlink_rcv_skb+0x15a/0x430 net/netlink/af_netlink.c:2469
 genl_rcv+0x24/0x40 net/netlink/genetlink.c:742
 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
 netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1329
 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1918
 sock_sendmsg_nosec net/socket.c:651 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:671
 sys_sendmsg+0x6e8/0x810 net/socket.c:2363
 ___sys_sendmsg+0xf3/0x170 net/socket.c:2417
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2450
 do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:384
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45c1f9
Code: Bad RIP value.
RSP: 002b:7f864e02dc78 EFLAGS: 0246 ORIG_RAX: 002e
RAX: ffda RBX: 0002b400 RCX: 0045c1f9
RDX:  RSI: 20c0 RDI: 0003
RBP: 0078bf40 R08:  R09: 
R10:  R11: 0246 R12: 0078bf0c
R13: 7ffc8c137ccf R14: 7f864e02e9c0 R15: 0078bf0c
INFO: task syz-executor.2:9422 blocked for more than 144 seconds.
  Not tainted 5.8.0-rc4-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor.2  D27040  9422   7005 0x4004
Call Trace:
 context_switch kernel/sched/core.c:3453 [inline]
 __schedule+0x8e1/0x1eb0 kernel/sched/core.c:4178
 schedule+0xd0/0x2a0 kernel/sched/core.c:4253
 schedule_timeout+0x1d8/0x250 kernel/time/timer.c:1873
 do_wait_for_common kernel/sched/completion.c:85 [inline]
 __wait_for_common kernel/sched/completion.c:106 [inline]
 wait_for_common kernel/sched/completion.c:117 [inline]
 wait_for_completion+0x163/0x260 

[ovs-dev] [PATCH v4] bfd: Support overlay BFD

2020-07-22 Thread Yifeng Sun
Current OVS intercepts and processes all BFD packets, thus VM-2-VM
BFD packets get lost and the recipient VM never sees them.

This patch fixes it by only intercepting and processing BFD packets
destined to a configured BFD instance, and other BFD packets are made
available to the OVS flow table for forwarding.

This patch keeps BFD's backward compatibility.

VMWare-BZ: 2579326
Signed-off-by: Yifeng Sun 
---
v1->v2: Add test by William's suggestion.
v2->v3: Fix BFD config, thanks William.
v3->v4: Test will fail at second run, fixed it.

 lib/bfd.c   | 16 +---
 tests/system-traffic.at | 44 
 vswitchd/vswitch.xml|  7 +++
 3 files changed, 64 insertions(+), 3 deletions(-)

diff --git a/lib/bfd.c b/lib/bfd.c
index cc8c6857afa4..3c965699ace3 100644
--- a/lib/bfd.c
+++ b/lib/bfd.c
@@ -149,6 +149,9 @@ BUILD_ASSERT_DECL(BFD_PACKET_LEN == sizeof(struct msg));
 #define FLAGS_MASK 0x3f
 #define DEFAULT_MULT 3
 
+#define BFD_DEFAULT_SRC_IP 0xA9FE0101 /* 169.254.1.1 */
+#define BFD_DEFAULT_DST_IP 0xA9FE0100 /* 169.254.1.0 */
+
 struct bfd {
 struct hmap_node node;/* In 'all_bfds'. */
 uint32_t disc;/* bfd.LocalDiscr. Key in 'all_bfds' hmap. */
@@ -457,9 +460,9 @@ bfd_configure(struct bfd *bfd, const char *name, const 
struct smap *cfg,
  >rmt_eth_dst);
 
 bfd_lookup_ip(smap_get_def(cfg, "bfd_src_ip", ""),
-  htonl(0xA9FE0101) /* 169.254.1.1 */, >ip_src);
+  htonl(BFD_DEFAULT_SRC_IP), >ip_src);
 bfd_lookup_ip(smap_get_def(cfg, "bfd_dst_ip", ""),
-  htonl(0xA9FE0100) /* 169.254.1.0 */, >ip_dst);
+  htonl(BFD_DEFAULT_DST_IP), >ip_dst);
 
 forwarding_if_rx = smap_get_bool(cfg, "forwarding_if_rx", false);
 if (bfd->forwarding_if_rx != forwarding_if_rx) {
@@ -674,7 +677,14 @@ bfd_should_process_flow(const struct bfd *bfd_, const 
struct flow *flow,
 memset(>masks.nw_proto, 0xff, sizeof wc->masks.nw_proto);
 if (flow->nw_proto == IPPROTO_UDP
 && !(flow->nw_frag & FLOW_NW_FRAG_LATER)
-&& tp_dst_equals(flow, BFD_DEST_PORT, wc)) {
+&& tp_dst_equals(flow, BFD_DEST_PORT, wc)
+&& (bfd->ip_src == htonl(BFD_DEFAULT_SRC_IP)
+|| bfd->ip_src == flow->nw_dst)) {
+
+if (bfd->ip_src == flow->nw_dst) {
+memset(>masks.nw_dst, 0x, sizeof wc->masks.nw_dst);
+}
+
 bool check_tnl_key;
 
 atomic_read_relaxed(>check_tnl_key, _tnl_key);
diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index 2a0fbadff4a1..ea72f155782f 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -6289,3 +6289,47 @@ OVS_WAIT_UNTIL([cat p2.pcap | egrep "0x0050: * * 
*5002 *2000 *b85e *
 
 OVS_TRAFFIC_VSWITCHD_STOP
 AT_CLEANUP
+
+AT_SETUP([bfd - BFD overlay])
+OVS_CHECK_GENEVE()
+
+OVS_TRAFFIC_VSWITCHD_START()
+
+AT_CHECK([ovs-vsctl -- set bridge br0 
other-config:hwaddr=\"f2:ff:00:00:00:01\"])
+ADD_BR([br-underlay], [set bridge br-underlay 
other-config:hwaddr=\"ee:09:e0:4d:bf:31\"])
+
+AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
+AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"])
+
+ADD_NAMESPACES(at_ns0)
+
+dnl Set up underlay link from host into the namespace using veth pair.
+ADD_VETH(p0, at_ns0, br-underlay, "172.16.180.105/24", 4e:12:5d:6c:74:3d)
+AT_CHECK([ip addr add dev br-underlay "172.16.180.106/24"])
+AT_CHECK([ip link set dev br-underlay up])
+
+dnl Set up tunnel endpoints on OVS outside the namespace.
+ADD_OVS_TUNNEL([geneve], [br0], [at_gnv0], [172.16.180.105], 
[192.168.10.100/24],
+[options:packet_type=ptap])
+AT_CHECK([ovs-vsctl -- set Interface at_gnv0 ofport_request=1])
+AT_CHECK([ovs-vsctl -- set Interface at_gnv0 bfd:enable=true 
bfd:bfd_src_ip="172.16.180.106"])
+
+dnl Certain Linux distributions, like CentOS, have default iptable rules
+dnl to reject input traffic from br-underlay. Here we add a rule to walk
+dnl around it.
+iptables -I INPUT 1 -i br-underlay -j ACCEPT
+on_exit 'iptables -D INPUT 1'
+
+dnl Firstly, test normal BFD packet.
+ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1 
packet=ee09e04dbf314e125d6c743d080045c00066ea4e400040118e83ac10b469ac10b46a356e17c10052cb5500806558002320018ad232b919c4080045c00034ff11fa03ac10b469ac10b46acec8002020800318035cd00e000f424f4240
 actions=NORMAL"
+dnl Next we test overlay BFD packet.
+ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1 
packet=ee09e04dbf314e125d6c743d0800455b2558400040115445ac10b469ac10b46a6b1017c1004722c10240655801048001001803005254009d0b6d5254000c8984080045210e22400040119688c0a80a68c0a80a6995cd0ec8000dd342746573740a
 actions=NORMAL"
+
+ovs-dpctl dump-flows > datapath-flows.txt
+dnl BFD packet was handled by BFD flow. Ipv4.dst is matched.
+AT_FAIL_IF([cat datapath-flows.txt | 

Re: [ovs-dev] [ovs-dev v2] datapath-windows: Reset ct_mark/ct_label to support ALG

2020-07-22 Thread Alin Serdean
Thanks a lot for the patch.

In general please prefer to use RtlCopyMemory instead of memcpy: 
https://community.osr.com/discussion/242667/rtlcopymemory-vs-memcpy

Just a few more nits inlined.

Alin.

From: Jinjun Gao
Sent: Monday, July 20, 2020 6:00 PM
To: d...@openvswitch.org; Alin 
Serdean; 
kumaran...@vmware.com
Cc: Jinjun Gao
Subject: [ovs-dev v2] datapath-windows: Reset ct_mark/ct_label to support ALG

The ct_mark/ct_label setting on related connection keep the same
behavior with Linux datapath. If one CT entry has parent/master entry,
its ct_mark and ct_label should inherit from the corresponding part
of parent/master entry at initialization.

Signed-off-by: Jinjun Gao 
---
v2: Resolve Anand's comment to add braces for if-block.
---
 datapath-windows/ovsext/Conntrack.c | 88 +++--
 1 file changed, 56 insertions(+), 32 deletions(-)

diff --git a/datapath-windows/ovsext/Conntrack.c 
b/datapath-windows/ovsext/Conntrack.c
index d065591..cda6d8b 100644
--- a/datapath-windows/ovsext/Conntrack.c
+++ b/datapath-windows/ovsext/Conntrack.c
@@ -789,60 +789,84 @@ OvsProcessConntrackEntry(OvsForwardingContext *fwdCtx,
 static __inline VOID
 OvsConntrackSetMark(OvsFlowKey *key,
 POVS_CT_ENTRY entry,
-UINT32 value,
-UINT32 mask,
+MD_MARK *mark,
 BOOLEAN *markChanged)
 {
-UINT32 newMark;
-newMark = value | (entry->mark & ~(mask));
-if (entry->mark != newMark) {
+POVS_CT_ENTRY parent = entry->parent;
+BOOLEAN changed = FALSE;
+UINT32 newMark = 0;
+
+if (parent && parent->mark) {
+newMark = parent->mark;
+changed = TRUE;
+} else if (mark) {
+newMark = mark->value | (entry->mark & ~(mark->mask));
+changed = TRUE;
+}
+
+if (changed && entry->mark != newMark) {
 entry->mark = newMark;
 key->ct.mark = newMark;
 *markChanged = TRUE;
 }
 }

+static __inline BOOLEAN
+OvsConntrackIsLabelsNonZero(const struct ovs_key_ct_labels *labels)
+{
+UINT8 i;
+
+for (i = 0; i < OVS_CT_LABELS_LEN_32; i++) {
+if (labels->ct_labels_32[i]) {
+return TRUE;
+}
+}
+
+return FALSE;
+}
+
 static __inline void
 OvsConntrackSetLabels(OvsFlowKey *key,
   POVS_CT_ENTRY entry,
-  struct ovs_key_ct_labels *val,
-  struct ovs_key_ct_labels *mask,
+  MD_LABELS *labels,
   BOOLEAN *labelChanged)
 {
-ovs_u128 v, m, pktMdLabel = {0};
-memcpy(, val, sizeof v);
-memcpy(, mask, sizeof m);
-memcpy(, >labels, sizeof(struct ovs_key_ct_labels));
+POVS_CT_ENTRY parent = entry->parent;

-pktMdLabel.u64.lo = v.u64.lo | (pktMdLabel.u64.lo & ~(m.u64.lo));
-pktMdLabel.u64.hi = v.u64.hi | (pktMdLabel.u64.hi & ~(m.u64.hi));
+/* Inherit master's labels at labels initialization, if any. */
+if (!OvsConntrackIsLabelsNonZero(>labels) &&
+parent && OvsConntrackIsLabelsNonZero(>labels)) {
+memcpy(>labels, >labels, OVS_CT_LABELS_LEN);
+*labelChanged = TRUE;
+}
+
+/* Use the same computing method with Linux kernel datapath.
+ * It is more clean and easy understanding.
+ */
[Alin]Nit: You should drop the comment above. It is more useful to
say what you are actually computing.
+if (labels && OvsConntrackIsLabelsNonZero(>mask)) {
+UINT8 i;
+UINT32 * dst = entry->labels.ct_labels_32;
[Alin] Nit: Remove the ‘ ‘. UINT32 *dst
+for (i = 0; i < OVS_CT_LABELS_LEN_32; i++) {
+dst[i] = (dst[i] & ~(labels->mask.ct_labels_32[i])) |
+ (labels->value.ct_labels_32[i] & 
labels->mask.ct_labels_32[i]);
+}

-if (!NdisEqualMemory(>labels, ,
- sizeof(struct ovs_key_ct_labels))) {
 *labelChanged = TRUE;
 }
-NdisMoveMemory(>labels, ,
-   sizeof(struct ovs_key_ct_labels));
-NdisMoveMemory(>ct.labels, ,
-   sizeof(struct ovs_key_ct_labels));
+
+/* Update flow key's ct labels */
+NdisMoveMemory(>ct.labels, >labels, OVS_CT_LABELS_LEN);
 }

 static void
 OvsCtSetMarkLabel(OvsFlowKey *key,
-   POVS_CT_ENTRY entry,
-   MD_MARK *mark,
-   MD_LABELS *labels,
-   BOOLEAN *triggerUpdateEvent)
+  POVS_CT_ENTRY entry,
+  MD_MARK *mark,
+  MD_LABELS *labels,
+  BOOLEAN *triggerUpdateEvent)
 {
-if (mark) {
-OvsConntrackSetMark(key, entry, mark->value, mark->mask,
-triggerUpdateEvent);
-}
-
-if (labels) {
-OvsConntrackSetLabels(key, entry, >value, >mask,
-

Re: [ovs-dev] [PATCH net-next 2/2] net: openvswitch: make masks cache size configurable

2020-07-22 Thread Florian Westphal
Eelco Chaudron  wrote:
> This patch makes the masks cache size configurable, or with
> a size of 0, disable it.
> 
> Reviewed-by: Paolo Abeni 
> Signed-off-by: Eelco Chaudron 
> ---
>  include/uapi/linux/openvswitch.h |1 
>  net/openvswitch/datapath.c   |   11 +
>  net/openvswitch/flow_table.c |   86 
> ++
>  net/openvswitch/flow_table.h |   10 
>  4 files changed, 98 insertions(+), 10 deletions(-)
> 
> diff --git a/include/uapi/linux/openvswitch.h 
> b/include/uapi/linux/openvswitch.h
> index 7cb76e5ca7cf..8300cc29dec8 100644
> --- a/include/uapi/linux/openvswitch.h
> +++ b/include/uapi/linux/openvswitch.h
> @@ -86,6 +86,7 @@ enum ovs_datapath_attr {
>   OVS_DP_ATTR_MEGAFLOW_STATS, /* struct ovs_dp_megaflow_stats */
>   OVS_DP_ATTR_USER_FEATURES,  /* OVS_DP_F_*  */
>   OVS_DP_ATTR_PAD,
> + OVS_DP_ATTR_MASKS_CACHE_SIZE,

This new attr should probably get an entry in
datapath.c datapath_policy[].

> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
> @@ -1535,6 +1535,10 @@ static int ovs_dp_cmd_fill_info(struct datapath *dp, 
> struct sk_buff *skb,
>   if (nla_put_u32(skb, OVS_DP_ATTR_USER_FEATURES, dp->user_features))
>   goto nla_put_failure;
>  
> + if (nla_put_u32(skb, OVS_DP_ATTR_MASKS_CACHE_SIZE,
> + ovs_flow_tbl_masks_cache_size(>table)))
> + goto nla_put_failure;
> +
>   genlmsg_end(skb, ovs_header);
>   return 0;


ovs_dp_cmd_msg_size() should add another nla_total_size(sizeof(u32))
to make sure there is enough space.

> + if (a[OVS_DP_ATTR_MASKS_CACHE_SIZE]) {
> + u32 cache_size;
> +
> + cache_size = nla_get_u32(a[OVS_DP_ATTR_MASKS_CACHE_SIZE]);
> + ovs_flow_tbl_masks_cache_resize(>table, cache_size);
> + }

I see a 0 cache size is legal (turns it off) and that the allocation
path has a few sanity checks as well.

Would it make sense to add min/max policy to datapath_policy[] for this
as well?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] python: fixed package dependency

2020-07-22 Thread Toms Atteka
From: Toms Atteka 

Python3 does not have python3-twisted-web. Required codebase is inside
python3-twisted.

Signed-off-by: Toms Atteka 
---
 debian/control | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/debian/control b/debian/control
index 0646b22a1..6420b9d3e 100644
--- a/debian/control
+++ b/debian/control
@@ -188,7 +188,7 @@ Description: Python bindings for Open vSwitch
 Package: openvswitch-test
 Architecture: all
 Depends: python3,
- python3-twisted-web,
+ python3-twisted,
  ${misc:Depends},
  ${python3:Depends}
 Description: Open vSwitch test package
-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] Fix the routing for external logical ports of bridged logical switches.

2020-07-22 Thread Numan Siddique
On Mon, Jul 13, 2020 at 11:56 AM Numan Siddique  wrote:

> +Daniel Alvarez Sanchez 
> +Lucas Alvares Gomes Martins 
>
>
> On Mon, Jul 13, 2020 at 11:29 AM Ankur Sharma 
> wrote:
>
>> Hi Numan,
>>
>> Thank you so much for the details.
>>
>>
> Hi Ankur,
>
> Thanks for the detailed email. Your analysis is correct. I have few
> comments.
> Please see below.
>
>
>

Hi Ankur,

Did you get any chance to look into my comments ? Just checking.

Thanks
Numan


> Following is my analysis on the feature:
>> a. Port of type EXTERNAL means that we create a logical switch port in
>> OVN without a VIF backing.
>> b. i.e the physical port corresponding to external port is NOT behind OVN
>> managed vswitch (for SRIOV specific case it is not behind any vswitch,
>> since PNIC sends the packet directly yo guest driver).
>> Just for the sake of further discussion we will refer the PHYSICAL
>> PORT/VM corresponding to external port as SRIOV PORT/VM.
>> c. Now from OVN perspective, packets from SRIOV VM will enter the OVN
>> flow pipeline via localnet port (on the Active HA Chassis).
>> d. For DHCP requests, the logical switch pipeline responds.
>> e. Now, we were trying to get the routing working.
>>
>> Based on the understanding mentioned above, i tried following scenario
>> and observed following:
>> a. SRIOV VM could talk to endpoints on other LS attached to LR via Active
>> gateway/HA chassis.
>>
>>
>> https://docs.google.com/document/d/117yskeP1S3qHmkNrBrZ0PxXCvMJCCVJhcPG4phw1Qls/edit?usp=sharing
>> [
>> https://lh5.googleusercontent.com/vhEZ3Ws4BXXkqoxc7naEXKPFC9qMqhJxKhumRzSyXsw9L1jbDc4B4r8cHBZZfOl9J6vYwgtyVA=w1200-h630-p
>> ]<
>> https://docs.google.com/document/d/117yskeP1S3qHmkNrBrZ0PxXCvMJCCVJhcPG4phw1Qls/edit?usp=sharing
>> >
>> OVN EXTERNAL PORT ROUTING<
>> https://docs.google.com/document/d/117yskeP1S3qHmkNrBrZ0PxXCvMJCCVJhcPG4phw1Qls/edit?usp=sharing
>> >
>> OVN EXTERNAL PORT EW ROUTING LOGICAL TOPOLOGY CONFIGURATION LOGICAL
>> ROUTER router 2bd894b1-81a0-4095-9c58-0472aa5de19d (router) port
>> router-to-gvlan1 mac: "00:00:01:01:02:03" networks: ["20.0.0.1/24"] port
>> router-to-underlay mac: "00:00:01:01:02:...
>> docs.google.com
>>
>> Explanation:
>> a. Since packets are coming through localnet port, hence from Router
>> perspective, it is an external endpoint, i.e NS.
>> b. Now for such cases, Router is designed to respond to ARP requests ONLY
>> on gateway chassis and since we have centralized a chassis for NS now,
>> hence from the Gateway chassis there is no MAC replacement.
>> c. Based on a. and b. above, i attached the same gateway chassis to LRP
>> (Logical Router Port) connecting to SRIOV PORT's LS.
>> d. And routing across LR connected Switches worked fine.
>> e. Mac table on TOR and ARP table on SRIOV VM was also fine, i.e ARP
>> cache had  and Mac table had an entry for
>> router port mac.
>>
>>
>> Inference:
>> a. For Routing, traffic from localnet ports has to enter via gateway node.
>> b. Hence, the Router port which connects to SR IOV VM Logical Switch has
>> to be on same gateway node as corresponding external port (Which can be
>> achieved easily by attaching same HA chassis group to both).
>>
>
> If you see the BZ here -
> https://bugzilla.redhat.com/show_bug.cgi?id=1829762,  Openstack
> networking ovn folks want this to be decoupled. According to them, it will
> be hard to maintain the logic to collocate the
> ha chassis group of external ports with the gateway chassis ports. I have
> CC'd them to get more comments from them.
>
> There is another problem:
> Let's say we have 3 logical switches - ls_vlan4 (10.0.0.0/24) , ls_vlan5 (
> 20.0.0.0/24) and ls_public (172.168.0.0/24) connected to a logical router
> lr0. And all are VLAN bridge networks
> but only ls_public provides N/S traffic. ls_vlan4 and ls_vlan5 are tenant
> bridged networks. I think this is a very common use case.
>
> In this scenario, you will have one l3gateway port - lr0-public
> connecting to logical switch ls_public. So there will be a set of
> gateway_chassis created for this.
>
> To solve this issue, we can also make the router ports - lr0-vlan4 and
> lr0-vlan5 (connecting to logical switches ls_vlan4 and ls_vlan5) as gateway
> ports by (once we enhance OVN
> to support multiple gateway ports), but if you see the logical
> swithes ls_vlan4 and ls_vlan5 in this example don't provide any N/S
> connectivity and hence it may not be accurate
> to consider them as having gateway ports.
>
> And the subnets (here 10.0.0.0/24 and 20.0.0.0/24) of ls_vlan4 and
> ls_vlan5 are internal to OVN and ideally there should be no external entity
> sharing the IP address from these subnets and using the same
> VLAN. Although practically it's possible to do so. If there is such a
> requirement, I think CMS should consider using 'external' port for this. I
> had noticed the issue you mentioned below in (c) long time back,
> I didn't think to bother much because of the reason I mentioned now. (i.e
> these subnets are internal to OVN)
>
> 

[ovs-dev] [PATCH ovn 2/2] ovn-northd: Don't send the pkt to conntrack for NAT if its not destined for LB VIP.

2020-07-22 Thread numans
From: Numan Siddique 

Presently when a logical switch has load balancer(s) associated to it, then the
packet is still sent to conntrack with the action ct_lb on both the ingress
and egress logical switch pipeline even if the destination IP is not LB VIP.

This is because below logical flows are hit:

In the ingress logical switch pipeline:
  - table=9 (ls_in_lb   ), priority=65535, match=(ct.est && !ct.rel && 
!ct.new && !ct.inv), action=(reg0[2] = 1; next;)
  - table=10(ls_in_stateful ), priority=100  , match=(reg0[2] == 1), 
action=(ct_lb;)

In the egress logical switch pipeline:
  - table=3 (ls_out_lb  ), priority=65535, match=(ct.est && !ct.rel && 
!ct.new && !ct.inv), action=(reg0[2] = 1; next;)
  - table=7 (ls_out_stateful), priority=100  , match=(reg0[2] == 1), 
action=(ct_lb;)

This patch avoid unnecessary ct actions by setting the ct_mark to 0x1/0x1 when 
the ct_lb(backends=...) action
is applied for NEW connections and updating the above logical flows to check 
for this mark:

 - table=9 (ls_in_lb), priority=65535, match=(ct.est && !ct.rel && !ct.new && 
!ct.inv && ct.mark == 1/1),
   action=(reg0[2] = 1; next;)

 - table=3 (ls_out_lb), priority=65535, match=(ct.est && !ct.rel && !ct.new && 
!ct.inv && ct.mark == 1/1),
   action=(reg0[2] = 1; next;)

Signed-off-by: Numan Siddique 
---
 lib/actions.c|   3 +-
 lib/logical-fields.c |   1 +
 northd/ovn-northd.c  |   6 ++-
 tests/ovn.at |  17 +++---
 tests/system-ovn.at  | 122 +--
 5 files changed, 77 insertions(+), 72 deletions(-)

diff --git a/lib/actions.c b/lib/actions.c
index e14907e3d4..ae61ec4d84 100644
--- a/lib/actions.c
+++ b/lib/actions.c
@@ -1159,7 +1159,8 @@ encode_CT_LB(const struct ovnact_ct_lb *cl,
 if (dst->port) {
 ds_put_format(, ":%"PRIu16, dst->port);
 }
-ds_put_format(, "),commit,table=%d,zone=NXM_NX_REG%d[0..15])",
+ds_put_format(, "),commit,table=%d,zone=NXM_NX_REG%d[0..15],"
+  "exec(set_field:2/3->ct_label))",
   recirc_table, zone_reg);
 }
 
diff --git a/lib/logical-fields.c b/lib/logical-fields.c
index 8639523ea1..a592479423 100644
--- a/lib/logical-fields.c
+++ b/lib/logical-fields.c
@@ -127,6 +127,7 @@ ovn_init_symtab(struct shash *symtab)
 
 expr_symtab_add_field(symtab, "ct_label", MFF_CT_LABEL, NULL, false);
 expr_symtab_add_subfield(symtab, "ct_label.blocked", NULL, "ct_label[0]");
+expr_symtab_add_subfield(symtab, "ct_label.natted", NULL, "ct_label[1]");
 
 expr_symtab_add_field(symtab, "ct_state", MFF_CT_STATE, NULL, false);
 
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index ac1da73342..ce4432318d 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -5750,10 +5750,12 @@ build_lb(struct ovn_datapath *od, struct hmap *lflows)
  *
  * Send established traffic through conntrack for just NAT. */
 ovn_lflow_add(lflows, od, S_SWITCH_IN_LB, UINT16_MAX - 1,
-  "ct.est && !ct.rel && !ct.new && !ct.inv",
+  "ct.est && !ct.rel && !ct.new && !ct.inv && "
+  "ct_label.natted == 1",
   REGBIT_CONNTRACK_NAT" = 1; next;");
 ovn_lflow_add(lflows, od, S_SWITCH_OUT_LB, UINT16_MAX - 1,
-  "ct.est && !ct.rel && !ct.new && !ct.inv",
+  "ct.est && !ct.rel && !ct.new && !ct.inv && "
+  "ct_label.natted == 1",
   REGBIT_CONNTRACK_NAT" = 1; next;");
 }
 }
diff --git a/tests/ovn.at b/tests/ovn.at
index e19efafbe2..2930721d7c 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -195,6 +195,7 @@ ct.snat = ct_state[6]
 ct.trk = ct_state[5]
 ct_label = NXM_NX_CT_LABEL
 ct_label.blocked = ct_label[0]
+ct_label.natted = ct_label[1]
 ct_mark = NXM_NX_CT_MARK
 ct_state = NXM_NX_CT_STATE
 ]])
@@ -997,17 +998,17 @@ ct_lb(192.168.1.2:80, 192.168.1.3:80);
 Syntax error at `192.168.1.2' expecting backends.
 ct_lb(backends=192.168.1.2:80,192.168.1.3:80);
 encodes as group:1
-uses group: id(1), 
name(type=select,selection_method=dp_hash,bucket=bucket_id=0,weight:100,actions=ct(nat(dst=192.168.1.2:80),commit,table=19,zone=NXM_NX_REG13[0..15]),bucket=bucket_id=1,weight:100,actions=ct(nat(dst=192.168.1.3:80),commit,table=19,zone=NXM_NX_REG13[0..15]))
+uses group: id(1), 
name(type=select,selection_method=dp_hash,bucket=bucket_id=0,weight:100,actions=ct(nat(dst=192.168.1.2:80),commit,table=19,zone=NXM_NX_REG13[0..15],exec(set_field:2/3->ct_label)),bucket=bucket_id=1,weight:100,actions=ct(nat(dst=192.168.1.3:80),commit,table=19,zone=NXM_NX_REG13[0..15],exec(set_field:2/3->ct_label)))
 has prereqs ip
 ct_lb(backends=192.168.1.2, 192.168.1.3, );
 formats as ct_lb(backends=192.168.1.2,192.168.1.3);
 encodes as group:2
-uses group: id(2), 

[ovs-dev] [PATCH ovn 1/2] ovn-northd: Don't send the pkt to conntrack if it is to be routed in egress stage.

2020-07-22 Thread numans
From: Numan Siddique 

If there is a logical port 'P1' with the IP - 10.0.0.3 and a logical port 'P2' 
with
the IP 20.0.0.3 and if the logical switch of 'P1' has atleast one load balancer
associated with it and atleast one ACL with allow-related action associated 
with it.
Then for every packet from 'P1' to 'P2' after the TCP connection
is established we see a total of 4 recirculations in the datapath on the chassis
claiming 'P1'. This is because,

In the ingress logical switch pipeline, below logical flows are hit
  - table=9 (ls_in_lb   ), priority=65535, match=(ct.est && !ct.rel && 
!ct.new && !ct.inv), action=(reg0[2] = 1; next;)
  - table=10(ls_in_stateful ), priority=100  , match=(reg0[2] == 1), 
action=(ct_lb;)

And in the egress logical switch pipeline, below logical flows are hit
 - table=0 (ls_out_pre_lb  ), priority=100  , match=(ip), action=(reg0[0] = 
1; next;)
 - table=2 (ls_out_pre_stateful), priority=100  , match=(reg0[0] == 1), 
action=(ct_next;)
 - table=3 (ls_out_lb  ), priority=65535, match=(ct.est && !ct.rel && 
!ct.new && !ct.inv), action=(reg0[2] = 1; next;)
 - table=7 (ls_out_stateful), priority=100  , match=(reg0[2] == 1), 
action=(ct_lb;)

In the above example, when the packet enters the egress pipeline and since it 
needs to
enter the router pipeline, we can skip setting reg0[0] if outport is peer port 
of
logical router port. There is no need to send the packet to conntrack in this 
case.

This patch handles this case for router ports. Next patch in the series avoids 
sending to
conntrack with the action - ct_lb if the packet is not destined to the LB VIP.

With the present master for the above example, we see total of 4 recirculations 
on the
chassis claiming the lport 'P1'. With this patch we see only 2 recirculations.

Signed-off-by: Numan Siddique 
---
 northd/ovn-northd.8.xml | 33 -
 northd/ovn-northd.c | 39 ++-
 2 files changed, 62 insertions(+), 10 deletions(-)

diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index eb2514f15c..3a34e0ad7f 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -366,6 +366,15 @@
   db="OVN_Northbound"/> table.
 
 
+
+  This table also has a priority-110 flow with the match
+  inport == I for all logical switch
+  datapaths to move traffic to the next table. Where I
+  is the peer of a logical router port. This flow is added to
+  skip the connection tracking of packets which enter from
+  logical router datapath to logical switch datapath.
+
+
 Ingress Table 5: Pre-stateful
 
 
@@ -533,7 +542,20 @@
 
 
   It contains a priority-0 flow that simply moves traffic to the next
-  table.  For established connections a priority 100 flow matches on
+  table.
+
+
+
+  A priority-65535 flow with the match
+  inport == I for all logical switch
+  datapaths to move traffic to the next table. Where I
+  is the peer of a logical router port. This flow is added to
+  skip the connection tracking of packets which enter from
+  logical router datapath to logical switch datapath.
+
+
+
+  For established connections a priority 65534 flow matches on
   ct.est  !ct.rel  !ct.new 
   !ct.inv and sets an action reg0[2] = 1; next; to act
   as a hint for table Stateful to send packets through
@@ -1359,6 +1381,15 @@ output;
   db="OVN_Northbound"/> table.
 
 
+
+  This table also has a priority-110 flow with the match
+  outport == I for all logical switch
+  datapaths to move traffic to the next table. Where I
+  is the peer of a logical router port. This flow is added to
+  skip the connection tracking of packets which will be entering
+  logical router datapath from logical switch datapath for routing.
+
+
 Egress Table 2: Pre-stateful
 
 
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 192198272a..ac1da73342 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -4850,8 +4850,9 @@ build_lswitch_output_port_sec(struct hmap *ports, struct 
hmap *datapaths,
 }
 
 static void
-build_pre_acl_flows(struct ovn_datapath *od, struct ovn_port *op,
-struct hmap *lflows)
+skip_port_from_conntrack(struct ovn_datapath *od, struct ovn_port *op,
+ enum ovn_stage in_stage, enum ovn_stage out_stage,
+ uint16_t priority, struct hmap *lflows)
 {
 /* Can't use ct() for router ports. Consider the following configuration:
  * lp1(10.0.0.2) on hostA--ls1--lr0--ls2--lp2(10.0.1.2) on hostB, For a
@@ -4867,10 +4868,10 @@ build_pre_acl_flows(struct ovn_datapath *od, struct 
ovn_port *op,
 
 ds_put_format(_in, "ip && inport == %s", op->json_key);
 ds_put_format(_out, "ip && outport == %s", op->json_key);
-ovn_lflow_add_with_hint(lflows, od, S_SWITCH_IN_PRE_ACL, 110,
+

[ovs-dev] [PATCH v3] bfd: Support overlay BFD

2020-07-22 Thread Yifeng Sun
Current OVS intercepts and processes all BFD packets, thus VM-2-VM
BFD packets get lost and the recipient VM never sees them.

This patch fixes it by only intercepting and processing BFD packets
destined to a configured BFD instance, and other BFD packets are made
available to the OVS flow table for forwarding.

This patch keeps BFD's backward compatibility.

VMWare-BZ: 2579326
Signed-off-by: Yifeng Sun 
---
v1->v2: Add test by William's suggestion.
v2->v3: Fix BFD config, thanks William.

 lib/bfd.c   | 16 +---
 tests/system-traffic.at | 43 +++
 vswitchd/vswitch.xml|  7 +++
 3 files changed, 63 insertions(+), 3 deletions(-)

diff --git a/lib/bfd.c b/lib/bfd.c
index cc8c6857afa4..3c965699ace3 100644
--- a/lib/bfd.c
+++ b/lib/bfd.c
@@ -149,6 +149,9 @@ BUILD_ASSERT_DECL(BFD_PACKET_LEN == sizeof(struct msg));
 #define FLAGS_MASK 0x3f
 #define DEFAULT_MULT 3
 
+#define BFD_DEFAULT_SRC_IP 0xA9FE0101 /* 169.254.1.1 */
+#define BFD_DEFAULT_DST_IP 0xA9FE0100 /* 169.254.1.0 */
+
 struct bfd {
 struct hmap_node node;/* In 'all_bfds'. */
 uint32_t disc;/* bfd.LocalDiscr. Key in 'all_bfds' hmap. */
@@ -457,9 +460,9 @@ bfd_configure(struct bfd *bfd, const char *name, const 
struct smap *cfg,
  >rmt_eth_dst);
 
 bfd_lookup_ip(smap_get_def(cfg, "bfd_src_ip", ""),
-  htonl(0xA9FE0101) /* 169.254.1.1 */, >ip_src);
+  htonl(BFD_DEFAULT_SRC_IP), >ip_src);
 bfd_lookup_ip(smap_get_def(cfg, "bfd_dst_ip", ""),
-  htonl(0xA9FE0100) /* 169.254.1.0 */, >ip_dst);
+  htonl(BFD_DEFAULT_DST_IP), >ip_dst);
 
 forwarding_if_rx = smap_get_bool(cfg, "forwarding_if_rx", false);
 if (bfd->forwarding_if_rx != forwarding_if_rx) {
@@ -674,7 +677,14 @@ bfd_should_process_flow(const struct bfd *bfd_, const 
struct flow *flow,
 memset(>masks.nw_proto, 0xff, sizeof wc->masks.nw_proto);
 if (flow->nw_proto == IPPROTO_UDP
 && !(flow->nw_frag & FLOW_NW_FRAG_LATER)
-&& tp_dst_equals(flow, BFD_DEST_PORT, wc)) {
+&& tp_dst_equals(flow, BFD_DEST_PORT, wc)
+&& (bfd->ip_src == htonl(BFD_DEFAULT_SRC_IP)
+|| bfd->ip_src == flow->nw_dst)) {
+
+if (bfd->ip_src == flow->nw_dst) {
+memset(>masks.nw_dst, 0x, sizeof wc->masks.nw_dst);
+}
+
 bool check_tnl_key;
 
 atomic_read_relaxed(>check_tnl_key, _tnl_key);
diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index 2a0fbadff4a1..5c1aee49aba2 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -6289,3 +6289,46 @@ OVS_WAIT_UNTIL([cat p2.pcap | egrep "0x0050: * * 
*5002 *2000 *b85e *
 
 OVS_TRAFFIC_VSWITCHD_STOP
 AT_CLEANUP
+
+AT_SETUP([bfd - BFD overlay])
+OVS_CHECK_GENEVE()
+
+OVS_TRAFFIC_VSWITCHD_START()
+
+AT_CHECK([ovs-vsctl -- set bridge br0 
other-config:hwaddr=\"f2:ff:00:00:00:01\"])
+ADD_BR([br-underlay], [set bridge br-underlay 
other-config:hwaddr=\"ee:09:e0:4d:bf:31\"])
+
+AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
+AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"])
+
+ADD_NAMESPACES(at_ns0)
+
+dnl Set up underlay link from host into the namespace using veth pair.
+ADD_VETH(p0, at_ns0, br-underlay, "172.16.180.105/24", 4e:12:5d:6c:74:3d)
+AT_CHECK([ip addr add dev br-underlay "172.16.180.106/24"])
+AT_CHECK([ip link set dev br-underlay up])
+
+dnl Set up tunnel endpoints on OVS outside the namespace.
+ADD_OVS_TUNNEL([geneve], [br0], [at_gnv0], [172.16.180.105], 
[192.168.10.100/24],
+[options:packet_type=ptap])
+AT_CHECK([ovs-vsctl -- set Interface at_gnv0 bfd:enable=true 
bfd:bfd_src_ip=172.16.180.106])
+
+dnl Certain Linux distributions, like CentOS, have default iptable rules
+dnl to reject input traffic from br-underlay. Here we add a rule to walk
+dnl around it.
+iptables -I INPUT 1 -i br-underlay -j ACCEPT
+on_exit 'iptables -D INPUT 1'
+
+dnl Firstly, test normal BFD packet.
+ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1 
packet=ee09e04dbf314e125d6c743d080045c0006624d94000401153f9ac10b469ac10b46a739f17c1005247f900806558002320016a0f5a9ed376080045c00034ff11fa03ac10b469ac10b46ac0070ec8002021c003187b3c96ebbc96b962000186af4240
 actions=NORMAL"
+dnl Next we test overlay BFD packet.
+ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1 
packet=ee09e04dbf314e125d6c743d0800455b2558400040115445ac10b469ac10b46a6b1017c1004722c10240655801048001001803005254009d0b6d5254000c8984080045210e22400040119688c0a80a68c0a80a6995cd0ec8000dd342746573740a
 actions=NORMAL"
+
+ovs-dpctl dump-flows > datapath-flows.txt
+dnl BFD packet was handled by BFD flow. Ipv4.dst is matched.
+AT_FAIL_IF([cat datapath-flows.txt | egrep 

Re: [ovs-dev] [PATCH v2] bfd: Support overlay BFD

2020-07-22 Thread Yifeng Sun
You are correct, I will fix BFD config in v3.

For the overlay BFD packet, we don't set up a port to handle packets
targeted at 192.168.10.105. So ovs simply drops them.

On Wed, Jul 22, 2020 at 11:26 AM William Tu  wrote:

> On Wed, Jul 22, 2020 at 11:02:32AM -0700, Yifeng Sun wrote:
> > Thanks for reviewing.
> >
> > For these two packets:
> >
> > dnl outer IP: Source: 172.16.180.105 Destination: 172.16.180.106
> > This one is normal BFD packet, bfd_should_process_flow should return
> > true, as used to.
> >
> > dnl inner IP: Source: 192.168.10.104 Destination: 192.168.10.105
> > This one is overlay BFD packet, bfd_should_process_flow should return
> > false so this packet won't be intercepted by OVS's BFD engine.
>
> So you add an additional condition here:
>  (bfd->ip_src == htonl(BFD_DEFAULT_SRC_IP)
> || bfd->ip_src == flow->nw_dst))
>
> How come the first packt is true, and the second packet is false in
> the above condition?
>
> you didn't set bfd_src_ip in the test, so what's the value of bfd->ip_src?
>
> another question below
>
> > > > --- a/tests/system-traffic.at
> > > > +++ b/tests/system-traffic.at
> > > > @@ -6289,3 +6289,45 @@ OVS_WAIT_UNTIL([cat p2.pcap | egrep "0x0050:
> > > * * *5002 *2000 *b85e *
> > > >
> > > >  OVS_TRAFFIC_VSWITCHD_STOP
> > > >  AT_CLEANUP
> > > > +
> > > > +AT_SETUP([bfd - BFD overlay])
> > > > +OVS_CHECK_GENEVE()
> > > > +
> > > > +OVS_TRAFFIC_VSWITCHD_START()
> > > > +
> > > > +AT_CHECK([ovs-vsctl -- set bridge br0
> > > other-config:hwaddr=\"f2:ff:00:00:00:01\"])
> > > > +ADD_BR([br-underlay], [set bridge br-underlay
> > > other-config:hwaddr=\"ee:09:e0:4d:bf:31\"])
> > > > +
> > > > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
> > > > +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"])
> > > > +
> > > > +ADD_NAMESPACES(at_ns0)
> > > > +
> > > > +dnl Set up underlay link from host into the namespace using veth
> pair.
> > > > +ADD_VETH(p0, at_ns0, br-underlay, "172.16.180.105/24",
> > > 4e:12:5d:6c:74:3d)
> > > > +AT_CHECK([ip addr add dev br-underlay "172.16.180.106/24"])
> > > > +AT_CHECK([ip link set dev br-underlay up])
> > > > +
> > > > +dnl Set up tunnel endpoints on OVS outside the namespace.
> > > > +ADD_OVS_TUNNEL([geneve], [br0], [at_gnv0], [172.16.180.105], [
> > > 192.168.10.100/24],
> > > > +[options:packet_type=ptap])
> > > > +
> > > > +dnl Certain Linux distributions, like CentOS, have default iptable
> rules
> > > > +dnl to reject input traffic from br-underlay. Here we add a rule to
> walk
> > > > +dnl around it.
> > > > +iptables -I INPUT 1 -i br-underlay -j ACCEPT
> > > > +on_exit 'iptables -D INPUT 1'
> > > > +
> > > > +dnl Firstly, test normal BFD packet.
> > > > +ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1
> > >
> packet=ee09e04dbf314e125d6c743d080045c0006624d94000401153f9ac10b469ac10b46a739f17c1005247f900806558002320016a0f5a9ed376080045c00034ff11fa03ac10b469ac10b46ac0070ec8002021c003187b3c96ebbc96b962000186af4240
> > > actions=NORMAL"
> > > The packet above is
> > > dnl outer IP: Source: 172.16.180.105 Destination: 172.16.180.106
> > > dnl inner IP: Source: 172.16.180.105 Destination: 172.16.180.106
> > >
> > > And why does this trigger ovs to process it ex:
> bfd_should_process_flow()
> > > return true?
> > > In your patch, you're adding extra check
> > > bfd->ip_src == flow->nw_dst
> > > and here
> > > bfd->ip_src is default 169.254.1.1
> > > flow->nw_dst is 172.16.180.106
> > >
> > >
> > > > +dnl Next we test overlay BFD packet.
> > > > +ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1
> > >
> packet=ee09e04dbf314e125d6c743d0800455b2558400040115445ac10b469ac10b46a6b1017c1004722c10240655801048001001803005254009d0b6d5254000c8984080045210e22400040119688c0a80a68c0a80a6995cd0ec8000dd342746573740a
> > > actions=NORMAL"
>
> the 2nd packet is NORMAL
> > >
> > > The packet above is
> > > dnl outer IP: Source: 172.16.180.105 Destination: 172.16.180.106
> > > dnl inner IP: Source: 192.168.10.104 Destination: 192.168.10.105
> > > So the bfd_should_process_flow returns false.
> > >
> > > > +
> > > > +ovs-dpctl dump-flows > datapath-flows.txt
> > > > +dnl BFD packet was handled by BFD flow.
> > > > +AT_FAIL_IF([cat datapath-flows.txt | egrep
> > >
> "tunnel\(.*,src=172.16.180.105,dst=172.16.180.106,.*,eth\(\),eth_type\(0x0800\),ipv4\(proto=17,.*\),udp\(dst=3784\),
> > > .*, actions:userspace\(pid=.*,slow_path\(bfd\)\)" 2>&1 1>/dev/null])
> > > > +dnl Overlay BFD packet was handled by non-BFD flows.
> > > > +AT_FAIL_IF([cat datapath-flows.txt | egrep
> > >
> "tunnel\(.*,src=172.16.180.105,dst=172.16.180.106,.*,eth\(.*\),eth_type\(0x0800\),ipv4\(src=192.168.10.104,dst=192.168.10.105,proto=17,.*\),udp\(.*,dst=3784\),
> > > .*, actions:drop" 2>&1 1>/dev/null])
>
> But why it is drop here?
>
> > > > +
> > > > +OVS_TRAFFIC_VSWITCHD_STOP
> > > > +AT_CLEANUP
> > >
> > >
>
___

[ovs-dev] OVN nb-db and sb-db out of sync

2020-07-22 Thread Tony Liu
Hi,

During a scaling test where 4000 networks are created from OpenStack, I see that
nb-db and sb-db are out of sync. All 4000 logical switches and 8000 LS ports
(GW port and service port of each network) are created in nb-db. In sb-db,
only 1567 port-bindings, 4000 is expected.

[root@ovn-db-2 ~]# ovn-nbctl list nb_global
_uuid   : b7b3aa05-f7ed-4dbc-979f-10445ac325b8
connections : []
external_ids: {"neutron:liveness_check_at"="2020-07-22 
04:03:17.726917+00:00"}
hv_cfg  : 312
ipsec   : false
name: ""
nb_cfg  : 2636
options : {mac_prefix="ca:e8:07", 
svc_monitor_mac="4e:d0:3a:80:d4:b7"}
sb_cfg  : 2005
ssl : []

[root@ovn-db-2 ~]# ovn-sbctl list sb_global
_uuid   : 3720bc1d-b0da-47ce-85ca-96fa8d398489
connections : []
external_ids: {}
ipsec   : false
nb_cfg  : 312
options : {mac_prefix="ca:e8:07", 
svc_monitor_mac="4e:d0:3a:80:d4:b7"}
ssl : []

Is there any way to force ovn-northd to rebuild sb-db to sync with nb-db,
like manipulating nb_cfg or anything else? Note, it's 3-node RAFT cluster for 
both
nb-db and sb-db.

Is that "incremental update" implemented in 20.03?
If not, in which release it's going to be available?


Thanks!

Tony

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] bfd: Support overlay BFD

2020-07-22 Thread William Tu
On Wed, Jul 22, 2020 at 11:02:32AM -0700, Yifeng Sun wrote:
> Thanks for reviewing.
> 
> For these two packets:
> 
> dnl outer IP: Source: 172.16.180.105 Destination: 172.16.180.106
> This one is normal BFD packet, bfd_should_process_flow should return
> true, as used to.
> 
> dnl inner IP: Source: 192.168.10.104 Destination: 192.168.10.105
> This one is overlay BFD packet, bfd_should_process_flow should return
> false so this packet won't be intercepted by OVS's BFD engine.

So you add an additional condition here:
 (bfd->ip_src == htonl(BFD_DEFAULT_SRC_IP)
|| bfd->ip_src == flow->nw_dst))

How come the first packt is true, and the second packet is false in
the above condition?

you didn't set bfd_src_ip in the test, so what's the value of bfd->ip_src?

another question below

> > > --- a/tests/system-traffic.at
> > > +++ b/tests/system-traffic.at
> > > @@ -6289,3 +6289,45 @@ OVS_WAIT_UNTIL([cat p2.pcap | egrep "0x0050:
> > * * *5002 *2000 *b85e *
> > >
> > >  OVS_TRAFFIC_VSWITCHD_STOP
> > >  AT_CLEANUP
> > > +
> > > +AT_SETUP([bfd - BFD overlay])
> > > +OVS_CHECK_GENEVE()
> > > +
> > > +OVS_TRAFFIC_VSWITCHD_START()
> > > +
> > > +AT_CHECK([ovs-vsctl -- set bridge br0
> > other-config:hwaddr=\"f2:ff:00:00:00:01\"])
> > > +ADD_BR([br-underlay], [set bridge br-underlay
> > other-config:hwaddr=\"ee:09:e0:4d:bf:31\"])
> > > +
> > > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
> > > +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"])
> > > +
> > > +ADD_NAMESPACES(at_ns0)
> > > +
> > > +dnl Set up underlay link from host into the namespace using veth pair.
> > > +ADD_VETH(p0, at_ns0, br-underlay, "172.16.180.105/24",
> > 4e:12:5d:6c:74:3d)
> > > +AT_CHECK([ip addr add dev br-underlay "172.16.180.106/24"])
> > > +AT_CHECK([ip link set dev br-underlay up])
> > > +
> > > +dnl Set up tunnel endpoints on OVS outside the namespace.
> > > +ADD_OVS_TUNNEL([geneve], [br0], [at_gnv0], [172.16.180.105], [
> > 192.168.10.100/24],
> > > +[options:packet_type=ptap])
> > > +
> > > +dnl Certain Linux distributions, like CentOS, have default iptable rules
> > > +dnl to reject input traffic from br-underlay. Here we add a rule to walk
> > > +dnl around it.
> > > +iptables -I INPUT 1 -i br-underlay -j ACCEPT
> > > +on_exit 'iptables -D INPUT 1'
> > > +
> > > +dnl Firstly, test normal BFD packet.
> > > +ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1
> > packet=ee09e04dbf314e125d6c743d080045c0006624d94000401153f9ac10b469ac10b46a739f17c1005247f900806558002320016a0f5a9ed376080045c00034ff11fa03ac10b469ac10b46ac0070ec8002021c003187b3c96ebbc96b962000186af4240
> > actions=NORMAL"
> > The packet above is
> > dnl outer IP: Source: 172.16.180.105 Destination: 172.16.180.106
> > dnl inner IP: Source: 172.16.180.105 Destination: 172.16.180.106
> >
> > And why does this trigger ovs to process it ex: bfd_should_process_flow()
> > return true?
> > In your patch, you're adding extra check
> > bfd->ip_src == flow->nw_dst
> > and here
> > bfd->ip_src is default 169.254.1.1
> > flow->nw_dst is 172.16.180.106
> >
> >
> > > +dnl Next we test overlay BFD packet.
> > > +ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1
> > packet=ee09e04dbf314e125d6c743d0800455b2558400040115445ac10b469ac10b46a6b1017c1004722c10240655801048001001803005254009d0b6d5254000c8984080045210e22400040119688c0a80a68c0a80a6995cd0ec8000dd342746573740a
> > actions=NORMAL"

the 2nd packet is NORMAL
> >
> > The packet above is
> > dnl outer IP: Source: 172.16.180.105 Destination: 172.16.180.106
> > dnl inner IP: Source: 192.168.10.104 Destination: 192.168.10.105
> > So the bfd_should_process_flow returns false.
> >
> > > +
> > > +ovs-dpctl dump-flows > datapath-flows.txt
> > > +dnl BFD packet was handled by BFD flow.
> > > +AT_FAIL_IF([cat datapath-flows.txt | egrep
> > "tunnel\(.*,src=172.16.180.105,dst=172.16.180.106,.*,eth\(\),eth_type\(0x0800\),ipv4\(proto=17,.*\),udp\(dst=3784\),
> > .*, actions:userspace\(pid=.*,slow_path\(bfd\)\)" 2>&1 1>/dev/null])
> > > +dnl Overlay BFD packet was handled by non-BFD flows.
> > > +AT_FAIL_IF([cat datapath-flows.txt | egrep
> > "tunnel\(.*,src=172.16.180.105,dst=172.16.180.106,.*,eth\(.*\),eth_type\(0x0800\),ipv4\(src=192.168.10.104,dst=192.168.10.105,proto=17,.*\),udp\(.*,dst=3784\),
> > .*, actions:drop" 2>&1 1>/dev/null])

But why it is drop here?

> > > +
> > > +OVS_TRAFFIC_VSWITCHD_STOP
> > > +AT_CLEANUP
> >
> >
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] bfd: Support overlay BFD

2020-07-22 Thread Yifeng Sun
Please discard my previous email, I misunderstood your question.

  The packet above is
  dnl outer IP: Source: 172.16.180.105 Destination: 172.16.180.106
  dnl inner IP: Source: 192.168.10.104 Destination: 192.168.10.105
  So the bfd_should_process_flow returns false.

Yes, you are correct and this patch returns false in this case.

For the above packet, outer IP is extracted as tunnel info, flow->nw_dst is
192.168.10.105.
So bfd_should_process_flow returns false.

Thanks,
Yifeng


On Wed, Jul 22, 2020 at 11:02 AM Yifeng Sun  wrote:

> Thanks for reviewing.
>
> For these two packets:
>
> dnl outer IP: Source: 172.16.180.105 Destination: 172.16.180.106
> This one is normal BFD packet, bfd_should_process_flow should return
> true, as used to.
>
> dnl inner IP: Source: 192.168.10.104 Destination: 192.168.10.105
> This one is overlay BFD packet, bfd_should_process_flow should return
> false so this packet won't be intercepted by OVS's BFD engine.
>
> Thanks,
> Yifeng
>
> On Wed, Jul 22, 2020 at 10:54 AM William Tu  wrote:
>
>> On Wed, Jul 22, 2020 at 01:59:04AM -0700, Yifeng Sun wrote:
>> > Current OVS intercepts and processes all BFD packets, thus VM-2-VM
>> > BFD packets get lost and the recipient VM never sees them.
>> >
>> > This patch fixes it by only intercepting and processing BFD packets
>> > destined to a configured BFD instance, and other BFD packets are made
>> > available to the OVS flow table for forwarding.
>> >
>> > This patch adds new test to validate BFD overlay.
>> >
>> > This patch keeps BFD's backward compatibility.
>> >
>> > Signed-off-by: Yifeng Sun 
>> > ---
>> > v1->v2: Add test by William's suggestion.
>> >
>> >  lib/bfd.c   | 16 +---
>> >  tests/system-traffic.at | 42
>> ++
>> >  vswitchd/vswitch.xml|  7 +++
>> >  3 files changed, 62 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/lib/bfd.c b/lib/bfd.c
>> > index cc8c6857afa4..3c965699ace3 100644
>> > --- a/lib/bfd.c
>> > +++ b/lib/bfd.c
>> > @@ -149,6 +149,9 @@ BUILD_ASSERT_DECL(BFD_PACKET_LEN == sizeof(struct
>> msg));
>> >  #define FLAGS_MASK 0x3f
>> >  #define DEFAULT_MULT 3
>> >
>> > +#define BFD_DEFAULT_SRC_IP 0xA9FE0101 /* 169.254.1.1 */
>> > +#define BFD_DEFAULT_DST_IP 0xA9FE0100 /* 169.254.1.0 */
>> > +
>> >  struct bfd {
>> >  struct hmap_node node;/* In 'all_bfds'. */
>> >  uint32_t disc;/* bfd.LocalDiscr. Key in 'all_bfds'
>> hmap. */
>> > @@ -457,9 +460,9 @@ bfd_configure(struct bfd *bfd, const char *name,
>> const struct smap *cfg,
>> >   >rmt_eth_dst);
>> >
>> >  bfd_lookup_ip(smap_get_def(cfg, "bfd_src_ip", ""),
>> > -  htonl(0xA9FE0101) /* 169.254.1.1 */, >ip_src);
>> > +  htonl(BFD_DEFAULT_SRC_IP), >ip_src);
>> >  bfd_lookup_ip(smap_get_def(cfg, "bfd_dst_ip", ""),
>> > -  htonl(0xA9FE0100) /* 169.254.1.0 */, >ip_dst);
>> > +  htonl(BFD_DEFAULT_DST_IP), >ip_dst);
>> >
>> >  forwarding_if_rx = smap_get_bool(cfg, "forwarding_if_rx", false);
>> >  if (bfd->forwarding_if_rx != forwarding_if_rx) {
>> > @@ -674,7 +677,14 @@ bfd_should_process_flow(const struct bfd *bfd_,
>> const struct flow *flow,
>> >  memset(>masks.nw_proto, 0xff, sizeof wc->masks.nw_proto);
>> >  if (flow->nw_proto == IPPROTO_UDP
>> >  && !(flow->nw_frag & FLOW_NW_FRAG_LATER)
>> > -&& tp_dst_equals(flow, BFD_DEST_PORT, wc)) {
>> > +&& tp_dst_equals(flow, BFD_DEST_PORT, wc)
>> > +&& (bfd->ip_src == htonl(BFD_DEFAULT_SRC_IP)
>> > +|| bfd->ip_src == flow->nw_dst)) {
>> > +
>> > +if (bfd->ip_src == flow->nw_dst) {
>> > +memset(>masks.nw_dst, 0x, sizeof
>> wc->masks.nw_dst);
>> > +}
>> > +
>> >  bool check_tnl_key;
>> >
>> >  atomic_read_relaxed(>check_tnl_key, _tnl_key);
>> > diff --git a/tests/system-traffic.at b/tests/system-traffic.at
>> > index 2a0fbadff4a1..80b58996d530 100644
>> > --- a/tests/system-traffic.at
>> > +++ b/tests/system-traffic.at
>> > @@ -6289,3 +6289,45 @@ OVS_WAIT_UNTIL([cat p2.pcap | egrep "0x0050:
>> * * *5002 *2000 *b85e *
>> >
>> >  OVS_TRAFFIC_VSWITCHD_STOP
>> >  AT_CLEANUP
>> > +
>> > +AT_SETUP([bfd - BFD overlay])
>> > +OVS_CHECK_GENEVE()
>> > +
>> > +OVS_TRAFFIC_VSWITCHD_START()
>> > +
>> > +AT_CHECK([ovs-vsctl -- set bridge br0
>> other-config:hwaddr=\"f2:ff:00:00:00:01\"])
>> > +ADD_BR([br-underlay], [set bridge br-underlay
>> other-config:hwaddr=\"ee:09:e0:4d:bf:31\"])
>> > +
>> > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
>> > +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"])
>> > +
>> > +ADD_NAMESPACES(at_ns0)
>> > +
>> > +dnl Set up underlay link from host into the namespace using veth pair.
>> > +ADD_VETH(p0, at_ns0, br-underlay, "172.16.180.105/24",
>> 4e:12:5d:6c:74:3d)
>> > +AT_CHECK([ip addr add dev br-underlay 

Re: [ovs-dev] [PATCH v2] bfd: Support overlay BFD

2020-07-22 Thread Yifeng Sun
Thanks for reviewing.

For these two packets:

dnl outer IP: Source: 172.16.180.105 Destination: 172.16.180.106
This one is normal BFD packet, bfd_should_process_flow should return
true, as used to.

dnl inner IP: Source: 192.168.10.104 Destination: 192.168.10.105
This one is overlay BFD packet, bfd_should_process_flow should return
false so this packet won't be intercepted by OVS's BFD engine.

Thanks,
Yifeng

On Wed, Jul 22, 2020 at 10:54 AM William Tu  wrote:

> On Wed, Jul 22, 2020 at 01:59:04AM -0700, Yifeng Sun wrote:
> > Current OVS intercepts and processes all BFD packets, thus VM-2-VM
> > BFD packets get lost and the recipient VM never sees them.
> >
> > This patch fixes it by only intercepting and processing BFD packets
> > destined to a configured BFD instance, and other BFD packets are made
> > available to the OVS flow table for forwarding.
> >
> > This patch adds new test to validate BFD overlay.
> >
> > This patch keeps BFD's backward compatibility.
> >
> > Signed-off-by: Yifeng Sun 
> > ---
> > v1->v2: Add test by William's suggestion.
> >
> >  lib/bfd.c   | 16 +---
> >  tests/system-traffic.at | 42 ++
> >  vswitchd/vswitch.xml|  7 +++
> >  3 files changed, 62 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/bfd.c b/lib/bfd.c
> > index cc8c6857afa4..3c965699ace3 100644
> > --- a/lib/bfd.c
> > +++ b/lib/bfd.c
> > @@ -149,6 +149,9 @@ BUILD_ASSERT_DECL(BFD_PACKET_LEN == sizeof(struct
> msg));
> >  #define FLAGS_MASK 0x3f
> >  #define DEFAULT_MULT 3
> >
> > +#define BFD_DEFAULT_SRC_IP 0xA9FE0101 /* 169.254.1.1 */
> > +#define BFD_DEFAULT_DST_IP 0xA9FE0100 /* 169.254.1.0 */
> > +
> >  struct bfd {
> >  struct hmap_node node;/* In 'all_bfds'. */
> >  uint32_t disc;/* bfd.LocalDiscr. Key in 'all_bfds'
> hmap. */
> > @@ -457,9 +460,9 @@ bfd_configure(struct bfd *bfd, const char *name,
> const struct smap *cfg,
> >   >rmt_eth_dst);
> >
> >  bfd_lookup_ip(smap_get_def(cfg, "bfd_src_ip", ""),
> > -  htonl(0xA9FE0101) /* 169.254.1.1 */, >ip_src);
> > +  htonl(BFD_DEFAULT_SRC_IP), >ip_src);
> >  bfd_lookup_ip(smap_get_def(cfg, "bfd_dst_ip", ""),
> > -  htonl(0xA9FE0100) /* 169.254.1.0 */, >ip_dst);
> > +  htonl(BFD_DEFAULT_DST_IP), >ip_dst);
> >
> >  forwarding_if_rx = smap_get_bool(cfg, "forwarding_if_rx", false);
> >  if (bfd->forwarding_if_rx != forwarding_if_rx) {
> > @@ -674,7 +677,14 @@ bfd_should_process_flow(const struct bfd *bfd_,
> const struct flow *flow,
> >  memset(>masks.nw_proto, 0xff, sizeof wc->masks.nw_proto);
> >  if (flow->nw_proto == IPPROTO_UDP
> >  && !(flow->nw_frag & FLOW_NW_FRAG_LATER)
> > -&& tp_dst_equals(flow, BFD_DEST_PORT, wc)) {
> > +&& tp_dst_equals(flow, BFD_DEST_PORT, wc)
> > +&& (bfd->ip_src == htonl(BFD_DEFAULT_SRC_IP)
> > +|| bfd->ip_src == flow->nw_dst)) {
> > +
> > +if (bfd->ip_src == flow->nw_dst) {
> > +memset(>masks.nw_dst, 0x, sizeof
> wc->masks.nw_dst);
> > +}
> > +
> >  bool check_tnl_key;
> >
> >  atomic_read_relaxed(>check_tnl_key, _tnl_key);
> > diff --git a/tests/system-traffic.at b/tests/system-traffic.at
> > index 2a0fbadff4a1..80b58996d530 100644
> > --- a/tests/system-traffic.at
> > +++ b/tests/system-traffic.at
> > @@ -6289,3 +6289,45 @@ OVS_WAIT_UNTIL([cat p2.pcap | egrep "0x0050:
> * * *5002 *2000 *b85e *
> >
> >  OVS_TRAFFIC_VSWITCHD_STOP
> >  AT_CLEANUP
> > +
> > +AT_SETUP([bfd - BFD overlay])
> > +OVS_CHECK_GENEVE()
> > +
> > +OVS_TRAFFIC_VSWITCHD_START()
> > +
> > +AT_CHECK([ovs-vsctl -- set bridge br0
> other-config:hwaddr=\"f2:ff:00:00:00:01\"])
> > +ADD_BR([br-underlay], [set bridge br-underlay
> other-config:hwaddr=\"ee:09:e0:4d:bf:31\"])
> > +
> > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
> > +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"])
> > +
> > +ADD_NAMESPACES(at_ns0)
> > +
> > +dnl Set up underlay link from host into the namespace using veth pair.
> > +ADD_VETH(p0, at_ns0, br-underlay, "172.16.180.105/24",
> 4e:12:5d:6c:74:3d)
> > +AT_CHECK([ip addr add dev br-underlay "172.16.180.106/24"])
> > +AT_CHECK([ip link set dev br-underlay up])
> > +
> > +dnl Set up tunnel endpoints on OVS outside the namespace.
> > +ADD_OVS_TUNNEL([geneve], [br0], [at_gnv0], [172.16.180.105], [
> 192.168.10.100/24],
> > +[options:packet_type=ptap])
> > +
> > +dnl Certain Linux distributions, like CentOS, have default iptable rules
> > +dnl to reject input traffic from br-underlay. Here we add a rule to walk
> > +dnl around it.
> > +iptables -I INPUT 1 -i br-underlay -j ACCEPT
> > +on_exit 'iptables -D INPUT 1'
> > +
> > +dnl Firstly, test normal BFD packet.
> > +ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1
> 

Re: [ovs-dev] [PATCH v2] bfd: Support overlay BFD

2020-07-22 Thread William Tu
On Wed, Jul 22, 2020 at 01:59:04AM -0700, Yifeng Sun wrote:
> Current OVS intercepts and processes all BFD packets, thus VM-2-VM
> BFD packets get lost and the recipient VM never sees them.
> 
> This patch fixes it by only intercepting and processing BFD packets
> destined to a configured BFD instance, and other BFD packets are made
> available to the OVS flow table for forwarding.
> 
> This patch adds new test to validate BFD overlay.
> 
> This patch keeps BFD's backward compatibility.
> 
> Signed-off-by: Yifeng Sun 
> ---
> v1->v2: Add test by William's suggestion.
> 
>  lib/bfd.c   | 16 +---
>  tests/system-traffic.at | 42 ++
>  vswitchd/vswitch.xml|  7 +++
>  3 files changed, 62 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/bfd.c b/lib/bfd.c
> index cc8c6857afa4..3c965699ace3 100644
> --- a/lib/bfd.c
> +++ b/lib/bfd.c
> @@ -149,6 +149,9 @@ BUILD_ASSERT_DECL(BFD_PACKET_LEN == sizeof(struct msg));
>  #define FLAGS_MASK 0x3f
>  #define DEFAULT_MULT 3
>  
> +#define BFD_DEFAULT_SRC_IP 0xA9FE0101 /* 169.254.1.1 */
> +#define BFD_DEFAULT_DST_IP 0xA9FE0100 /* 169.254.1.0 */
> +
>  struct bfd {
>  struct hmap_node node;/* In 'all_bfds'. */
>  uint32_t disc;/* bfd.LocalDiscr. Key in 'all_bfds' hmap. 
> */
> @@ -457,9 +460,9 @@ bfd_configure(struct bfd *bfd, const char *name, const 
> struct smap *cfg,
>   >rmt_eth_dst);
>  
>  bfd_lookup_ip(smap_get_def(cfg, "bfd_src_ip", ""),
> -  htonl(0xA9FE0101) /* 169.254.1.1 */, >ip_src);
> +  htonl(BFD_DEFAULT_SRC_IP), >ip_src);
>  bfd_lookup_ip(smap_get_def(cfg, "bfd_dst_ip", ""),
> -  htonl(0xA9FE0100) /* 169.254.1.0 */, >ip_dst);
> +  htonl(BFD_DEFAULT_DST_IP), >ip_dst);
>  
>  forwarding_if_rx = smap_get_bool(cfg, "forwarding_if_rx", false);
>  if (bfd->forwarding_if_rx != forwarding_if_rx) {
> @@ -674,7 +677,14 @@ bfd_should_process_flow(const struct bfd *bfd_, const 
> struct flow *flow,
>  memset(>masks.nw_proto, 0xff, sizeof wc->masks.nw_proto);
>  if (flow->nw_proto == IPPROTO_UDP
>  && !(flow->nw_frag & FLOW_NW_FRAG_LATER)
> -&& tp_dst_equals(flow, BFD_DEST_PORT, wc)) {
> +&& tp_dst_equals(flow, BFD_DEST_PORT, wc)
> +&& (bfd->ip_src == htonl(BFD_DEFAULT_SRC_IP)
> +|| bfd->ip_src == flow->nw_dst)) {
> +
> +if (bfd->ip_src == flow->nw_dst) {
> +memset(>masks.nw_dst, 0x, sizeof 
> wc->masks.nw_dst);
> +}
> +
>  bool check_tnl_key;
>  
>  atomic_read_relaxed(>check_tnl_key, _tnl_key);
> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
> index 2a0fbadff4a1..80b58996d530 100644
> --- a/tests/system-traffic.at
> +++ b/tests/system-traffic.at
> @@ -6289,3 +6289,45 @@ OVS_WAIT_UNTIL([cat p2.pcap | egrep "0x0050: * 
> * *5002 *2000 *b85e *
>  
>  OVS_TRAFFIC_VSWITCHD_STOP
>  AT_CLEANUP
> +
> +AT_SETUP([bfd - BFD overlay])
> +OVS_CHECK_GENEVE()
> +
> +OVS_TRAFFIC_VSWITCHD_START()
> +
> +AT_CHECK([ovs-vsctl -- set bridge br0 
> other-config:hwaddr=\"f2:ff:00:00:00:01\"])
> +ADD_BR([br-underlay], [set bridge br-underlay 
> other-config:hwaddr=\"ee:09:e0:4d:bf:31\"])
> +
> +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
> +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"])
> +
> +ADD_NAMESPACES(at_ns0)
> +
> +dnl Set up underlay link from host into the namespace using veth pair.
> +ADD_VETH(p0, at_ns0, br-underlay, "172.16.180.105/24", 4e:12:5d:6c:74:3d)
> +AT_CHECK([ip addr add dev br-underlay "172.16.180.106/24"])
> +AT_CHECK([ip link set dev br-underlay up])
> +
> +dnl Set up tunnel endpoints on OVS outside the namespace.
> +ADD_OVS_TUNNEL([geneve], [br0], [at_gnv0], [172.16.180.105], 
> [192.168.10.100/24],
> +[options:packet_type=ptap])
> +
> +dnl Certain Linux distributions, like CentOS, have default iptable rules
> +dnl to reject input traffic from br-underlay. Here we add a rule to walk
> +dnl around it.
> +iptables -I INPUT 1 -i br-underlay -j ACCEPT
> +on_exit 'iptables -D INPUT 1'
> +
> +dnl Firstly, test normal BFD packet.
> +ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1 
> packet=ee09e04dbf314e125d6c743d080045c0006624d94000401153f9ac10b469ac10b46a739f17c1005247f900806558002320016a0f5a9ed376080045c00034ff11fa03ac10b469ac10b46ac0070ec8002021c003187b3c96ebbc96b962000186af4240
>  actions=NORMAL"
The packet above is
dnl outer IP: Source: 172.16.180.105 Destination: 172.16.180.106
dnl inner IP: Source: 172.16.180.105 Destination: 172.16.180.106

And why does this trigger ovs to process it ex: bfd_should_process_flow() 
return true?
In your patch, you're adding extra check
bfd->ip_src == flow->nw_dst
and here
bfd->ip_src is default 169.254.1.1
flow->nw_dst is 172.16.180.106


> +dnl Next we test overlay BFD 

Re: [ovs-dev] [PATCH net-next 2/2] net: openvswitch: make masks cache size configurable

2020-07-22 Thread kernel test robot
Hi Eelco,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Eelco-Chaudron/net-openvswitch-masks-cache-enhancements/20200722-163017
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 
fa56a987449bcf4c1cb68369a187af3515b85c78
config: x86_64-randconfig-s022-20200719 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-14) 9.3.0
reproduce:
# apt-get install sparse
# sparse version: v0.6.2-49-g707c5017-dirty
# save the attached .config to linux build tree
make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


sparse warnings: (new ones prefixed by >>)

>> net/openvswitch/flow_table.c:376:23: sparse: sparse: incorrect type in 
>> assignment (different address spaces) @@ expected struct 
>> mask_cache_entry *cache @@ got void [noderef] __percpu * @@
>> net/openvswitch/flow_table.c:376:23: sparse: expected struct 
>> mask_cache_entry *cache
>> net/openvswitch/flow_table.c:376:23: sparse: got void [noderef] __percpu 
>> *
>> net/openvswitch/flow_table.c:386:25: sparse: sparse: incorrect type in 
>> assignment (different address spaces) @@ expected struct 
>> mask_cache_entry [noderef] __percpu *mask_cache @@ got struct 
>> mask_cache_entry *cache @@
>> net/openvswitch/flow_table.c:386:25: sparse: expected struct 
>> mask_cache_entry [noderef] __percpu *mask_cache
>> net/openvswitch/flow_table.c:386:25: sparse: got struct mask_cache_entry 
>> *cache
>> net/openvswitch/flow_table.c:411:27: sparse: sparse: incorrect type in 
>> assignment (different address spaces) @@ expected struct mask_cache 
>> [noderef] __rcu *mask_cache @@ got struct mask_cache * @@
>> net/openvswitch/flow_table.c:411:27: sparse: expected struct mask_cache 
>> [noderef] __rcu *mask_cache
>> net/openvswitch/flow_table.c:411:27: sparse: got struct mask_cache *
>> net/openvswitch/flow_table.c:440:35: sparse: sparse: incorrect type in 
>> argument 1 (different address spaces) @@ expected struct mask_cache *mc 
>> @@ got struct mask_cache [noderef] __rcu *mask_cache @@
>> net/openvswitch/flow_table.c:440:35: sparse: expected struct mask_cache 
>> *mc
>> net/openvswitch/flow_table.c:440:35: sparse: got struct mask_cache 
>> [noderef] __rcu *mask_cache
   net/openvswitch/flow_table.c:517:24: sparse: sparse: incorrect type in 
argument 1 (different address spaces) @@ expected struct callback_head 
*head @@ got struct callback_head [noderef] __rcu * @@
   net/openvswitch/flow_table.c:517:24: sparse: expected struct 
callback_head *head
   net/openvswitch/flow_table.c:517:24: sparse: got struct callback_head 
[noderef] __rcu *
   net/openvswitch/flow_table.c:518:24: sparse: sparse: incorrect type in 
argument 1 (different address spaces) @@ expected struct callback_head 
*head @@ got struct callback_head [noderef] __rcu * @@
   net/openvswitch/flow_table.c:518:24: sparse: expected struct 
callback_head *head
   net/openvswitch/flow_table.c:518:24: sparse: got struct callback_head 
[noderef] __rcu *
   net/openvswitch/flow_table.c:1166:42: sparse: sparse: incorrect type in 
assignment (different address spaces) @@ expected struct sw_flow_mask 
[noderef] __rcu * @@ got struct sw_flow_mask * @@
   net/openvswitch/flow_table.c:1166:42: sparse: expected struct 
sw_flow_mask [noderef] __rcu *
   net/openvswitch/flow_table.c:1166:42: sparse: got struct sw_flow_mask *

vim +376 net/openvswitch/flow_table.c

   357  
   358  static struct mask_cache *tbl_mask_cache_alloc(u32 size)
   359  {
   360  struct mask_cache_entry *cache = NULL;
   361  struct mask_cache *new;
   362  
   363  /* Only allow size to be 0, or a power of 2, and does not exceed
   364   * percpu allocation size.
   365   */
   366  if ((size & (size - 1)) != 0 ||
   367  (size * sizeof(struct mask_cache_entry)) > 
PCPU_MIN_UNIT_SIZE)
   368  return NULL;
   369  
   370  new = kzalloc(sizeof(*new), GFP_KERNEL);
   371  if (!new)
   372  return NULL;
   373  
   374  new->cache_size = size;
   375  if (new->cache_size > 0) {
 > 376  cache = __alloc_percpu(sizeof(struct mask_cache_entry) *
   377 new->cache_size,
   378 __alignof__(struct 
mask_cache_entry));
   379  
   380  if (!cache) {
   381  kfree(new);
   382  return NULL;
   383

Re: [ovs-dev] [PATCH v2] bfd: Support overlay BFD

2020-07-22 Thread William Tu
On Wed, Jul 22, 2020 at 01:59:04AM -0700, Yifeng Sun wrote:
> Current OVS intercepts and processes all BFD packets, thus VM-2-VM
> BFD packets get lost and the recipient VM never sees them.
> 
> This patch fixes it by only intercepting and processing BFD packets
> destined to a configured BFD instance, and other BFD packets are made
> available to the OVS flow table for forwarding.
> 
> This patch adds new test to validate BFD overlay.
> 
> This patch keeps BFD's backward compatibility.
> 
is there a VMware Bug id?

> Signed-off-by: Yifeng Sun 
> ---
> v1->v2: Add test by William's suggestion.
> 
>  lib/bfd.c   | 16 +---
>  tests/system-traffic.at | 42 ++
>  vswitchd/vswitch.xml|  7 +++
>  3 files changed, 62 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/bfd.c b/lib/bfd.c
> index cc8c6857afa4..3c965699ace3 100644
> --- a/lib/bfd.c
> +++ b/lib/bfd.c
> @@ -149,6 +149,9 @@ BUILD_ASSERT_DECL(BFD_PACKET_LEN == sizeof(struct msg));
>  #define FLAGS_MASK 0x3f
>  #define DEFAULT_MULT 3
>  
> +#define BFD_DEFAULT_SRC_IP 0xA9FE0101 /* 169.254.1.1 */
> +#define BFD_DEFAULT_DST_IP 0xA9FE0100 /* 169.254.1.0 */
> +
>  struct bfd {
>  struct hmap_node node;/* In 'all_bfds'. */
>  uint32_t disc;/* bfd.LocalDiscr. Key in 'all_bfds' hmap. 
> */
> @@ -457,9 +460,9 @@ bfd_configure(struct bfd *bfd, const char *name, const 
> struct smap *cfg,
>   >rmt_eth_dst);
>  
>  bfd_lookup_ip(smap_get_def(cfg, "bfd_src_ip", ""),
> -  htonl(0xA9FE0101) /* 169.254.1.1 */, >ip_src);
> +  htonl(BFD_DEFAULT_SRC_IP), >ip_src);
>  bfd_lookup_ip(smap_get_def(cfg, "bfd_dst_ip", ""),
> -  htonl(0xA9FE0100) /* 169.254.1.0 */, >ip_dst);
> +  htonl(BFD_DEFAULT_DST_IP), >ip_dst);
>  
>  forwarding_if_rx = smap_get_bool(cfg, "forwarding_if_rx", false);
>  if (bfd->forwarding_if_rx != forwarding_if_rx) {
> @@ -674,7 +677,14 @@ bfd_should_process_flow(const struct bfd *bfd_, const 
> struct flow *flow,
>  memset(>masks.nw_proto, 0xff, sizeof wc->masks.nw_proto);
>  if (flow->nw_proto == IPPROTO_UDP
>  && !(flow->nw_frag & FLOW_NW_FRAG_LATER)
> -&& tp_dst_equals(flow, BFD_DEST_PORT, wc)) {
> +&& tp_dst_equals(flow, BFD_DEST_PORT, wc)
> +&& (bfd->ip_src == htonl(BFD_DEFAULT_SRC_IP)
> +|| bfd->ip_src == flow->nw_dst)) {
> +
> +if (bfd->ip_src == flow->nw_dst) {
> +memset(>masks.nw_dst, 0x, sizeof 
> wc->masks.nw_dst);
> +}
> +
>  bool check_tnl_key;
>  
>  atomic_read_relaxed(>check_tnl_key, _tnl_key);
> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
> index 2a0fbadff4a1..80b58996d530 100644
> --- a/tests/system-traffic.at
> +++ b/tests/system-traffic.at
> @@ -6289,3 +6289,45 @@ OVS_WAIT_UNTIL([cat p2.pcap | egrep "0x0050: * 
> * *5002 *2000 *b85e *
>  
>  OVS_TRAFFIC_VSWITCHD_STOP
>  AT_CLEANUP
> +
> +AT_SETUP([bfd - BFD overlay])
> +OVS_CHECK_GENEVE()
> +
> +OVS_TRAFFIC_VSWITCHD_START()
> +
> +AT_CHECK([ovs-vsctl -- set bridge br0 
> other-config:hwaddr=\"f2:ff:00:00:00:01\"])
> +ADD_BR([br-underlay], [set bridge br-underlay 
> other-config:hwaddr=\"ee:09:e0:4d:bf:31\"])
> +
> +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
> +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"])
> +
> +ADD_NAMESPACES(at_ns0)
> +
> +dnl Set up underlay link from host into the namespace using veth pair.
> +ADD_VETH(p0, at_ns0, br-underlay, "172.16.180.105/24", 4e:12:5d:6c:74:3d)
> +AT_CHECK([ip addr add dev br-underlay "172.16.180.106/24"])
> +AT_CHECK([ip link set dev br-underlay up])
> +
> +dnl Set up tunnel endpoints on OVS outside the namespace.
> +ADD_OVS_TUNNEL([geneve], [br0], [at_gnv0], [172.16.180.105], 
> [192.168.10.100/24],
> +[options:packet_type=ptap])
> +
> +dnl Certain Linux distributions, like CentOS, have default iptable rules
> +dnl to reject input traffic from br-underlay. Here we add a rule to walk
> +dnl around it.
> +iptables -I INPUT 1 -i br-underlay -j ACCEPT
> +on_exit 'iptables -D INPUT 1'
Is this only happened to this bfd test, or others also fail?

> +
> +dnl Firstly, test normal BFD packet.
> +ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1 
> packet=ee09e04dbf314e125d6c743d080045c0006624d94000401153f9ac10b469ac10b46a739f17c1005247f900806558002320016a0f5a9ed376080045c00034ff11fa03ac10b469ac10b46ac0070ec8002021c003187b3c96ebbc96b962000186af4240
>  actions=NORMAL"
> +dnl Next we test overlay BFD packet.
> +ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1 
> packet=ee09e04dbf314e125d6c743d0800455b2558400040115445ac10b469ac10b46a6b1017c1004722c10240655801048001001803005254009d0b6d5254000c8984080045210e22400040119688c0a80a68c0a80a6995cd0ec8000dd342746573740a
>  

Re: [ovs-dev] [PATCH net-next 2/2] net: openvswitch: make masks cache size configurable

2020-07-22 Thread Eelco Chaudron



On 22 Jul 2020, at 17:21, Jakub Kicinski wrote:


On Wed, 22 Jul 2020 10:27:52 +0200 Eelco Chaudron wrote:

This patch makes the masks cache size configurable, or with
a size of 0, disable it.

Reviewed-by: Paolo Abeni 
Signed-off-by: Eelco Chaudron 


Hi Elco!

This patch adds a bunch of new sparse warnings:

net/openvswitch/flow_table.c:376:23: warning: incorrect type in 
assignment (different address spaces)
net/openvswitch/flow_table.c:376:23:expected struct 
mask_cache_entry *cache

net/openvswitch/flow_table.c:376:23:got void [noderef] __percpu *
net/openvswitch/flow_table.c:386:25: warning: incorrect type in 
assignment (different address spaces)
net/openvswitch/flow_table.c:386:25:expected struct 
mask_cache_entry [noderef] __percpu *mask_cache
net/openvswitch/flow_table.c:386:25:got struct mask_cache_entry 
*cache
net/openvswitch/flow_table.c:411:27: warning: incorrect type in 
assignment (different address spaces)
net/openvswitch/flow_table.c:411:27:expected struct mask_cache 
[noderef] __rcu *mask_cache

net/openvswitch/flow_table.c:411:27:got struct mask_cache *
net/openvswitch/flow_table.c:440:35: warning: incorrect type in 
argument 1 (different address spaces)

net/openvswitch/flow_table.c:440:35:expected struct mask_cache *mc
net/openvswitch/flow_table.c:440:35:got struct mask_cache 
[noderef] __rcu *mask_cache


Odd, as I’m sure I ran checkpatch :( Will sent an update fixing those!

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next 2/2] net: openvswitch: make masks cache size configurable

2020-07-22 Thread Jakub Kicinski
On Wed, 22 Jul 2020 10:27:52 +0200 Eelco Chaudron wrote:
> This patch makes the masks cache size configurable, or with
> a size of 0, disable it.
> 
> Reviewed-by: Paolo Abeni 
> Signed-off-by: Eelco Chaudron 

Hi Elco!

This patch adds a bunch of new sparse warnings:

net/openvswitch/flow_table.c:376:23: warning: incorrect type in assignment 
(different address spaces)
net/openvswitch/flow_table.c:376:23:expected struct mask_cache_entry *cache
net/openvswitch/flow_table.c:376:23:got void [noderef] __percpu *
net/openvswitch/flow_table.c:386:25: warning: incorrect type in assignment 
(different address spaces)
net/openvswitch/flow_table.c:386:25:expected struct mask_cache_entry 
[noderef] __percpu *mask_cache
net/openvswitch/flow_table.c:386:25:got struct mask_cache_entry *cache
net/openvswitch/flow_table.c:411:27: warning: incorrect type in assignment 
(different address spaces)
net/openvswitch/flow_table.c:411:27:expected struct mask_cache [noderef] 
__rcu *mask_cache
net/openvswitch/flow_table.c:411:27:got struct mask_cache *
net/openvswitch/flow_table.c:440:35: warning: incorrect type in argument 1 
(different address spaces)
net/openvswitch/flow_table.c:440:35:expected struct mask_cache *mc
net/openvswitch/flow_table.c:440:35:got struct mask_cache [noderef] __rcu 
*mask_cache
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] dpif-netlink: distribute polling to discreet handlers

2020-07-22 Thread Matteo Croce
On Wed, Jul 22, 2020 at 1:27 AM Aaron Conole  wrote:
> To check:
>
>   ip netns add left
>   ip netns add right
>   ip link add center-left type veth peer name left0
>   ip link add center-right type veth peer name right0
>   ip link set left0 netns left
>   ip link set right0 netns right

Nit: ip can set the peer netns upon veth creation:

  ip link add center-left type veth peer name left0 netns left
  ip link add center-right type veth peer name right0 netns right

> +static uint32_t
> +dpif_handler_port_idx_max(const struct dpif_handler *handler,
> +  struct dpif_channel **out_chn)
> +{
> +uint32_t max = 0;
> +size_t i;
> +
> +for (i = 0; i < handler->n_channels; ++i) {

Any reason for using the prefix operator everywhere?
They should be equal nowadays.

> --
> 2.25.4
>

Good job!
Did you have the chance to run the same tests I did in 69c51582ff78?
I'm curious to see how much it improves.

Bye,
-- 
Matteo Croce

perl -e 'for($t=0;;$t++){print chr($t*($t>>8|$t>>13)&255)}' |aplay
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] 1

2020-07-22 Thread NGENE NKANI


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2] bfd: Support overlay BFD

2020-07-22 Thread Yifeng Sun
Current OVS intercepts and processes all BFD packets, thus VM-2-VM
BFD packets get lost and the recipient VM never sees them.

This patch fixes it by only intercepting and processing BFD packets
destined to a configured BFD instance, and other BFD packets are made
available to the OVS flow table for forwarding.

This patch adds new test to validate BFD overlay.

This patch keeps BFD's backward compatibility.

Signed-off-by: Yifeng Sun 
---
v1->v2: Add test by William's suggestion.

 lib/bfd.c   | 16 +---
 tests/system-traffic.at | 42 ++
 vswitchd/vswitch.xml|  7 +++
 3 files changed, 62 insertions(+), 3 deletions(-)

diff --git a/lib/bfd.c b/lib/bfd.c
index cc8c6857afa4..3c965699ace3 100644
--- a/lib/bfd.c
+++ b/lib/bfd.c
@@ -149,6 +149,9 @@ BUILD_ASSERT_DECL(BFD_PACKET_LEN == sizeof(struct msg));
 #define FLAGS_MASK 0x3f
 #define DEFAULT_MULT 3
 
+#define BFD_DEFAULT_SRC_IP 0xA9FE0101 /* 169.254.1.1 */
+#define BFD_DEFAULT_DST_IP 0xA9FE0100 /* 169.254.1.0 */
+
 struct bfd {
 struct hmap_node node;/* In 'all_bfds'. */
 uint32_t disc;/* bfd.LocalDiscr. Key in 'all_bfds' hmap. */
@@ -457,9 +460,9 @@ bfd_configure(struct bfd *bfd, const char *name, const 
struct smap *cfg,
  >rmt_eth_dst);
 
 bfd_lookup_ip(smap_get_def(cfg, "bfd_src_ip", ""),
-  htonl(0xA9FE0101) /* 169.254.1.1 */, >ip_src);
+  htonl(BFD_DEFAULT_SRC_IP), >ip_src);
 bfd_lookup_ip(smap_get_def(cfg, "bfd_dst_ip", ""),
-  htonl(0xA9FE0100) /* 169.254.1.0 */, >ip_dst);
+  htonl(BFD_DEFAULT_DST_IP), >ip_dst);
 
 forwarding_if_rx = smap_get_bool(cfg, "forwarding_if_rx", false);
 if (bfd->forwarding_if_rx != forwarding_if_rx) {
@@ -674,7 +677,14 @@ bfd_should_process_flow(const struct bfd *bfd_, const 
struct flow *flow,
 memset(>masks.nw_proto, 0xff, sizeof wc->masks.nw_proto);
 if (flow->nw_proto == IPPROTO_UDP
 && !(flow->nw_frag & FLOW_NW_FRAG_LATER)
-&& tp_dst_equals(flow, BFD_DEST_PORT, wc)) {
+&& tp_dst_equals(flow, BFD_DEST_PORT, wc)
+&& (bfd->ip_src == htonl(BFD_DEFAULT_SRC_IP)
+|| bfd->ip_src == flow->nw_dst)) {
+
+if (bfd->ip_src == flow->nw_dst) {
+memset(>masks.nw_dst, 0x, sizeof wc->masks.nw_dst);
+}
+
 bool check_tnl_key;
 
 atomic_read_relaxed(>check_tnl_key, _tnl_key);
diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index 2a0fbadff4a1..80b58996d530 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -6289,3 +6289,45 @@ OVS_WAIT_UNTIL([cat p2.pcap | egrep "0x0050: * * 
*5002 *2000 *b85e *
 
 OVS_TRAFFIC_VSWITCHD_STOP
 AT_CLEANUP
+
+AT_SETUP([bfd - BFD overlay])
+OVS_CHECK_GENEVE()
+
+OVS_TRAFFIC_VSWITCHD_START()
+
+AT_CHECK([ovs-vsctl -- set bridge br0 
other-config:hwaddr=\"f2:ff:00:00:00:01\"])
+ADD_BR([br-underlay], [set bridge br-underlay 
other-config:hwaddr=\"ee:09:e0:4d:bf:31\"])
+
+AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
+AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"])
+
+ADD_NAMESPACES(at_ns0)
+
+dnl Set up underlay link from host into the namespace using veth pair.
+ADD_VETH(p0, at_ns0, br-underlay, "172.16.180.105/24", 4e:12:5d:6c:74:3d)
+AT_CHECK([ip addr add dev br-underlay "172.16.180.106/24"])
+AT_CHECK([ip link set dev br-underlay up])
+
+dnl Set up tunnel endpoints on OVS outside the namespace.
+ADD_OVS_TUNNEL([geneve], [br0], [at_gnv0], [172.16.180.105], 
[192.168.10.100/24],
+[options:packet_type=ptap])
+
+dnl Certain Linux distributions, like CentOS, have default iptable rules
+dnl to reject input traffic from br-underlay. Here we add a rule to walk
+dnl around it.
+iptables -I INPUT 1 -i br-underlay -j ACCEPT
+on_exit 'iptables -D INPUT 1'
+
+dnl Firstly, test normal BFD packet.
+ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1 
packet=ee09e04dbf314e125d6c743d080045c0006624d94000401153f9ac10b469ac10b46a739f17c1005247f900806558002320016a0f5a9ed376080045c00034ff11fa03ac10b469ac10b46ac0070ec8002021c003187b3c96ebbc96b962000186af4240
 actions=NORMAL"
+dnl Next we test overlay BFD packet.
+ovs-ofctl -O OpenFlow13 packet-out br-underlay "in_port=1 
packet=ee09e04dbf314e125d6c743d0800455b2558400040115445ac10b469ac10b46a6b1017c1004722c10240655801048001001803005254009d0b6d5254000c8984080045210e22400040119688c0a80a68c0a80a6995cd0ec8000dd342746573740a
 actions=NORMAL"
+
+ovs-dpctl dump-flows > datapath-flows.txt
+dnl BFD packet was handled by BFD flow.
+AT_FAIL_IF([cat datapath-flows.txt | egrep 
"tunnel\(.*,src=172.16.180.105,dst=172.16.180.106,.*,eth\(\),eth_type\(0x0800\),ipv4\(proto=17,.*\),udp\(dst=3784\),
 .*, actions:userspace\(pid=.*,slow_path\(bfd\)\)" 2>&1 1>/dev/null])
+dnl Overlay BFD packet was handled by non-BFD 

Re: [ovs-dev] [PATCH v3 3/4] netdev-offload: Add xdp flow api provider

2020-07-22 Thread Toshiaki Makita

On 2020/07/22 3:10, William Tu wrote:

Thanks for the patch. My comments below:

On Mon, Jun 29, 2020 at 8:30 AM Toshiaki Makita
 wrote:


This provider offloads classifier to software XDP.

It works only when a custom XDP object is loaded by afxdp netdev.
The BPF program needs to implement classifier with array-of-maps for
subtable hashmaps and arraymap for subtable masks. The flow api
provider detects classifier support in the custom XDP program when
loading it.

The requirements for the BPF program is as below:

- A struct named "meta_info" in ".ovs_meta" section
   inform vswitchd of supported keys, actions, and the max number of
   actions.

- A map named "subtbl_masks"
   An array which implements a list. The entry contains struct
   xdp_subtables_mask provided by lib/netdev-offload-xdp.h followed by
   miniflow buf of any length for the mask.

- A map named "subtbl_masks_hd"
   A one entry int array which contains the first entry index of the list
   implemented by subtbl_masks.

- A map named "flow_table"
   An array-of-maps. Each entry points to subtable hash-map instantiated
   with the following "subtbl_template".
   The entry of each index of subtbl_masks has miniflow mask of subtable
   in the corresponding index of flow_table.

- A map named "subtbl_template"
   A template for subtable, used for entries of "flow_table".
   The key must be miniflow buf (without leading map).

- A map named "output_map"
   A devmap. Each entry represents odp port.

- A map named "xsks_map"
   A xskmap. Used for upcall.


Maybe later on, we can add the above to Documentation/ and
explain the purpose of these maps there.


Sure.



For more details, refer to the reference BPF program in next commit.

In the future it may be possible to offload classifier to SmartNIC
through XDP, but currently it's not because map-in-map is not supported
by XDP hw-offload.


I think it will be hard, given that only Netronome supports XDP offload today.


I'll try it if I can get a netronome NIC.


Signed-off-by: Toshiaki Makita 
---
  lib/automake.mk   |6 +-
  lib/bpf-util.c|   38 +
  lib/bpf-util.h|   22 +
  lib/netdev-afxdp.c|  205 -
  lib/netdev-afxdp.h|3 +
  lib/netdev-linux-private.h|2 +
  lib/netdev-offload-provider.h |6 +
  lib/netdev-offload-xdp.c  | 1315 +
  lib/netdev-offload-xdp.h  |   49 ++
  lib/netdev-offload.c  |6 +
  10 files changed, 1650 insertions(+), 2 deletions(-)
  create mode 100644 lib/bpf-util.c
  create mode 100644 lib/bpf-util.h
  create mode 100644 lib/netdev-offload-xdp.c
  create mode 100644 lib/netdev-offload-xdp.h

diff --git a/lib/automake.mk b/lib/automake.mk
index 86940ccd2..1fa1371f3 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -421,10 +421,14 @@ endif

  if HAVE_AF_XDP
  lib_libopenvswitch_la_SOURCES += \
+   lib/bpf-util.c \
+   lib/bpf-util.h \
 lib/netdev-afxdp-pool.c \
 lib/netdev-afxdp-pool.h \
 lib/netdev-afxdp.c \
-   lib/netdev-afxdp.h
+   lib/netdev-afxdp.h \
+   lib/netdev-offload-xdp.c \
+   lib/netdev-offload-xdp.h
  endif


How about doing s.t like:
--enable-afxdp: the current one on master without the xdp offload program
--enable-afxdp-with-bpf: the afxdp one plus your xdp offload program

So that when users only --enable-afxdp, it doesn't need to compile in the
lib/netdev-offload-xdp.* and bpf/*


Let me clarify it.

Currently,

--enable-afxdp: build lib/netdev-afxdp* and lib/netdev-offload-xdp*
--enable-bpf: build bpf/*

Do you sugguest this?

--enable-afxdp: build lib/netdev-afxdp*
--enable-afxdp-with-bpf: build lib/netdev-afxdp*, lib/netdev-offload-xdp*, and 
bpf/*


  if DPDK_NETDEV
diff --git a/lib/bpf-util.c b/lib/bpf-util.c
new file mode 100644
index 0..324cfbe1d
--- /dev/null
+++ b/lib/bpf-util.c
@@ -0,0 +1,38 @@
+/*
+ * Copyright (c) 2020 NTT Corp.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+
+#include "bpf-util.h"
+
+#include 
+
+#include "ovs-thread.h"
+
+DEFINE_STATIC_PER_THREAD_DATA(struct { char s[128]; },
+  libbpf_strerror_buffer,
+  { "" });
+
+const char *
+ovs_libbpf_strerror(int err)
+{
+enum { BUFSIZE = sizeof libbpf_strerror_buffer_get()->s };

just curious:
what's the benefit of using enum {BUFSIZE ...} here?


Just imitated ovs_strerror().
Did not 

Re: [ovs-dev] [PATCH net-next 0/2] net: openvswitch: masks cache enhancements

2020-07-22 Thread Eelco Chaudron




On 22 Jul 2020, at 10:27, Eelco Chaudron wrote:


This patchset adds two enhancements to the Open vSwitch masks cache.

Signed-off-by: Eelco Chaudron 

Eelco Chaudron (2):
  net: openvswitch: add masks cache hit counter
  net: openvswitch: make masks cache size configurable


 include/uapi/linux/openvswitch.h |3 +
 net/openvswitch/datapath.c   |   16 +-
 net/openvswitch/datapath.h   |3 +
 net/openvswitch/flow_table.c |  105 
+-

 net/openvswitch/flow_table.h |   13 -
 5 files changed, 122 insertions(+), 18 deletions(-)


FYI the userspace patch can be found here:

https://mail.openvswitch.org/pipermail/ovs-dev/2020-July/373159.html

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [RFC PATCH 2/2] dpctl: dpif: allow viewing and configuring dp cache sizes

2020-07-22 Thread Eelco Chaudron
This patch adds a general way of viewing/configuring datapath
cache sizes. With an implementation for the netlink interface.

The ovs-dpctl/ovs-appctl show commands will display the
current cache sizes configured:

ovs-dpctl show
system@ovs-system:
  lookups: hit:25 missed:63 lost:0
  flows: 0
  masks: hit:282 total:0 hit/pkt:3.20
  cache: hit:4 hit rate:4.5455%
  caches:
masks cache: size: 256
  port 0: ovs-system (internal)
  port 1: br-int (internal)
  port 2: genev_sys_6081 (geneve: packet_type=ptap)
  port 3: br-ex (internal)
  port 4: eth2
  port 5: sw0p1 (internal)
  port 6: sw0p3 (internal)

A specific cache can be configured as follows:

ovs-appctl dpctl/cache-set-size DP CACHE SIZE
ovs-dpctl cache-set-size DP CACHE SIZE

For example to disable the cache do:

$ ovs-dpctl cache-set-size system@ovs-system "masks cache" 0
Setting cache size successful, new size 0.

Signed-off-by: Eelco Chaudron 
---
 datapath/linux/compat/include/linux/openvswitch.h |1 
 lib/dpctl.c   |  120 +
 lib/dpif-netdev.c |4 +
 lib/dpif-netlink.c|  113 
 lib/dpif-provider.h   |   20 
 lib/dpif.c|   32 ++
 lib/dpif.h|7 +
 utilities/ovs-dpctl.c |4 +
 8 files changed, 301 insertions(+)

diff --git a/datapath/linux/compat/include/linux/openvswitch.h 
b/datapath/linux/compat/include/linux/openvswitch.h
index fb73bfa90..4d4ead8af 100644
--- a/datapath/linux/compat/include/linux/openvswitch.h
+++ b/datapath/linux/compat/include/linux/openvswitch.h
@@ -105,6 +105,7 @@ enum ovs_datapath_attr {
OVS_DP_ATTR_MEGAFLOW_STATS, /* struct ovs_dp_megaflow_stats */
OVS_DP_ATTR_USER_FEATURES,  /* OVS_DP_F_*  */
OVS_DP_ATTR_PAD,
+   OVS_DP_ATTR_MASKS_CACHE_SIZE,
__OVS_DP_ATTR_MAX
 };
 
diff --git a/lib/dpctl.c b/lib/dpctl.c
index dac350fb5..c8e8b3cdb 100644
--- a/lib/dpctl.c
+++ b/lib/dpctl.c
@@ -585,6 +585,36 @@ compare_port_nos(const void *a_, const void *b_)
 return a < b ? -1 : a > b;
 }
 
+static void
+show_dpif_cache__(struct dpif *dpif, struct dpctl_params *dpctl_p)
+{
+uint32_t nr_caches;
+int error = dpif_cache_get_supported_levels(dpif, _caches);
+
+if (error || nr_caches == 0) {
+return;
+}
+
+dpctl_print(dpctl_p, "  caches:\n");
+for (int i = 0; i < nr_caches; i++) {
+const char *name;
+uint32_t size;
+
+if (dpif_cache_get_name(dpif, i, ) ||
+dpif_cache_get_size(dpif, i, )) {
+continue;
+}
+dpctl_print(dpctl_p, "%s: size: %u\n", name, size);
+}
+}
+
+static void
+show_dpif_cache(struct dpif *dpif, struct dpctl_params *dpctl_p)
+{
+dpctl_print(dpctl_p, "%s:\n", dpif_name(dpif));
+show_dpif_cache__(dpif, dpctl_p);
+}
+
 static void
 show_dpif(struct dpif *dpif, struct dpctl_params *dpctl_p)
 {
@@ -617,6 +647,8 @@ show_dpif(struct dpif *dpif, struct dpctl_params *dpctl_p)
 }
 }
 
+show_dpif_cache__(dpif, dpctl_p);
+
 odp_port_t *port_nos = NULL;
 size_t allocated_port_nos = 0, n_port_nos = 0;
 DPIF_PORT_FOR_EACH (_port, , dpif) {
@@ -2224,6 +2256,92 @@ dpctl_ct_ipf_get_status(int argc, const char *argv[],
 return error;
 }
 
+static int
+dpctl_cache_get_size(int argc, const char *argv[],
+ struct dpctl_params *dpctl_p)
+{
+int error;
+
+if (argc > 1) {
+struct dpif *dpif;
+
+error = parsed_dpif_open(argv[1], false, );
+if (!error) {
+show_dpif_cache(dpif, dpctl_p);
+dpif_close(dpif);
+} else {
+dpctl_error(dpctl_p, error, "opening datapath %s failed", argv[1]);
+}
+} else {
+error = dps_for_each(dpctl_p, show_dpif_cache);
+}
+
+return error;
+}
+
+static int
+dpctl_cache_set_size(int argc, const char *argv[],
+ struct dpctl_params *dpctl_p)
+{
+int i, error = EINVAL;
+uint32_t nr_caches, size;
+struct dpif *dpif;
+
+if (argc != 4) {
+dpctl_error(dpctl_p, error, "Invalid number of arguments passes.");
+return error;
+}
+
+if (!ovs_scan(argv[3], "%"SCNu32, )) {
+dpctl_error(dpctl_p, error, "size seems malformed.");
+return error;
+}
+
+error = parsed_dpif_open(argv[1], false, );
+if (error) {
+dpctl_error(dpctl_p, error, "Opening datapath %s failed.",
+argv[1]);
+return error;
+}
+
+error = dpif_cache_get_supported_levels(dpif, _caches);
+if (error || nr_caches == 0) {
+dpctl_error(dpctl_p, error, "Setting caches not supported.");
+goto exit;
+}
+
+for (i = 0; i < nr_caches; i++) {
+const char *name;
+
+if (dpif_cache_get_name(dpif, i, )) {
+  

[ovs-dev] [RFC PATCH 0/2] dpctl: cache visibility/configuration enhancements

2020-07-22 Thread Eelco Chaudron
This RFC patchset adds two enhancements related to cache
management to dpclt.

It's marked RFC as the required kmod patches have just been
sent upstream:
  
https://lore.kernel.org/netdev/159540642765.619787.548452630292188.stgit@ebuild/T/#t

NOTE: This RFC still needs a dpctl test suite update.

Signed-off-by: Eelco Chaudron 

Eelco Chaudron (2):
  dpctl: dpif: add kernel datapath cache hit output
  dpctl: dpif: allow viewing and configuring dp cache sizes


 datapath/linux/compat/include/linux/openvswitch.h |3 
 lib/dpctl.c   |  129 +
 lib/dpif-netdev.c |5 +
 lib/dpif-netlink.c|  116 +++
 lib/dpif-provider.h   |   20 +++
 lib/dpif.c|   32 +
 lib/dpif.h|9 +
 utilities/ovs-dpctl.c |4 +
 8 files changed, 317 insertions(+), 1 deletion(-)

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [RFC PATCH 1/2] dpctl: dpif: add kernel datapath cache hit output

2020-07-22 Thread Eelco Chaudron
This patch adds cache usage statistics to the output:

$ ovs-dpctl show


ovnhostvm: Tue Jul 21 15:25:14 2020
system@ovs-system:
  lookups: hit:24 missed:71 lost:0
  flows: 0
  masks: hit:334 total:0 hit/pkt:3.52
  cache: hit:4 hit rate:4.2105%
  port 0: ovs-system (internal)
  port 1: genev_sys_6081 (geneve: packet_type=ptap)
  port 2: br-int (internal)
  port 3: br-ex (internal)
  port 4: eth2
  port 5: sw1p1 (internal)
  port 6: sw0p4 (internal)

Signed-off-by: Eelco Chaudron 
---
 datapath/linux/compat/include/linux/openvswitch.h |2 +-
 lib/dpctl.c   |9 +
 lib/dpif-netdev.c |1 +
 lib/dpif-netlink.c|3 +++
 lib/dpif.h|2 ++
 5 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/datapath/linux/compat/include/linux/openvswitch.h 
b/datapath/linux/compat/include/linux/openvswitch.h
index cc41bbea4..fb73bfa90 100644
--- a/datapath/linux/compat/include/linux/openvswitch.h
+++ b/datapath/linux/compat/include/linux/openvswitch.h
@@ -122,8 +122,8 @@ struct ovs_dp_megaflow_stats {
__u64 n_mask_hit;/* Number of masks used for flow lookups. */
__u32 n_masks;   /* Number of masks for the datapath. */
__u32 pad0;  /* Pad for future expension. */
+   __u64 n_cache_hit;   /* Number of cache matches for flow lookups. */
__u64 pad1;  /* Pad for future expension. */
-   __u64 pad2;  /* Pad for future expension. */
 };
 
 struct ovs_vport_stats {
diff --git a/lib/dpctl.c b/lib/dpctl.c
index db2b1f896..dac350fb5 100644
--- a/lib/dpctl.c
+++ b/lib/dpctl.c
@@ -605,6 +605,15 @@ show_dpif(struct dpif *dpif, struct dpctl_params *dpctl_p)
 dpctl_print(dpctl_p, "  masks: hit:%"PRIu64" total:%"PRIu32
  " hit/pkt:%.2f\n",
 stats.n_mask_hit, stats.n_masks, avg);
+
+if (stats.n_cache_hit > 0 && stats.n_cache_hit != UINT64_MAX) {
+double avg_hits = n_pkts ?
+(double) stats.n_cache_hit / n_pkts * 100 : 0.0;
+
+dpctl_print(dpctl_p,
+"  cache: hit:%"PRIu64" hit rate:%.4f%%\n",
+stats.n_cache_hit, avg_hits);
+}
 }
 }
 
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 2aad24511..4e3bd7519 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2023,6 +2023,7 @@ dpif_netdev_get_stats(const struct dpif *dpif, struct 
dpif_dp_stats *stats)
 }
 stats->n_masks = UINT32_MAX;
 stats->n_mask_hit = UINT64_MAX;
+stats->n_cache_hit = UINT64_MAX;
 
 return 0;
 }
diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index 7da4fb54d..07f15ac65 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -672,9 +672,12 @@ dpif_netlink_get_stats(const struct dpif *dpif_, struct 
dpif_dp_stats *stats)
 stats->n_masks = dp.megaflow_stats->n_masks;
 stats->n_mask_hit = get_32aligned_u64(
 _stats->n_mask_hit);
+stats->n_cache_hit = get_32aligned_u64(
+_stats->n_cache_hit);
 } else {
 stats->n_masks = UINT32_MAX;
 stats->n_mask_hit = UINT64_MAX;
+stats->n_cache_hit = UINT64_MAX;
 }
 ofpbuf_delete(buf);
 }
diff --git a/lib/dpif.h b/lib/dpif.h
index 2d52f0186..43c1ab998 100644
--- a/lib/dpif.h
+++ b/lib/dpif.h
@@ -429,6 +429,8 @@ struct dpif_dp_stats {
 uint64_t n_missed;  /* Number of flow table misses. */
 uint64_t n_lost;/* Number of misses not sent to userspace. */
 uint64_t n_flows;   /* Number of flows present. */
+uint64_t n_cache_hit;   /* Number of mega flow mask cache hits for
+   flow table matches. */
 uint64_t n_mask_hit;/* Number of mega flow masks visited for
flow table matches. */
 uint32_t n_masks;   /* Number of mega flow masks. */

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next 1/2] net: openvswitch: add masks cache hit counter

2020-07-22 Thread Eelco Chaudron
Add a counter that counts the number of masks cache hits, and
export it through the megaflow netlink statistics.

Reviewed-by: Paolo Abeni 
Signed-off-by: Eelco Chaudron 
---
 include/uapi/linux/openvswitch.h |2 +-
 net/openvswitch/datapath.c   |5 -
 net/openvswitch/datapath.h   |3 +++
 net/openvswitch/flow_table.c |   19 ++-
 net/openvswitch/flow_table.h |3 ++-
 5 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index 9b14519e74d9..7cb76e5ca7cf 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -102,8 +102,8 @@ struct ovs_dp_megaflow_stats {
__u64 n_mask_hit;/* Number of masks used for flow lookups. */
__u32 n_masks;   /* Number of masks for the datapath. */
__u32 pad0;  /* Pad for future expension. */
+   __u64 n_cache_hit;   /* Number of cache matches for flow lookups. */
__u64 pad1;  /* Pad for future expension. */
-   __u64 pad2;  /* Pad for future expension. */
 };
 
 struct ovs_vport_stats {
diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 95805f0e27bd..a54df1fe3ec4 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -225,13 +225,14 @@ void ovs_dp_process_packet(struct sk_buff *skb, struct 
sw_flow_key *key)
struct dp_stats_percpu *stats;
u64 *stats_counter;
u32 n_mask_hit;
+   u32 n_cache_hit;
int error;
 
stats = this_cpu_ptr(dp->stats_percpu);
 
/* Look up flow. */
flow = ovs_flow_tbl_lookup_stats(>table, key, skb_get_hash(skb),
-_mask_hit);
+_mask_hit, _cache_hit);
if (unlikely(!flow)) {
struct dp_upcall_info upcall;
 
@@ -262,6 +263,7 @@ void ovs_dp_process_packet(struct sk_buff *skb, struct 
sw_flow_key *key)
u64_stats_update_begin(>syncp);
(*stats_counter)++;
stats->n_mask_hit += n_mask_hit;
+   stats->n_cache_hit += n_cache_hit;
u64_stats_update_end(>syncp);
 }
 
@@ -699,6 +701,7 @@ static void get_dp_stats(const struct datapath *dp, struct 
ovs_dp_stats *stats,
stats->n_missed += local_stats.n_missed;
stats->n_lost += local_stats.n_lost;
mega_stats->n_mask_hit += local_stats.n_mask_hit;
+   mega_stats->n_cache_hit += local_stats.n_cache_hit;
}
 }
 
diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
index 697a2354194b..86d78613edb4 100644
--- a/net/openvswitch/datapath.h
+++ b/net/openvswitch/datapath.h
@@ -38,12 +38,15 @@
  * @n_mask_hit: Number of masks looked up for flow match.
  *   @n_mask_hit / (@n_hit + @n_missed)  will be the average masks looked
  *   up per packet.
+ * @n_cache_hit: The number of received packets that had their mask found using
+ * the mask cache.
  */
 struct dp_stats_percpu {
u64 n_hit;
u64 n_missed;
u64 n_lost;
u64 n_mask_hit;
+   u64 n_cache_hit;
struct u64_stats_sync syncp;
 };
 
diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index af22c9ee28dd..a5912ea05352 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -667,6 +667,7 @@ static struct sw_flow *flow_lookup(struct flow_table *tbl,
   struct mask_array *ma,
   const struct sw_flow_key *key,
   u32 *n_mask_hit,
+  u32 *n_cache_hit,
   u32 *index)
 {
u64 *usage_counters = this_cpu_ptr(ma->masks_usage_cntr);
@@ -682,6 +683,7 @@ static struct sw_flow *flow_lookup(struct flow_table *tbl,
u64_stats_update_begin(>syncp);
usage_counters[*index]++;
u64_stats_update_end(>syncp);
+   (*n_cache_hit)++;
return flow;
}
}
@@ -719,7 +721,8 @@ static struct sw_flow *flow_lookup(struct flow_table *tbl,
 struct sw_flow *ovs_flow_tbl_lookup_stats(struct flow_table *tbl,
  const struct sw_flow_key *key,
  u32 skb_hash,
- u32 *n_mask_hit)
+ u32 *n_mask_hit,
+ u32 *n_cache_hit)
 {
struct mask_array *ma = rcu_dereference(tbl->mask_array);
struct table_instance *ti = rcu_dereference(tbl->ti);
@@ -729,10 +732,13 @@ struct sw_flow *ovs_flow_tbl_lookup_stats(struct 
flow_table *tbl,
int seg;
 
*n_mask_hit = 0;
+   *n_cache_hit = 0;
if 

[ovs-dev] [PATCH net-next 2/2] net: openvswitch: make masks cache size configurable

2020-07-22 Thread Eelco Chaudron
This patch makes the masks cache size configurable, or with
a size of 0, disable it.

Reviewed-by: Paolo Abeni 
Signed-off-by: Eelco Chaudron 
---
 include/uapi/linux/openvswitch.h |1 
 net/openvswitch/datapath.c   |   11 +
 net/openvswitch/flow_table.c |   86 ++
 net/openvswitch/flow_table.h |   10 
 4 files changed, 98 insertions(+), 10 deletions(-)

diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index 7cb76e5ca7cf..8300cc29dec8 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -86,6 +86,7 @@ enum ovs_datapath_attr {
OVS_DP_ATTR_MEGAFLOW_STATS, /* struct ovs_dp_megaflow_stats */
OVS_DP_ATTR_USER_FEATURES,  /* OVS_DP_F_*  */
OVS_DP_ATTR_PAD,
+   OVS_DP_ATTR_MASKS_CACHE_SIZE,
__OVS_DP_ATTR_MAX
 };
 
diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index a54df1fe3ec4..f08aa1fb9f4b 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -1535,6 +1535,10 @@ static int ovs_dp_cmd_fill_info(struct datapath *dp, 
struct sk_buff *skb,
if (nla_put_u32(skb, OVS_DP_ATTR_USER_FEATURES, dp->user_features))
goto nla_put_failure;
 
+   if (nla_put_u32(skb, OVS_DP_ATTR_MASKS_CACHE_SIZE,
+   ovs_flow_tbl_masks_cache_size(>table)))
+   goto nla_put_failure;
+
genlmsg_end(skb, ovs_header);
return 0;
 
@@ -1599,6 +1603,13 @@ static int ovs_dp_change(struct datapath *dp, struct 
nlattr *a[])
 #endif
}
 
+   if (a[OVS_DP_ATTR_MASKS_CACHE_SIZE]) {
+   u32 cache_size;
+
+   cache_size = nla_get_u32(a[OVS_DP_ATTR_MASKS_CACHE_SIZE]);
+   ovs_flow_tbl_masks_cache_resize(>table, cache_size);
+   }
+
dp->user_features = user_features;
 
if (dp->user_features & OVS_DP_F_TC_RECIRC_SHARING)
diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index a5912ea05352..1c3590bcf28b 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -38,8 +38,8 @@
 #define MASK_ARRAY_SIZE_MIN16
 #define REHASH_INTERVAL(10 * 60 * HZ)
 
+#define MC_DEFAULT_HASH_ENTRIES256
 #define MC_HASH_SHIFT  8
-#define MC_HASH_ENTRIES(1u << MC_HASH_SHIFT)
 #define MC_HASH_SEGS   ((sizeof(uint32_t) * 8) / MC_HASH_SHIFT)
 
 static struct kmem_cache *flow_cache;
@@ -341,14 +341,74 @@ static void flow_mask_remove(struct flow_table *tbl, 
struct sw_flow_mask *mask)
}
 }
 
+static void __mask_cache_destroy(struct mask_cache *mc)
+{
+   if (mc->mask_cache)
+   free_percpu(mc->mask_cache);
+   kfree(mc);
+}
+
+static void mask_cache_rcu_cb(struct rcu_head *rcu)
+{
+   struct mask_cache *mc = container_of(rcu, struct mask_cache, rcu);
+
+   __mask_cache_destroy(mc);
+}
+
+static struct mask_cache *tbl_mask_cache_alloc(u32 size)
+{
+   struct mask_cache_entry *cache = NULL;
+   struct mask_cache *new;
+
+   /* Only allow size to be 0, or a power of 2, and does not exceed
+* percpu allocation size.
+*/
+   if ((size & (size - 1)) != 0 ||
+   (size * sizeof(struct mask_cache_entry)) > PCPU_MIN_UNIT_SIZE)
+   return NULL;
+
+   new = kzalloc(sizeof(*new), GFP_KERNEL);
+   if (!new)
+   return NULL;
+
+   new->cache_size = size;
+   if (new->cache_size > 0) {
+   cache = __alloc_percpu(sizeof(struct mask_cache_entry) *
+  new->cache_size,
+  __alignof__(struct mask_cache_entry));
+
+   if (!cache) {
+   kfree(new);
+   return NULL;
+   }
+   }
+
+   new->mask_cache = cache;
+   return new;
+}
+
+void ovs_flow_tbl_masks_cache_resize(struct flow_table *table, u32 size)
+{
+   struct mask_cache *mc = rcu_dereference(table->mask_cache);
+   struct mask_cache *new;
+
+   if (size == mc->cache_size || (size & (size - 1)) != 0)
+   return;
+
+   new = tbl_mask_cache_alloc(size);
+   if (!new)
+   return;
+
+   rcu_assign_pointer(table->mask_cache, new);
+   call_rcu(>rcu, mask_cache_rcu_cb);
+}
+
 int ovs_flow_tbl_init(struct flow_table *table)
 {
struct table_instance *ti, *ufid_ti;
struct mask_array *ma;
 
-   table->mask_cache = __alloc_percpu(sizeof(struct mask_cache_entry) *
-  MC_HASH_ENTRIES,
-  __alignof__(struct 
mask_cache_entry));
+   table->mask_cache = tbl_mask_cache_alloc(MC_DEFAULT_HASH_ENTRIES);
if (!table->mask_cache)
return -ENOMEM;
 
@@ -377,7 +437,7 @@ int ovs_flow_tbl_init(struct flow_table *table)
 free_mask_array:
__mask_array_destroy(ma);
 free_mask_cache:

[ovs-dev] [PATCH net-next 0/2] net: openvswitch: masks cache enhancements

2020-07-22 Thread Eelco Chaudron
This patchset adds two enhancements to the Open vSwitch masks cache.

Signed-off-by: Eelco Chaudron 

Eelco Chaudron (2):
  net: openvswitch: add masks cache hit counter
  net: openvswitch: make masks cache size configurable


 include/uapi/linux/openvswitch.h |3 +
 net/openvswitch/datapath.c   |   16 +-
 net/openvswitch/datapath.h   |3 +
 net/openvswitch/flow_table.c |  105 +-
 net/openvswitch/flow_table.h |   13 -
 5 files changed, 122 insertions(+), 18 deletions(-)

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 3/4] netdev-offload: Add xdp flow api provider

2020-07-22 Thread Toshiaki Makita

On 2020/07/22 0:34, William Tu wrote:

On Tue, Jul 7, 2020 at 2:07 AM Toshiaki Makita
 wrote:


On 2020/06/30 0:30, Toshiaki Makita wrote:
...

   int netdev_afxdp_init(void)
   {
   libbpf_set_print(libbpf_print);
-return 0;
+return netdev_register_flow_api_provider(_offload_xdp);


This causes duplicate flow api provider error because afxdp and afxdp-nonpmd 
are using
the same init function, and afxdp-nonpmd netdev registration fails.
Probably afxdp-nonpmd does not need to call this init function.


I think we just need to make sure it registers only once.
you can use
ovsthread_once_start()
see example in netdev_dpdk_class_init(void)


OK, will try this.

Toshiaki Makita
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 3/4] netdev-offload: Add xdp flow api provider

2020-07-22 Thread Toshiaki Makita

On 2020/07/22 0:38, Aaron Conole wrote:

William Tu  writes:


On Tue, Jun 30, 2020 at 12:11 AM Toshiaki Makita
 wrote:


On 2020/06/30 1:17, 0-day Robot wrote:

Bleep bloop.  Greetings Toshiaki Makita, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Comment with 'xxx' marker
#252 FILE: lib/netdev-afxdp.c:329:
  /* XXX: close output_map_fd somewhere? */

ERROR: Improper whitespace around control block
#734 FILE: lib/netdev-offload-xdp.c:258:
  FLOWMAP_FOR_EACH_INDEX(idx, mf->map) {


Adding a whitespace like

   FLOWMAP_FOR_EACH_INDEX (idx, mf->map) {

fixes the error, but as far as I can see, all existing usage of this macro
does not have this kind of whitespace.

Which is correct, need whitespace or not?


Need whitespace.  The usage of FLOWMAP_FOR_EACH throughout the codebase
predates automated checking.  You'll note that when someone contributes
new versions (ex: tests/oss-fuzz/miniflow_target.c) they have the
whitespace.

Prior to a checkpatch utility, it was more difficult to catch instances
of style violations, and some snuck through.  Even with checkpatch, some
make it through (though hopefully fewer).


I think we need whitespace here.


Yes.


William, Aaron,

Thank you for the confirmation.
Will fix them in the next version.

Toshiaki Makita
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] N/A

2020-07-22 Thread Magdalena Kate
I'm Mrs Nakielska Kate Magdalena from Poland base in Ukraine.
Our business is lending and we have multiple lending operations
in Europe,all round America, Asia and the United States.

If you are interested in applying for a loan please send us the
following information to get in touch with you:

Amount needed as loan,Full name,Country,State,cell or phone
number, Loan amount and duration of the loan.

your Mobile number is necessary, and the Best time to call you,
because l will like to speak with you on phone.

send above information to our email below:

eMail: nkm...@gmail.com

E-mail Directly/ Phone : +38095948300

-- 
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev