Re: [ovs-dev] [PATCH v4 0/9] Support zone-based conntrack timeout policy

2019-08-15 Thread Darrell Ball
Thanks for the patches

I did some quick adhoc testing with 4.4.0-119 and 5.0.0-23.

Some results worth noting:

1/ Failing to delete timeout policies with non-zero refcounts results in
WARN logging

> 2019-08-16T00:21:32.240Z|00153|dpif_netlink|WARN|failed to delete timeout
policy ovs_tp_0_udp4 (Device or resource busy)
> 2019-08-16T00:21:32.240Z|00154|ofproto_dpif|INFO|failed to delete timeout
policy id = 0 Device or resource busy
> 2019-08-16T00:21:32.241Z|00155|dpif_netlink|WARN|failed to delete timeout
policy ovs_tp_0_udp4 (Device or resource busy)
> 2019-08-16T00:21:32.241Z|00156|ofproto_dpif|INFO|failed to delete timeout
policy id = 0 Device or resource busy
> 2019-08-16T00:21:32.243Z|00157|dpif_netlink|WARN|failed to delete timeout
policy ovs_tp_0_udp4 (Device or resource busy)
> 2019-08-16T00:21:32.243Z|00158|ofproto_dpif|INFO|failed to delete timeout
policy id = 0 Device or resource busy
> 2019-08-16T00:21:32.243Z|00159|dpif_netlink|WARN|failed to delete timeout
policy ovs_tp_0_udp4 (Device or resource busy)
> 2019-08-16T00:21:32.243Z|00160|ofproto_dpif|INFO|failed to delete timeout
policy id = 0 Device or resource busy

Lets change to INFO level since this can legitimately and expectantly occur

Also eliminate one of the redundant logs - dpif-netlink or ofproto_dpif
> 2019-08-16T00:21:32.243Z|00159|dpif_netlink|WARN|failed to delete timeout
policy ovs_tp_0_udp4 (Device or resource busy)
> 2019-08-16T00:21:32.243Z|00160|ofproto_dpif|INFO|failed to delete timeout
policy id = 0 Device or resource busy
while still maintaining the full profile name (e,g, ovs_tp_0_udp4) in the
kernel.

2/ Also, the cleanup attempts are too aggressive
bridge_run() ... ->->type_run()->ct_zone_timeout_policy_sweep()
IIRC, we had a debounce in previous version(s); maybe I missed it in my
quick check of v4; I'll check more later.
Anyways, we need some debouncing.
Also the vlog rate limiting may need adjustment to throttle more.

Thanks Darrell

On Thu, Aug 15, 2019 at 12:31 PM Yi-Hung Wei  wrote:

> This patch series enables zone-based conntrack timeout policy support in
> OVS.
> Timeout policy is a set of timeout attributes that can be associated with a
> connection when it is committed.  Then, the connection tracking system will
> expire a connection based on its connection state.  For example, one use
> case would be to extend the timeout of TCP connection in the established
> state to avoid re-connect overhead. Other use case may be to shorten the
> connection timeout so that the system can reclaim resources faster.
> The idea of zone-based conntrack timeout policy is to group connections
> with similar characteristics in a conntrack zone, and assign timeout policy
> to the conntrack zone.  In this way, all the connections in that zone will
> share
> the same timeout policy.
>
> For zone-based timeout policy configuration, the association of conntrack
> zone and conntrack timeout policy is defined per datapath in vswitchd ovsdb
> schema.  User can program the database through ovs-vsctl or using ovsdb
> protocol directly.  Once the zone-based timeout policy configuration is
> in the database, vswitchd will read those configuration and organize it
> in internal datapath structure, and push the timeout policy into datapath.
> Currently, only the kernel datapath supports customized timeout policy.
>
> When a packet is committed to connection tracking system, during flow
> translation in ofproto-dpif-xlate, vswitchd will lookup the internal
> data structure to figure out which timeout policy to associate with
> the connection.  If timeout policy is not specified to the committed
> zone, it defaults to the timeout policy in the default zone (zone 0).
> If the timeout policy is not specified in the default zone, it defaults
> to the system default timeouts.
>
> Here are some more details about each patch
> * p01: Introduce ovsdb schema for ct timeout policy.
> * p02: ovs-vsctl commands to configure zone-based ct timeout policy.
> * p03: Expose a utility functions.
> * p04: dpif interface along with dpif-netlink implementation to support
>ct timeout policy.
> * p05: Consume ct timeout policy configuration from ovsdb server,
>keep it in internal data structure, and push configuration to
>datapath.
> * p06: Add utility function to help compare two simaps.
> * p07-08: Kernel datapath support for the new ct action attribute.
> * p09: Translate timeout policy in ofproto-dpif-xlate and system traffic
> test.
>
> v3->v4:
> * ofproto-dpif
> - Probe datapath for timeout policy support.
> - With the probing information only translate timeout policy when
>   the datapath is supported.
> - Resolve the old kernel compatibility issue reported by Darrell.
> * system-traffic
> - Simplify the testing script (diff from Darrell).
> * Address various code changes as in the mailing list discussion.
>
> v2->v3
> * ovsdb schema
> - Fold in changes from Justin.
> - Make ct timeout policy 

[ovs-dev] [PATCH v5 4/4 ovn] OVN: Vlan backed DVR N-S, redirect packet via localnet port

2019-08-15 Thread Ankur Sharma
Background:
With c0974331b7a19a87ab8f1f2cec8fbe366af92fa2, we have added
support for E-W workflow for vlan backed DVRs.

This series enables N-S workflow for vlan backed DVRs.

Key difference between E-W and N-S traffic flow is that
N-S flow requires a gateway chassis. A gateway chassis
will be respondible for following:
a. Doing Network Address Translation (NAT).
b. Becoming entry and exit point for North->South
   and South->North traffic respectively.

OVN by default always uses overlay encapsulation to redirect
the packet to gateway chassis. This series will enable
the redirection to gateway chassis in the absence of encapsulation.

This patch:
Achieves the vlan backed redirection by doing following:

Sender Side:

a. For a remote port of type "chassisredirect" and if it
   has redirect type as "vlan", then do not add tunnel
   based redirection flow in table=32.

b. In table=33, add a flow with priority=100, that would do following:
   i. Change the metadata to that of gateway logical switch
  (i.e logical switch attached to gateway logical router port).
  ii. Change REG15 to point to localnet port of gateway logical switch.
 iii. send to packet to table=15.

c. In Table=65, packet will hit the existing priority=150 flow to send
   the packet to physical bridge, while attaching vlan header and
   changing source mac to chassis mac.

Receiver Side:
--
a. No changes needed

OVERALL PACKET FLOW:
Sender Side:
---
a. logical flow in lr_in_gw_redirect stage will ensure that
   outport of the packet is chassisredirect port.
   For example:
   table=12(lr_in_gw_redirect  ), priority=50   , match=(outport == 
"router-to-underlay"), action=(outport = "cr-router-to-underlay"; next;)

b. After ingress pipeline, packet will enter the table=32, followed by table=33

c. Table=33, will send the packet to table=65.

d. Table=65, will send the packet to uplink bridge
   with destination mac of chassisredirect port and vlan
   id of peer logical switch.

Receiver Side:
-
a. Packet is received by the pipeline of peer logical switch.
b. Since destination mac is that of router port, hence packet will
   enter the logical router pipeline.
c. Now, packet will go through regular logical router pipeline
   (both ingress and egress).

One caveat with the approach is that ttl will be decremented twice,
since the packets are going through logical router ingress pipeline
twice (once on sender chassis and again on gateway chassis).

No changes needed for the reverse path.

Signed-off-by: Ankur Sharma 
---
 controller/physical.c  | 255 +++-
 lib/ovn-util.c |  33 ++
 lib/ovn-util.h |   5 +
 ovn-architecture.7.xml |  64 +++
 tests/ovn.at   | 307 +
 5 files changed, 581 insertions(+), 83 deletions(-)

diff --git a/controller/physical.c b/controller/physical.c
index 5068785..9e56149 100644
--- a/controller/physical.c
+++ b/controller/physical.c
@@ -228,6 +228,165 @@ get_zone_ids(const struct sbrec_port_binding *binding,
 }
 
 static void
+put_remote_port_redirect_vlan(const struct
+  sbrec_port_binding *binding,
+  const struct hmap *local_datapaths,
+  struct local_datapath *ld,
+  struct match *match,
+  struct ofpbuf *ofpacts_p,
+  struct ovn_desired_flow_table *flow_table)
+{
+struct eth_addr binding_mac;
+uint32_t ls_dp_key = 0;
+
+if (strcmp(binding->type, "chassisredirect")) {
+/* VLAN based redirect is only supported for chassisredirect
+ * type remote ports. */
+return;
+}
+
+bool  is_valid_mac = extract_sbrec_binding_first_mac(binding,
+ _mac);
+if (!is_valid_mac) {
+return;
+}
+
+for (int i = 0; i < ld->n_peer_ports; i++) {
+const struct sbrec_port_binding *sport_binding = ld->peer_ports[i];
+const char *sport_peer_name = smap_get(_binding->options,
+   "peer");
+const char *distributed_port = smap_get(>options,
+"distributed-port");
+
+if (!strcmp(sport_peer_name, distributed_port)) {
+ls_dp_key = sport_binding->datapath->tunnel_key;
+break;
+}
+}
+
+if (!ls_dp_key) {
+return;
+}
+
+union mf_value value;
+struct ofpact_mac *src_mac;
+const struct sbrec_port_binding *ls_localnet_port;
+
+ls_localnet_port = get_localnet_port(local_datapaths, ls_dp_key);
+
+src_mac = ofpact_put_SET_ETH_SRC(ofpacts_p);
+src_mac->mac = binding_mac;
+
+value.be64 = 

[ovs-dev] [PATCH v5 2/4 ovn] OVN: Vlan backed DVR N-S, redirect-type option

2019-08-15 Thread Ankur Sharma
Background:
With c0974331b7a19a87ab8f1f2cec8fbe366af92fa2, we have added
support for E-W workflow for vlan backed DVRs.

This series enables N-S workflow for vlan backed DVRs.

Key difference between E-W and N-S traffic flow is that
N-S flow requires a gateway chassis. A gateway chassis
will be respondible for following:
a. Doing Network Address Translation (NAT).
b. Becoming entry and exit point for North->South
   and South->North traffic respectively.

OVN by default always uses overlay encapsulation to redirect
the packet to gateway chassis. This series will enable
the redirection to gateway chassis in the absence of encapsulation.

This patch:
a. Add a new key-value in options of a router port.
b. This new config key will be used by ovn-controller
   to determine if a redirected packet will go out of
   tunnel port or localnet port.
c. key is "redirect-type" and it takes "overlay" and
   "vlan" as values.
d. Added ovn-nbctl command to set and get redirect-type
   option on a router port.
e. This new configuration is added because vlan or overlay
   based forwarding is considered to be a logical switch property,
   hence for a router configuration has to be done at the router port
   level.
f. Restored the function ovsdb_datum_to_smap, which helps in ensuring
   that we do not overwrite existing options, while adding a new
   key-value pair to it. This function exists in 2.8.5, i am not
   able to figure out so far, which release/why it was removed.
   I do not see a harm in adding it back.

Signed-off-by: Ankur Sharma 
---
 northd/ovn-northd.c   |  6 +
 ovn-nb.xml| 43 
 ovs/lib/ovsdb-data.c  | 11 +
 ovs/lib/ovsdb-data.h  |  2 ++
 tests/ovn-nbctl.at| 25 +++
 tests/ovn-northd.at   | 31 +++
 utilities/ovn-nbctl.c | 68 +++
 7 files changed, 186 insertions(+)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index e861344..89ca8df 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -2445,6 +2445,9 @@ ovn_port_update_sbrec(struct northd_context *ctx,
 if (op->derived) {
 const char *redirect_chassis = smap_get(>nbrp->options,
 "redirect-chassis");
+const char *redirect_type = smap_get(>nbrp->options,
+ "redirect-type");
+
 int n_gw_options_set = 0;
 if (op->nbrp->ha_chassis_group) {
 n_gw_options_set++;
@@ -2536,6 +2539,9 @@ ovn_port_update_sbrec(struct northd_context *ctx,
 sbrec_port_binding_set_gateway_chassis(op->sb, NULL, 0);
 }
 smap_add(, "distributed-port", op->nbrp->name);
+if (redirect_type) {
+smap_add(, "redirect-type", redirect_type);
+}
 } else {
 if (op->peer) {
 smap_add(, "peer", op->peer->key);
diff --git a/ovn-nb.xml b/ovn-nb.xml
index e166190..8bb6221 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -1948,6 +1948,49 @@
   issues.
 
   
+
+  
+
+  This options dictates if a packet redirected to
+  gateway chassis will be overlay encapsulated
+  or go as a regular vlan packet.
+
+
+
+  Option takes following values
+
+
+
+  
+OVERLAY
+  
+
+  
+VLAN
+  
+
+
+
+  OVERLAY option will ensure that redirected packet goes out as
+  encapsulation via the tunnel port.
+
+
+
+  VLAN option will ensure that redirected packet goes out as vlan
+  tagged via the localnet port.
+
+
+
+  OVERLAY is the default redirection type.
+
+
+
+  Option is applicable only to gateway chassis attached logical
+  router ports.
+
+
+  
+
 
 
 
diff --git a/ovs/lib/ovsdb-data.c b/ovs/lib/ovsdb-data.c
index b0fb20d..c7fcb8a 100644
--- a/ovs/lib/ovsdb-data.c
+++ b/ovs/lib/ovsdb-data.c
@@ -1691,6 +1691,17 @@ ovsdb_datum_from_smap(struct ovsdb_datum *datum, const 
struct smap *smap)
 ovsdb_datum_sort_unique(datum, OVSDB_TYPE_STRING, OVSDB_TYPE_STRING);
 }
 
+/* Initializes smap from a string-to-string datum map. */
+void
+ovsdb_datum_to_smap(struct smap *smap, const struct ovsdb_datum *datum)
+{
+size_t i = 0;
+for (; i < datum->n; i++) {
+smap_add(smap, datum->keys[i].string,  datum->values[i].string);
+}
+ovs_assert(i == smap_count(smap));
+}
+
 struct ovsdb_error * OVS_WARN_UNUSED_RESULT
 ovsdb_datum_convert(struct ovsdb_datum *dst,
 const struct ovsdb_type *dst_type,
diff --git a/ovs/lib/ovsdb-data.h b/ovs/lib/ovsdb-data.h
index c5a80ee..bf2cd8a 100644
--- a/ovs/lib/ovsdb-data.h
+++ b/ovs/lib/ovsdb-data.h
@@ -191,6 +191,8 @@ void 

[ovs-dev] [PATCH v5 3/4 ovn] OVN: Vlan backed DVR N-S, avoid get_arp on non redirect chassis.

2019-08-15 Thread Ankur Sharma
Background:
With c0974331b7a19a87ab8f1f2cec8fbe366af92fa2, we have added
support for E-W workflow for vlan backed DVRs.

This series enables N-S workflow for vlan backed DVRs.

Key difference between E-W and N-S traffic flow is that
N-S flow requires a gateway chassis. A gateway chassis
will be respondible for following:
a. Doing Network Address Translation (NAT).
b. Becoming entry and exit point for North->South
   and South->North traffic respectively.

OVN by default always uses overlay encapsulation to redirect
the packet to gateway chassis. This series will enable
the redirection to gateway chassis in the absence of encapsulation.

This patch:
a. Make sure that ARP request for endpoint behind the gateway
   router port is sent from gateway chassis only and not from
   host(compute) chassis.

b. This is achieved by adding a new logical flow in
   lr_in_arp_resolve at priority=50.

c. This flow run on non gateway chassis and sets the destination
   mac to router port mac, if outport is a gateway chassis attached
   router port and redirect-type is set as "vlan".
   Example logical flow:
   table=9 (lr_in_arp_resolve  ), priority=50   , match=(outport == 
"router-to-underlay" && !is_chassis_resident("cr-router-to-underlay")), 
action=(eth.dst = 00:00:01:01:02:04; next;)

d. This change is needed because other wise for non resolved ARPs,
   we will end up doing get_arp in host chassis. Doing so will
   have following issues:
   i. We want all the interation with North bound endpoints via
  gateway chassis only, doing so on host chassis will violate
  that.

  ii. With get_arp, ovn-controller will generate the ARP using router
  port's mac as source mac, which will lead us to the same issue,
  where router port mac will be going through continous mac moves
  in physical network. Worst, it would affect the redirection,
  since it uses router port mac as destination mac.

Signed-off-by: Ankur Sharma 
---
 northd/ovn-northd.c | 32 
 1 file changed, 32 insertions(+)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 89ca8df..e13a5af 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -3516,6 +3516,16 @@ lsp_is_external(const struct nbrec_logical_switch_port 
*nbsp)
 return !strcmp(nbsp->type, "external");
 }
 
+/* Returns true if lrp has either gateway chassis or ha chassis group
+ * attached to it. */
+static bool
+lrp_has_gateway(const struct nbrec_logical_router_port *nbrp)
+{
+return (nbrp->n_gateway_chassis ||
+(nbrp->ha_chassis_group && nbrp->ha_chassis_group->n_ha_chassis))
+? true : false;
+}
+
 static bool
 build_dhcpv4_action(struct ovn_port *op, ovs_be32 offer_ip,
 struct ds *options_action, struct ds *response_action,
@@ -7568,6 +7578,28 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap 
*ports,
   100, ds_cstr(), ds_cstr());
 }
 }
+
+if (!op->derived && lrp_has_gateway(op->nbrp)) {
+const char *redirect_type = smap_get(>nbrp->options,
+ "redirect-type");
+if (redirect_type && !strcasecmp(redirect_type, "vlan")) {
+/* Packet is on a non gateway chassis and
+ * has an unresolved ARP on a network behind gateway
+ * chassis attached router port. Since, redirect type
+ * is set to vlan, hence instead of calling "get_arp"
+ * on this node, we will redirect the packet to gateway
+ * chassis, by setting destination mac router port mac.*/
+ds_clear();
+ds_put_format(, "outport == %s && "
+  "!is_chassis_resident(%s)", op->json_key,
+  op->od->l3redirect_port->json_key);
+ds_clear();
+ds_put_format(, "eth.dst = %s; next;",
+  op->lrp_networks.ea_s);
+ovn_lflow_add(lflows, op->od, S_ROUTER_IN_ARP_RESOLVE,
+  50, ds_cstr(), ds_cstr());
+}
+}
 } else if (op->od->n_router_ports && strcmp(op->nbsp->type, "router")
&& strcmp(op->nbsp->type, "virtual")) {
 /* This is a logical switch port that backs a VM or a container.
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v5 0/4 ovn] OVN: Vlan backed DVR, enable N-S packet flow

2019-08-15 Thread Ankur Sharma
Problem Description:
Redirection to chassisredirect ports happens only via tunnel
encapsulation. As a result, same cannot be leveraged upon for vlan backed
environments.

This series addresses the issue by allowing redirection to happen without
encapsulation.


Design:
===
a. High level design is that CMS/user will provision "redirect-type"
   on a gateway chassis attached logical router port.

b. If the redirect-type is set as "vlan", then redirected packet
   from compute chassis to gateway chassis will go via the localnet
   port.

c. This redirected vlan packet will have following attributes:
   i. Source Mac ==> Chassis mac of compute chassis.
  ii. Destination Mac ==> Mac address of chassisredirect router port.
  i.e cr-lrp-*
 iii. Vlan id  ==> Vlan id of peer logical switch of gateway chassis
   attached logical router port.

d. To attain c. above a logical flow added in table=33 will send the
   packet to table=65, where it will hit regular flow which will
   send the packet out of localnet port.


This Series:


Patch 1
---
It is mostly a bug fix. Fix is to make sure that on gateway chassis
we DO NOT replace source mac with chassis mac. This is done because
this gateway chassis is the entry point for all the North->South traffic
, hence north bound endpoint will use router port mac as destination mac.

If we replace router port mac with chassis mac on its gateway chassis as well,
then router port mac will not learnt by physical network.

Patch 2
---
ovn-nbctl and ovn-northd changes to accept a "redirect-type" option
associated with a logical router port. This configuration
is added so that we have a parameter to decide if want to send
a redirected packet via tunnel port or localnet port.

Patch 3
---
Adding a logical flow in lrp_in_arp_resolve to make sure that ARP requests
from logical router are generated on gateway chassis only. This flow
will make sure that we DO NOT call get_arp on compute chassis.

Patch 4
---
Changes in ovn-controller, to add the redirect related OVS flow based
on configuration parameter added in Patch 2. i.e if redirect-type
is 'overlay', then flow in table=32 will be added (to send the packet
out via tunnel port), if redirect-type is 'vlan', then flow in table=33
will be added, to send the packet out via localnet port.


Existing Efforts:
=
There has been an effort done in solving same problem.
https://patchwork.ozlabs.org/patch/920447/ by vkomm...@redhat.com

This patch differs from above changes in following areas:
a. Existing patch adds an additional flow in lr_in_ip_routing to mark the packet
   as NAT_REDIRECT in compute chassis. This approach looks reasonable,
   especially since through this we can avoid Patch 3 in this series.
   However, using something similar based on our approach lead to following: 
   i. For each route pointing to gateway router port, we will need 2 flows
  in lr_in_ip_routing, one which executes ONLY on compute chassis
  and marks the packet for NAT_REDIRECT and one which executes ONLY
  on gateway chassis and does not mark the packet.

  Since, we cannot change the priority of a route flow
  (prefix length decides the priority), hence we will end up with
  2 logical flows with  same match (with is_chassis_resident being the
  only differentiating factor). OVN controller considered both such
  flows as duplicate and ends up considering only one of them, i.e either
  either only the compute chassis got is relevant flow or only
  gateway chassis.

b. Existing patch considers a tenant as VLAN backed. Whereas this patch
   considers overlay/vlan as the property of gateway logical switch. And
   from a router's perspective, configuration is done on the peer router port.

c. Existing patch sends the packet to gateway router port mac, but uses
   source logical switch (tenant logical switch's) vlan id.
   This will always cause flooding in physical network, because
   gateway router port mac will be learnt on the peer logical switch.
   Using a different vlan id will always cause flooding of
   redirected packets from compute chassis to gateway chassis.

d. Because of c. above, existing patch needs changes on receiving side as well.
   This is because since packet's vlan id is not of the correct logical switch,
   hence on receiving node (gateway chassis), packet has to be forced forwaded 
to
   the logical router pipeline.

v4 -> v5
--
  * Fixed a compilation warning in lport.h

v3 -> v4
--
  * Handle review comments.
  * Rebased to TOT.

v2 -> v3
--
  * Fix merge conflicts.

v1 -> v2
--
  * Added ovn in the comment description prefix.




Ankur Sharma (4):
  OVN: Do not replace router port mac on gateway chassis.
  OVN: Vlan backed DVR N-S, redirect-type option
  OVN: Vlan backed DVR N-S, avoid get_arp on non redirect chassis.
  OVN: Vlan backed DVR N-S, redirect packet via localnet port

 

[ovs-dev] [PATCH v5 1/4 ovn] OVN: Do not replace router port mac on gateway chassis.

2019-08-15 Thread Ankur Sharma
With 795d7f24ce0e2ed5454e193a059451d237289542 we have added
support for E-W routing on vlan backed networks by replacing
router port macs with chassis macs.

This replacement of router port mac need NOT be done on
gateway chassis for following reasons:

a. For N-S traffic, gateway chassis will respond to ARP
for the router port (to which it is attached) and
traffic will be using router port mac as destination mac.

b. Chassis redirect port is a centralized version of distributed
router port, hence we need not replace its mac with chassis mac
on the resident chassis.

This patch addresses the same.

Signed-off-by: Ankur Sharma 
---
 controller/lport.c|  20 
 controller/lport.h|   6 +
 controller/physical.c |  18 ++-
 controller/pinctrl.c  |  20 +---
 tests/ovn.at  | 313 ++
 5 files changed, 356 insertions(+), 21 deletions(-)

diff --git a/controller/lport.c b/controller/lport.c
index 792c825..478fcfd 100644
--- a/controller/lport.c
+++ b/controller/lport.c
@@ -17,6 +17,7 @@
 
 #include "lib/sset.h"
 #include "lport.h"
+#include "ha-chassis.h"
 #include "hash.h"
 #include "openvswitch/vlog.h"
 #include "lib/ovn-sb-idl.h"
@@ -64,6 +65,25 @@ lport_lookup_by_key(struct ovsdb_idl_index 
*sbrec_datapath_binding_by_key,
 return retval;
 }
 
+bool
+lport_is_chassis_resident(struct ovsdb_idl_index *sbrec_port_binding_by_name,
+  const struct sbrec_chassis *chassis,
+  const struct sset *active_tunnels,
+  const char *port_name)
+{
+const struct sbrec_port_binding *pb
+= lport_lookup_by_name(sbrec_port_binding_by_name, port_name);
+if (!pb || !pb->chassis) {
+return false;
+}
+if (strcmp(pb->type, "chassisredirect")) {
+return pb->chassis == chassis;
+} else {
+return ha_chassis_group_is_active(pb->ha_chassis_group,
+  active_tunnels, chassis);
+}
+}
+
 const struct sbrec_datapath_binding *
 datapath_lookup_by_key(struct ovsdb_idl_index *sbrec_datapath_binding_by_key,
uint64_t dp_key)
diff --git a/controller/lport.h b/controller/lport.h
index 2d4bb71..345efc1 100644
--- a/controller/lport.h
+++ b/controller/lport.h
@@ -23,6 +23,7 @@ struct sbrec_chassis;
 struct sbrec_datapath_binding;
 struct sbrec_multicast_group;
 struct sbrec_port_binding;
+struct sset;
 
 
 /* Database indexes.
@@ -48,5 +49,10 @@ const struct sbrec_datapath_binding *datapath_lookup_by_key(
 const struct sbrec_multicast_group *mcgroup_lookup_by_dp_name(
 struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath,
 const struct sbrec_datapath_binding *, const char *name);
+bool
+lport_is_chassis_resident(struct ovsdb_idl_index *sbrec_port_binding_by_name,
+  const struct sbrec_chassis *chassis,
+  const struct sset *active_tunnels,
+  const char *port_name);
 
 #endif /* controller/lport.h */
diff --git a/controller/physical.c b/controller/physical.c
index a05962b..5068785 100644
--- a/controller/physical.c
+++ b/controller/physical.c
@@ -228,9 +228,12 @@ get_zone_ids(const struct sbrec_port_binding *binding,
 }
 
 static void
-put_replace_router_port_mac_flows(const struct
+put_replace_router_port_mac_flows(struct ovsdb_idl_index
+  *sbrec_port_binding_by_name,
+  const struct
   sbrec_port_binding *localnet_port,
   const struct sbrec_chassis *chassis,
+  const struct sset *active_tunnels,
   const struct hmap *local_datapaths,
   struct ofpbuf *ofpacts_p,
   ofp_port_t ofport,
@@ -270,6 +273,16 @@ put_replace_router_port_mac_flows(const struct
 struct eth_addr router_port_mac;
 struct match match;
 struct ofpact_mac *replace_mac;
+char *cr_peer_name = xasprintf("cr-%s", rport_binding->logical_port);
+if (lport_is_chassis_resident(sbrec_port_binding_by_name,
+  chassis, active_tunnels,
+  cr_peer_name)) {
+/* If a router port's chassisredirect port is
+ * resident on this chassis, then we need not do mac replace. */
+free(cr_peer_name);
+continue;
+}
+free(cr_peer_name);
 
 /* Table 65, priority 150.
  * ===
@@ -787,7 +800,8 @@ consider_port_binding(struct ovsdb_idl_index 
*sbrec_port_binding_by_name,
 , ofpacts_p, >header_.uuid);
 
 if (!strcmp(binding->type, "localnet")) {
-put_replace_router_port_mac_flows(binding, chassis,
+put_replace_router_port_mac_flows(sbrec_port_binding_by_name,

[ovs-dev] [PATCH v4 9/9] ofproto-dpif-xlate: Translate timeout policy in ct action

2019-08-15 Thread Yi-Hung Wei
This patch derives the timeout policy based on ct zone from the
internal data structure that we maintain on dpif layer.

It also adds a system traffic test to verify the zone-based conntrack
timeout feature.  The test uses ovs-vsctl commands to configure
the customized ICMP and UDP timeout on zone 5 to a shorter period.
It then injects ICMP and UDP traffic to conntrack, and checks if the
corresponding conntrack entry expires after the predefined timeout.

Signed-off-by: Yi-Hung Wei 

ofproto-dpif: Checks if datapath supports OVS_CT_ATTR_TIMEOUT

This patch checks whether datapath supports OVS_CT_ATTR_TIMEOUT. With this
check, ofproto-dpif-xlate can use this information to decide whether to
translate the ct timeout policy.

Signed-off-by: Yi-Hung Wei 
---
 NEWS |  1 +
 lib/ct-dpif.c| 11 +
 lib/ct-dpif.h|  3 ++
 lib/dpif-netdev.c|  1 +
 lib/dpif-netlink.c   | 12 +
 lib/dpif-provider.h  | 10 +
 ofproto/ofproto-dpif-xlate.c | 23 ++
 ofproto/ofproto-dpif.c   | 96 
 ofproto/ofproto-dpif.h   | 12 -
 tests/system-kmod-macros.at  | 20 +
 tests/system-traffic.at  | 66 +++
 tests/system-userspace-macros.at | 19 
 12 files changed, 272 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index c5caa13d6374..9f7fbb852e08 100644
--- a/NEWS
+++ b/NEWS
@@ -69,6 +69,7 @@ v2.12.0 - xx xxx 
- Linux datapath:
  * Support for the kernel versions 4.19.x and 4.20.x.
  * Support for the kernel version 5.0.x.
+ * Add support for conntrack zone-based timeout policy.
- 'ovs-dpctl dump-flows' is no longer suitable for dumping offloaded flows.
  'ovs-appctl dpctl/dump-flows' should be used instead.
- Add L2 GRE tunnel over IPv6 support.
diff --git a/lib/ct-dpif.c b/lib/ct-dpif.c
index be59f34ff995..2181c83157ad 100644
--- a/lib/ct-dpif.c
+++ b/lib/ct-dpif.c
@@ -862,3 +862,14 @@ ct_dpif_timeout_policy_dump_done(struct dpif *dpif, void 
*state)
 ? dpif->dpif_class->ct_timeout_policy_dump_done(dpif, state)
 : EOPNOTSUPP);
 }
+
+int
+ct_dpif_get_timeout_policy_name(struct dpif *dpif, uint32_t tp_id,
+uint16_t dl_type, uint8_t nw_proto,
+struct ds *tp_name, bool *is_generic)
+{
+return (dpif->dpif_class->ct_get_timeout_policy_name
+? dpif->dpif_class->ct_get_timeout_policy_name(
+dpif, tp_id, dl_type, nw_proto, tp_name, is_generic)
+: EOPNOTSUPP);
+}
diff --git a/lib/ct-dpif.h b/lib/ct-dpif.h
index aabd6962f2c0..b95e6ac762ab 100644
--- a/lib/ct-dpif.h
+++ b/lib/ct-dpif.h
@@ -318,5 +318,8 @@ int ct_dpif_timeout_policy_dump_start(struct dpif *dpif, 
void **statep);
 int ct_dpif_timeout_policy_dump_next(struct dpif *dpif, void *state,
  struct ct_dpif_timeout_policy *tp);
 int ct_dpif_timeout_policy_dump_done(struct dpif *dpif, void *state);
+int ct_dpif_get_timeout_policy_name(struct dpif *dpif, uint32_t tp_id,
+uint16_t dl_type, uint8_t nw_proto,
+struct ds *tp_name, bool *is_generic);
 
 #endif /* CT_DPIF_H */
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 7240a3e6f3c8..36637052e598 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -7539,6 +7539,7 @@ const struct dpif_class dpif_netdev_class = {
 NULL,   /* ct_timeout_policy_dump_start */
 NULL,   /* ct_timeout_policy_dump_next */
 NULL,   /* ct_timeout_policy_dump_done */
+NULL,   /* ct_get_timeout_policy_name */
 dpif_netdev_ipf_set_enabled,
 dpif_netdev_ipf_set_min_frag,
 dpif_netdev_ipf_set_max_nfrags,
diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index 1d4ee60bd199..52f911991dc6 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -3072,6 +3072,17 @@ dpif_netlink_format_tp_name(uint32_t id, uint16_t l3num, 
uint8_t l4num,
 ovs_assert(tp_name->length < CTNL_TIMEOUT_NAME_MAX);
 }
 
+static int
+dpif_netlink_ct_get_timeout_policy_name(struct dpif *dpif OVS_UNUSED,
+uint32_t tp_id, uint16_t dl_type, uint8_t nw_proto, struct ds *tp_name,
+bool *is_generic)
+{
+dpif_netlink_format_tp_name(tp_id,
+dl_type == ETH_TYPE_IP ? AF_INET : AF_INET6, nw_proto, tp_name);
+*is_generic = false;
+return 0;
+}
+
 #define CT_DPIF_NL_TP_TCP_MAPPINGS  \
 CT_DPIF_NL_TP_MAPPING(TCP, TCP, SYN_SENT, SYN_SENT) \
 CT_DPIF_NL_TP_MAPPING(TCP, TCP, SYN_RECV, SYN_RECV) \
@@ -3907,6 +3918,7 @@ const struct dpif_class dpif_netlink_class = {
 dpif_netlink_ct_timeout_policy_dump_start,
 dpif_netlink_ct_timeout_policy_dump_next,
 dpif_netlink_ct_timeout_policy_dump_done,
+

[ovs-dev] [PATCH v4 8/9] datapath: Add support for conntrack timeout policy

2019-08-15 Thread Yi-Hung Wei
This patch adds support for specifying a timeout policy for a
connection in connection tracking system in kernel datapath.
The timeout policy will be attached to a connection when the
connection is committed to conntrack.

This patch introduces a new odp field OVS_CT_ATTR_TIMEOUT in the
ct action that specifies the timeout policy in the datapath.
In the following patch, during the upcall process, the vswitchd will use
the ct_zone to look up the corresponding timeout policy and fill
OVS_CT_ATTR_TIMEOUT if it is available.

The datapath code is from the following two net-next upstream commits.

Upstream commit:
commit 06bd2bdf19d2f3d22731625e1a47fa1dff5ac407
Author: Yi-Hung Wei 
Date:   Tue Mar 26 11:31:14 2019 -0700

openvswitch: Add timeout support to ct action

Add support for fine-grain timeout support to conntrack action.
The new OVS_CT_ATTR_TIMEOUT attribute of the conntrack action
specifies a timeout to be associated with this connection.
If no timeout is specified, it acts as is, that is the default
timeout for the connection will be automatically applied.

Example usage:
$ nfct timeout add timeout_1 inet tcp syn_sent 100 established 200
$ ovs-ofctl add-flow br0 
in_port=1,ip,tcp,action=ct(commit,timeout=timeout_1)

CC: Pravin Shelar 
CC: Pablo Neira Ayuso 
Signed-off-by: Yi-Hung Wei 
Acked-by: Pravin B Shelar 
Signed-off-by: David S. Miller 

commit 6d670497e01803b486aa72cc1a718401ab986896
Author: Dan Carpenter 
Date:   Tue Apr 2 09:53:14 2019 +0300

openvswitch: use after free in __ovs_ct_free_action()

We free "ct_info->ct" and then use it on the next line when we pass it
to nf_ct_destroy_timeout().  This patch swaps the order to avoid the use
after free.

Fixes: 06bd2bdf19d2 ("openvswitch: Add timeout support to ct action")
Signed-off-by: Dan Carpenter 
Acked-by: Yi-Hung Wei 
Signed-off-by: David S. Miller 

Signed-off-by: Yi-Hung Wei 
---
 datapath/conntrack.c  | 30 ++-
 datapath/linux/compat/include/linux/openvswitch.h |  4 +++
 lib/dpif-netdev.c |  4 +++
 lib/odp-util.c| 29 +++---
 tests/odp.at  |  1 +
 5 files changed, 63 insertions(+), 5 deletions(-)

diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index 292febb3c83e..f85d0a2572f6 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -86,6 +87,7 @@ struct ovs_conntrack_info {
u32 eventmask;  /* Mask of 1 << IPCT_*. */
struct md_mark mark;
struct md_labels labels;
+   char timeout[CTNL_TIMEOUT_NAME_MAX];
 #ifdef CONFIG_NF_NAT_NEEDED
struct nf_nat_range2 range;  /* Only present for SRC NAT and DST NAT. */
 #endif
@@ -1519,6 +1521,8 @@ static const struct ovs_ct_len_tbl 
ovs_ct_attr_lens[OVS_CT_ATTR_MAX + 1] = {
 #endif
[OVS_CT_ATTR_EVENTMASK] = { .minlen = sizeof(u32),
.maxlen = sizeof(u32) },
+   [OVS_CT_ATTR_TIMEOUT] = { .minlen = 1,
+ .maxlen = CTNL_TIMEOUT_NAME_MAX },
 };
 
 static int parse_ct(const struct nlattr *attr, struct ovs_conntrack_info *info,
@@ -1604,6 +1608,15 @@ static int parse_ct(const struct nlattr *attr, struct 
ovs_conntrack_info *info,
info->have_eventmask = true;
info->eventmask = nla_get_u32(a);
break;
+#ifdef CONFIG_NF_CONNTRACK_TIMEOUT
+   case OVS_CT_ATTR_TIMEOUT:
+   memcpy(info->timeout, nla_data(a), nla_len(a));
+   if (!memchr(info->timeout, '\0', nla_len(a))) {
+   OVS_NLERR(log, "Invalid conntrack helper");
+   return -EINVAL;
+   }
+   break;
+#endif
 
default:
OVS_NLERR(log, "Unknown conntrack attr (%d)",
@@ -1685,6 +1698,14 @@ int ovs_ct_copy_action(struct net *net, const struct 
nlattr *attr,
OVS_NLERR(log, "Failed to allocate conntrack template");
return -ENOMEM;
}
+
+   if (ct_info.timeout[0]) {
+   if (nf_ct_set_timeout(net, ct_info.ct, family, key->ip.proto,
+ ct_info.timeout))
+   pr_info_ratelimited("Failed to associated timeout "
+   "policy `%s'\n", ct_info.timeout);
+   }
+
if (helper) {
err = ovs_ct_add_helper(_info, helper, key, log);
if (err)
@@ -1809,6 +1830,10 @@ int ovs_ct_action_to_attr(const struct 
ovs_conntrack_info *ct_info,
if (ct_info->have_eventmask &&
nla_put_u32(skb, OVS_CT_ATTR_EVENTMASK, ct_info->eventmask))
   

[ovs-dev] [PATCH v4 7/9] datapath: compat: Backport nf_conntrack_timeout support

2019-08-15 Thread Yi-Hung Wei
This patch brings in nf_ct_timeout_put() and nf_ct_set_timeout()
when it is not available in the kernel.

Three symbols are created in acinclude.m4.

* HAVE_NF_CT_SET_TIMEOUT is used to determine if upstream net-next commit
717700d183d65 ("netfilter: Export nf_ct_{set,destroy}_timeout()") is
availabe.  If it is defined, the kernel should have all the
nf_conntrack_timeout support that OVS needs.

* HAVE_NF_CT_TIMEOUT is used to check if upstream net-next commit
6c1fd7dc489d9 ("netfilter: cttimeout: decouple timeout policy from
nfnetlink_cttimeout object") is there.  If it is not defined, we
will use the old ctnl_timeout interface rather than the nf_ct_timeout
interface that is introduced in this commit.

* HAVE_NF_CT_TIMEOUT_FIND_GET_HOOK_NET is used to check if upstream
commit 19576c9478682 ("netfilter: cttimeout: add netns support") is
there, so that we pass different arguement based on whether the kernel
has netns support.

Signed-off-by: Yi-Hung Wei 
---
 acinclude.m4   |   7 ++
 datapath/linux/Modules.mk  |   2 +
 .../include/net/netfilter/nf_conntrack_timeout.h   |  34 +++
 datapath/linux/compat/nf_conntrack_timeout.c   | 102 +
 4 files changed, 145 insertions(+)
 create mode 100644 
datapath/linux/compat/include/net/netfilter/nf_conntrack_timeout.h
 create mode 100644 datapath/linux/compat/nf_conntrack_timeout.c

diff --git a/acinclude.m4 b/acinclude.m4
index 116ffcf9096d..61fe4faa006a 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -714,6 +714,13 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [
   OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack_seqadj.h], 
[nf_ct_seq_adjust])
   OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack_count.h], 
[nf_conncount_gc_list],
   [OVS_DEFINE([HAVE_UPSTREAM_NF_CONNCOUNT])])
+  OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack_timeout.h], 
[nf_ct_set_timeout])
+  OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack_timeout.h], 
[struct nf_ct_timeout],
+  [OVS_DEFINE([HAVE_NF_CT_TIMEOUT])])
+  OVS_FIND_PARAM_IFELSE([$KSRC/include/net/netfilter/nf_conntrack_timeout.h],
+[\(*nf_ct_timeout_find_get_hook\)], [net],
+[OVS_DEFINE([HAVE_NF_CT_TIMEOUT_FIND_GET_HOOK_NET])])
+
 
   OVS_GREP_IFELSE([$KSRC/include/linux/random.h], [prandom_u32])
   OVS_GREP_IFELSE([$KSRC/include/linux/random.h], [prandom_u32_max])
diff --git a/datapath/linux/Modules.mk b/datapath/linux/Modules.mk
index cbb29f1c69d0..f93097b8e0e5 100644
--- a/datapath/linux/Modules.mk
+++ b/datapath/linux/Modules.mk
@@ -21,6 +21,7 @@ openvswitch_sources += \
linux/compat/nf_conntrack_core.c \
linux/compat/nf_conntrack_proto.c \
linux/compat/nf_conntrack_reasm.c \
+   linux/compat/nf_conntrack_timeout.c \
linux/compat/reciprocal_div.c \
linux/compat/skbuff-openvswitch.c \
linux/compat/socket.c \
@@ -108,6 +109,7 @@ openvswitch_headers += \
linux/compat/include/net/netfilter/nf_conntrack_helper.h \
linux/compat/include/net/netfilter/nf_conntrack_labels.h \
linux/compat/include/net/netfilter/nf_conntrack_seqadj.h \
+   linux/compat/include/net/netfilter/nf_conntrack_timeout.h \
linux/compat/include/net/netfilter/nf_conntrack_zones.h \
linux/compat/include/net/netfilter/nf_nat.h \
linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h \
diff --git a/datapath/linux/compat/include/net/netfilter/nf_conntrack_timeout.h 
b/datapath/linux/compat/include/net/netfilter/nf_conntrack_timeout.h
new file mode 100644
index ..134e72b8363e
--- /dev/null
+++ b/datapath/linux/compat/include/net/netfilter/nf_conntrack_timeout.h
@@ -0,0 +1,34 @@
+#ifndef _NF_CONNTRACK_TIMEOUT_WRAPPER_H
+#define _NF_CONNTRACK_TIMEOUT_WRAPPER_H
+
+#include_next 
+
+#ifndef HAVE_NF_CT_SET_TIMEOUT
+
+#ifndef HAVE_NF_CT_TIMEOUT
+#define nf_ct_timeout ctnl_timeout
+#endif
+
+#ifdef CONFIG_NF_CONNTRACK_TIMEOUT
+int rpl_nf_ct_set_timeout(struct net *net, struct nf_conn *ct, u8 l3num, u8 
l4num,
+ const char *timeout_name);
+void rpl_nf_ct_destroy_timeout(struct nf_conn *ct);
+#else
+static inline int rpl_nf_ct_set_timeout(struct net *net, struct nf_conn *ct,
+   u8 l3num, u8 l4num,
+   const char *timeout_name)
+{
+   return -EOPNOTSUPP;
+}
+
+static inline void rpl_nf_ct_destroy_timeout(struct nf_conn *ct)
+{
+   return;
+}
+#endif /* CONFIG_NF_CONNTRACK_TIMEOUT */
+
+#define nf_ct_set_timeout rpl_nf_ct_set_timeout
+#define nf_ct_destroy_timeout rpl_nf_ct_destroy_timeout
+
+#endif /* HAVE_NF_CT_SET_TIMEOUT */
+#endif /* _NF_CONNTRACK_TIMEOUT_WRAPPER_H */
diff --git a/datapath/linux/compat/nf_conntrack_timeout.c 
b/datapath/linux/compat/nf_conntrack_timeout.c
new file mode 100644
index ..c02baff5771b
--- /dev/null
+++ 

[ovs-dev] [PATCH v4 6/9] ofproto-dpif: Consume CT_Zone, and CT_Timeout_Policy tables

2019-08-15 Thread Yi-Hung Wei
This patch consumes the CT_Zone and CT_Timeout_Policy tables, maintains
the zone-based configuration in the vswitchd.  Whenever there is a
database change, vswitchd will read the datapath, CT_Zone, and
CT_Timeout_Policy tables from ovsdb, builds an internal snapshot of the
database configuration in bridge.c, and pushes down the change into
ofproto and dpif layer.

If a new zone-based timeout policy is added, it updates the zone to
timeout policy mapping in the per datapath type datapath structure in
dpif-backer, and pushes down the timeout policy into the datapath via
dpif interface.

If a timeout policy is no longer used, for kernel datapath, vswitchd
may not be able to remove it from datapath immediately since
datapath flows can still reference the to-be-deleted timeout policies.
Thus, we keep an timeout policy kill list, that vswitchd will go
back to the list periodically and try to kill the unused timeout policies.

Signed-off-by: Yi-Hung Wei 
---
 ofproto/ofproto-dpif.c | 287 +
 ofproto/ofproto-dpif.h |  10 ++
 ofproto/ofproto-provider.h |  10 ++
 ofproto/ofproto.c  |  26 
 ofproto/ofproto.h  |   5 +
 vswitchd/bridge.c  | 198 +++
 6 files changed, 536 insertions(+)

diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 751535249e21..244155a59291 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -156,6 +156,25 @@ struct ofport_dpif {
 size_t n_qdscp;
 };
 
+struct ct_timeout_policy {
+int ref_count;  /* The number of ct zones that use this
+ * timeout policy. */
+uint32_t tp_id; /* Timeout policy id in the datapath. */
+struct simap tp;/* A map from timeout policy attribute to
+ * timeout value. */
+struct hmap_node node;  /* Element in struct dpif_backer's "ct_tps"
+ * cmap. */
+struct ovs_list list_node;  /* Element in struct dpif_backer's
+ * "ct_tp_kill_list" list. */
+};
+
+struct ct_zone {
+uint16_t zone_id;
+struct ct_timeout_policy *ct_tp;
+struct cmap_node node;  /* Element in struct dpif_backer's
+ * "ct_zones" cmap. */
+};
+
 static odp_port_t ofp_port_to_odp_port(const struct ofproto_dpif *,
ofp_port_t);
 
@@ -196,6 +215,9 @@ static struct hmap all_ofproto_dpifs_by_uuid =
 
 static bool ofproto_use_tnl_push_pop = true;
 static void ofproto_unixctl_init(void);
+static void ct_zone_config_init(struct dpif_backer *backer);
+static void ct_zone_config_uninit(struct dpif_backer *backer);
+static void ct_zone_timeout_policy_sweep(struct dpif_backer *backer);
 
 static inline struct ofproto_dpif *
 ofproto_dpif_cast(const struct ofproto *ofproto)
@@ -488,6 +510,7 @@ type_run(const char *type)
 }
 
 process_dpif_port_changes(backer);
+ct_zone_timeout_policy_sweep(backer);
 
 return 0;
 }
@@ -683,6 +706,7 @@ close_dpif_backer(struct dpif_backer *backer, bool del)
 }
 dpif_close(backer->dpif);
 id_pool_destroy(backer->meter_ids);
+ct_zone_config_uninit(backer);
 free(backer);
 }
 
@@ -811,6 +835,8 @@ open_dpif_backer(const char *type, struct dpif_backer 
**backerp)
 backer->meter_ids = NULL;
 }
 
+ct_zone_config_init(backer);
+
 /* Make a pristine snapshot of 'support' into 'boottime_support'.
  * 'boottime_support' can be checked to prevent 'support' to be changed
  * beyond the datapath capabilities. In case 'support' is changed by
@@ -5086,6 +5112,265 @@ ct_flush(const struct ofproto *ofproto_, const uint16_t 
*zone)
 ct_dpif_flush(ofproto->backer->dpif, zone, NULL);
 }
 
+static struct ct_timeout_policy *
+ct_timeout_policy_lookup(const struct hmap *ct_tps, struct simap *tp)
+{
+struct ct_timeout_policy *ct_tp;
+
+HMAP_FOR_EACH_WITH_HASH (ct_tp, node, simap_hash(tp), ct_tps) {
+if (simap_equal(_tp->tp, tp)) {
+return ct_tp;
+}
+}
+return NULL;
+}
+
+static struct ct_timeout_policy *
+ct_timeout_policy_alloc__(void)
+{
+struct ct_timeout_policy *ct_tp = xzalloc(sizeof *ct_tp);
+simap_init(_tp->tp);
+return ct_tp;
+}
+
+static struct ct_timeout_policy *
+ct_timeout_policy_alloc(struct simap *tp, struct id_pool *tp_ids)
+{
+struct simap_node *node;
+
+struct ct_timeout_policy *ct_tp = ct_timeout_policy_alloc__();
+SIMAP_FOR_EACH (node, tp) {
+simap_put(_tp->tp, node->name, node->data);
+}
+
+if (!id_pool_alloc_id(tp_ids, _tp->tp_id)) {
+VLOG_ERR_RL(, "failed to allocate timeout policy id.");
+simap_destroy(_tp->tp);
+free(ct_tp);
+return NULL;
+}
+
+return ct_tp;
+}
+
+static void
+ct_timeout_policy_destroy(struct ct_timeout_policy *ct_tp,
+  struct id_pool *tp_ids)

[ovs-dev] [PATCH v4 4/9] ct-dpif, dpif-netlink: Add conntrack timeout policy support

2019-08-15 Thread Yi-Hung Wei
This patch first defines the dpif interface for a datapath to support
adding, deleting, getting and dumping conntrack timeout policy.
The timeout policy is identified by a 4 bytes unsigned integer in
datapath, and it currently support timeout for TCP, UDP, and ICMP
protocols.

Moreover, this patch provides the implementation for Linux kernel
datapath in dpif-netlink.

In Linux kernel, the timeout policy is maintained per L3/L4 protocol,
and it is identified by 32 bytes null terminated string.  On the other
hand, in vswitchd, the timeout policy is a generic one that consists of
all the supported L4 protocols.  Therefore, one of the main task in
dpif-netlink is to break down the generic timeout policy into 6
sub policies (ipv4 tcp, udp, icmp, and ipv6 tcp, udp, icmp),
and push down the configuration using the netlink API in
netlink-conntrack.c.

This patch also adds missing symbols in the windows datapath so
that the build on windows can pass.

Appveyor CI:
* https://ci.appveyor.com/project/YiHungWei/ovs/builds/26387754

Signed-off-by: Yi-Hung Wei 
Acked-by: Alin Gabriel Serdean 
---
 Documentation/faq/releases.rst |   3 +-
 datapath-windows/include/OvsDpInterfaceCtExt.h | 114 +
 datapath-windows/ovsext/Netlink/NetlinkProto.h |   8 +-
 include/windows/automake.mk|   1 +
 .../windows/linux/netfilter/nfnetlink_cttimeout.h  |   0
 lib/ct-dpif.c  | 102 +
 lib/ct-dpif.h  |  56 +++
 lib/dpif-netdev.c  |   6 +
 lib/dpif-netlink.c | 478 +
 lib/dpif-netlink.h |   1 -
 lib/dpif-provider.h|  44 ++
 lib/netlink-conntrack.c| 301 +
 lib/netlink-conntrack.h|  27 +-
 lib/netlink-protocol.h |   8 +-
 14 files changed, 1142 insertions(+), 7 deletions(-)
 create mode 100644 include/windows/linux/netfilter/nfnetlink_cttimeout.h

diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
index 8daa23bb2d0c..0b7eaab1b143 100644
--- a/Documentation/faq/releases.rst
+++ b/Documentation/faq/releases.rst
@@ -110,8 +110,9 @@ Q: Are all features available with all datapaths?
 == == == = ===
 Connection tracking 4.3YES  YES  YES
 Conntrack Fragment Reass.   4.3YES  YES  YES
+Conntrack Timeout Policies  5.2YES  NO   NO
+Conntrack Zone Limit4.18   YES  NO   YES
 NAT 4.6YES  YES  YES
-Conntrack zone limit4.18   YES  NO   YES
 Tunnel - LISP   NO YES  NO   NO
 Tunnel - STTNO YES  NO   YES
 Tunnel - GRE3.11   YES  YES  YES
diff --git a/datapath-windows/include/OvsDpInterfaceCtExt.h 
b/datapath-windows/include/OvsDpInterfaceCtExt.h
index 3b947782e90c..3379f0a256fa 100644
--- a/datapath-windows/include/OvsDpInterfaceCtExt.h
+++ b/datapath-windows/include/OvsDpInterfaceCtExt.h
@@ -421,4 +421,118 @@ struct nf_ct_tcp_flags {
 UINT8 mask;
 };
 
+/* File: nfnetlink_cttimeout.h. XXX: the following are not implemented */
+enum ctnl_timeout_msg_types {
+IPCTNL_MSG_TIMEOUT_NEW,
+IPCTNL_MSG_TIMEOUT_GET,
+IPCTNL_MSG_TIMEOUT_DELETE,
+IPCTNL_MSG_TIMEOUT_DEFAULT_SET,
+IPCTNL_MSG_TIMEOUT_DEFAULT_GET,
+
+IPCTNL_MSG_TIMEOUT_MAX
+};
+
+enum ctattr_timeout {
+CTA_TIMEOUT_UNSPEC,
+CTA_TIMEOUT_NAME,
+CTA_TIMEOUT_L3PROTO,
+CTA_TIMEOUT_L4PROTO,
+CTA_TIMEOUT_DATA,
+CTA_TIMEOUT_USE,
+__CTA_TIMEOUT_MAX
+};
+#define CTA_TIMEOUT_MAX (__CTA_TIMEOUT_MAX - 1)
+
+enum ctattr_timeout_generic {
+CTA_TIMEOUT_GENERIC_UNSPEC,
+CTA_TIMEOUT_GENERIC_TIMEOUT,
+__CTA_TIMEOUT_GENERIC_MAX
+};
+#define CTA_TIMEOUT_GENERIC_MAX (__CTA_TIMEOUT_GENERIC_MAX - 1)
+
+enum ctattr_timeout_tcp {
+CTA_TIMEOUT_TCP_UNSPEC,
+CTA_TIMEOUT_TCP_SYN_SENT,
+CTA_TIMEOUT_TCP_SYN_RECV,
+CTA_TIMEOUT_TCP_ESTABLISHED,
+CTA_TIMEOUT_TCP_FIN_WAIT,
+CTA_TIMEOUT_TCP_CLOSE_WAIT,
+CTA_TIMEOUT_TCP_LAST_ACK,
+CTA_TIMEOUT_TCP_TIME_WAIT,
+CTA_TIMEOUT_TCP_CLOSE,
+CTA_TIMEOUT_TCP_SYN_SENT2,
+CTA_TIMEOUT_TCP_RETRANS,
+CTA_TIMEOUT_TCP_UNACK,
+__CTA_TIMEOUT_TCP_MAX
+};
+#define CTA_TIMEOUT_TCP_MAX (__CTA_TIMEOUT_TCP_MAX - 1)
+
+enum ctattr_timeout_udp {
+CTA_TIMEOUT_UDP_UNSPEC,
+CTA_TIMEOUT_UDP_UNREPLIED,
+CTA_TIMEOUT_UDP_REPLIED,
+__CTA_TIMEOUT_UDP_MAX
+};
+#define CTA_TIMEOUT_UDP_MAX (__CTA_TIMEOUT_UDP_MAX - 1)
+
+enum ctattr_timeout_udplite {
+CTA_TIMEOUT_UDPLITE_UNSPEC,
+

[ovs-dev] [PATCH v4 5/9] simap: Add utility function to help compare two simaps.

2019-08-15 Thread Yi-Hung Wei
From: Ben Pfaff 

Signed-off-by: Ben Pfaff 
---
 lib/simap.c | 15 ++-
 lib/simap.h |  1 +
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/lib/simap.c b/lib/simap.c
index d634f8ed9eea..f404ece67703 100644
--- a/lib/simap.c
+++ b/lib/simap.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2009, 2010, 2011, 2012, 2017 Nicira, Inc.
+ * Copyright (c) 2009, 2010, 2011, 2012, 2017, 2019 Nicira, Inc.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -242,6 +242,19 @@ simap_equal(const struct simap *a, const struct simap *b)
 return true;
 }
 
+uint32_t
+simap_hash(const struct simap *simap)
+{
+uint32_t hash = 0;
+
+const struct simap_node *node;
+SIMAP_FOR_EACH (node, simap) {
+hash ^= hash_int(node->data,
+ hash_name(node->name, strlen(node->name)));
+}
+return hash;
+}
+
 static size_t
 hash_name(const char *name, size_t length)
 {
diff --git a/lib/simap.h b/lib/simap.h
index 5b4a2f39dca3..5e646e660782 100644
--- a/lib/simap.h
+++ b/lib/simap.h
@@ -70,6 +70,7 @@ bool simap_find_and_delete(struct simap *, const char *);
 
 const struct simap_node **simap_sort(const struct simap *);
 bool simap_equal(const struct simap *, const struct simap *);
+uint32_t simap_hash(const struct simap *);
 
 #ifdef  __cplusplus
 }
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4 3/9] ct-dpif: Export ct_dpif_format_ipproto()

2019-08-15 Thread Yi-Hung Wei
This function will be useful for following patches.

Signed-off-by: Yi-Hung Wei 
Acked-by: Justin Pettit 
---
 lib/ct-dpif.c | 3 +--
 lib/ct-dpif.h | 1 +
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/ct-dpif.c b/lib/ct-dpif.c
index 5d8a75d3a63f..6ea7feb0ee35 100644
--- a/lib/ct-dpif.c
+++ b/lib/ct-dpif.c
@@ -31,7 +31,6 @@ struct flags {
 const char *name;
 };
 
-static void ct_dpif_format_ipproto(struct ds *, uint16_t ipproto);
 static void ct_dpif_format_counters(struct ds *,
 const struct ct_dpif_counters *);
 static void ct_dpif_format_timestamp(struct ds *,
@@ -315,7 +314,7 @@ ct_dpif_format_entry(const struct ct_dpif_entry *entry, 
struct ds *ds,
 }
 }
 
-static void
+void
 ct_dpif_format_ipproto(struct ds *ds, uint16_t ipproto)
 {
 const char *name;
diff --git a/lib/ct-dpif.h b/lib/ct-dpif.h
index 14178bb7c3f0..2f4906817946 100644
--- a/lib/ct-dpif.h
+++ b/lib/ct-dpif.h
@@ -250,6 +250,7 @@ int ct_dpif_ipf_dump_done(struct dpif *dpif, void *);
 void ct_dpif_entry_uninit(struct ct_dpif_entry *);
 void ct_dpif_format_entry(const struct ct_dpif_entry *, struct ds *,
   bool verbose, bool print_stats);
+void ct_dpif_format_ipproto(struct ds *ds, uint16_t ipproto);
 void ct_dpif_format_tuple(struct ds *, const struct ct_dpif_tuple *);
 uint8_t ct_dpif_coalesce_tcp_state(uint8_t state);
 void ct_dpif_format_tcp_stat(struct ds *, int, int);
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4 2/9] ovs-vsctl: Add conntrack zone commands.

2019-08-15 Thread Yi-Hung Wei
From: William Tu 

The patch adds commands creating/deleting/listing conntrack zone
timeout policies:
  $ ovs-vsctl {add,del,list}-zone-tp dp zone=zone_id ...

Signed-off-by: William Tu 
---
 tests/ovs-vsctl.at   |  34 +++-
 utilities/ovs-vsctl.8.in |  26 ++
 utilities/ovs-vsctl.c| 202 ++-
 3 files changed, 256 insertions(+), 6 deletions(-)

diff --git a/tests/ovs-vsctl.at b/tests/ovs-vsctl.at
index 46fa3c5b1a33..df15fb6901a0 100644
--- a/tests/ovs-vsctl.at
+++ b/tests/ovs-vsctl.at
@@ -805,6 +805,20 @@ AT_CHECK(
   [RUN_OVS_VSCTL([--if-exists remove netflow x targets '"1.2.3.4:567"'])])
 AT_CHECK(
   [RUN_OVS_VSCTL([--if-exists clear netflow x targets])])
+
+AT_CHECK([RUN_OVS_VSCTL([-- --id=@m create Datapath datapath_version=0 -- set 
Open_vSwitch . datapaths:"netdev"=@m])], [0], [stdout])
+AT_CHECK([RUN_OVS_VSCTL([add-zone-tp netdev zone=1 icmp_first=1 
icmp_reply=2])])
+AT_CHECK([RUN_OVS_VSCTL([--may-exist add-zone-tp netdev zone=1 icmp_first=1 
icmp_reply=2])])
+AT_CHECK([RUN_OVS_VSCTL([list-zone-tp netdev])], [0], [Zone:1, Timeout 
Policies: icmp_first=1 icmp_reply=2
+])
+AT_CHECK([RUN_OVS_VSCTL([add-zone-tp netdev zone=2 icmp_first=2 
icmp_reply=3])])
+AT_CHECK([RUN_OVS_VSCTL([list-zone-tp netdev])], [0], [Zone:1, Timeout 
Policies: icmp_first=1 icmp_reply=2
+Zone:2, Timeout Policies: icmp_first=2 icmp_reply=3
+])
+AT_CHECK([RUN_OVS_VSCTL([del-zone-tp netdev zone=1])])
+AT_CHECK([RUN_OVS_VSCTL([--if-exists del-zone-tp netdev zone=1])])
+AT_CHECK([RUN_OVS_VSCTL([list-zone-tp netdev])], [0], [Zone:2, Timeout 
Policies: icmp_first=2 icmp_reply=3
+])
 OVS_VSCTL_CLEANUP
 AT_CLEANUP
 
@@ -890,10 +904,10 @@ AT_CHECK([RUN_OVS_VSCTL([set bridge br0 flood_vlans=-1])],
 AT_CHECK([RUN_OVS_VSCTL([set bridge br0 flood_vlans=4096])],
   [1], [], [ovs-vsctl: constraint violation: 4096 is not in the valid range 0 
to 4095 (inclusive)
 ])
-AT_CHECK([RUN_OVS_VSCTL([set c br1 'connection-mode=xyz'])],
+AT_CHECK([RUN_OVS_VSCTL([set controller br1 'connection-mode=xyz'])],
   [1], [], [[ovs-vsctl: constraint violation: xyz is not one of the allowed 
values ([in-band, out-of-band])
 ]])
-AT_CHECK([RUN_OVS_VSCTL([set c br1 connection-mode:x=y])],
+AT_CHECK([RUN_OVS_VSCTL([set controller br1 connection-mode:x=y])],
   [1], [], [ovs-vsctl: cannot specify key to set for non-map column 
connection_mode
 ])
 AT_CHECK([RUN_OVS_VSCTL([add bridge br1 datapath_id x y])],
@@ -929,6 +943,22 @@ AT_CHECK([RUN_OVS_VSCTL([remove bridge br1 flood-vlans 
true])],
 AT_CHECK([RUN_OVS_VSCTL([clear bridge br1 name])],
   [1], [], [ovs-vsctl: cannot modify read-only column name in table Bridge
 ])
+
+AT_CHECK([RUN_OVS_VSCTL([-- --id=@m create Datapath datapath_version=0 -- set 
Open_vSwitch . datapaths:"netdev"=@m])], [0], [stdout])
+AT_CHECK([RUN_OVS_VSCTL([add-zone-tp netdevxx zone=1 icmp_first=1 
icmp_reply=2])],
+  [1], [], [ovs-vsctl: datapath netdevxx does not exist
+])
+AT_CHECK([RUN_OVS_VSCTL([add-zone-tp netdev zone=2 icmp_first=2 
icmp_reply=3])])
+AT_CHECK([RUN_OVS_VSCTL([add-zone-tp netdev zone=2 icmp_first=2 
icmp_reply=3])],
+  [1], [], [ovs-vsctl: zone id 2 already exists
+])
+AT_CHECK([RUN_OVS_VSCTL([list-zone-tp netdev])], [0], [Zone:2, Timeout 
Policies: icmp_first=2 icmp_reply=3
+])
+AT_CHECK([RUN_OVS_VSCTL([del-zone-tp netdev zone=11])],
+  [1], [], [ovs-vsctl: zone id 11 does not exist
+])
+AT_CHECK([RUN_OVS_VSCTL([list-zone-tp netdev])], [0], [Zone:2, Timeout 
Policies: icmp_first=2 icmp_reply=3
+])
 OVS_VSCTL_CLEANUP
 AT_CLEANUP
 
diff --git a/utilities/ovs-vsctl.8.in b/utilities/ovs-vsctl.8.in
index 7c09df79bd29..5b9883ae1c3d 100644
--- a/utilities/ovs-vsctl.8.in
+++ b/utilities/ovs-vsctl.8.in
@@ -353,6 +353,32 @@ list.
 Prints the name of the bridge that contains \fIiface\fR on standard
 output.
 .
+.SS "Conntrack Zone Commands"
+These commands query and modify datapath CT zones and Timeout Policies.
+.
+.IP "[\fB\-\-may\-exist\fR] \fBadd\-zone\-tp \fIdatapath \fBzone=\fIzone_id 
\fIpolicies\fR"
+Creates a conntrack zone timeout policy with \fIzone_id\fR in
+\fIdatapath\fR.  The \fIpolicies\fR consist of \fIkey\fB=\fIvalue\fR
+pairs, separated by spaces.  For example, \fBicmp_first=30
+icmp_reply=60\fR specifies a 30-second timeout policy for the first ICMP
+packet and a 60-second policy for ICMP reply packet.  See the
+\fBCT_Timeout_Policy\fR table in \fBovs-vswitchd.conf.db\fR(5) for the
+supported keys.
+.IP
+Without \fB\-\-may\-exist\fR, attempting to add a \fIzone_id\fR that
+already exists is an error.  With \fB\-\-may\-exist\fR,
+this command does nothing if \fIzone_id\fR is already created\fR.
+.
+.IP "[\fB\-\-if\-exists\fR] \fBdel\-zone\-tp \fIdatapath \fBzone=\fIzone_id\fR"
+Delete the timeout policy associated with \fIzone_id\fR from \fIdatapath\fR.
+.IP
+Without \fB\-\-if\-exists\fR, attempting to delete a zone that
+does not exist is an error.  With \fB\-\-if\-exists\fR, attempting to
+delete a zone that does not exist has no effect.
+.
+.IP 

[ovs-dev] [PATCH v4 1/9] ovs-vswitchd: Add Datapath, CT_Zone, and CT_Zone_Policy tables.

2019-08-15 Thread Yi-Hung Wei
From: Justin Pettit 

Signed-off-by: Justin Pettit 
Signed-off-by: Yi-Hung Wei 
Co-authored-by: Yi-Hung Wei 
---
 vswitchd/vswitch.ovsschema |  51 -
 vswitchd/vswitch.xml   | 275 +
 2 files changed, 277 insertions(+), 49 deletions(-)

diff --git a/vswitchd/vswitch.ovsschema b/vswitchd/vswitch.ovsschema
index f7c6eb8983cd..c0a2242ad345 100644
--- a/vswitchd/vswitch.ovsschema
+++ b/vswitchd/vswitch.ovsschema
@@ -1,9 +1,14 @@
 {"name": "Open_vSwitch",
- "version": "8.0.0",
- "cksum": "3962141869 23978",
+ "version": "8.1.0",
+ "cksum": "1635647160 26090",
  "tables": {
"Open_vSwitch": {
  "columns": {
+   "datapaths": {
+ "type": {"key": {"type": "string"},
+  "value": {"type": "uuid",
+"refTable": "Datapath"},
+  "min": 0, "max": "unlimited"}},
"bridges": {
  "type": {"key": {"type": "uuid",
   "refTable": "Bridge"},
@@ -629,6 +634,48 @@
   "min": 0, "max": "unlimited"},
  "ephemeral": true}},
  "indexes": [["target"]]},
+   "Datapath": {
+ "columns": {
+   "datapath_version": {
+ "type": "string"},
+   "ct_zones": {
+ "type": {"key": {"type": "integer",
+  "minInteger": 0,
+  "maxInteger": 65535},
+  "value": {"type": "uuid",
+"refTable": "CT_Zone"},
+  "min": 0, "max": "unlimited"}},
+   "external_ids": {
+ "type": {"key": "string", "value": "string",
+  "min": 0, "max": "unlimited",
+   "CT_Zone": {
+ "columns": {
+   "timeout_policy": {
+ "type": {"key": {"type": "uuid",
+  "refTable": "CT_Timeout_Policy"},
+  "min": 0, "max": 1}},
+   "external_ids": {
+ "type": {"key": "string", "value": "string",
+  "min": 0, "max": "unlimited",
+   "CT_Timeout_Policy": {
+ "columns": {
+   "timeouts": {
+ "type": {"key": {"type" : "string",
+  "enum": ["set", ["tcp_syn_sent", "tcp_syn_recv",
+   "tcp_established", "tcp_fin_wait",
+   "tcp_close_wait", "tcp_last_ack",
+   "tcp_time_wait", "tcp_close",
+   "tcp_syn_sent2", "tcp_retransmit",
+   "tcp_unack", "udp_first",
+   "udp_single", "udp_multiple",
+   "icmp_first", "icmp_reply"]]},
+  "value": {"type" : "integer",
+"minInteger" : 0,
+"maxInteger" : 4294967295},
+  "min": 0, "max": "unlimited"}},
+   "external_ids": {
+ "type": {"key": "string", "value": "string",
+  "min": 0, "max": "unlimited",
"SSL": {
  "columns": {
"private_key": {
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 027aee2f523b..00e37c47c075 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -52,6 +52,13 @@
 one record in the  table.
 
 
+  
+Map of datapath types to datapaths.  The
+ column of the 
+table is used as a key for this map.  The value points to a row in
+the  table.
+  
+
   
 Set of bridges managed by the daemon.
   
@@ -1192,53 +1199,11 @@
   
 
   
-
-  Reports the version number of the Open vSwitch datapath in use.
-  This allows management software to detect and report discrepancies
-  between Open vSwitch userspace and datapath versions.  (The  column in the  reports the Open vSwitch userspace version.)
-  The version reported depends on the datapath in use:
-
-
-
-  
-When the kernel module included in the Open vSwitch source tree is
-used, this column reports the Open vSwitch version from which the
-module was taken.
-  
-
-  
-When the kernel module that is part of the upstream Linux kernel is
-used, this column reports unknown.
-  
-
-  
-When the datapath is built into the ovs-vswitchd
-binary, this column reports built-in.  A
-built-in datapath is by definition the same version as the rest of
-the Open VSwitch userspace.
-  
-
-  
-Other datapaths (such as the Hyper-V kernel datapath) currently
-report unknown.
-  
-
-
-
-  A version discrepancy between ovs-vswitchd and the
-  datapath in use is not normally cause for alarm.  The Open vSwitch
-  kernel datapaths for Linux and Hyper-V, in 

[ovs-dev] [PATCH v4 0/9] Support zone-based conntrack timeout policy

2019-08-15 Thread Yi-Hung Wei
This patch series enables zone-based conntrack timeout policy support in OVS.
Timeout policy is a set of timeout attributes that can be associated with a
connection when it is committed.  Then, the connection tracking system will
expire a connection based on its connection state.  For example, one use
case would be to extend the timeout of TCP connection in the established
state to avoid re-connect overhead. Other use case may be to shorten the
connection timeout so that the system can reclaim resources faster.
The idea of zone-based conntrack timeout policy is to group connections
with similar characteristics in a conntrack zone, and assign timeout policy
to the conntrack zone.  In this way, all the connections in that zone will share
the same timeout policy.

For zone-based timeout policy configuration, the association of conntrack
zone and conntrack timeout policy is defined per datapath in vswitchd ovsdb
schema.  User can program the database through ovs-vsctl or using ovsdb
protocol directly.  Once the zone-based timeout policy configuration is
in the database, vswitchd will read those configuration and organize it
in internal datapath structure, and push the timeout policy into datapath.
Currently, only the kernel datapath supports customized timeout policy.

When a packet is committed to connection tracking system, during flow
translation in ofproto-dpif-xlate, vswitchd will lookup the internal
data structure to figure out which timeout policy to associate with
the connection.  If timeout policy is not specified to the committed
zone, it defaults to the timeout policy in the default zone (zone 0).
If the timeout policy is not specified in the default zone, it defaults
to the system default timeouts.

Here are some more details about each patch
* p01: Introduce ovsdb schema for ct timeout policy.
* p02: ovs-vsctl commands to configure zone-based ct timeout policy.
* p03: Expose a utility functions.
* p04: dpif interface along with dpif-netlink implementation to support
   ct timeout policy.
* p05: Consume ct timeout policy configuration from ovsdb server,
   keep it in internal data structure, and push configuration to
   datapath.
* p06: Add utility function to help compare two simaps.
* p07-08: Kernel datapath support for the new ct action attribute.
* p09: Translate timeout policy in ofproto-dpif-xlate and system traffic test.

v3->v4:
* ofproto-dpif
- Probe datapath for timeout policy support.
- With the probing information only translate timeout policy when
  the datapath is supported.
- Resolve the old kernel compatibility issue reported by Darrell.
* system-traffic
- Simplify the testing script (diff from Darrell).
* Address various code changes as in the mailing list discussion.

v2->v3
* ovsdb schema
- Fold in changes from Justin.
- Make ct timeout policy key to be in a pre-defined set.
* ovs-vsctl
- Bug fixes.
* ct-dpif
- Fold in diff suggestion from Justin.
* bridge, ofproto-dpif
- Restruct the ofproto and dpif layer support for zone based timeout
  policy.
* system traffic test
- Fix bug reported by Darrell.
* Address review comments from Justin and Darrell.

v1->v2

* ovs-vsctl
- Remove add-dp,del-dp,list-dp ovs-vsctl commands.
- Add --may-exist and --if-exists to ovs-vsctl add-zone-tp command.
- Improve ovs-vsctl test.
* ct-dpif, dpif-netlink
- Remove support to change default timeout policy in the datapath.
- Squash ct-dpif and dpif-netlink layer implementation altogether.
- Address review comments from William.
* ofproto-dpif
- Remove changes from datapath-config module to ofproto-dpif layer.
- Maintain zone-based timeout policy in dpif-backer since this is
  per datapath type configuration. This will not break the OVS
  hierarchy as Ilya suggested.
- Allocate timeout policy id using id_pool instead of idl_seqno
  as Darrell suggested.
- Add a timeout policy sweep function that clean up unnecessary
  timeout policy regularly in the datapath.
* ofproto-dpif-xlate
- Only translate ct timeout policy if it is a ct commit action
  in kernel datapath.
* system-traffic test
- Update system traffic test with low level ovs-vsctl command.
- Make system traffic test to be datapath type agnostic.
- Improve system traffic test as Darrell suggested.
* Rebase to master

Ben Pfaff (1):
  simap: Add utility function to help compare two simaps.

Justin Pettit (1):
  ovs-vswitchd: Add Datapath, CT_Zone, and CT_Zone_Policy tables.

William Tu (1):
  ovs-vsctl: Add conntrack zone commands.

Yi-Hung Wei (6):
  ct-dpif: Export ct_dpif_format_ipproto()
  ct-dpif, dpif-netlink: Add conntrack timeout policy support
  ofproto-dpif: Consume CT_Zone, and CT_Timeout_Policy tables
  datapath: compat: Backport nf_conntrack_timeout support
  datapath: Add support for conntrack timeout policy
  ofproto-dpif-xlate: Translate timeout policy in ct action

 Documentation/faq/releases.rst 

Re: [ovs-dev] [PATCH v3 2/9] ovs-vsctl: Add conntrack zone commands.

2019-08-15 Thread Darrell Ball
On Thu, Aug 15, 2019 at 10:53 AM William Tu  wrote:

> Thanks for the review.
>
> On Mon, Aug 12, 2019 at 09:54:42PM -0700, Darrell Ball wrote:
> > Thanks for the patch
> >
> > Thanks for the fixups; mostly minor comments inline.
> >
> > On Mon, Aug 12, 2019 at 5:53 PM Yi-Hung Wei 
> wrote:
> >
> > > From: William Tu 
> > >
> > > The patch adds commands creating/deleting/listing conntrack zone
> > > timeout policies:
> > >   $ ovs-vsctl {add,del,list}-zone-tp dp zone=zone_id ...
> > >
> > > Signed-off-by: William Tu 
> > > ---
> > >  tests/ovs-vsctl.at   |  34 -
> > >  utilities/ovs-vsctl.8.in |  26 +++
> > >  utilities/ovs-vsctl.c| 194
> > > +++
> > >  3 files changed, 252 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/tests/ovs-vsctl.at b/tests/ovs-vsctl.at
> > > index 46fa3c5b1a33..df15fb6901a0 100644
> > > --- a/tests/ovs-vsctl.at
> > > +++ b/tests/ovs-vsctl.at
> > > @@ -805,6 +805,20 @@ AT_CHECK(
> > >[RUN_OVS_VSCTL([--if-exists remove netflow x targets '"1.2.3.4:567
> > > "'])])
> > >  AT_CHECK(
> > >[RUN_OVS_VSCTL([--if-exists clear netflow x targets])])
> > > +
> > > +AT_CHECK([RUN_OVS_VSCTL([-- --id=@m create Datapath
> datapath_version=0 --
> > > set Open_vSwitch . datapaths:"netdev"=@m])], [0], [stdout])
> > > +AT_CHECK([RUN_OVS_VSCTL([add-zone-tp netdev zone=1 icmp_first=1
> > > icmp_reply=2])])
> > > +AT_CHECK([RUN_OVS_VSCTL([--may-exist add-zone-tp netdev zone=1
> > > icmp_first=1 icmp_reply=2])])
> > > +AT_CHECK([RUN_OVS_VSCTL([list-zone-tp netdev])], [0], [Zone:1, Timeout
> > > Policies: icmp_first=1 icmp_reply=2
> > > +])
> > > +AT_CHECK([RUN_OVS_VSCTL([add-zone-tp netdev zone=2 icmp_first=2
> > > icmp_reply=3])])
> > > +AT_CHECK([RUN_OVS_VSCTL([list-zone-tp netdev])], [0], [Zone:1, Timeout
> > > Policies: icmp_first=1 icmp_reply=2
> > > +Zone:2, Timeout Policies: icmp_first=2 icmp_reply=3
> > > +])
> > > +AT_CHECK([RUN_OVS_VSCTL([del-zone-tp netdev zone=1])])
> > > +AT_CHECK([RUN_OVS_VSCTL([--if-exists del-zone-tp netdev zone=1])])
> > > +AT_CHECK([RUN_OVS_VSCTL([list-zone-tp netdev])], [0], [Zone:2, Timeout
> > > Policies: icmp_first=2 icmp_reply=3
> > > +])
> > >  OVS_VSCTL_CLEANUP
> > >  AT_CLEANUP
> > >
> > > @@ -890,10 +904,10 @@ AT_CHECK([RUN_OVS_VSCTL([set bridge br0
> > > flood_vlans=-1])],
> > >  AT_CHECK([RUN_OVS_VSCTL([set bridge br0 flood_vlans=4096])],
> > >[1], [], [ovs-vsctl: constraint violation: 4096 is not in the valid
> > > range 0 to 4095 (inclusive)
> > >  ])
> > > -AT_CHECK([RUN_OVS_VSCTL([set c br1 'connection-mode=xyz'])],
> > > +AT_CHECK([RUN_OVS_VSCTL([set controller br1 'connection-mode=xyz'])],
> > >[1], [], [[ovs-vsctl: constraint violation: xyz is not one of the
> > > allowed values ([in-band, out-of-band])
> > >  ]])
> > > -AT_CHECK([RUN_OVS_VSCTL([set c br1 connection-mode:x=y])],
> > > +AT_CHECK([RUN_OVS_VSCTL([set controller br1 connection-mode:x=y])],
> > >[1], [], [ovs-vsctl: cannot specify key to set for non-map column
> > > connection_mode
> > >  ])
> > >  AT_CHECK([RUN_OVS_VSCTL([add bridge br1 datapath_id x y])],
> > > @@ -929,6 +943,22 @@ AT_CHECK([RUN_OVS_VSCTL([remove bridge br1
> > > flood-vlans true])],
> > >  AT_CHECK([RUN_OVS_VSCTL([clear bridge br1 name])],
> > >[1], [], [ovs-vsctl: cannot modify read-only column name in table
> Bridge
> > >  ])
> > > +
> > > +AT_CHECK([RUN_OVS_VSCTL([-- --id=@m create Datapath
> datapath_version=0 --
> > > set Open_vSwitch . datapaths:"netdev"=@m])], [0], [stdout])
> > >
> >
> > If I execute
> >
> > AT_CHECK([RUN_OVS_VSCTL([-- --id=@m create Datapath datapath_version=0 --
> > set Open_vSwitch . datapaths:"netdevvv"=@m])], [0], [stdout])
> >
> > it works, but there is no such datapath type 'netdevvv'
> >
> this is using database command, I don't think we should enforce it.
>

JTBC, there is nothing to do for this aspect in terms of the ovs-vsctl
command support.
I mentioned here for context, bcoz the tests are in this patch.
Below paragraph has more context, but don't worry about that either.


>
> > I think it would be better to enforce an enum here as well thru the
> schema,
> > as I mentioned in V2, since this handles errors better.
> > This is actually the same idea as enforcing enum for timeout keys that
> was
> > done for V3 to block bad timeout keys like "foo_bar=3"
> > I commented patch 1 anyways.
> >
> >
> > > +AT_CHECK([RUN_OVS_VSCTL([add-zone-tp netdevxx zone=1 icmp_first=1
> > > icmp_reply=2])],
> > > +  [1], [], [ovs-vsctl: datapath netdevxx does not exist
> > > +])
> > > +AT_CHECK([RUN_OVS_VSCTL([add-zone-tp netdev zone=2 icmp_first=2
> > > icmp_reply=3])])
> > > +AT_CHECK([RUN_OVS_VSCTL([add-zone-tp netdev zone=2 icmp_first=2
> > > icmp_reply=3])],
> > > +  [1], [], [ovs-vsctl: zone id 2 already exists
> > > +])
> > > +AT_CHECK([RUN_OVS_VSCTL([list-zone-tp netdev])], [0], [Zone:2, Timeout
> > > Policies: icmp_first=2 icmp_reply=3
> > > +])
> > > +AT_CHECK([RUN_OVS_VSCTL([del-zone-tp netdev 

[ovs-dev] [PATCH v2.12] OVN: Repair memory leak for OVN controller events.

2019-08-15 Thread Mark Michelson
Controller event action is leaking its genopts. This corrects the error.
---
 ovn/lib/actions.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/ovn/lib/actions.c b/ovn/lib/actions.c
index b0cb3490b..274297194 100644
--- a/ovn/lib/actions.c
+++ b/ovn/lib/actions.c
@@ -1760,8 +1760,9 @@ parse_trigger_event(struct action_context *ctx,
 }
 
 static void
-ovnact_controller_event_free(struct ovnact_controller_event *event OVS_UNUSED)
+ovnact_controller_event_free(struct ovnact_controller_event *event)
 {
+free_gen_options(event->options, event->n_options);
 }
 
 static void
-- 
2.14.5

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 2/9] ovs-vsctl: Add conntrack zone commands.

2019-08-15 Thread William Tu
Thanks for the review.

On Mon, Aug 12, 2019 at 09:54:42PM -0700, Darrell Ball wrote:
> Thanks for the patch
> 
> Thanks for the fixups; mostly minor comments inline.
> 
> On Mon, Aug 12, 2019 at 5:53 PM Yi-Hung Wei  wrote:
> 
> > From: William Tu 
> >
> > The patch adds commands creating/deleting/listing conntrack zone
> > timeout policies:
> >   $ ovs-vsctl {add,del,list}-zone-tp dp zone=zone_id ...
> >
> > Signed-off-by: William Tu 
> > ---
> >  tests/ovs-vsctl.at   |  34 -
> >  utilities/ovs-vsctl.8.in |  26 +++
> >  utilities/ovs-vsctl.c| 194
> > +++
> >  3 files changed, 252 insertions(+), 2 deletions(-)
> >
> > diff --git a/tests/ovs-vsctl.at b/tests/ovs-vsctl.at
> > index 46fa3c5b1a33..df15fb6901a0 100644
> > --- a/tests/ovs-vsctl.at
> > +++ b/tests/ovs-vsctl.at
> > @@ -805,6 +805,20 @@ AT_CHECK(
> >[RUN_OVS_VSCTL([--if-exists remove netflow x targets '"1.2.3.4:567
> > "'])])
> >  AT_CHECK(
> >[RUN_OVS_VSCTL([--if-exists clear netflow x targets])])
> > +
> > +AT_CHECK([RUN_OVS_VSCTL([-- --id=@m create Datapath datapath_version=0 --
> > set Open_vSwitch . datapaths:"netdev"=@m])], [0], [stdout])
> > +AT_CHECK([RUN_OVS_VSCTL([add-zone-tp netdev zone=1 icmp_first=1
> > icmp_reply=2])])
> > +AT_CHECK([RUN_OVS_VSCTL([--may-exist add-zone-tp netdev zone=1
> > icmp_first=1 icmp_reply=2])])
> > +AT_CHECK([RUN_OVS_VSCTL([list-zone-tp netdev])], [0], [Zone:1, Timeout
> > Policies: icmp_first=1 icmp_reply=2
> > +])
> > +AT_CHECK([RUN_OVS_VSCTL([add-zone-tp netdev zone=2 icmp_first=2
> > icmp_reply=3])])
> > +AT_CHECK([RUN_OVS_VSCTL([list-zone-tp netdev])], [0], [Zone:1, Timeout
> > Policies: icmp_first=1 icmp_reply=2
> > +Zone:2, Timeout Policies: icmp_first=2 icmp_reply=3
> > +])
> > +AT_CHECK([RUN_OVS_VSCTL([del-zone-tp netdev zone=1])])
> > +AT_CHECK([RUN_OVS_VSCTL([--if-exists del-zone-tp netdev zone=1])])
> > +AT_CHECK([RUN_OVS_VSCTL([list-zone-tp netdev])], [0], [Zone:2, Timeout
> > Policies: icmp_first=2 icmp_reply=3
> > +])
> >  OVS_VSCTL_CLEANUP
> >  AT_CLEANUP
> >
> > @@ -890,10 +904,10 @@ AT_CHECK([RUN_OVS_VSCTL([set bridge br0
> > flood_vlans=-1])],
> >  AT_CHECK([RUN_OVS_VSCTL([set bridge br0 flood_vlans=4096])],
> >[1], [], [ovs-vsctl: constraint violation: 4096 is not in the valid
> > range 0 to 4095 (inclusive)
> >  ])
> > -AT_CHECK([RUN_OVS_VSCTL([set c br1 'connection-mode=xyz'])],
> > +AT_CHECK([RUN_OVS_VSCTL([set controller br1 'connection-mode=xyz'])],
> >[1], [], [[ovs-vsctl: constraint violation: xyz is not one of the
> > allowed values ([in-band, out-of-band])
> >  ]])
> > -AT_CHECK([RUN_OVS_VSCTL([set c br1 connection-mode:x=y])],
> > +AT_CHECK([RUN_OVS_VSCTL([set controller br1 connection-mode:x=y])],
> >[1], [], [ovs-vsctl: cannot specify key to set for non-map column
> > connection_mode
> >  ])
> >  AT_CHECK([RUN_OVS_VSCTL([add bridge br1 datapath_id x y])],
> > @@ -929,6 +943,22 @@ AT_CHECK([RUN_OVS_VSCTL([remove bridge br1
> > flood-vlans true])],
> >  AT_CHECK([RUN_OVS_VSCTL([clear bridge br1 name])],
> >[1], [], [ovs-vsctl: cannot modify read-only column name in table Bridge
> >  ])
> > +
> > +AT_CHECK([RUN_OVS_VSCTL([-- --id=@m create Datapath datapath_version=0 --
> > set Open_vSwitch . datapaths:"netdev"=@m])], [0], [stdout])
> >
> 
> If I execute
> 
> AT_CHECK([RUN_OVS_VSCTL([-- --id=@m create Datapath datapath_version=0 --
> set Open_vSwitch . datapaths:"netdevvv"=@m])], [0], [stdout])
> 
> it works, but there is no such datapath type 'netdevvv'
> 
this is using database command, I don't think we should enforce it.

> I think it would be better to enforce an enum here as well thru the schema,
> as I mentioned in V2, since this handles errors better.
> This is actually the same idea as enforcing enum for timeout keys that was
> done for V3 to block bad timeout keys like "foo_bar=3"
> I commented patch 1 anyways.
> 
> 
> > +AT_CHECK([RUN_OVS_VSCTL([add-zone-tp netdevxx zone=1 icmp_first=1
> > icmp_reply=2])],
> > +  [1], [], [ovs-vsctl: datapath netdevxx does not exist
> > +])
> > +AT_CHECK([RUN_OVS_VSCTL([add-zone-tp netdev zone=2 icmp_first=2
> > icmp_reply=3])])
> > +AT_CHECK([RUN_OVS_VSCTL([add-zone-tp netdev zone=2 icmp_first=2
> > icmp_reply=3])],
> > +  [1], [], [ovs-vsctl: zone id 2 already exists
> > +])
> > +AT_CHECK([RUN_OVS_VSCTL([list-zone-tp netdev])], [0], [Zone:2, Timeout
> > Policies: icmp_first=2 icmp_reply=3
> > +])
> > +AT_CHECK([RUN_OVS_VSCTL([del-zone-tp netdev zone=11])],
> > +  [1], [], [ovs-vsctl: zone id 11 does not exist
> > +])
> > +AT_CHECK([RUN_OVS_VSCTL([list-zone-tp netdev])], [0], [Zone:2, Timeout
> > Policies: icmp_first=2 icmp_reply=3
> > +])
> >  OVS_VSCTL_CLEANUP
> >  AT_CLEANUP
> >
> > diff --git a/utilities/ovs-vsctl.8.in b/utilities/ovs-vsctl.8.in
> > index 7c09df79bd29..5b9883ae1c3d 100644
> > --- a/utilities/ovs-vsctl.8.in
> > +++ b/utilities/ovs-vsctl.8.in
> > @@ -353,6 +353,32 @@ list.
> >  Prints the name of the bridge that contains 

[ovs-dev] [PATCH v1] datapath-windows: Fix updating ct label when mask is specified

2019-08-15 Thread Anand Kumar via dev
From: kumaranand 

When an existing label needs to be changed by specifing bits to be
updated using mask, instead of updating only the masked bits,
new label was getting overridden. This patch fixes this issue.

Signed-off-by: Anand Kumar 
---
 datapath-windows/ovsext/Conntrack.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/datapath-windows/ovsext/Conntrack.c 
b/datapath-windows/ovsext/Conntrack.c
index bc00b60..ba56116 100644
--- a/datapath-windows/ovsext/Conntrack.c
+++ b/datapath-windows/ovsext/Conntrack.c
@@ -807,6 +807,7 @@ OvsConntrackSetLabels(OvsFlowKey *key,
 ovs_u128 v, m, pktMdLabel = {0};
 memcpy(, val, sizeof v);
 memcpy(, mask, sizeof m);
+memcpy(, >labels, sizeof(struct ovs_key_ct_labels));
 
 pktMdLabel.u64.lo = v.u64.lo | (pktMdLabel.u64.lo & ~(m.u64.lo));
 pktMdLabel.u64.hi = v.u64.hi | (pktMdLabel.u64.hi & ~(m.u64.hi));
-- 
2.9.3.windows.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 0/3] travis: Build time optimization.

2019-08-15 Thread Ilya Maximets
On 06.08.2019 17:36, Aaron Conole wrote:
> Ilya Maximets  writes:
> 
>> These are low-hanging optimizations for TravisCI that allows
>> to speed up build without changing the test scope.
>>
>> On 'xenial' images this patch set gives ~30% build time improvement.
>> Example:
>>
>>   Before:
>> Ran for 1 hr 22 min 26 sec
>> Total time 5 hrs 40 min 56 sec
>>
>>   After (second run after the cache population):
>> Ran for 55 min 42 sec
>> Total time 4 hrs 21 min 14 sec
>>
>>   Saved:
>> Run time: ~27 minutes
>> Total time: ~1 hour 20 minutes
>>
>> Ilya Maximets (3):
>>   travis: Cache DPDK build.
>>   travis: Combine kernel builds.
>>   travis: Drop OSX workarounds.
>>
>>  .travis.yml| 22 +
>>  .travis/linux-build.sh | 74 +-
>>  .travis/osx-prepare.sh |  3 --
>>  3 files changed, 67 insertions(+), 32 deletions(-)
> 
> I'm always in favor of saving resources. :)
> 
> It doesn't seem to break anything existing.
> 
> Acked-by: Aaron Conole 

Thanks! Applied to master.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] 株式会社オクト お問い合わせ確認メール(自動返信)

2019-08-15 Thread 株式会社オクト
--
※このメールは「株式会社オクト」
お問い合わせフォームからの自動返信メールです。
--

WalterLek WalterLek 様

お問い合わせありがとうございます。
以下の内容でお問い合わせを承りました。

ご返信は、1週間以内とさせていただいておりますが、
混み合っている場合、又はお問い合わせの内容によりましては、
それ以上のお時間を頂く場合や、ご返信ができない場合もございます。
予めご了承ください。

メール以外でのご連絡をご希望の方は、以下の電話番号をご利用ください。
tel:078-306-0408(平日 9:00〜17:00)

--

■貴社名
google

■お名前
WalterLek WalterLek 様

■電話番号
84867285371

■メールアドレス
ovs-...@mail.linuxfoundation.org

■お問い合わせ内容
Meet a beautiful woman for sex right now: 
http://pluscombackgen.tk/cy8m?=x0l5UoWJsL1

--
株式会社オクト
〒650-0047
神戸市中央区港島南町5-5-2 KIBC南館253

URL:www.octinc.jp
TEL: 078-306-0408
FAX: 078-306-0406
E-Mail: oct...@myad.jp
--

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 0/3] be able to tune revalidator timing

2019-08-15 Thread Roi Dayan



On 2019-08-06 7:49 AM, Roi Dayan wrote:
> 
> 
> On 2019-07-22 8:10 PM, Ben Pfaff wrote:
>> On Sun, Jul 21, 2019 at 11:34:20AM +0300, Roi Dayan wrote:
>>> The following series expose configuration options to
>>> be able to tune revalidator timing to different setups
>>> at runtime instead of using hard coded values.
>>
>> These seem fine.
>>
>> I'd like to wait until we've branched for 2.12 before applying them.
>>
> 
> Hi Ben,
> 
> I see there is now a branch for 2.12.
> can you take these commits now?
> 
> Thanks,
> Roi
> 

Hi,

just pinging here.

Thanks,
Roi
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] OVN: Fix learning of neighbors from ARP/ND packets.

2019-08-15 Thread Han Zhou
On Wed, Aug 14, 2019 at 11:11 PM Dumitru Ceara  wrote:
>
> On Thu, Aug 8, 2019 at 1:52 AM Han Zhou  wrote:
> >
> >
> >
> > On Wed, Aug 7, 2019 at 8:12 AM Dumitru Ceara  wrote:
> >>
> >> On Mon, Aug 5, 2019 at 5:34 PM Han Zhou  wrote:
> >> >
> >> >
> >> >
> >> > On Mon, Aug 5, 2019 at 1:06 AM Dumitru Ceara 
wrote:
> >> > >
> >> > > On Mon, Aug 5, 2019 at 1:41 AM Han Zhou  wrote:
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Mon, Jul 29, 2019 at 2:30 AM Dumitru Ceara 
wrote:
> >> > > > >
> >> > > > > Add a restriction on the target protocol addresses to match the
> >> > > > > configured subnets. All other ARP/ND packets are invalid in
this
> >> > > > > context.
> >> > > > >
> >> > > > > One exception is for ARP replies that are received for route
next-hops
> >> > > > > that are only reachable via a port but can't be directly
resolved
> >> > > > > through route lookups. Such support was introduced by commit:
> >> > > > >
> >> > > > > 6b785fd8fe29 ("ovn-util: Allow /32 IP addresses for router
ports.")
> >> > > > >
> >> > > > > Reported-at: https://bugzilla.redhat.com/1729846
> >> > > > > Reported-by: Haidong Li 
> >> > > > > CC: Han Zhou 
> >> > > > > CC: Guru Shetty 
> >> > > > > Fixes: b068454082f5 ("ovn-northd: Support learning neighbor
from ARP request.")
> >> > > > > Signed-off-by: Dumitru Ceara 
> >> > > >
> >> > > > Hi Dumitru,
> >> > >
> >> > > Hi Han,
> >> > >
> >> > > Thanks for reviewing this.
> >> > >
> >> > > >
> >> > > > Sorry for my slow response, and thanks a lot for revising the
patch for a bigger scope of validations. However, the exception of /32
address makes me thinking more about this patch. If ARP replies is allowed
from any nexthop for a LR port with /32, at least ARP request for GARP
purpose should also be allowed. But before asking for a v3, I'd hold on to
rethink the purpose of this patch.
> >> > >
> >> > > Right, we should allow GARP requests too. If we decide to go ahead
> >> > > with this patch I'll add a function in v3 to handle all types of
ARPs
> >> > > and call it both for unreachable static route next-hops and
attached
> >> > > subnets.
> >> > >
> >> > > >
> >> > > > The nexthop specific flows are now from static routes. What if
OVN support dynamical routing protocols in the future? Of course we can
then take those dynamic nexthops into allowed peers. But then I am thinking
what is the real benefit of all these restrictions? Why can't we just have
simpler logic to handle all these situations without validation? I think
the major benefit of the validation is to avoid handling the noise ARP/NDs
when multiple subnets shares same L2, but most cases it is really not a big
deal, right? For the CPU problem caused by ARP flooding as mentioned by the
bug report, it is a real problem, but this patch seems not really helpful
for that, because the attacker can just trigger the same CPU problem with
*valid* packets. So I am not sure if the benefit of the change is worth the
complexity it introduced. Please share your thought and correct me if I
missed something.
> >> > > >
> >> > > > Thanks,
> >> > > > Han
> >> > >
> >> > > I assume the simpler logic to handle all these situations without
> >> > > validation is to add rate limiting for ARP packets that get punted
to
> >> > > the controller. I agree that this should be implemented too.
> >> > >
> >> > > But I think rate limiting all ARP packets regardless of IP
addresses
> >> > > is not enough. In the following scenario and if we would have a
way to
> >> > > rate limit ARP packets:
> >> > > - Subnet 42.42.42.0/24 configured on the OVN
> >> > > - "Invalid" ARP packets are injected at high rate in the network
for 41.41.41.41
> >> > > - Host 42.42.42.42 sends GARP.
> >> > > - Rate limiting of ARP packets towards controller at 100pps
> >> > >
> >> > > With the current code, ARP packets for 41.41.41.41 will be punted
to
> >> > > controller at a rate of at most 100 per second. But the chances
that
> >> > > the valid 42.42.42.42 GARP is dropped is really high.
> >> > >
> >> > > If we instead use the following approach:
> >> > > a. Punt only useful ARPs to controller.
> >> > > b. Rate limit ARPs that are sent to controller.
> >> > >
> >> > > Then ARP packets outside 42.42.42./24 are never punted to the
> >> > > controller and don't consume any rate limiting tokens. For the
second
> >> > > case, when an attacker would flood with valid ARP packets we would
> >> > > have the rate limit in place to protect the controller CPU.
> >> > >
> >> > > My commit addresses point "a" above as I think point "b" should be
> >> > > done in a generic way for all protocol packets that need to reach
the
> >> > > controller.
> >> > >
> >> > > For dynamic routing protocols on the other hand I think we should
not
> >> > > install routes with next-hops that are unreachable through other
> >> > > direct or indirect routes.
> >> > >
> >> > > Thanks,
> >> > > Dumitru
> >> >
> >> > I agree that blindly ARP rate limit is not helpful, but a) is not
really helpful in this case 

Re: [ovs-dev] [PATCH ovn v2] OVN: Fix learning of neighbors from ARP/ND packets.

2019-08-15 Thread Dumitru Ceara
On Thu, Aug 8, 2019 at 1:52 AM Han Zhou  wrote:
>
>
>
> On Wed, Aug 7, 2019 at 8:12 AM Dumitru Ceara  wrote:
>>
>> On Mon, Aug 5, 2019 at 5:34 PM Han Zhou  wrote:
>> >
>> >
>> >
>> > On Mon, Aug 5, 2019 at 1:06 AM Dumitru Ceara  wrote:
>> > >
>> > > On Mon, Aug 5, 2019 at 1:41 AM Han Zhou  wrote:
>> > > >
>> > > >
>> > > >
>> > > > On Mon, Jul 29, 2019 at 2:30 AM Dumitru Ceara  
>> > > > wrote:
>> > > > >
>> > > > > Add a restriction on the target protocol addresses to match the
>> > > > > configured subnets. All other ARP/ND packets are invalid in this
>> > > > > context.
>> > > > >
>> > > > > One exception is for ARP replies that are received for route 
>> > > > > next-hops
>> > > > > that are only reachable via a port but can't be directly resolved
>> > > > > through route lookups. Such support was introduced by commit:
>> > > > >
>> > > > > 6b785fd8fe29 ("ovn-util: Allow /32 IP addresses for router ports.")
>> > > > >
>> > > > > Reported-at: https://bugzilla.redhat.com/1729846
>> > > > > Reported-by: Haidong Li 
>> > > > > CC: Han Zhou 
>> > > > > CC: Guru Shetty 
>> > > > > Fixes: b068454082f5 ("ovn-northd: Support learning neighbor from ARP 
>> > > > > request.")
>> > > > > Signed-off-by: Dumitru Ceara 
>> > > >
>> > > > Hi Dumitru,
>> > >
>> > > Hi Han,
>> > >
>> > > Thanks for reviewing this.
>> > >
>> > > >
>> > > > Sorry for my slow response, and thanks a lot for revising the patch 
>> > > > for a bigger scope of validations. However, the exception of /32 
>> > > > address makes me thinking more about this patch. If ARP replies is 
>> > > > allowed from any nexthop for a LR port with /32, at least ARP request 
>> > > > for GARP purpose should also be allowed. But before asking for a v3, 
>> > > > I'd hold on to rethink the purpose of this patch.
>> > >
>> > > Right, we should allow GARP requests too. If we decide to go ahead
>> > > with this patch I'll add a function in v3 to handle all types of ARPs
>> > > and call it both for unreachable static route next-hops and attached
>> > > subnets.
>> > >
>> > > >
>> > > > The nexthop specific flows are now from static routes. What if OVN 
>> > > > support dynamical routing protocols in the future? Of course we can 
>> > > > then take those dynamic nexthops into allowed peers. But then I am 
>> > > > thinking what is the real benefit of all these restrictions? Why can't 
>> > > > we just have simpler logic to handle all these situations without 
>> > > > validation? I think the major benefit of the validation is to avoid 
>> > > > handling the noise ARP/NDs when multiple subnets shares same L2, but 
>> > > > most cases it is really not a big deal, right? For the CPU problem 
>> > > > caused by ARP flooding as mentioned by the bug report, it is a real 
>> > > > problem, but this patch seems not really helpful for that, because the 
>> > > > attacker can just trigger the same CPU problem with *valid* packets. 
>> > > > So I am not sure if the benefit of the change is worth the complexity 
>> > > > it introduced. Please share your thought and correct me if I missed 
>> > > > something.
>> > > >
>> > > > Thanks,
>> > > > Han
>> > >
>> > > I assume the simpler logic to handle all these situations without
>> > > validation is to add rate limiting for ARP packets that get punted to
>> > > the controller. I agree that this should be implemented too.
>> > >
>> > > But I think rate limiting all ARP packets regardless of IP addresses
>> > > is not enough. In the following scenario and if we would have a way to
>> > > rate limit ARP packets:
>> > > - Subnet 42.42.42.0/24 configured on the OVN
>> > > - "Invalid" ARP packets are injected at high rate in the network for 
>> > > 41.41.41.41
>> > > - Host 42.42.42.42 sends GARP.
>> > > - Rate limiting of ARP packets towards controller at 100pps
>> > >
>> > > With the current code, ARP packets for 41.41.41.41 will be punted to
>> > > controller at a rate of at most 100 per second. But the chances that
>> > > the valid 42.42.42.42 GARP is dropped is really high.
>> > >
>> > > If we instead use the following approach:
>> > > a. Punt only useful ARPs to controller.
>> > > b. Rate limit ARPs that are sent to controller.
>> > >
>> > > Then ARP packets outside 42.42.42./24 are never punted to the
>> > > controller and don't consume any rate limiting tokens. For the second
>> > > case, when an attacker would flood with valid ARP packets we would
>> > > have the rate limit in place to protect the controller CPU.
>> > >
>> > > My commit addresses point "a" above as I think point "b" should be
>> > > done in a generic way for all protocol packets that need to reach the
>> > > controller.
>> > >
>> > > For dynamic routing protocols on the other hand I think we should not
>> > > install routes with next-hops that are unreachable through other
>> > > direct or indirect routes.
>> > >
>> > > Thanks,
>> > > Dumitru
>> >
>> > I agree that blindly ARP rate limit is not helpful, but a) is not really 
>> > helpful in this case either. In