Re: [ovs-dev] [ovs-dev, RFC] ovn: Revised support for service function chaining

2017-03-15 Thread Mickey Spiegel
On Mon, Mar 13, 2017 at 1:28 PM, John McDowall <
jmcdow...@paloaltonetworks.com> wrote:

> This patch set is an alternative implementation of service function
> chaining (SFC) for OVS/OVN. The major change from the previous patch is
> that the overloading of the ACL stage in ovn-northd.c has been removed
> and replaced with additional logic in the CHAIN stage.
>
> This was done to improve modularity of the code as it was felt that
> overloading the ACL stage did not add a lot and made it hard to compose
> reusable SFCs.
>
> The new approach can still use the matching logic in OVN as a match
> argument has been added. This has not been fully implemented yet as it
> may need some error checking to ensure the match does not conflicit
> with the lport in the chain and the bi-directional case.
>
> The other major change is the logic for the final destination of the
> flows exiting the service chain. This has been simplified such that the
> rules in the chain stage determine when the flow is exiting the port-chain
> and then the flow just follows the normal path to the src or dst of the
> flow.
>
> Areas to review
>
> 1) Logic for delivering flow after it leaves the chain. I think it is now
> general and should work across subnets etc.
>

This revision does address one of my concerns with the previous revision,
specifically the previous proposal to specify a 'last_hop_port' at the end
of the logical port chain.

However, it does not address my other major concern about the behavior at
the end of the service chain, the interaction with the way OVN forwards
traffic to the outside world.


Repeating my comments from the last revision:

In OVN without SFC, traffic is forwarded through a logical switch to a
logical switch port of type 'localnet' that effectively resides on all
hypervisors, that then forwards to the physical L2 network.  When a VM
residing on that same logical switch originates traffic destined for the
outside world, it is directed to the local instance of the type 'localnet'
logical switch port on the VM's hypervisor.  The physical L2 network learns
that the VM's MAC address resides on that hypervisor, and forwards all
traffic to that MAC address to that specific hypervisor.  Typically the
physical L2 network first learns the VM's MAC address due to a gratuitous
ARP packet sent by the ovn-controller on the VM's hypervisor.

In this SFC proposal, at the end of the logical port chain, traffic is
forwarded directly to the destination.  This might work for destinations
attached directly to OVN, but it would break forwarding to the physical L2
network:

1. If the last port pair group in the logical port chain contains more than
one port pair spread across more than one chassis, then this would
completely break upstream L2 learning.  Traffic would have to be forwarded
to a common point before being sent upstream.

2. What if the SFC classifier granularity is finer than per source VM, and
the multiple logical port chains (that one VM's traffic can be redirected
to) end at different chassis?

3. The gratuitous ARP packets generated by OVN for VIFs residing on the
same logical switch as the 'localnet' port are sent from each VIF's
chassis.  This may be different than the chassis at the end of the logical
port chain.

There are some topologies where this SFC proposal will work even for
traffic destined for the outside world: if there are no VIFs on the same
logical switch as the 'localnet' port, and either a gateway router is used
or a distributed router with only centralized NAT rules is used.  If a
distributed router is specified with distributed NAT rules, then point 3
above gets even worse, since ARP replies for a distributed NAT external IP
address are restricted to the chassis where the corresponding VIF resides.

My preference is to return to the originating hypervisor at the end of the
logical port chain.  With such an approach, L2 and L3 forwarding are
unaffected.
However, there is a question how to determine from a packet at the end of
the logical port chain, what is the originating hypervisor?
The best answer is to use an encapsulation such as Geneve or NSH that can
carry context information.
Another alternative is to do a lookup of the source address when the chain
was entered in the "exit-lport" direction.


An example to clarify the problem:

Single Logical Switch LS1.
VM1 on HV1 with MAC1, IP1.
Logical Port Chain LPC1 with one Logical Port Pair Group LPPG1.
LPPG1 has two Logical Port Pairs LPP1 and LPP2, residing on HV2 and HV3
respectively.

When VM1 comes up, HV1 will send a gratuitous ARP packet out the localnet
interface on HV1 to the physical L2 network, using MAC1 and IP1.
The physical L2 network will learn that MAC1 is on HV1.

VM1 sends a packet to the outside world.
The packet gets redirected down LPC1 to LPP1 on HV2. Using the logic below,
the packet will go through a normal L2 destination lookup on HV2, so it
will be sent out the localnet interface on HV2 to the physical L2 network.

[ovs-dev] [PATCH branch-2.7 25/25] lib: Indicate if netlink message had labels.

2017-03-15 Thread Jarno Rajahalme
Conntrack update events include labels only if they have changed.
Record the presence of labels in the netlink message to OVS internal
representation, so that the user may keep the old labels when an
update does not modify them.

Fixes: 6830a0c0e6bf ("netlink-conntrack: New module.")
Signed-off-by: Jarno Rajahalme 
Acked-by: Joe Stringer 
---
 lib/ct-dpif.h   | 1 +
 lib/netlink-conntrack.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/lib/ct-dpif.h b/lib/ct-dpif.h
index 5da3c2c..e8e159a 100644
--- a/lib/ct-dpif.h
+++ b/lib/ct-dpif.h
@@ -163,6 +163,7 @@ struct ct_dpif_entry {
 struct ct_dpif_protoinfo protoinfo;
 
 ovs_u128 labels;
+bool have_labels;
 uint32_t status;
 /* Timeout for this entry in seconds */
 uint32_t timeout;
diff --git a/lib/netlink-conntrack.c b/lib/netlink-conntrack.c
index aab5b1f..8b82db2 100644
--- a/lib/netlink-conntrack.c
+++ b/lib/netlink-conntrack.c
@@ -780,6 +780,7 @@ nl_ct_attrs_to_ct_dpif_entry(struct ct_dpif_entry *entry,
 entry->mark = ntohl(nl_attr_get_be32(attrs[CTA_MARK]));
 }
 if (attrs[CTA_LABELS]) {
+entry->have_labels = true;
 memcpy(>labels, nl_attr_get(attrs[CTA_LABELS]),
MIN(sizeof entry->labels, nl_attr_get_size(attrs[CTA_LABELS])));
 }
-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.7 19/25] datapath: Simplify labels length logic.

2017-03-15 Thread Jarno Rajahalme
Upstream commit:

commit b87cec3814ccc7f6afb0a1378ee7e5110d07cdd3
Author: Jarno Rajahalme 
Date:   Thu Feb 9 11:21:56 2017 -0800

openvswitch: Simplify labels length logic.

Since 23014011ba42 ("netfilter: conntrack: support a fixed size of 128
distinct labels"), the size of conntrack labels extension has fixed to
128 bits, so we do not need to check for labels sizes shorter than 128
at run-time.  This patch simplifies labels length logic accordingly,
but allows the conntrack labels size to be increased in the future
without breaking the build.  In the event of conntrack labels
increasing in size OVS would still be able to deal with the 128 first
label bits.

Suggested-by: Joe Stringer 
Signed-off-by: Jarno Rajahalme 
Acked-by: Pravin B Shelar 
Acked-by: Joe Stringer 
Signed-off-by: David S. Miller 

Signed-off-by: Jarno Rajahalme 
Acked-by: Joe Stringer 
---
 datapath/conntrack.c | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index b5c80be..2d095b8 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -145,22 +145,20 @@ static size_t ovs_ct_get_labels_len(struct nf_conn_labels 
*cl)
 #endif
 }
 
+/* Guard against conntrack labels max size shrinking below 128 bits. */
+#if NF_CT_LABELS_MAX_SIZE < 16
+#error NF_CT_LABELS_MAX_SIZE must be at least 16 bytes
+#endif
+
 static void ovs_ct_get_labels(const struct nf_conn *ct,
  struct ovs_key_ct_labels *labels)
 {
struct nf_conn_labels *cl = ct ? nf_ct_labels_find(ct) : NULL;
 
-   if (cl) {
-   size_t len = ovs_ct_get_labels_len(cl);
-
-   if (len > OVS_CT_LABELS_LEN)
-   len = OVS_CT_LABELS_LEN;
-   else if (len < OVS_CT_LABELS_LEN)
-   memset(labels, 0, OVS_CT_LABELS_LEN);
-   memcpy(labels, cl->bits, len);
-   } else {
+   if (cl)
+   memcpy(labels, cl->bits, OVS_CT_LABELS_LEN);
+   else
memset(labels, 0, OVS_CT_LABELS_LEN);
-   }
 }
 
 static void __ovs_ct_update_key(struct sw_flow_key *key, u8 state,
-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.7 23/25] nx-match: Fix oxm decode.

2017-03-15 Thread Jarno Rajahalme
From: Yi-Hung Wei 

decode_nx_packet_in2() may be used by the switch to parse NXT_RESUME messages,
where we need exact match on the oxm header. Therefore, change
oxm_decode_loose() to oxm_decode() that takes an extra argument to indicate 
whether
we want strict or loose match.

Fixes: 7befb20d0f70 ("ofp-util: Ignore unknown fields in 
ofputil_decode_packet_in2()")
Signed-off-by: Yi-Hung Wei 
Signed-off-by: Joe Stringer 
Signed-off-by: Jarno Rajahalme 
---
 lib/nx-match.c | 10 +-
 lib/nx-match.h |  4 ++--
 lib/ofp-util.c |  5 ++---
 3 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/lib/nx-match.c b/lib/nx-match.c
index 2e62e99..43672cb 100644
--- a/lib/nx-match.c
+++ b/lib/nx-match.c
@@ -683,14 +683,14 @@ oxm_pull_match_loose(struct ofpbuf *b, const struct 
tun_table *tun_table,
  *
  * Returns 0 if successful, otherwise an OpenFlow error code.
  *
- * Encountering unknown OXM headers or missing field prerequisites are not
- * considered as error conditions.
+ * If 'loose' is true, encountering unknown OXM headers or missing field
+ * prerequisites are not considered as error conditions.
  */
 enum ofperr
-oxm_decode_match_loose(const void *oxm, size_t oxm_len,
-   const struct tun_table *tun_table, struct match *match)
+oxm_decode_match(const void *oxm, size_t oxm_len, bool loose,
+ const struct tun_table *tun_table, struct match *match)
 {
-return nx_pull_raw(oxm, oxm_len, false, match, NULL, NULL, tun_table);
+return nx_pull_raw(oxm, oxm_len, !loose, match, NULL, NULL, tun_table);
 }
 
 /* Verify an array of OXM TLVs treating value of each TLV as a mask,
diff --git a/lib/nx-match.h b/lib/nx-match.h
index cee9e65..e103dd5 100644
--- a/lib/nx-match.h
+++ b/lib/nx-match.h
@@ -61,8 +61,8 @@ enum ofperr oxm_pull_match(struct ofpbuf *, const struct 
tun_table *,
struct match *);
 enum ofperr oxm_pull_match_loose(struct ofpbuf *, const struct tun_table *,
  struct match *);
-enum ofperr oxm_decode_match_loose(const void *, size_t,
-   const struct tun_table *, struct match *);
+enum ofperr oxm_decode_match(const void *, size_t, bool,
+ const struct tun_table *, struct match *);
 enum ofperr oxm_pull_field_array(const void *, size_t fields_len,
  struct field_array *);
 
diff --git a/lib/ofp-util.c b/lib/ofp-util.c
index 9e8d4d2..d315337 100644
--- a/lib/ofp-util.c
+++ b/lib/ofp-util.c
@@ -3397,9 +3397,8 @@ decode_nx_packet_in2(const struct ofp_header *oh, bool 
loose,
 }
 
 case NXPINT_METADATA:
-error = oxm_decode_match_loose(payload.msg,
-   ofpbuf_msgsize(),
-   tun_table, >flow_metadata);
+error = oxm_decode_match(payload.msg, ofpbuf_msgsize(),
+ loose, tun_table, >flow_metadata);
 break;
 
 case NXPINT_USERDATA:
-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.7 22/25] ofp-util: Ignore unknown fields in ofputil_decode_packet_in2().

2017-03-15 Thread Jarno Rajahalme
The decoder of packet_in messages should not fail on encountering
unknown metadata fields.  This allows the switch to add new features
without breaking controllers.  The controllers should, however, copy
the metadata fields from the packet_int to packet_out so that the
switch gets back the full metadata.  OVN is already doing this.

Signed-off-by: Jarno Rajahalme 
Acked-by: Joe Stringer 
---
 lib/nx-match.c | 25 -
 lib/nx-match.h |  4 ++--
 lib/ofp-util.c |  5 +++--
 3 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/lib/nx-match.c b/lib/nx-match.c
index 91401e2..2e62e99 100644
--- a/lib/nx-match.c
+++ b/lib/nx-match.c
@@ -504,6 +504,9 @@ nx_pull_match_entry(struct ofpbuf *b, bool allow_cookie,
 return 0;
 }
 
+/* Prerequisites will only be checked when 'strict' is 'true'.  This allows
+ * decoding conntrack original direction 5-tuple IP addresses without the
+ * ethertype being present, when decoding metadata only. */
 static enum ofperr
 nx_pull_raw(const uint8_t *p, unsigned int match_len, bool strict,
 struct match *match, ovs_be64 *cookie, ovs_be64 *cookie_mask,
@@ -539,7 +542,7 @@ nx_pull_raw(const uint8_t *p, unsigned int match_len, bool 
strict,
 *cookie = value.be64;
 *cookie_mask = mask.be64;
 }
-} else if (!mf_are_prereqs_ok(field, >flow, NULL)) {
+} else if (strict && !mf_are_prereqs_ok(field, >flow, NULL)) {
 error = OFPERR_OFPBMC_BAD_PREREQ;
 } else if (!mf_is_all_wild(field, >wc)) {
 error = OFPERR_OFPBMC_DUP_FIELD;
@@ -607,7 +610,8 @@ nx_pull_match(struct ofpbuf *b, unsigned int match_len, 
struct match *match,
 }
 
 /* Behaves the same as nx_pull_match(), but skips over unknown NXM headers,
- * instead of failing with an error. */
+ * instead of failing with an error, and does not check for field
+ * prerequisities. */
 enum ofperr
 nx_pull_match_loose(struct ofpbuf *b, unsigned int match_len,
 struct match *match,
@@ -664,8 +668,9 @@ oxm_pull_match(struct ofpbuf *b, const struct tun_table 
*tun_table,
 return oxm_pull_match__(b, true, tun_table, match);
 }
 
-/* Behaves the same as oxm_pull_match() with one exception.  Skips over unknown
- * OXM headers instead of failing with an error when they are encountered. */
+/* Behaves the same as oxm_pull_match() with two exceptions.  Skips over
+ * unknown OXM headers instead of failing with an error when they are
+ * encountered, and does not check for field prerequisities. */
 enum ofperr
 oxm_pull_match_loose(struct ofpbuf *b, const struct tun_table *tun_table,
  struct match *match)
@@ -676,14 +681,16 @@ oxm_pull_match_loose(struct ofpbuf *b, const struct 
tun_table *tun_table,
 /* Parses the OXM match description in the 'oxm_len' bytes in 'oxm'.  Stores
  * the result in 'match'.
  *
- * Fails with an error when encountering unknown OXM headers.
+ * Returns 0 if successful, otherwise an OpenFlow error code.
  *
- * Returns 0 if successful, otherwise an OpenFlow error code. */
+ * Encountering unknown OXM headers or missing field prerequisites are not
+ * considered as error conditions.
+ */
 enum ofperr
-oxm_decode_match(const void *oxm, size_t oxm_len,
- const struct tun_table *tun_table, struct match *match)
+oxm_decode_match_loose(const void *oxm, size_t oxm_len,
+   const struct tun_table *tun_table, struct match *match)
 {
-return nx_pull_raw(oxm, oxm_len, true, match, NULL, NULL, tun_table);
+return nx_pull_raw(oxm, oxm_len, false, match, NULL, NULL, tun_table);
 }
 
 /* Verify an array of OXM TLVs treating value of each TLV as a mask,
diff --git a/lib/nx-match.h b/lib/nx-match.h
index 5dca24a..cee9e65 100644
--- a/lib/nx-match.h
+++ b/lib/nx-match.h
@@ -61,8 +61,8 @@ enum ofperr oxm_pull_match(struct ofpbuf *, const struct 
tun_table *,
struct match *);
 enum ofperr oxm_pull_match_loose(struct ofpbuf *, const struct tun_table *,
  struct match *);
-enum ofperr oxm_decode_match(const void *, size_t, const struct tun_table *,
- struct match *);
+enum ofperr oxm_decode_match_loose(const void *, size_t,
+   const struct tun_table *, struct match *);
 enum ofperr oxm_pull_field_array(const void *, size_t fields_len,
  struct field_array *);
 
diff --git a/lib/ofp-util.c b/lib/ofp-util.c
index 0c9343e..9e8d4d2 100644
--- a/lib/ofp-util.c
+++ b/lib/ofp-util.c
@@ -3397,8 +3397,9 @@ decode_nx_packet_in2(const struct ofp_header *oh, bool 
loose,
 }
 
 case NXPINT_METADATA:
-error = oxm_decode_match(payload.msg, ofpbuf_msgsize(),
- tun_table, >flow_metadata);
+error = oxm_decode_match_loose(payload.msg,
+   

[ovs-dev] [PATCH branch-2.7 21/25] datapath: Inherit master's labels.

2017-03-15 Thread Jarno Rajahalme
Upstream commit:

commit 09aa98ad496d6b11a698b258bc64d7f64c55d682
Author: Jarno Rajahalme 
Date:   Thu Feb 9 11:21:58 2017 -0800

openvswitch: Inherit master's labels.

We avoid calling into nf_conntrack_in() for expected connections, as
that would remove the expectation that we want to stick around until
we are ready to commit the connection.  Instead, we do a lookup in the
expectation table directly.  However, after a successful expectation
lookup we have set the flow key label field from the master
connection, whereas nf_conntrack_in() does not do this.  This leads to
master's labels being inherited after an expectation lookup, but those
labels not being inherited after the corresponding conntrack action
with a commit flag.

This patch resolves the problem by changing the commit code path to
also inherit the master's labels to the expected connection.
Resolving this conflict in favor of inheriting the labels allows more
information be passed from the master connection to related
connections, which would otherwise be much harder if the 32 bits in
the connmark are not enough.  Labels can still be set explicitly, so
this change only affects the default values of the labels in presense
of a master connection.

Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Signed-off-by: Jarno Rajahalme 
Acked-by: Pravin B Shelar 
Acked-by: Joe Stringer 
Signed-off-by: David S. Miller 

Fixes: a94ebc39996b ("datapath: Add conntrack action")
Signed-off-by: Jarno Rajahalme 
Acked-by: Joe Stringer 
---
 datapath/conntrack.c | 45 +++--
 1 file changed, 31 insertions(+), 14 deletions(-)

diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index a56fe07..9428eb2 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -80,6 +80,8 @@ struct ovs_conntrack_info {
 #endif
 };
 
+static bool labels_nonzero(const struct ovs_key_ct_labels *labels);
+
 static void __ovs_ct_free_action(struct ovs_conntrack_info *ct_info);
 
 static u16 key_to_nfproto(const struct sw_flow_key *key)
@@ -275,18 +277,32 @@ static int ovs_ct_init_labels(struct nf_conn *ct, struct 
sw_flow_key *key,
  const struct ovs_key_ct_labels *labels,
  const struct ovs_key_ct_labels *mask)
 {
-   struct nf_conn_labels *cl;
-   u32 *dst;
-   int i;
+   struct nf_conn_labels *cl, *master_cl;
+   bool have_mask = labels_nonzero(mask);
+
+   /* Inherit master's labels to the related connection? */
+   master_cl = ct->master ? nf_ct_labels_find(ct->master) : NULL;
+
+   if (!master_cl && !have_mask)
+   return 0;   /* Nothing to do. */
 
cl = ovs_ct_get_conn_labels(ct);
if (!cl)
return -ENOSPC;
 
-   dst = (u32 *)cl->bits;
-   for (i = 0; i < OVS_CT_LABELS_LEN_32; i++)
-   dst[i] = (dst[i] & ~mask->ct_labels_32[i]) |
-   (labels->ct_labels_32[i] & mask->ct_labels_32[i]);
+   /* Inherit the master's labels, if any. */
+   if (master_cl)
+   *cl = *master_cl;
+
+   if (have_mask) {
+   u32 *dst = (u32 *)cl->bits;
+   int i;
+
+   for (i = 0; i < OVS_CT_LABELS_LEN_32; i++)
+   dst[i] = (dst[i] & ~mask->ct_labels_32[i]) |
+   (labels->ct_labels_32[i]
+& mask->ct_labels_32[i]);
+   }
 
/* Labels are included in the IPCTNL_MSG_CT_NEW event only if the
 * IPCT_LABEL bit it set in the event cache.
@@ -943,13 +959,14 @@ static int ovs_ct_commit(struct net *net, struct 
sw_flow_key *key,
if (err)
return err;
}
-   if (labels_nonzero(>labels.mask)) {
-   if (!nf_ct_is_confirmed(ct))
-   err = ovs_ct_init_labels(ct, key, >labels.value,
->labels.mask);
-   else
-   err = ovs_ct_set_labels(ct, key, >labels.value,
-   >labels.mask);
+   if (!nf_ct_is_confirmed(ct)) {
+   err = ovs_ct_init_labels(ct, key, >labels.value,
+>labels.mask);
+   if (err)
+   return err;
+   } else if (labels_nonzero(>labels.mask)) {
+   err = ovs_ct_set_labels(ct, key, >labels.value,
+   >labels.mask);
if (err)
return err;
}
-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.7 07/25] datapath: remove unnecessary EXPORT_SYMBOLs

2017-03-15 Thread Jarno Rajahalme
From: Jiri Benc 

Upstream commit:
commit 76e4cc7731a1e0c07e202999b9834f9d9be66de4
Author: Jiri Benc 
Date:   Wed Oct 19 11:26:37 2016 +0200

openvswitch: remove unnecessary EXPORT_SYMBOLs

Some symbols exported to other modules are really used only by
openvswitch.ko. Remove the exports.

Tested by loading all 4 openvswitch modules, nothing breaks.

Signed-off-by: Jiri Benc 
Acked-by: Pravin B Shelar 
Signed-off-by: David S. Miller 

Signed-off-by: Jarno Rajahalme 
Signed-off-by: Joe Stringer 
---
 datapath/datapath.c | 2 --
 datapath/vport-netdev.c | 1 -
 datapath/vport.c| 1 -
 3 files changed, 4 deletions(-)

diff --git a/datapath/datapath.c b/datapath/datapath.c
index be433ba..64cd781 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -62,7 +62,6 @@
 #include "vport-netdev.h"
 
 int ovs_net_id __read_mostly;
-EXPORT_SYMBOL_GPL(ovs_net_id);
 
 static struct genl_family dp_packet_genl_family;
 static struct genl_family dp_flow_genl_family;
@@ -135,7 +134,6 @@ int lockdep_ovsl_is_held(void)
else
return 1;
 }
-EXPORT_SYMBOL_GPL(lockdep_ovsl_is_held);
 #endif
 
 static int queue_gso_packets(struct datapath *dp, struct sk_buff *,
diff --git a/datapath/vport-netdev.c b/datapath/vport-netdev.c
index 970f7d3..fd97246 100644
--- a/datapath/vport-netdev.c
+++ b/datapath/vport-netdev.c
@@ -167,7 +167,6 @@ void ovs_netdev_detach_dev(struct vport *vport)
netdev_master_upper_dev_get(vport->dev));
dev_set_promiscuity(vport->dev, -1);
 }
-EXPORT_SYMBOL_GPL(ovs_netdev_detach_dev);
 
 static void netdev_destroy(struct vport *vport)
 {
diff --git a/datapath/vport.c b/datapath/vport.c
index 7ac4632..9c8c0f1 100644
--- a/datapath/vport.c
+++ b/datapath/vport.c
@@ -507,7 +507,6 @@ int ovs_vport_receive(struct vport *vport, struct sk_buff 
*skb,
ovs_dp_process_packet(skb, );
return 0;
 }
-EXPORT_SYMBOL_GPL(ovs_vport_receive);
 
 static unsigned int packet_length(const struct sk_buff *skb)
 {
-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.7 20/25] datapath: Refactor labels initialization.

2017-03-15 Thread Jarno Rajahalme
Upstream commit:

Refactoring conntrack labels initialization makes changes in later
patches easier to review.

Signed-off-by: Jarno Rajahalme 
Acked-by: Pravin B Shelar 
Acked-by: Joe Stringer 
Signed-off-by: David S. Miller 

Signed-off-by: Jarno Rajahalme 
Acked-by: Joe Stringer 
---
 datapath/conntrack.c | 120 +++
 1 file changed, 64 insertions(+), 56 deletions(-)

diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index 2d095b8..a56fe07 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -136,15 +136,6 @@ static u32 ovs_ct_get_mark(const struct nf_conn *ct)
 #endif
 }
 
-static size_t ovs_ct_get_labels_len(struct nf_conn_labels *cl)
-{
-#ifdef HAVE_NF_CONN_LABELS_WITH_WORDS
-   return cl->words * sizeof(long);
-#else
-   return sizeof(cl->bits);
-#endif
-}
-
 /* Guard against conntrack labels max size shrinking below 128 bits. */
 #if NF_CT_LABELS_MAX_SIZE < 16
 #error NF_CT_LABELS_MAX_SIZE must be at least 16 bytes
@@ -243,19 +234,12 @@ int ovs_ct_put_key(const struct sw_flow_key *key, struct 
sk_buff *skb)
return 0;
 }
 
-static int ovs_ct_set_mark(struct sk_buff *skb, struct sw_flow_key *key,
+static int ovs_ct_set_mark(struct nf_conn *ct, struct sw_flow_key *key,
   u32 ct_mark, u32 mask)
 {
 #if IS_ENABLED(CONFIG_NF_CONNTRACK_MARK)
-   enum ip_conntrack_info ctinfo;
-   struct nf_conn *ct;
u32 new_mark;
 
-   /* The connection could be invalid, in which case set_mark is no-op. */
-   ct = nf_ct_get(skb, );
-   if (!ct)
-   return 0;
-
new_mark = ct_mark | (ct->mark & ~(mask));
if (ct->mark != new_mark) {
ct->mark = new_mark;
@@ -270,18 +254,9 @@ static int ovs_ct_set_mark(struct sk_buff *skb, struct 
sw_flow_key *key,
 #endif
 }
 
-static int ovs_ct_set_labels(struct sk_buff *skb, struct sw_flow_key *key,
-const struct ovs_key_ct_labels *labels,
-const struct ovs_key_ct_labels *mask)
+static struct nf_conn_labels *ovs_ct_get_conn_labels(struct nf_conn *ct)
 {
-   enum ip_conntrack_info ctinfo;
struct nf_conn_labels *cl;
-   struct nf_conn *ct;
-
-   /* The connection could be invalid, in which case set_label is no-op.*/
-   ct = nf_ct_get(skb, );
-   if (!ct)
-   return 0;
 
cl = nf_ct_labels_find(ct);
if (!cl) {
@@ -289,37 +264,59 @@ static int ovs_ct_set_labels(struct sk_buff *skb, struct 
sw_flow_key *key,
cl = nf_ct_labels_find(ct);
}
 
-   if (!cl || ovs_ct_get_labels_len(cl) < OVS_CT_LABELS_LEN)
+   return cl;
+}
+
+/* Initialize labels for a new, yet to be committed conntrack entry.  Note that
+ * since the new connection is not yet confirmed, and thus no-one else has
+ * access to it's labels, we simply write them over.
+ */
+static int ovs_ct_init_labels(struct nf_conn *ct, struct sw_flow_key *key,
+ const struct ovs_key_ct_labels *labels,
+ const struct ovs_key_ct_labels *mask)
+{
+   struct nf_conn_labels *cl;
+   u32 *dst;
+   int i;
+
+   cl = ovs_ct_get_conn_labels(ct);
+   if (!cl)
return -ENOSPC;
 
-   if (nf_ct_is_confirmed(ct)) {
-   /* Triggers a change event, which makes sense only for
-* confirmed connections.
-*/
-   int err = nf_connlabels_replace(ct, labels->ct_labels_32,
-   mask->ct_labels_32,
-   OVS_CT_LABELS_LEN_32);
-   if (err)
-   return err;
-   } else {
-   u32 *dst = (u32 *)cl->bits;
-   const u32 *msk = mask->ct_labels_32;
-   const u32 *lbl = labels->ct_labels_32;
-   int i;
+   dst = (u32 *)cl->bits;
+   for (i = 0; i < OVS_CT_LABELS_LEN_32; i++)
+   dst[i] = (dst[i] & ~mask->ct_labels_32[i]) |
+   (labels->ct_labels_32[i] & mask->ct_labels_32[i]);
 
-   /* No-one else has access to the non-confirmed entry, copy
-* labels over, keeping any bits we are not explicitly setting.
-*/
-   for (i = 0; i < OVS_CT_LABELS_LEN_32; i++)
-   dst[i] = (dst[i] & ~msk[i]) | (lbl[i] & msk[i]);
+   /* Labels are included in the IPCTNL_MSG_CT_NEW event only if the
+* IPCT_LABEL bit it set in the event cache.
+*/
+   nf_conntrack_event_cache(IPCT_LABEL, ct);
 
-   /* Labels are included in the IPCTNL_MSG_CT_NEW event only if
-* the IPCT_LABEL bit it set in the event cache.
-*/
-   nf_conntrack_event_cache(IPCT_LABEL, ct);
-   }
+  

[ovs-dev] [PATCH branch-2.7 18/25] datapath: Unionize ovs_key_ct_label with a u32 array.

2017-03-15 Thread Jarno Rajahalme
Upstream commit:

commit cb80d58fae76d8ea93555149b2b16e19b89a1f4f
Author: Jarno Rajahalme 
Date:   Thu Feb 9 11:21:55 2017 -0800

openvswitch: Unionize ovs_key_ct_label with a u32 array.

Make the array of labels in struct ovs_key_ct_label an union, adding a
u32 array of the same byte size as the existing u8 array.  It is
faster to loop through the labels 32 bits at the time, which is also
the alignment of netlink attributes.

Signed-off-by: Jarno Rajahalme 
Acked-by: Joe Stringer 
Acked-by: Pravin B Shelar 
Signed-off-by: David S. Miller 

Signed-off-by: Jarno Rajahalme 
Acked-by: Joe Stringer 
---
 datapath/conntrack.c  | 15 ---
 datapath/linux/compat/include/linux/openvswitch.h |  8 ++--
 2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index cee47b7..b5c80be 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -298,20 +298,21 @@ static int ovs_ct_set_labels(struct sk_buff *skb, struct 
sw_flow_key *key,
/* Triggers a change event, which makes sense only for
 * confirmed connections.
 */
-   int err = nf_connlabels_replace(ct, (u32 *)labels, (u32 *)mask,
-   OVS_CT_LABELS_LEN / 
sizeof(u32));
+   int err = nf_connlabels_replace(ct, labels->ct_labels_32,
+   mask->ct_labels_32,
+   OVS_CT_LABELS_LEN_32);
if (err)
return err;
} else {
u32 *dst = (u32 *)cl->bits;
-   const u32 *msk = (const u32 *)mask->ct_labels;
-   const u32 *lbl = (const u32 *)labels->ct_labels;
+   const u32 *msk = mask->ct_labels_32;
+   const u32 *lbl = labels->ct_labels_32;
int i;
 
/* No-one else has access to the non-confirmed entry, copy
 * labels over, keeping any bits we are not explicitly setting.
 */
-   for (i = 0; i < OVS_CT_LABELS_LEN / sizeof(u32); i++)
+   for (i = 0; i < OVS_CT_LABELS_LEN_32; i++)
dst[i] = (dst[i] & ~msk[i]) | (lbl[i] & msk[i]);
 
/* Labels are included in the IPCTNL_MSG_CT_NEW event only if
@@ -912,8 +913,8 @@ static bool labels_nonzero(const struct ovs_key_ct_labels 
*labels)
 {
size_t i;
 
-   for (i = 0; i < sizeof(*labels); i++)
-   if (labels->ct_labels[i])
+   for (i = 0; i < OVS_CT_LABELS_LEN_32; i++)
+   if (labels->ct_labels_32[i])
return true;
 
return false;
diff --git a/datapath/linux/compat/include/linux/openvswitch.h 
b/datapath/linux/compat/include/linux/openvswitch.h
index 12260d8..76a19ed 100644
--- a/datapath/linux/compat/include/linux/openvswitch.h
+++ b/datapath/linux/compat/include/linux/openvswitch.h
@@ -472,9 +472,13 @@ struct ovs_key_nd {
__u8nd_tll[ETH_ALEN];
 };
 
-#define OVS_CT_LABELS_LEN  16
+#define OVS_CT_LABELS_LEN_32   4
+#define OVS_CT_LABELS_LEN  (OVS_CT_LABELS_LEN_32 * sizeof(__u32))
 struct ovs_key_ct_labels {
-   __u8ct_labels[OVS_CT_LABELS_LEN];
+   union {
+   __u8ct_labels[OVS_CT_LABELS_LEN];
+   __u32   ct_labels_32[OVS_CT_LABELS_LEN_32];
+   };
 };
 
 /* OVS_KEY_ATTR_CT_STATE flags */
-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.7 11/25] datapath: make ndo_get_stats64 a void function

2017-03-15 Thread Jarno Rajahalme
From: stephen hemminger 

Upstream commit:
commit bc1f44709cf27fb2a5766cadafe7e2ad5e9cb221
Author: stephen hemminger 
Date:   Fri Jan 6 19:12:52 2017 -0800

net: make ndo_get_stats64 a void function

The network device operation for reading statistics is only called
in one place, and it ignores the return value. Having a structure
return value is potentially confusing because some future driver could
incorrectly assume that the return value was used.

Fix all drivers with ndo_get_stats64 to have a void function.

Signed-off-by: Stephen Hemminger 
Signed-off-by: David S. Miller 

This seems to be fine for all prior Linux versions as well.

Signed-off-by: Jarno Rajahalme 
Signed-off-by: Joe Stringer 
---
 datapath/vport-internal_dev.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/datapath/vport-internal_dev.c b/datapath/vport-internal_dev.c
index 2988e8c..2fa9ab9 100644
--- a/datapath/vport-internal_dev.c
+++ b/datapath/vport-internal_dev.c
@@ -117,7 +117,7 @@ static void internal_dev_destructor(struct net_device *dev)
free_netdev(dev);
 }
 
-static struct rtnl_link_stats64 *
+static void
 internal_get_stats(struct net_device *dev, struct rtnl_link_stats64 *stats)
 {
int i;
@@ -145,8 +145,6 @@ internal_get_stats(struct net_device *dev, struct 
rtnl_link_stats64 *stats)
stats->tx_bytes += local_stats.tx_bytes;
stats->tx_packets   += local_stats.tx_packets;
}
-
-   return stats;
 }
 
 #ifdef HAVE_IFF_PHONY_HEADROOM
@@ -164,7 +162,7 @@ static const struct net_device_ops internal_dev_netdev_ops 
= {
 #ifndef HAVE_NET_DEVICE_WITH_MAX_MTU
.ndo_change_mtu = internal_dev_change_mtu,
 #endif
-   .ndo_get_stats64 = internal_get_stats,
+   .ndo_get_stats64 = (void *)internal_get_stats,
 #ifdef HAVE_IFF_PHONY_HEADROOM
 #ifndef HAVE_NET_DEVICE_OPS_WITH_EXTENDED
.ndo_set_rx_headroom = internal_set_rx_headroom,
-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.7 06/25] datapath: remove unused functions

2017-03-15 Thread Jarno Rajahalme
From: Jiri Benc 

Upstream commit:
commit f33eb0cf9984f79e8643eaac888e4b6a06a8e221
Author: Jiri Benc 
Date:   Wed Oct 19 11:26:36 2016 +0200

openvswitch: remove unused functions

ovs_vport_deferred_free is not used anywhere. It's the only caller of
free_vport_rcu thus this one can be removed, too.

Signed-off-by: Jiri Benc 
Acked-by: Pravin B Shelar 
Signed-off-by: David S. Miller 
Signed-off-by: Jarno Rajahalme 

Signed-off-by: Jarno Rajahalme 
Signed-off-by: Joe Stringer 
---
 datapath/vport.c | 16 
 datapath/vport.h |  1 -
 2 files changed, 17 deletions(-)

diff --git a/datapath/vport.c b/datapath/vport.c
index c29f0b0..7ac4632 100644
--- a/datapath/vport.c
+++ b/datapath/vport.c
@@ -509,22 +509,6 @@ int ovs_vport_receive(struct vport *vport, struct sk_buff 
*skb,
 }
 EXPORT_SYMBOL_GPL(ovs_vport_receive);
 
-static void free_vport_rcu(struct rcu_head *rcu)
-{
-   struct vport *vport = container_of(rcu, struct vport, rcu);
-
-   ovs_vport_free(vport);
-}
-
-void ovs_vport_deferred_free(struct vport *vport)
-{
-   if (!vport)
-   return;
-
-   call_rcu(>rcu, free_vport_rcu);
-}
-EXPORT_SYMBOL_GPL(ovs_vport_deferred_free);
-
 static unsigned int packet_length(const struct sk_buff *skb)
 {
unsigned int length = skb->len - ETH_HLEN;
diff --git a/datapath/vport.h b/datapath/vport.h
index 47995be..d14908f 100644
--- a/datapath/vport.h
+++ b/datapath/vport.h
@@ -152,7 +152,6 @@ struct vport_ops {
 struct vport *ovs_vport_alloc(int priv_size, const struct vport_ops *,
  const struct vport_parms *);
 void ovs_vport_free(struct vport *);
-void ovs_vport_deferred_free(struct vport *vport);
 
 #define VPORT_ALIGN 8
 
-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.7 16/25] datapath: Use inverted tuple in ovs_ct_find_existing() if NATted.

2017-03-15 Thread Jarno Rajahalme
Upstream commit:

commit 9ff464db50e437eef131f719cc2e9902eea9c607
Author: Jarno Rajahalme 
Date:   Thu Feb 9 11:21:53 2017 -0800

openvswitch: Use inverted tuple in ovs_ct_find_existing() if NATted.

The conntrack lookup for existing connections fails to invert the
packet 5-tuple for NATted packets, and therefore fails to find the
existing conntrack entry.  Conntrack only stores 5-tuples for incoming
packets, and there are various situations where a lookup on a packet
that has already been transformed by NAT needs to be made.  Looking up
an existing conntrack entry upon executing packet received from the
userspace is one of them.

This patch fixes ovs_ct_find_existing() to invert the packet 5-tuple
for the conntrack lookup whenever the packet has already been
transformed by conntrack from its input form as evidenced by one of
the NAT flags being set in the conntrack state metadata.

Fixes: 05752523e565 ("openvswitch: Interface with NAT.")
Signed-off-by: Jarno Rajahalme 
Acked-by: Joe Stringer 
Acked-by: Pravin B Shelar 
Signed-off-by: David S. Miller 

This patch also adds a test case to OVS system tests to verify the
behavior.

The following is a more thorough explanation of what is going on:

When we have evidence that an existing conntrack entry could exist, we
must invert the tuple if NAT has already been applied, as the current
packet headers do not match any tuple stored in conntrack.  For
example, if a packet from private address X to a public address B is
source-NATted to A, the conntrack entry will have the following tuples
(ignoring the protocol and port numbers) after the conntrack entry is
committed:

Original direction tuple: (X,B)
Reply direction tuple: (B,A)

Now, if a reply packet is already transformed back to the private
address space (e.g., with a CT(nat) action), the tuple corresponding
to the current packet headers is:

Current packet tuple: (B,X)

This does not match either of the conntrack tuples above.  Normally
this does not matter, as the conntrack lookup was already done using
the tuple (B,A), but if the current packet does not match any flow in
the OVS datapath, the packet is sent to userspace via an upcall,
during which the packet's skb is freed, and the conntrack entry
pointer in the skb is lost.  When the packet is reintroduced to the
datapath, any further conntrack action will need to perform a new
conntrack lookup to find the entry again.  Prior to this patch this
second lookup failed.  The datapath flow setup corresponding to the
upcall can succeed, however, allowing all further packets in the reply
direction to re-use the conntrack entry pointer in the skb, so
typically the lookup failure only causes a packet drop.

The solution is to invert the tuple derived from the current packet
headers in case the conntrack state stored in the packet metadata
indicates that the packet has been transformed by NAT:

Inverted tuple: (X,B)

With this the conntrack entry can be found, matching the original
direction tuple.

This same logic also works for the original direction packets:

Current packet tuple (after reverse NAT): (A,B)
Inverted tuple: (B,A)

While the current packet tuple (A,B) does not match either of the
conntrack tuples, the inverted one (B,A) does match the reply
direction tuple.

Since the inverted tuple matches the reverse direction tuple the
direction of the packet must be reversed as well.

Fixes: c5f6c06b58d6 ("datapath: Interface with NAT.")
Signed-off-by: Jarno Rajahalme 
Acked-by: Joe Stringer 
---
 datapath/conntrack.c| 24 +--
 tests/system-traffic.at | 52 +
 2 files changed, 74 insertions(+), 2 deletions(-)

diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index 9e34af9..08e5eab 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -471,7 +471,7 @@ ovs_ct_get_info(const struct nf_conntrack_tuple_hash *h)
  */
 static struct nf_conn *
 ovs_ct_find_existing(struct net *net, const struct nf_conntrack_zone *zone,
-u8 l3num, struct sk_buff *skb)
+u8 l3num, struct sk_buff *skb, bool natted)
 {
struct nf_conntrack_l3proto *l3proto;
struct nf_conntrack_l4proto *l4proto;
@@ -494,6 +494,17 @@ ovs_ct_find_existing(struct net *net, const struct 
nf_conntrack_zone *zone,
return NULL;
}
 
+   /* Must invert the tuple if skb has been transformed by NAT. */
+   if (natted) {
+   struct nf_conntrack_tuple inverse;
+
+   if (!nf_ct_invert_tuple(, , l3proto, l4proto)) {
+   pr_debug("ovs_ct_find_existing: Inversion failed!\n");
+   return NULL;
+   }
+   tuple = inverse;
+   }
+
/* look for tuple match */
h = 

[ovs-dev] [PATCH branch-2.7 17/25] datapath: Do not trigger events for unconfirmed connections.

2017-03-15 Thread Jarno Rajahalme
Upstream commit:

commit 193e30967897f3a8b6f9f137ac30571d832c2c5c
Author: Jarno Rajahalme 
Date:   Thu Feb 9 11:21:54 2017 -0800

openvswitch: Do not trigger events for unconfirmed connections.
Receiving change events before the 'new' event for the connection has
been received can be confusing.  Avoid triggering change events for
setting conntrack mark or labels before the conntrack entry has been
confirmed.

Fixes: 182e3042e15d ("openvswitch: Allow matching on conntrack mark")
Fixes: c2ac66735870 ("openvswitch: Allow matching on conntrack label")
Signed-off-by: Jarno Rajahalme 
Acked-by: Joe Stringer 
Acked-by: Pravin B Shelar 
Signed-off-by: David S. Miller 

Upstream commit:

commit 2317c6b51e4249dbfa093e1b88cab0a9f0564b7f
Author: Jarno Rajahalme 
Date:   Fri Feb 17 18:11:58 2017 -0800

openvswitch: Set event bit after initializing labels.

Connlabels are included in conntrack netlink event messages only if
the IPCT_LABEL bit is set in the event cache (see
ctnetlink_conntrack_event()).  Set it after initializing labels for a
new connection.

Found upon further system testing, where it was noticed that labels
were missing from the conntrack events.

Fixes: 193e30967897 ("openvswitch: Do not trigger events for unconfirmed con
nections.")
Signed-off-by: Jarno Rajahalme 
Acked-by: Pravin B Shelar 
Signed-off-by: David S. Miller 

Fixes: 372ce9737d2b ("datapath: Allow matching on conntrack mark")
Fixes: 038e34abaa31 ("datapath: Allow matching on conntrack label")
Signed-off-by: Jarno Rajahalme 
Acked-by: Joe Stringer 
---
 datapath/conntrack.c | 33 +++--
 1 file changed, 27 insertions(+), 6 deletions(-)

diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index 08e5eab..cee47b7 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -261,7 +261,8 @@ static int ovs_ct_set_mark(struct sk_buff *skb, struct 
sw_flow_key *key,
new_mark = ct_mark | (ct->mark & ~(mask));
if (ct->mark != new_mark) {
ct->mark = new_mark;
-   nf_conntrack_event_cache(IPCT_MARK, ct);
+   if (nf_ct_is_confirmed(ct))
+   nf_conntrack_event_cache(IPCT_MARK, ct);
key->ct.mark = new_mark;
}
 
@@ -278,7 +279,6 @@ static int ovs_ct_set_labels(struct sk_buff *skb, struct 
sw_flow_key *key,
enum ip_conntrack_info ctinfo;
struct nf_conn_labels *cl;
struct nf_conn *ct;
-   int err;
 
/* The connection could be invalid, in which case set_label is no-op.*/
ct = nf_ct_get(skb, );
@@ -294,10 +294,31 @@ static int ovs_ct_set_labels(struct sk_buff *skb, struct 
sw_flow_key *key,
if (!cl || ovs_ct_get_labels_len(cl) < OVS_CT_LABELS_LEN)
return -ENOSPC;
 
-   err = nf_connlabels_replace(ct, (u32 *)labels, (u32 *)mask,
-   OVS_CT_LABELS_LEN / sizeof(u32));
-   if (err)
-   return err;
+   if (nf_ct_is_confirmed(ct)) {
+   /* Triggers a change event, which makes sense only for
+* confirmed connections.
+*/
+   int err = nf_connlabels_replace(ct, (u32 *)labels, (u32 *)mask,
+   OVS_CT_LABELS_LEN / 
sizeof(u32));
+   if (err)
+   return err;
+   } else {
+   u32 *dst = (u32 *)cl->bits;
+   const u32 *msk = (const u32 *)mask->ct_labels;
+   const u32 *lbl = (const u32 *)labels->ct_labels;
+   int i;
+
+   /* No-one else has access to the non-confirmed entry, copy
+* labels over, keeping any bits we are not explicitly setting.
+*/
+   for (i = 0; i < OVS_CT_LABELS_LEN / sizeof(u32); i++)
+   dst[i] = (dst[i] & ~msk[i]) | (lbl[i] & msk[i]);
+
+   /* Labels are included in the IPCTNL_MSG_CT_NEW event only if
+* the IPCT_LABEL bit it set in the event cache.
+*/
+   nf_conntrack_event_cache(IPCT_LABEL, ct);
+   }
 
ovs_ct_get_labels(ct, >ct.labels);
return 0;
-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.7 15/25] datapath: Fix comments for skb->_nfct

2017-03-15 Thread Jarno Rajahalme
Upstream commit:

commit 5e17da634a21b1200853fe82ba67d6571f2beabe
Author: Jarno Rajahalme 
Date:   Thu Feb 9 11:21:52 2017 -0800

openvswitch: Fix comments for skb->_nfct

Fix comments referring to skb 'nfct' and 'nfctinfo' fields now that
they are combined into '_nfct'.

Signed-off-by: Jarno Rajahalme 
Acked-by: Pravin B Shelar 
Acked-by: Joe Stringer 
Signed-off-by: David S. Miller 

Signed-off-by: Jarno Rajahalme 
Acked-by: Joe Stringer 
---
 datapath/conntrack.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index 6b8414c..9e34af9 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -173,7 +173,7 @@ static void __ovs_ct_update_key(struct sw_flow_key *key, u8 
state,
ovs_ct_get_labels(ct, >ct.labels);
 }
 
-/* Update 'key' based on skb->nfct.  If 'post_ct' is true, then OVS has
+/* Update 'key' based on skb->_nfct.  If 'post_ct' is true, then OVS has
  * previously sent the packet to conntrack via the ct action.  If
  * 'keep_nat_flags' is true, the existing NAT flags retained, else they are
  * initialized from the connection status.
@@ -462,12 +462,12 @@ ovs_ct_get_info(const struct nf_conntrack_tuple_hash *h)
 
 /* Find an existing connection which this packet belongs to without
  * re-attributing statistics or modifying the connection state.  This allows an
- * skb->nfct lost due to an upcall to be recovered during actions execution.
+ * skb->_nfct lost due to an upcall to be recovered during actions execution.
  *
  * Must be called with rcu_read_lock.
  *
- * On success, populates skb->nfct and skb->nfctinfo, and returns the
- * connection.  Returns NULL if there is no existing entry.
+ * On success, populates skb->_nfct and returns the connection.  Returns NULL
+ * if there is no existing entry.
  */
 static struct nf_conn *
 ovs_ct_find_existing(struct net *net, const struct nf_conntrack_zone *zone,
@@ -505,7 +505,7 @@ ovs_ct_find_existing(struct net *net, const struct 
nf_conntrack_zone *zone,
return ct;
 }
 
-/* Determine whether skb->nfct is equal to the result of conntrack lookup. */
+/* Determine whether skb->_nfct is equal to the result of conntrack lookup. */
 static bool skb_nfct_cached(struct net *net,
const struct sw_flow_key *key,
const struct ovs_conntrack_info *info,
@@ -516,7 +516,7 @@ static bool skb_nfct_cached(struct net *net,
 
ct = nf_ct_get(skb, );
/* If no ct, check if we have evidence that an existing conntrack entry
-* might be found for this skb.  This happens when we lose a skb->nfct
+* might be found for this skb.  This happens when we lose a skb->_nfct
 * due to an upcall.  If the connection was not confirmed, it is not
 * cached and needs to be run through conntrack again.
 */
@@ -740,7 +740,7 @@ static int ovs_ct_nat(struct net *net, struct sw_flow_key 
*key,
 /* Pass 'skb' through conntrack in 'net', using zone configured in 'info', if
  * not done already.  Update key with new CT state after passing the packet
  * through conntrack.
- * Note that if the packet is deemed invalid by conntrack, skb->nfct will be
+ * Note that if the packet is deemed invalid by conntrack, skb->_nfct will be
  * set to NULL and 0 will be returned.
  */
 static int __ovs_ct_lookup(struct net *net, struct sw_flow_key *key,
-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.7 14/25] datapath: add and use nf_ct_set helper

2017-03-15 Thread Jarno Rajahalme
From: Florian Westphal 

Upstream commit:

commit c74454fadd5ea6fc866ffe2c417a0dba56b2bf1c
Author: Florian Westphal 
Date:   Mon Jan 23 18:21:57 2017 +0100

netfilter: add and use nf_ct_set helper

Add a helper to assign a nf_conn entry and the ctinfo bits to an sk_buff.
This avoids changing code in followup patch that merges skb->nfct and
skb->nfctinfo into skb->_nfct.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 

Signed-off-by: Jarno Rajahalme 
Acked-by: Joe Stringer 
---
 acinclude.m4   | 2 ++
 datapath/conntrack.c   | 6 ++
 datapath/linux/compat/include/net/netfilter/nf_conntrack.h | 8 
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/acinclude.m4 b/acinclude.m4
index ea5dfe9..751dc63 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -531,6 +531,8 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [
   OVS_FIND_PARAM_IFELSE([$KSRC/include/net/netfilter/nf_conntrack.h],
   [nf_ct_get_tuplepr], [struct.net],
   [OVS_DEFINE([HAVE_NF_CT_GET_TUPLEPR_TAKES_STRUCT_NET])])
+  OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack.h],
+  [nf_ct_set])
   OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack_zones.h],
   [nf_ct_zone_init])
   OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack_labels.h],
diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index ec354c3..6b8414c 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -501,8 +501,7 @@ ovs_ct_find_existing(struct net *net, const struct 
nf_conntrack_zone *zone,
 
ct = nf_ct_tuplehash_to_ctrack(h);
 
-   skb->nfct = >ct_general;
-   skb->nfctinfo = ovs_ct_get_info(h);
+   nf_ct_set(skb, ct, ovs_ct_get_info(h));
return ct;
 }
 
@@ -766,8 +765,7 @@ static int __ovs_ct_lookup(struct net *net, struct 
sw_flow_key *key,
if (skb_nfct(skb))
nf_conntrack_put(skb_nfct(skb));
nf_conntrack_get(>ct_general);
-   skb->nfct = >ct_general;
-   skb->nfctinfo = IP_CT_NEW;
+   nf_ct_set(skb, tmpl, IP_CT_NEW);
}
 
err = nf_conntrack_in(net, info->family,
diff --git a/datapath/linux/compat/include/net/netfilter/nf_conntrack.h 
b/datapath/linux/compat/include/net/netfilter/nf_conntrack.h
index e02e20b..bb40b0f 100644
--- a/datapath/linux/compat/include/net/netfilter/nf_conntrack.h
+++ b/datapath/linux/compat/include/net/netfilter/nf_conntrack.h
@@ -14,4 +14,12 @@ static inline bool rpl_nf_ct_get_tuplepr(const struct 
sk_buff *skb,
 #define nf_ct_get_tuplepr rpl_nf_ct_get_tuplepr
 #endif
 
+#ifndef HAVE_NF_CT_SET
+static inline void
+nf_ct_set(struct sk_buff *skb, struct nf_conn *ct, enum ip_conntrack_info info)
+{
+   skb->nfct = >ct_general;
+   skb->nfctinfo = info;
+}
+#endif
 #endif /* _NF_CONNTRACK_WRAPPER_H */
-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.7 13/25] datapath: add and use skb_nfct helper

2017-03-15 Thread Jarno Rajahalme
From: Florian Westphal 

Upstream commit:

commit cb9c68363efb6d1f950ec55fb06e031ee70db5fc
Author: Florian Westphal 
Date:   Mon Jan 23 18:21:56 2017 +0100

skbuff: add and use skb_nfct helper

Followup patch renames skb->nfct and changes its type so add a helper to
avoid intrusive rename change later.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 

Signed-off-by: Jarno Rajahalme 
Acked-by: Joe Stringer 
---
 acinclude.m4 |  1 +
 datapath/conntrack.c |  6 +++---
 datapath/linux/compat/include/linux/skbuff.h | 11 +++
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/acinclude.m4 b/acinclude.m4
index b34c0fd..ea5dfe9 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -606,6 +606,7 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [
   OVS_GREP_IFELSE([$KSRC/include/linux/skbuff.h], [skb_clear_hash_if_not_l4])
   OVS_GREP_IFELSE([$KSRC/include/linux/skbuff.h], [skb_postpush_rcsum])
   OVS_GREP_IFELSE([$KSRC/include/linux/skbuff.h], [lco_csum])
+  OVS_GREP_IFELSE([$KSRC/include/linux/skbuff.h], [skb_nfct])
 
   OVS_GREP_IFELSE([$KSRC/include/linux/types.h], [bool],
   [OVS_DEFINE([HAVE_BOOL_TYPE])])
diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index 36db32a..ec354c3 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -763,8 +763,8 @@ static int __ovs_ct_lookup(struct net *net, struct 
sw_flow_key *key,
 
/* Associate skb with specified zone. */
if (tmpl) {
-   if (skb->nfct)
-   nf_conntrack_put(skb->nfct);
+   if (skb_nfct(skb))
+   nf_conntrack_put(skb_nfct(skb));
nf_conntrack_get(>ct_general);
skb->nfct = >ct_general;
skb->nfctinfo = IP_CT_NEW;
@@ -861,7 +861,7 @@ static int ovs_ct_lookup(struct net *net, struct 
sw_flow_key *key,
if (err)
return err;
 
-   ct = (struct nf_conn *)skb->nfct;
+   ct = (struct nf_conn *)skb_nfct(skb);
if (ct)
nf_ct_deliver_cached_events(ct);
}
diff --git a/datapath/linux/compat/include/linux/skbuff.h 
b/datapath/linux/compat/include/linux/skbuff.h
index a2cbd78..943d5f8 100644
--- a/datapath/linux/compat/include/linux/skbuff.h
+++ b/datapath/linux/compat/include/linux/skbuff.h
@@ -371,4 +371,15 @@ static inline __wsum lco_csum(struct sk_buff *skb)
return csum_partial(l4_hdr, csum_start - l4_hdr, partial);
 }
 #endif
+
+#ifndef HAVE_SKB_NFCT
+static inline struct nf_conntrack *skb_nfct(const struct sk_buff *skb)
+{
+#if IS_ENABLED(CONFIG_NF_CONNTRACK)
+   return skb->nfct;
+#else
+   return NULL;
+#endif
+}
+#endif
 #endif
-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.7 12/25] datapath: Allow compiling against Linux 4.10

2017-03-15 Thread Jarno Rajahalme
OVS in-tree datapath compiles against Linux 4.10 kernel, so allow it.

Signed-off-by: Jarno Rajahalme 
Acked-by: Joe Stringer 
---
 acinclude.m4 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/acinclude.m4 b/acinclude.m4
index 052a18f..b34c0fd 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -134,10 +134,10 @@ AC_DEFUN([OVS_CHECK_LINUX], [
 AC_MSG_RESULT([$kversion])
 
 if test "$version" -ge 4; then
-   if test "$version" = 4 && test "$patchlevel" -le 9; then
+   if test "$version" = 4 && test "$patchlevel" -le 10; then
   : # Linux 4.x
else
-  AC_ERROR([Linux kernel in $KBUILD is version $kversion, but version 
newer than 4.9.x is not supported (please refer to the FAQ for advice)])
+  AC_ERROR([Linux kernel in $KBUILD is version $kversion, but version 
newer than 4.10.x is not supported (please refer to the FAQ for advice)])
fi
 elif test "$version" = 3 && test "$patchlevel" -ge 10; then
: # Linux 3.x
-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.7 09/25] datapath: handle NF_REPEAT from nf_conntrack_in()

2017-03-15 Thread Jarno Rajahalme
From: Pablo Neira Ayuso 

Upstream commit:
commit 08733a0cb7decce40bbbd0331a0449465f13c444
Author: Pablo Neira Ayuso 
Date:   Thu Nov 3 10:56:43 2016 +0100

netfilter: handle NF_REPEAT from nf_conntrack_in()

NF_REPEAT is only needed from nf_conntrack_in() under a very specific
case required by the TCP protocol tracker, we can handle this case
without returning to the core hook path. Handling of NF_REPEAT from the
nf_reinject() is left untouched.

Signed-off-by: Pablo Neira Ayuso 

[Committer notes]
Shift the functionality into the compat code, protected by v4.10
version check. This allows the datapath/conntrack.c to match
upstream.

Signed-off-by: Jarno Rajahalme 
Signed-off-by: Joe Stringer 
---
 datapath/conntrack.c|  8 ++--
 .../include/net/netfilter/nf_conntrack_core.h   | 21 +
 2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index a0c5443..36db32a 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -770,12 +770,8 @@ static int __ovs_ct_lookup(struct net *net, struct 
sw_flow_key *key,
skb->nfctinfo = IP_CT_NEW;
}
 
-   /* Repeat if requested, see nf_iterate(). */
-   do {
-   err = nf_conntrack_in(net, info->family,
- NF_INET_PRE_ROUTING, skb);
-   } while (err == NF_REPEAT);
-
+   err = nf_conntrack_in(net, info->family,
+ NF_INET_PRE_ROUTING, skb);
if (err != NF_ACCEPT)
return -ENOENT;
 
diff --git a/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h 
b/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h
index 09a53c3..16b57a6 100644
--- a/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h
+++ b/datapath/linux/compat/include/net/netfilter/nf_conntrack_core.h
@@ -67,4 +67,25 @@ static inline bool rpl_nf_ct_get_tuple(const struct sk_buff 
*skb,
 #define nf_ct_get_tuple rpl_nf_ct_get_tuple
 #endif /* HAVE_NF_CT_GET_TUPLEPR_TAKES_STRUCT_NET */
 
+/* Commit 08733a0cb7de ("netfilter: handle NF_REPEAT from nf_conntrack_in()")
+ * introduced behavioural changes to this function which cannot be detected
+ * in the headers. Unconditionally backport to kernels older than the one which
+ * contains this commit. */
+#if LINUX_VERSION_CODE < KERNEL_VERSION(4,10,0)
+static unsigned int rpl_nf_conntrack_in(struct net *net, u_int8_t pf,
+   unsigned int hooknum,
+   struct sk_buff *skb)
+{
+   int err;
+
+   /* Repeat if requested, see nf_iterate(). */
+   do {
+   err = nf_conntrack_in(net, pf, hooknum, skb);
+   } while (err == NF_REPEAT);
+
+   return err;
+}
+#define nf_conntrack_in rpl_nf_conntrack_in
+#endif /* < 4.10 */
+
 #endif /* _NF_CONNTRACK_CORE_WRAPPER_H */
-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.7 10/25] datapath: netns: make struct pernet_operations::id unsigned int.

2017-03-15 Thread Jarno Rajahalme
From: Alexey Dobriyan 

Upstream commit:
commit c7d03a00b56fc23c3a01a8353789ad257363e281
Author: Alexey Dobriyan 
Date:   Thu Nov 17 04:58:21 2016 +0300

netns: make struct pernet_operations::id unsigned int

Make struct pernet_operations::id unsigned.

There are 2 reasons to do so:

1)
This field is really an index into an zero based array and
thus is unsigned entity. Using negative value is out-of-bound
access by definition.

2)
On x86_64 unsigned 32-bit data which are mixed with pointers
via array indexing or offsets added or subtracted to pointers
are preffered to signed 32-bit data.

"int" being used as an array index needs to be sign-extended
to 64-bit before being used.

void f(long *p, int i)
{
g(p[i]);
}

  roughly translates to

movsx   rsi, esi
mov rdi, [rsi+...]
callg

MOVSX is 3 byte instruction which isn't necessary if the variable is
unsigned because x86_64 is zero extending by default.

Now, there is net_generic() function which, you guessed it right, uses
"int" as an array index:

static inline void *net_generic(const struct net *net, int id)
{
...
ptr = ng->ptr[id - 1];
...
}

And this function is used a lot, so those sign extensions add up.

Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
messing with code generation):

add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)

Unfortunately some functions actually grow bigger.
This is a semmingly random artefact of code generation with register
allocator being used differently. gcc decides that some variable
needs to live in new r8+ registers and every access now requires REX
prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
used which is longer than [r8]

However, overall balance is in negative direction:

add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
function old new   delta
nfsd4_lock  38863959 +73
tipc_link_build_proto_msg   10961140 +44
mac80211_hwsim_new_radio27762808 +32
tipc_mon_rcv10321058 +26
svcauth_gss_legacy_init 14131429 +16
tipc_bcbase_select_primary   379 392 +13
nfsd4_exchange_id   12471260 +13
nfsd4_setclientid_confirm782 793 +11
...
put_client_renew_locked  494 480 -14
ip_set_sockfn_get730 716 -14
geneve_sock_add  829 813 -16
nfsd4_sequence_done  721 703 -18
nlmclnt_lookup_host  708 686 -22
nfsd4_lockt 10851063 -22
nfs_get_client  10771050 -27
tcf_bpf_init11061076 -30
nfsd4_encode_fattr  59975930 -67
Total: Before=154856051, After=154854321, chg -0.00%

Signed-off-by: Alexey Dobriyan 
Signed-off-by: David S. Miller 

[Committer notes]

It looks like changing the type of this doesn't affect the build on older
kernels, so we can just make the change. I didn't go through all of the
compat code to update the net_id variables there as none of that code should
be enabled on kernels with this patch.

Upstream: c7d03a00b56f ("netns: make struct pernet_operations::id unsigned int")
Signed-off-by: Joe Stringer 
Acked-by: Jarno Rajahalme 
Signed-off-by: Jarno Rajahalme 
---
 datapath/datapath.c | 2 +-
 datapath/datapath.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/datapath/datapath.c b/datapath/datapath.c
index 64cd781..78fcc1c 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -61,7 +61,7 @@
 #include "vport-internal_dev.h"
 #include "vport-netdev.h"
 
-int ovs_net_id __read_mostly;
+unsigned int ovs_net_id __read_mostly;
 
 static struct genl_family dp_packet_genl_family;
 static struct genl_family dp_flow_genl_family;
diff --git a/datapath/datapath.h b/datapath/datapath.h
index 22bbaac..5e5c1c4 100644
--- a/datapath/datapath.h
+++ b/datapath/datapath.h
@@ -145,7 +145,7 @@ struct ovs_net {
bool xt_label;
 };
 
-extern 

[ovs-dev] [PATCH branch-2.7 08/25] datapath: use core MTU range checking in core net infra

2017-03-15 Thread Jarno Rajahalme
From: Jarod Wilson 

Upstream commit:
commit 61e84623ace35ce48975e8f90bbbac7557c43d61
Author: Jarod Wilson 
Date:   Fri Oct 7 22:04:33 2016 -0400

net: centralize net_device min/max MTU checking

While looking into an MTU issue with sfc, I started noticing that almost
every NIC driver with an ndo_change_mtu function implemented almost
exactly the same range checks, and in many cases, that was the only
practical thing their ndo_change_mtu function was doing. Quite a few
drivers have either 68, 64, 60 or 46 as their minimum MTU value checked,
and then various sizes from 1500 to 65535 for their maximum MTU value. We
can remove a whole lot of redundant code here if we simple store min_mtu
and max_mtu in net_device, and check against those in net/core/dev.c's
dev_set_mtu().

In theory, there should be zero functional change with this patch, it just
puts the infrastructure in place. Subsequent patches will attempt to start
using said infrastructure, with theoretically zero change in
functionality.

CC: net...@vger.kernel.org
Signed-off-by: Jarod Wilson 
Signed-off-by: David S. Miller 

Upstream commit:
commit 91572088e3fdbf4fe31cf397926d8b890fdb3237
Author: Jarod Wilson 
Date:   Thu Oct 20 13:55:20 2016 -0400

net: use core MTU range checking in core net infra

...

openvswitch:
- set min/max_mtu, remove internal_dev_change_mtu
- note: max_mtu wasn't checked previously, it's been set to 65535, which
  is the largest possible size supported

...

Signed-off-by: Jarod Wilson 
Signed-off-by: David S. Miller 
Signed-off-by: Jarno Rajahalme 

Upstream commit:
commit 425df17ce3a26d98f76e2b6b0af2acf4aeb0b026
Author: Jarno Rajahalme 
Date:   Tue Feb 14 21:16:28 2017 -0800

openvswitch: Set internal device max mtu to ETH_MAX_MTU.

Commit 91572088e3fd ("net: use core MTU range checking in core net
infra") changed the openvswitch internal device to use the core net
infra for controlling the MTU range, but failed to actually set the
max_mtu as described in the commit message, which now defaults to
ETH_DATA_LEN.

This patch fixes this by setting max_mtu to ETH_MAX_MTU after
ether_setup() call.

Fixes: 91572088e3fd ("net: use core MTU range checking in core net infra")
Signed-off-by: Jarno Rajahalme 
Signed-off-by: David S. Miller 

This backport detects the new max_mtu field in the struct netdevice
and uses the upstream code if it exists, and local backport code if
not.  The latter case is amended with bounds checks with new upstream
macros ETH_MIN_MTU and ETH_MAX_MTU and the corresponding error
messages from the upstream commit.

Signed-off-by: Jarno Rajahalme 
Signed-off-by: Joe Stringer 
---
 acinclude.m4   |  2 ++
 datapath/linux/compat/include/linux/if_ether.h |  8 
 datapath/vport-internal_dev.c  | 22 +++---
 3 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/acinclude.m4 b/acinclude.m4
index e8b64b5..052a18f 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -510,6 +510,8 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [
   OVS_FIND_PARAM_IFELSE([$KSRC/include/linux/netdevice.h],
 [netdev_master_upper_dev_link], [upper_priv],
 [OVS_DEFINE([HAVE_NETDEV_MASTER_UPPER_DEV_LINK_PRIV])])
+  OVS_FIND_FIELD_IFELSE([$KSRC/include/linux/netdevice.h], [net_device],
+[max_mtu])
 
   OVS_GREP_IFELSE([$KSRC/include/linux/netfilter.h], [nf_hook_state])
   OVS_GREP_IFELSE([$KSRC/include/linux/netfilter.h], [nf_register_net_hook])
diff --git a/datapath/linux/compat/include/linux/if_ether.h 
b/datapath/linux/compat/include/linux/if_ether.h
index ac0f1ed..5eb99bc 100644
--- a/datapath/linux/compat/include/linux/if_ether.h
+++ b/datapath/linux/compat/include/linux/if_ether.h
@@ -3,6 +3,14 @@
 
 #include_next 
 
+#ifndef ETH_MIN_MTU
+#define ETH_MIN_MTU68  /* Min IPv4 MTU per RFC791  */
+#endif
+
+#ifndef ETH_MAX_MTU
+#define ETH_MAX_MTU0xU /* 65535, same as IP_MAX_MTU*/
+#endif
+
 #ifndef ETH_P_802_3_MIN
 #define ETH_P_802_3_MIN0x0600
 #endif
diff --git a/datapath/vport-internal_dev.c b/datapath/vport-internal_dev.c
index f0542a5..2988e8c 100644
--- a/datapath/vport-internal_dev.c
+++ b/datapath/vport-internal_dev.c
@@ -89,14 +89,25 @@ static const struct ethtool_ops internal_dev_ethtool_ops = {
.get_link   = ethtool_op_get_link,
 };
 
-static int internal_dev_change_mtu(struct net_device *netdev, int new_mtu)
+#ifndef HAVE_NET_DEVICE_WITH_MAX_MTU
+static int internal_dev_change_mtu(struct net_device *dev, int new_mtu)
 {
-   

[ovs-dev] [PATCH branch-2.7 05/25] datapath: add NETIF_F_HW_VLAN_STAG_TX to internal dev.

2017-03-15 Thread Jarno Rajahalme
From: Jiri Benc 

Upstream commit:
commit 3145c037e74926dea9241a3f68ada6f294b0119a
Author: Jiri Benc 
Date:   Mon Oct 10 17:02:44 2016 +0200

openvswitch: add NETIF_F_HW_VLAN_STAG_TX to internal dev

The internal device does support 802.1AD offloading since 018c1dda5ff1
("openvswitch: 802.1AD Flow handling, actions, vlan parsing, netlink
attributes").

Signed-off-by: Jiri Benc 
Acked-by: Pravin B Shelar 
Acked-by: Eric Garver 
Signed-off-by: David S. Miller 

Upstream: 3145c037e749 ("openvswitch: add NETIF_F_HW_VLAN_STAG_TX to internal 
dev")
Signed-off-by: Joe Stringer 
Acked-by: Jarno Rajahalme 
Signed-off-by: Jarno Rajahalme 
---
 datapath/vport-internal_dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/datapath/vport-internal_dev.c b/datapath/vport-internal_dev.c
index cc01c9c..f0542a5 100644
--- a/datapath/vport-internal_dev.c
+++ b/datapath/vport-internal_dev.c
@@ -188,7 +188,7 @@ static void do_setup(struct net_device *netdev)
 
netdev->vlan_features = netdev->features;
netdev->hw_enc_features = netdev->features;
-   netdev->features |= NETIF_F_HW_VLAN_CTAG_TX;
+   netdev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
netdev->hw_features = netdev->features & ~NETIF_F_LLTX;
 
eth_hw_addr_random(netdev);
-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.7 03/25] datapath: Fix Frame-size larger than 1024 bytes warning.

2017-03-15 Thread Jarno Rajahalme
From: pravin shelar 

Upstream commit:
commit 190aa3e77880a05332ea1ccb382a51285d57adb5
Author: pravin shelar 
Date:   Mon Sep 19 13:50:59 2016 -0700

openvswitch: Fix Frame-size larger than 1024 bytes warning.

There is no need to declare separate key on stack,
we can just use sw_flow->key to store the key directly.

This commit fixes following warning:

net/openvswitch/datapath.c: In function ‘ovs_flow_cmd_new’:
net/openvswitch/datapath.c:1080:1: warning: the frame size of 1040 bytes
is larger than 1024 bytes [-Wframe-larger-than=]

Signed-off-by: Pravin B Shelar 
Signed-off-by: David S. Miller 

Signed-off-by: Jarno Rajahalme 
Signed-off-by: Joe Stringer 
---
 datapath/datapath.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/datapath/datapath.c b/datapath/datapath.c
index ce2364a..e4089ef 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -939,7 +939,6 @@ static int ovs_flow_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
struct sw_flow_mask mask;
struct sk_buff *reply;
struct datapath *dp;
-   struct sw_flow_key key;
struct sw_flow_actions *acts;
struct sw_flow_match match;
u32 ufid_flags = ovs_nla_get_ufid_flags(a[OVS_FLOW_ATTR_UFID_FLAGS]);
@@ -967,20 +966,24 @@ static int ovs_flow_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
}
 
/* Extract key. */
-   ovs_match_init(, , );
+   ovs_match_init(, _flow->key, );
error = ovs_nla_get_match(net, , a[OVS_FLOW_ATTR_KEY],
  a[OVS_FLOW_ATTR_MASK], log);
if (error)
goto err_kfree_flow;
 
-   ovs_flow_mask_key(_flow->key, , true, );
-
/* Extract flow identifier. */
error = ovs_nla_get_identifier(_flow->id, a[OVS_FLOW_ATTR_UFID],
-  , log);
+  _flow->key, log);
if (error)
goto err_kfree_flow;
 
+   /* unmasked key is needed to match when ufid is not used. */
+   if (ovs_identifier_is_key(_flow->id))
+   match.key = new_flow->id.unmasked_key;
+
+   ovs_flow_mask_key(_flow->key, _flow->key, true, );
+
/* Validate actions. */
error = ovs_nla_copy_actions(net, a[OVS_FLOW_ATTR_ACTIONS],
 _flow->key, , log);
@@ -1007,7 +1010,7 @@ static int ovs_flow_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
if (ovs_identifier_is_ufid(_flow->id))
flow = ovs_flow_tbl_lookup_ufid(>table, _flow->id);
if (!flow)
-   flow = ovs_flow_tbl_lookup(>table, );
+   flow = ovs_flow_tbl_lookup(>table, _flow->key);
if (likely(!flow)) {
rcu_assign_pointer(new_flow->sf_acts, acts);
 
-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.7 02/25] datapath: use percpu flow stats

2017-03-15 Thread Jarno Rajahalme
From: Thadeu Lima de Souza Cascardo 

Upstream commit:
commit db74a3335e0f645e3139c80bcfc90feb01d8e304
Author: Thadeu Lima de Souza Cascardo 
Date:   Thu Sep 15 19:11:53 2016 -0300

openvswitch: use percpu flow stats

Instead of using flow stats per NUMA node, use it per CPU. When using
megaflows, the stats lock can be a bottleneck in scalability.

On a E5-2690 12-core system, usual throughput went from ~4Mpps to
~15Mpps when forwarding between two 40GbE ports with a single flow
configured on the datapath.

This has been tested on a system with possible CPUs 0-7,16-23. After
module removal, there were no corruption on the slab cache.

Signed-off-by: Thadeu Lima de Souza Cascardo 
Cc: pravin shelar 
Acked-by: Pravin B Shelar 
Signed-off-by: David S. Miller 

Signed-off-by: Jarno Rajahalme 
Signed-off-by: Joe Stringer 
---
 datapath/flow.c   | 42 ++
 datapath/flow.h   |  4 ++--
 datapath/flow_table.c | 26 +-
 3 files changed, 33 insertions(+), 39 deletions(-)

diff --git a/datapath/flow.c b/datapath/flow.c
index 6d56644..58b0e13 100644
--- a/datapath/flow.c
+++ b/datapath/flow.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -71,32 +72,33 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 
tcp_flags,
 {
struct flow_stats *stats;
int node = numa_node_id();
+   int cpu = smp_processor_id();
int len = skb->len + (skb_vlan_tag_present(skb) ? VLAN_HLEN : 0);
 
-   stats = rcu_dereference(flow->stats[node]);
+   stats = rcu_dereference(flow->stats[cpu]);
 
-   /* Check if already have node-specific stats. */
+   /* Check if already have CPU-specific stats. */
if (likely(stats)) {
spin_lock(>lock);
/* Mark if we write on the pre-allocated stats. */
-   if (node == 0 && unlikely(flow->stats_last_writer != node))
-   flow->stats_last_writer = node;
+   if (cpu == 0 && unlikely(flow->stats_last_writer != cpu))
+   flow->stats_last_writer = cpu;
} else {
stats = rcu_dereference(flow->stats[0]); /* Pre-allocated. */
spin_lock(>lock);
 
-   /* If the current NUMA-node is the only writer on the
+   /* If the current CPU is the only writer on the
 * pre-allocated stats keep using them.
 */
-   if (unlikely(flow->stats_last_writer != node)) {
+   if (unlikely(flow->stats_last_writer != cpu)) {
/* A previous locker may have already allocated the
-* stats, so we need to check again.  If node-specific
+* stats, so we need to check again.  If CPU-specific
 * stats were already allocated, we update the pre-
 * allocated stats as we have already locked them.
 */
-   if (likely(flow->stats_last_writer != NUMA_NO_NODE)
-   && likely(!rcu_access_pointer(flow->stats[node]))) {
-   /* Try to allocate node-specific stats. */
+   if (likely(flow->stats_last_writer != -1) &&
+   likely(!rcu_access_pointer(flow->stats[cpu]))) {
+   /* Try to allocate CPU-specific stats. */
struct flow_stats *new_stats;
 
new_stats =
@@ -113,12 +115,12 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 
tcp_flags,
new_stats->tcp_flags = tcp_flags;
spin_lock_init(_stats->lock);
 
-   rcu_assign_pointer(flow->stats[node],
+   rcu_assign_pointer(flow->stats[cpu],
   new_stats);
goto unlock;
}
}
-   flow->stats_last_writer = node;
+   flow->stats_last_writer = cpu;
}
}
 
@@ -135,15 +137,15 @@ void ovs_flow_stats_get(const struct sw_flow *flow,
struct ovs_flow_stats *ovs_stats,
unsigned long *used, __be16 *tcp_flags)
 {
-   int node;
+   int cpu;
 
*used = 0;
*tcp_flags = 0;
memset(ovs_stats, 0, sizeof(*ovs_stats));
 
-   /* We open code this to make sure node 0 is always considered */
-   for (node = 0; node < MAX_NUMNODES; node = next_node(node, 

[ovs-dev] [PATCH branch-2.7 01/25] datapath: fix flow stats accounting when node 0 is not possible

2017-03-15 Thread Jarno Rajahalme
From: Thadeu Lima de Souza Cascardo 

Upstream commit:
commit 40773966ccf1985a1b2bb570a03cbeaf1cbd4e00
Author: Thadeu Lima de Souza Cascardo 
Date:   Thu Sep 15 19:11:52 2016 -0300

openvswitch: fix flow stats accounting when node 0 is not possible

On a system with only node 1 as possible, all statistics is going to be
accounted on node 0 as it will have a single writer.

However, when getting and clearing the statistics, node 0 is not going
to be considered, as it's not a possible node.

Tested that statistics are not zero on a system with only node 1
possible. Also compile-tested with CONFIG_NUMA off.

Signed-off-by: Thadeu Lima de Souza Cascardo 
Acked-by: Pravin B Shelar 
Signed-off-by: David S. Miller 

This patch contained a memory leak that is fixed in this backport.
The next patch silently fixed that in upstream, too.

Signed-off-by: Jarno Rajahalme 
Signed-off-by: Joe Stringer 
---
 datapath/flow.c   | 6 --
 datapath/flow_table.c | 3 ++-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/datapath/flow.c b/datapath/flow.c
index 390286c..6d56644 100644
--- a/datapath/flow.c
+++ b/datapath/flow.c
@@ -141,7 +141,8 @@ void ovs_flow_stats_get(const struct sw_flow *flow,
*tcp_flags = 0;
memset(ovs_stats, 0, sizeof(*ovs_stats));
 
-   for_each_node(node) {
+   /* We open code this to make sure node 0 is always considered */
+   for (node = 0; node < MAX_NUMNODES; node = next_node(node, 
node_possible_map)) {
struct flow_stats *stats = 
rcu_dereference_ovsl(flow->stats[node]);
 
if (stats) {
@@ -164,7 +165,8 @@ void ovs_flow_stats_clear(struct sw_flow *flow)
 {
int node;
 
-   for_each_node(node) {
+   /* We open code this to make sure node 0 is always considered */
+   for (node = 0; node < MAX_NUMNODES; node = next_node(node, 
node_possible_map)) {
struct flow_stats *stats = ovsl_dereference(flow->stats[node]);
 
if (stats) {
diff --git a/datapath/flow_table.c b/datapath/flow_table.c
index d4204e5..3829b92 100644
--- a/datapath/flow_table.c
+++ b/datapath/flow_table.c
@@ -154,7 +154,8 @@ static void flow_free(struct sw_flow *flow)
kfree(flow->id.unmasked_key);
if (flow->sf_acts)
ovs_nla_free_flow_actions((struct sw_flow_actions __force 
*)flow->sf_acts);
-   for_each_node(node)
+   /* We open code this to make sure node 0 is always considered */
+   for (node = 0; node < MAX_NUMNODES; node = next_node(node, 
node_possible_map))
if (flow->stats[node])
kmem_cache_free(flow_stats_cache,
rcu_dereference_raw(flow->stats[node]));
-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.7 00/25] Backports for branch-2.7

2017-03-15 Thread Jarno Rajahalme
These are (mostly) datapath backports from master that fix existing
features in branch-2.7 and/or make the datapath compilable with later
Linux kernel code.

Alexey Dobriyan (1):
  datapath: netns: make struct pernet_operations::id unsigned int.

Florian Westphal (2):
  datapath: add and use skb_nfct helper
  datapath: add and use nf_ct_set helper

Jarno Rajahalme (11):
  datapath: Allow compiling against Linux 4.10
  datapath: Fix comments for skb->_nfct
  datapath: Use inverted tuple in ovs_ct_find_existing() if NATted.
  datapath: Do not trigger events for unconfirmed connections.
  datapath: Unionize ovs_key_ct_label with a u32 array.
  datapath: Simplify labels length logic.
  datapath: Refactor labels initialization.
  datapath: Inherit master's labels.
  ofp-util: Ignore unknown fields in ofputil_decode_packet_in2().
  compat: nf_ct_delete compat.
  lib: Indicate if netlink message had labels.

Jarod Wilson (1):
  datapath: use core MTU range checking in core net infra

Jiri Benc (3):
  datapath: add NETIF_F_HW_VLAN_STAG_TX to internal dev.
  datapath: remove unused functions
  datapath: remove unnecessary EXPORT_SYMBOLs

Pablo Neira Ayuso (1):
  datapath: handle NF_REPEAT from nf_conntrack_in()

Thadeu Lima de Souza Cascardo (2):
  datapath: fix flow stats accounting when node 0 is not possible
  datapath: use percpu flow stats

Yi-Hung Wei (1):
  nx-match: Fix oxm decode.

pravin shelar (2):
  datapath: Fix Frame-size larger than 1024 bytes warning.
  datapath: avoid resetting flow key while installing new flow.

stephen hemminger (1):
  datapath: make ndo_get_stats64 a void function

 acinclude.m4   |  15 +-
 datapath/conntrack.c   | 195 ++---
 datapath/datapath.c|  25 +--
 datapath/datapath.h|   2 +-
 datapath/flow.c|  42 ++---
 datapath/flow.h|   4 +-
 datapath/flow_netlink.c|   6 +-
 datapath/flow_netlink.h|   3 +-
 datapath/flow_table.c  |  25 +--
 datapath/linux/compat/include/linux/if_ether.h |   8 +
 datapath/linux/compat/include/linux/openvswitch.h  |   8 +-
 datapath/linux/compat/include/linux/skbuff.h   |  11 ++
 .../compat/include/net/netfilter/nf_conntrack.h|   8 +
 .../include/net/netfilter/nf_conntrack_core.h  |  58 ++
 datapath/vport-internal_dev.c  |  30 +++-
 datapath/vport-netdev.c|   1 -
 datapath/vport.c   |  17 --
 datapath/vport.h   |   1 -
 lib/ct-dpif.h  |   1 +
 lib/netlink-conntrack.c|   1 +
 lib/nx-match.c |  23 ++-
 lib/nx-match.h |   4 +-
 lib/ofp-util.c |   2 +-
 tests/system-traffic.at|  52 ++
 24 files changed, 378 insertions(+), 164 deletions(-)

-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [branch-2.7 1/4] nx-match: Fix oxm decode.

2017-03-15 Thread Jarno Rajahalme
IMO we should also backport the patch (“ofp-util: Ignore unknown fields in 
ofputil_decode_packet_in2().”) this patch fixed.

  jarno

> On Mar 15, 2017, at 4:01 PM, Joe Stringer  wrote:
> 
> From: Yi-Hung Wei 
> 
> decode_nx_packet_in2() may be used by the switch to parse NXT_RESUME
> messages, where we need exact match on the oxm header. It's also used by
> OVN to parse NXT_PACKET_IN2 messages. For the switch, strict
> prerequisites should be applied but for the controller, this should not
> be the case. Pass the 'loose' parameter down to oxm_decode() to apply
> these restrictions correctly based on which code is performing decode.
> 
> Signed-off-by: Yi-Hung Wei 
> Signed-off-by: Joe Stringer 
> ---
> lib/nx-match.c | 8 +---
> lib/nx-match.h | 4 ++--
> lib/ofp-util.c | 2 +-
> 3 files changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/nx-match.c b/lib/nx-match.c
> index 91401e2201c6..c258869eec80 100644
> --- a/lib/nx-match.c
> +++ b/lib/nx-match.c
> @@ -678,12 +678,14 @@ oxm_pull_match_loose(struct ofpbuf *b, const struct 
> tun_table *tun_table,
>  *
>  * Fails with an error when encountering unknown OXM headers.
>  *
> - * Returns 0 if successful, otherwise an OpenFlow error code. */
> + * If 'loose' is true, encountering unknown OXM headers or missing field
> + * prerequisites are not considered as error conditions.
> + */
> enum ofperr
> -oxm_decode_match(const void *oxm, size_t oxm_len,
> +oxm_decode_match(const void *oxm, size_t oxm_len, bool loose,
>  const struct tun_table *tun_table, struct match *match)
> {
> -return nx_pull_raw(oxm, oxm_len, true, match, NULL, NULL, tun_table);
> +return nx_pull_raw(oxm, oxm_len, !loose, match, NULL, NULL, tun_table);
> }
> 
> /* Verify an array of OXM TLVs treating value of each TLV as a mask,
> diff --git a/lib/nx-match.h b/lib/nx-match.h
> index 5dca24a01a49..e103dd5fa74d 100644
> --- a/lib/nx-match.h
> +++ b/lib/nx-match.h
> @@ -61,8 +61,8 @@ enum ofperr oxm_pull_match(struct ofpbuf *, const struct 
> tun_table *,
>struct match *);
> enum ofperr oxm_pull_match_loose(struct ofpbuf *, const struct tun_table *,
>  struct match *);
> -enum ofperr oxm_decode_match(const void *, size_t, const struct tun_table *,
> - struct match *);
> +enum ofperr oxm_decode_match(const void *, size_t, bool,
> + const struct tun_table *, struct match *);
> enum ofperr oxm_pull_field_array(const void *, size_t fields_len,
>  struct field_array *);
> 
> diff --git a/lib/ofp-util.c b/lib/ofp-util.c
> index 0c9343ec400b..d3153370f2e6 100644
> --- a/lib/ofp-util.c
> +++ b/lib/ofp-util.c
> @@ -3398,7 +3398,7 @@ decode_nx_packet_in2(const struct ofp_header *oh, bool 
> loose,
> 
> case NXPINT_METADATA:
> error = oxm_decode_match(payload.msg, ofpbuf_msgsize(),
> - tun_table, >flow_metadata);
> + loose, tun_table, >flow_metadata);
> break;
> 
> case NXPINT_USERDATA:
> -- 
> 2.11.1
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [branch-2.7 3/4] ofproto: Add ref counting for variable length mf_fields.

2017-03-15 Thread Joe Stringer
From: Yi-Hung Wei 

Currently, a controller may potentially trigger a segmentation fault if it
accidentally removes a TLV mapping that is still used by an active flow.
To resolve this issue, in this patch, we maintain reference counting for each
dynamically allocated variable length mf_fields, so that vswitchd can use this
information to properly remove a TLV mapping, and to return an error if the
controller tries to remove a TLV mapping that is still used by any active flow.

To keep track of the usage of tun_metadata for each flow, two 'uint64_t'
bitmaps are introduce for the flow match and flow action respectively. We use
'uint64_t' as a bitmap since the 64 geneve TLV tunnel metadata are the only
available variable length mf_fields for now. We shall adopt general bitmap when
more variable length mf_fields are introduced. The bitmaps are configured
during the flow decoding process, and vswitchd use these bitmaps to increase or
decrease the ref counting when the flow is created or deleted.

VMWare-BZ: #1768370
Fixes: 04f48a68c428 ("ofp-actions: Fix variable length meta-flow OXMs.")
Suggested-by: Jarno Rajahalme 
Suggested-by: Joe Stringer 
Signed-off-by: Yi-Hung Wei 
Signed-off-by: Joe Stringer 
---
 build-aux/extract-ofp-actions |   9 +-
 include/openvswitch/ofp-actions.h |   2 +
 include/openvswitch/ofp-errors.h  |   4 +
 include/openvswitch/ofp-util.h|   1 +
 lib/learn.c   |   5 +
 lib/meta-flow.c   | 228 --
 lib/ofp-actions.c | 208 +-
 lib/ofp-util.c|  21 ++--
 lib/vl-mff-map.h  |  17 ++-
 ofproto/ofproto-provider.h|   4 +
 ofproto/ofproto.c |  33 +-
 ovn/controller/pinctrl.c  |   6 +-
 tests/tunnel.at   |  76 -
 utilities/ovs-ofctl.c |   2 +-
 14 files changed, 479 insertions(+), 137 deletions(-)

diff --git a/build-aux/extract-ofp-actions b/build-aux/extract-ofp-actions
index 184447b99422..0062ab881dd5 100755
--- a/build-aux/extract-ofp-actions
+++ b/build-aux/extract-ofp-actions
@@ -322,7 +322,8 @@ def extract_ofp_actions(fn, definitions):
 static enum ofperr
 ofpact_decode(const struct ofp_action_header *a, enum ofp_raw_action_type raw,
   enum ofp_version version, uint64_t arg,
-  const struct vl_mff_map *vl_mff_map, struct ofpbuf *out)
+  const struct vl_mff_map *vl_mff_map,
+  uint64_t *tlv_bitmap, struct ofpbuf *out)
 {
 switch (raw) {\
 """
@@ -343,7 +344,7 @@ ofpact_decode(const struct ofp_action_header *a, enum 
ofp_raw_action_type raw,
 else:
 arg = "arg"
 if arg_vl_mff_map:
-print "return decode_%s(%s, version, vl_mff_map, 
out);" % (enum, arg)
+print "return decode_%s(%s, version, vl_mff_map, 
tlv_bitmap, out);" % (enum, arg)
 else:
 print "return decode_%s(%s, version, out);" % 
(enum, arg)
 print
@@ -365,7 +366,7 @@ ofpact_decode(const struct ofp_action_header *a, enum 
ofp_raw_action_type raw,
 else:
 prototype += "%s, enum ofp_version, " % base_argtype
 if arg_vl_mff_map:
-prototype += 'const struct vl_mff_map *, '
+prototype += 'const struct vl_mff_map *, uint64_t *, '
 prototype += "struct ofpbuf *);"
 print prototype
 
@@ -374,7 +375,7 @@ static enum ofperr ofpact_decode(const struct 
ofp_action_header *,
  enum ofp_raw_action_type raw,
  enum ofp_version version,
  uint64_t arg, const struct vl_mff_map 
*vl_mff_map,
- struct ofpbuf *out);
+ uint64_t *tlv_bitmap, struct ofpbuf *out);
 """
 
 if __name__ == '__main__':
diff --git a/include/openvswitch/ofp-actions.h 
b/include/openvswitch/ofp-actions.h
index 88f573dcd74e..214dfed3f3bd 100644
--- a/include/openvswitch/ofp-actions.h
+++ b/include/openvswitch/ofp-actions.h
@@ -946,12 +946,14 @@ enum ofperr ofpacts_pull_openflow_actions(struct ofpbuf 
*openflow,
   unsigned int actions_len,
   enum ofp_version version,
   const struct vl_mff_map *,
+  uint64_t *ofpacts_tlv_bitmap,
   struct ofpbuf *ofpacts);
 enum ofperr
 ofpacts_pull_openflow_instructions(struct ofpbuf *openflow,
unsigned int instructions_len,
enum ofp_version version,
  

[ovs-dev] [branch-2.7 4/4] ofproto: Move tun_table and vl_mff_map deletion.

2017-03-15 Thread Joe Stringer
From: Yi-Hung Wei 

In this patch, we move the tun_table and vl_mff_map deletion in
ofproto_destory__() to be in the following order.
1. Delete all the flows.
2. Delete vl_mff_map.
3. Delete tun_table.
The rationale behind this order is that a flow may use a variable length
mf_field, and a variable length mf_field is defined by a TLV mapping
in tun_table.

Signed-off-by: Yi-Hung Wei 
Signed-off-by: Joe Stringer 
---
 ofproto/ofproto.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
index b00a4af5e5c7..ceb020778d81 100644
--- a/ofproto/ofproto.c
+++ b/ofproto/ofproto.c
@@ -1574,14 +1574,6 @@ ofproto_destroy__(struct ofproto *ofproto)
 cmap_destroy(>groups);
 
 hmap_remove(_ofprotos, >hmap_node);
-tun_metadata_free(ovsrcu_get_protected(struct tun_table *,
-   >metadata_tab));
-
-ovs_mutex_lock(>vl_mff_map.mutex);
-mf_vl_mff_map_clear(>vl_mff_map, true);
-ovs_mutex_unlock(>vl_mff_map.mutex);
-cmap_destroy(>vl_mff_map.cmap);
-ovs_mutex_destroy(>vl_mff_map.mutex);
 
 free(ofproto->name);
 free(ofproto->type);
@@ -1600,6 +1592,14 @@ ofproto_destroy__(struct ofproto *ofproto)
 }
 free(ofproto->tables);
 
+ovs_mutex_lock(>vl_mff_map.mutex);
+mf_vl_mff_map_clear(>vl_mff_map, true);
+ovs_mutex_unlock(>vl_mff_map.mutex);
+cmap_destroy(>vl_mff_map.cmap);
+ovs_mutex_destroy(>vl_mff_map.mutex);
+tun_metadata_free(ovsrcu_get_protected(struct tun_table *,
+   >metadata_tab));
+
 ovs_assert(hindex_is_empty(>cookies));
 hindex_destroy(>cookies);
 
-- 
2.11.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [branch-2.7 2/4] nx-match: Use vl_mff_map to parse match field.

2017-03-15 Thread Joe Stringer
From: Yi-Hung Wei 

vl_mff_map is introduced in commit 04f48a68c428 ("ofp-actions: Fix variable
length meta-flow OXMs") to account variable length mf_field, and it is used
to decode variable length mf_field in ofp_action. In this patch, vl_mff_map
is further used to decode the variable length match field as well.

Signed-off-by: Yi-Hung Wei 
Signed-off-by: Joe Stringer 
---
 include/openvswitch/ofp-util.h |  6 ++--
 lib/learning-switch.c  |  2 +-
 lib/nx-match.c | 46 
 lib/nx-match.h |  7 ++--
 lib/ofp-print.c|  4 +--
 lib/ofp-util.c | 80 +-
 ofproto/ofproto.c  | 11 +++---
 ovn/controller/pinctrl.c   |  2 +-
 tests/ofproto.at   | 15 +---
 utilities/ovs-ofctl.c  | 13 +++
 10 files changed, 123 insertions(+), 63 deletions(-)

diff --git a/include/openvswitch/ofp-util.h b/include/openvswitch/ofp-util.h
index 0c3a10aa4264..e73a942a3e15 100644
--- a/include/openvswitch/ofp-util.h
+++ b/include/openvswitch/ofp-util.h
@@ -222,7 +222,7 @@ void ofputil_match_to_ofp10_match(const struct match *, 
struct ofp10_match *);
 
 /* Work with ofp11_match. */
 enum ofperr ofputil_pull_ofp11_match(struct ofpbuf *, const struct tun_table *,
- struct match *,
+ const struct vl_mff_map *, struct match *,
  uint16_t *padded_match_len);
 enum ofperr ofputil_pull_ofp11_mask(struct ofpbuf *, struct match *,
 struct mf_bitmap *bm);
@@ -352,7 +352,7 @@ struct ofputil_flow_stats_request {
 
 enum ofperr ofputil_decode_flow_stats_request(
 struct ofputil_flow_stats_request *, const struct ofp_header *,
-const struct tun_table *);
+const struct tun_table *, const struct vl_mff_map *);
 struct ofpbuf *ofputil_encode_flow_stats_request(
 const struct ofputil_flow_stats_request *, enum ofputil_protocol);
 
@@ -457,6 +457,7 @@ void ofputil_packet_in_destroy(struct ofputil_packet_in *);
 
 enum ofperr ofputil_decode_packet_in(const struct ofp_header *, bool loose,
  const struct tun_table *,
+ const struct vl_mff_map *,
  struct ofputil_packet_in *,
  size_t *total_len, uint32_t *buffer_id,
  struct ofpbuf *continuation);
@@ -509,6 +510,7 @@ struct ofpbuf *ofputil_encode_packet_in_private(
 enum ofperr ofputil_decode_packet_in_private(
 const struct ofp_header *, bool loose,
 const struct tun_table *,
+const struct vl_mff_map *,
 struct ofputil_packet_in_private *,
 size_t *total_len, uint32_t *buffer_id);
 
diff --git a/lib/learning-switch.c b/lib/learning-switch.c
index bc757f46dd7a..77155d04fcc0 100644
--- a/lib/learning-switch.c
+++ b/lib/learning-switch.c
@@ -523,7 +523,7 @@ process_packet_in(struct lswitch *sw, const struct 
ofp_header *oh)
 struct dp_packet pkt;
 struct flow flow;
 
-error = ofputil_decode_packet_in(oh, true, NULL, , NULL,
+error = ofputil_decode_packet_in(oh, true, NULL, NULL, , NULL,
  _id, NULL);
 if (error) {
 VLOG_WARN_RL(, "failed to decode packet-in: %s",
diff --git a/lib/nx-match.c b/lib/nx-match.c
index c258869eec80..124cb71eb7c8 100644
--- a/lib/nx-match.c
+++ b/lib/nx-match.c
@@ -480,13 +480,14 @@ nx_pull_header(struct ofpbuf *b, const struct vl_mff_map 
*vl_mff_map,
 
 static enum ofperr
 nx_pull_match_entry(struct ofpbuf *b, bool allow_cookie,
+const struct vl_mff_map *vl_mff_map,
 const struct mf_field **field,
 union mf_value *value, union mf_value *mask)
 {
 enum ofperr error;
 uint64_t header;
 
-error = nx_pull_entry__(b, allow_cookie, NULL, , field, value,
+error = nx_pull_entry__(b, allow_cookie, vl_mff_map, , field, value,
 mask);
 if (error) {
 return error;
@@ -507,7 +508,8 @@ nx_pull_match_entry(struct ofpbuf *b, bool allow_cookie,
 static enum ofperr
 nx_pull_raw(const uint8_t *p, unsigned int match_len, bool strict,
 struct match *match, ovs_be64 *cookie, ovs_be64 *cookie_mask,
-const struct tun_table *tun_table)
+const struct tun_table *tun_table,
+const struct vl_mff_map *vl_mff_map)
 {
 ovs_assert((cookie != NULL) == (cookie_mask != NULL));
 
@@ -525,7 +527,8 @@ nx_pull_raw(const uint8_t *p, unsigned int match_len, bool 
strict,
 union mf_value mask;
 enum ofperr error;
 
-error = nx_pull_match_entry(, cookie != NULL, , , );
+error = nx_pull_match_entry(, cookie != NULL, vl_mff_map, ,
+, );
 if 

[ovs-dev] [branch-2.7 1/4] nx-match: Fix oxm decode.

2017-03-15 Thread Joe Stringer
From: Yi-Hung Wei 

decode_nx_packet_in2() may be used by the switch to parse NXT_RESUME
messages, where we need exact match on the oxm header. It's also used by
OVN to parse NXT_PACKET_IN2 messages. For the switch, strict
prerequisites should be applied but for the controller, this should not
be the case. Pass the 'loose' parameter down to oxm_decode() to apply
these restrictions correctly based on which code is performing decode.

Signed-off-by: Yi-Hung Wei 
Signed-off-by: Joe Stringer 
---
 lib/nx-match.c | 8 +---
 lib/nx-match.h | 4 ++--
 lib/ofp-util.c | 2 +-
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/lib/nx-match.c b/lib/nx-match.c
index 91401e2201c6..c258869eec80 100644
--- a/lib/nx-match.c
+++ b/lib/nx-match.c
@@ -678,12 +678,14 @@ oxm_pull_match_loose(struct ofpbuf *b, const struct 
tun_table *tun_table,
  *
  * Fails with an error when encountering unknown OXM headers.
  *
- * Returns 0 if successful, otherwise an OpenFlow error code. */
+ * If 'loose' is true, encountering unknown OXM headers or missing field
+ * prerequisites are not considered as error conditions.
+ */
 enum ofperr
-oxm_decode_match(const void *oxm, size_t oxm_len,
+oxm_decode_match(const void *oxm, size_t oxm_len, bool loose,
  const struct tun_table *tun_table, struct match *match)
 {
-return nx_pull_raw(oxm, oxm_len, true, match, NULL, NULL, tun_table);
+return nx_pull_raw(oxm, oxm_len, !loose, match, NULL, NULL, tun_table);
 }
 
 /* Verify an array of OXM TLVs treating value of each TLV as a mask,
diff --git a/lib/nx-match.h b/lib/nx-match.h
index 5dca24a01a49..e103dd5fa74d 100644
--- a/lib/nx-match.h
+++ b/lib/nx-match.h
@@ -61,8 +61,8 @@ enum ofperr oxm_pull_match(struct ofpbuf *, const struct 
tun_table *,
struct match *);
 enum ofperr oxm_pull_match_loose(struct ofpbuf *, const struct tun_table *,
  struct match *);
-enum ofperr oxm_decode_match(const void *, size_t, const struct tun_table *,
- struct match *);
+enum ofperr oxm_decode_match(const void *, size_t, bool,
+ const struct tun_table *, struct match *);
 enum ofperr oxm_pull_field_array(const void *, size_t fields_len,
  struct field_array *);
 
diff --git a/lib/ofp-util.c b/lib/ofp-util.c
index 0c9343ec400b..d3153370f2e6 100644
--- a/lib/ofp-util.c
+++ b/lib/ofp-util.c
@@ -3398,7 +3398,7 @@ decode_nx_packet_in2(const struct ofp_header *oh, bool 
loose,
 
 case NXPINT_METADATA:
 error = oxm_decode_match(payload.msg, ofpbuf_msgsize(),
- tun_table, >flow_metadata);
+ loose, tun_table, >flow_metadata);
 break;
 
 case NXPINT_USERDATA:
-- 
2.11.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [branch-2.7 0/4] Backport of variable length metaflow field fixes.

2017-03-15 Thread Joe Stringer
Commit 04f48a68c428 ("ofp-actions: Fix variable length meta-flow OXMs."), on
branch-2.7 as 9554b03d6ab7, attempted to address incorrect encode and decode of
variable length metaflow fields where the OXM/NXM encoding of the variable
length fields would incorrectly serialize the length. The patch addresses this
by introducing a new per-bridge structure that adds additional metaflow fields
for the variable-length fields on demand when the TLVs are configured by a
controller.

Unfortunately, in the original patch there was nothing ensuring that flows
referring to variable length fields would retain valid field references when
controllers reconfigure the TLVs. In practice, this could lead to a crash of
ovs-vswitchd by configuring a TLV field, adding a flow which refers to it,
removing the TLV field, then running some traffic that hit the configured flow.

This series looks to remedy the situation by reference counting the variable
length fields and preventing a controller from reconfiguring TLV fields when
there are active flows whose match or actions refer to the field.

This series was applied to master, but given the size of the change and the
minor changes necessary to apply to branch-2.7, I would feel more confident in
backporting it if there was an extra round of review to ensure that nothing was
missed when this series was first applied to master.

Yi-Hung Wei (4):
  nx-match: Fix oxm decode.
  nx-match: Use vl_mff_map to parse match field.
  ofproto: Add ref counting for variable length mf_fields.
  ofproto: Move tun_table and vl_mff_map deletion.

 build-aux/extract-ofp-actions |   9 +-
 include/openvswitch/ofp-actions.h |   2 +
 include/openvswitch/ofp-errors.h  |   4 +
 include/openvswitch/ofp-util.h|   7 +-
 lib/learn.c   |   5 +
 lib/learning-switch.c |   2 +-
 lib/meta-flow.c   | 228 --
 lib/nx-match.c|  52 ++---
 lib/nx-match.h|   9 +-
 lib/ofp-actions.c | 208 +-
 lib/ofp-print.c   |   4 +-
 lib/ofp-util.c| 101 +++--
 lib/vl-mff-map.h  |  17 ++-
 ofproto/ofproto-provider.h|   4 +
 ofproto/ofproto.c |  58 +++---
 ovn/controller/pinctrl.c  |   8 +-
 tests/ofproto.at  |  15 ++-
 tests/tunnel.at   |  76 -
 utilities/ovs-ofctl.c |  15 +--
 19 files changed, 614 insertions(+), 210 deletions(-)

-- 
2.11.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 4/4] ofproto: Move tun_table and vl_mff_map deletion.

2017-03-15 Thread Joe Stringer
On 13 March 2017 at 11:28, Yi-Hung Wei  wrote:
> In this patch, we move the tun_table and vl_mff_map deletion in
> ofproto_destory__() to be in the following order.
> 1. Delete all the flows.
> 2. Delete vl_mff_map.
> 3. Delete tun_table.
> The rationale behind this order is that a flow may use a variable length
> mf_field, and a variable length mf_field is defined by a TLV mapping
> in tun_table.
>
> Signed-off-by: Yi-Hung Wei 

Thanks, applied to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 3/4] ofproto: Add ref counting for variable length mf_fields.

2017-03-15 Thread Joe Stringer
On 13 March 2017 at 11:28, Yi-Hung Wei  wrote:
> Currently, a controller may potentially trigger a segmentation fault if it
> accidentally removes a TLV mapping that is still used by an active flow.
> To resolve this issue, in this patch, we maintain reference counting for each
> dynamically allocated variable length mf_fields, so that vswitchd can use this
> information to properly remove a TLV mapping, and to return an error if the
> controller tries to remove a TLV mapping that is still used by any active 
> flow.
>
> To keep track of the usage of tun_metadata for each flow, two 'uint64_t'
> bitmaps are introduce for the flow match and flow action respectively. We use
> 'uint64_t' as a bitmap since the 64 geneve TLV tunnel metadata are the only
> available variable length mf_fields for now. We shall adopt general bitmap 
> when
> more variable length mf_fields are introduced. The bitmaps are configured
> during the flow decoding process, and vswitchd use these bitmaps to increase 
> or
> decrease the ref counting when the flow is created or deleted.
>
> VMWare-BZ: #1768370
> Fixes: 04f48a68c428 ("ofp-actions: Fix variable length meta-flow OXMs.")
> Suggested-by: Jarno Rajahalme 
> Suggested-by: Joe Stringer 
> Signed-off-by: Yi-Hung Wei 

Thanks, applied to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 2/4] nx-match: Use vl_mff_map to parse match field.

2017-03-15 Thread Joe Stringer
On 13 March 2017 at 11:28, Yi-Hung Wei  wrote:
> vl_mff_map is introduced in commit 04f48a68c428 ("ofp-actions: Fix variable
> length meta-flow OXMs") to account variable length mf_field, and it is used
> to decode variable length mf_field in ofp_action. In this patch, vl_mff_map
> is further used to decode the variable length match field as well.
>
> Signed-off-by: Yi-Hung Wei 

Thanks, I made some minor style changes and applied this to master
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 1/4] nx-match: Fix oxm decode.

2017-03-15 Thread Joe Stringer
On 13 March 2017 at 11:27, Yi-Hung Wei  wrote:
> decode_nx_packet_in2() may be used by the switch to parse NXT_RESUME messages,
> where we need exact match on the oxm header. Therefore, change
> oxm_decode_loose() to oxm_decode() that takes an extra argument to indicate 
> whether
> we want strict or loose match.
>
> Fixes: 7befb20d0f70 ("ofp-util: Ignore unknown fields in 
> ofputil_decode_packet_in2()")
> Signed-off-by: Yi-Hung Wei 

Thanks, applied to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/7] Add support for 802.1ad (QinQ tunneling)

2017-03-15 Thread Andy Zhou
On Wed, Mar 1, 2017 at 2:47 PM, Eric Garver  wrote:
> Flow key handling changes:
>  - Add VLAN header array in struct flow, to record multiple 802.1q VLAN
>headers.
>  - Add dpif multi-VLAN capability probing. If datapath supports
>multi-VLAN, increase the maximum depth of nested OVS_KEY_ATTR_ENCAP.
>
> Refactor VLAN handling in dpif-xlate:
>  - Introduce 'xvlan' to track VLAN stack during flow processing.
>  - Input and output VLAN translation according to the xbundle type.
>
> Push VLAN action support:
>  - Allow ethertype 0x88a8 in VLAN headers and push_vlan action.
>  - Support push_vlan on dot1q packets.
>
> Use other_config:vlan-limit in table Open_vSwitch to limit maximum VLANs
> that can be matched. This allows us to preserve backwards compatibility.
>
> Add test cases for VLAN depth limit, Multi-VLAN actions and QinQ VLAN
> handling
>
> Co-authored-by: Thomas F Herbert 
> Signed-off-by: Thomas F Herbert 
> Co-authored-by: Xiao Liang 
> Signed-off-by: Xiao Liang 
> Signed-off-by: Eric Garver 

Thanks for working on this. And sorry it took a while to get those reviewed.

The patch does not apply cleanly to master.  Do you mind rebase and repost?
I suppose you will need to this anyways for them to be pushed.

I have a few comments for this patch,  I will review this other
patches more carefully
after rebase.

git am complained about extra while space. I think it is about the
line below ofproto_set_vlan_limit()
>
>  v2.7.0 - xx xxx 
>  -
> diff --git a/include/openvswitch/flow.h b/include/openvswitch/flow.h
> index df80dfe46199..0ed02e08fd2d 100644
> --- a/include/openvswitch/flow.h
> +++ b/include/openvswitch/flow.h
> @@ -23,7 +23,7 @@
>  /* This sequence number should be incremented whenever anything involving 
> flows
>   * or the wildcarding of flows changes.  This will cause build assertion
>   * failures in places which likely need to be updated. */
> -#define FLOW_WC_SEQ 36
> +#define FLOW_WC_SEQ 37
>
>  /* Number of Open vSwitch extension 32-bit registers. */
>  #define FLOW_N_REGS 16
> @@ -61,6 +61,12 @@ const char *flow_tun_flag_to_string(uint32_t flags);
>  /* Maximum number of supported MPLS labels. */
>  #define FLOW_MAX_MPLS_LABELS 3
>
> +/* Maximum number of supported VLAN headers. */
> +#define FLOW_MAX_VLAN_HEADERS 2
> +
> +/* Legacy maximum VLAN headers */
> +#define LEGACY_MAX_VLAN_HEADERS 1
> +
>  /*
>   * A flow in the network.
>   *
> @@ -103,7 +109,8 @@ struct flow {
>  struct eth_addr dl_dst; /* Ethernet destination address. */
>  struct eth_addr dl_src; /* Ethernet source address. */
>  ovs_be16 dl_type;   /* Ethernet frame type. */
> -ovs_be16 vlan_tci;  /* If 802.1Q, TCI | VLAN_CFI; otherwise 0. */
> +uint8_t pad2[2];/* Pad to 64 bits. */
Do we need 4 more bytes for 64 bit alignment?



> +void
> +flow_pop_vlan(struct flow *flow, struct flow_wildcards *wc)
> +{
> +int n = flow_count_vlan_headers(flow);
> +if (n == 0) {
> +return;
> +}
Can n be 0? or should this be an assert instead?
> +if (wc) {
> +memset(>masks.vlans[1], 0xff,
> +   sizeof(union flow_vlan_hdr) * (n - 1));
> +}

Should we also set the mask for wc->masks.vlans[0]?
> +memmove(>vlans[0], >vlans[1],
> +sizeof(union flow_vlan_hdr) * (n - 1));
> +memset(>vlans[n - 1], 0, sizeof(union flow_vlan_hdr));
> +}
> +
> +void
> +flow_push_vlan_uninit(struct flow *flow, struct flow_wildcards *wc)
> +{
> +int n = flow_count_vlan_headers(flow);

Do we always know there is room in the flow?
> +if (wc) {
> +memset(wc->masks.vlans, 0xff, sizeof(union flow_vlan_hdr) * n);
> +}
Should we set mask for the vlan fields that is about to be pushed?

> +memmove(>vlans[1], >vlans[0],
> +sizeof(union flow_vlan_hdr) * (FLOW_MAX_VLAN_HEADERS - 1));
> +memset(>vlans[0], 0, sizeof(union flow_vlan_hdr));
The function syas "_uninit",  I don't think the last memset() is needed.
>  }
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] Document OVN support in ovs-sandbox.

2017-03-15 Thread Russell Bryant
On Wed, Mar 15, 2017 at 4:37 AM, Numan Siddique  wrote:
> On Tue, Mar 14, 2017 at 1:56 AM, Russell Bryant  wrote:
>>
>> A previous commit removed the original ovs-sandbox based OVN tutorial
>> because it became too outdated and difficult to maintain.  However,
>> the use of ovs-sandbox for basic OVN development and testing is incredibly
>> useful, so we should provide at least basic documentation on how to use
>> it.
>>
>> This commit introduces a new and shorter document that shows how to use
>> OVN
>> in ovs-sandbox.  It provides a single sample configuration, as well as a
>> sample ovn-trace command.
>>
>> Signed-off-by: Russell Bryant 
>
>
> Nice to have the basic documentation back.
>
> Acked-by: Numan Siddique 

Thanks, I applied this to master.

-- 
Russell Bryant
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC] lib/automake.mk: remove runtime directories

2017-03-15 Thread Aaron Conole
Markos Chandras  writes:

> Hi Aaron,
>
> On 03/09/2017 03:35 PM, Aaron Conole wrote:
>> The Open vSwitch run, log, and DB directories are installed as part of the
>> normal `make install` process.  However, this means they are created with
>> user and group ownership that may conflict with the desired user.  For
>> example, running `make install` as root will install those files as
>> root:root, whereas the runtime user desired may be openvswitch:openvswitch.
>> 
>> Since these directories are automatically created as part of the ovs-ctl
>> command, and with the correct user:group permissions, it makes sense to
>> delay creation until these directories are actually required.
>> 
>> Signed-off-by: Aaron Conole 
>
> It looks reasonable to me. Thanks!
>
> Reviewed-by: Markos Chandras 

Thanks Markos.  I'm planning on submitting this as a full PATCH once
I've double checked that it doesn't break any of my existing test
scripts and rpm builds.

-Aaron
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] OVN: Compromised Chassis Mitigation

2017-03-15 Thread Mickey Spiegel
On Wed, Mar 15, 2017 at 7:18 AM, Lance Richardson 
wrote:

> > From: "Mickey Spiegel" 
> > To: "Lance Richardson" 
> > Cc: "Russell Bryant" , "devovs" 
> > Sent: Tuesday, March 14, 2017 3:06:53 PM
> > Subject: Re: [ovs-dev] OVN: Compromised Chassis Mitigation
> >
>
> Hi Mickey,
>
> Thanks for the excellent feedback.  Here's the latest pass:
>

Thanks for driving this.

All of this looks good to me :-)

I hope others can provide some feedback on the question at the bottom.

Mickey


> 1) Add a new column, "role", of type "string" to the remote connection
>table. If set, role-based access control is applied to transactions
>on these connections using "role" as the index to the "RBAC_Roles"
>table. If not set, role-based access control is not applied (e.g.
>local unix: remotes between northd and ovsdb will not require RBAC
>and will therefore not set the "role" column).
>
>For connections having role-based ACLs enabled, a reliable client ID
>is required. This will require the use of SSL and client certificates
>with CN field containing the client ID.
>
> 2) Add a new table, "RBAC_Roles", which is indexed by a role name
>and contains two columns:
>   "name":Name of role, type string. Corresponds to "role"
>  column in remote connection table.
>   "permissions": A map of string (table name) to UUID (row in the
>  "RBAC_Permissions" table).
>
>The purpose of this table is to select a row in the RBAC_Permissions
>table based on the transaction client's "role" and the name of a
>table to be modified by an operation within a transaction. Having
>this level of indirection allows new roles and access controls to
>be created and managed dynamically, without having to update code
>or schemas.
>
> 3) Add a new table, "RBAC_Permissions" which is initialized to contain one
>row for each table in the schema that can be modified by ovn-controller.
>Each row contains:
>
>   - An "authorization" column containing a set of "string" type, where
> each string is the name of a column (or column:key) whose contents
> are to be compared against the ID of client attempting the
> transaction
> (CN field from client certificate). If this set is empty, all IDs
> are
> considered to be authorized.  If this set contains more than one
> string,
> at least one must contain the client ID in order for the action to
> be
> considered authorized.
>
>   - An "insert_delete" column of type boolean. If true, insertions
> are allowed by any client and deletions are allowed for rows
> meeting the authorization requirement.
>
>   - An "update" column of type "set of strings". Each string is the
> name of a column (or column:key) for which modification is allowed
> in rows meeting the authorization requirement.
>
> For the current implementation of the OVN_Southbound schema, these tables
> would have the following contents:
>
> RBAC_Roles:
> name:"controller"
> permissions: "Chassis": ,
>  "Encap":   ,
>  "Port_Binding": RBAC_Permissions>,
>  "MAC_Binding":  RBAC_Permissions>
>
> RBAC_Permissions:
>Chassis row:
>   authorization: "chassis"
>   insert_delete: "true"
>   update:"nb_cfg", "external_ids", "encaps",
> "vtep_logical_switches"
>  Modification of these columns is allowed for rows
> which in
>  which the authorization check passes.
>
>Encap row:
>   authorization: "chassis"  New column containing CN ID of row creator.
>   insert_delete: "true"
>   update:"type", "options", "ip"
>
>Port_Binding row:
>   authorization: "" All chassis are authorized. In a recent
> live migration proposal, this column would contain
> "options:chassis" and "options:migration-
> destination".
>   insert_delete: "false"
>   update:"chassis"
>
>MAC_Binding row:
>   authorization: "" All chassis are allowed to update/delete any row.
> Long- term plan would be to eliminate this table.
>   insert_delete:  "true"
>   update: "logical_port", "ip", "mac", "datapath"
>
> 4) Implementation logic:
>- Extract CN ID from client certificate and store with session state
>  when remote connection is established.
>- Modify ovsdb_execute_{insert,delete,update,mutate} and related
> functions
>  to pass the client session state (need to know whether RBAC is enabled
>  for the session and client ID for the session).
>- Modify ovsdb_execute_{insert,delete,update,mutate} to apply access
>  controls.
>- Logging, statistics.
>- Unit tests.
>- Documentation.
>
> 

Re: [ovs-dev] [ dev-openvswitch ] sflow configuration

2017-03-15 Thread Ben Pfaff
On Wed, Mar 15, 2017 at 02:52:19PM +0100, nicolas prochazka wrote:
> Hello,
> How can i configure agentSubId in sflow configuration .

It doesn't look like OVS supports configuring the subid.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] OVN: Compromised Chassis Mitigation

2017-03-15 Thread Lance Richardson
> From: "Mickey Spiegel" 
> To: "Lance Richardson" 
> Cc: "Russell Bryant" , "devovs" 
> Sent: Tuesday, March 14, 2017 3:06:53 PM
> Subject: Re: [ovs-dev] OVN: Compromised Chassis Mitigation
> 

Hi Mickey,

Thanks for the excellent feedback.  Here's the latest pass:

1) Add a new column, "role", of type "string" to the remote connection
   table. If set, role-based access control is applied to transactions
   on these connections using "role" as the index to the "RBAC_Roles"
   table. If not set, role-based access control is not applied (e.g.
   local unix: remotes between northd and ovsdb will not require RBAC
   and will therefore not set the "role" column).

   For connections having role-based ACLs enabled, a reliable client ID
   is required. This will require the use of SSL and client certificates
   with CN field containing the client ID.

2) Add a new table, "RBAC_Roles", which is indexed by a role name
   and contains two columns:
  "name":Name of role, type string. Corresponds to "role"
 column in remote connection table.
  "permissions": A map of string (table name) to UUID (row in the
 "RBAC_Permissions" table).

   The purpose of this table is to select a row in the RBAC_Permissions
   table based on the transaction client's "role" and the name of a
   table to be modified by an operation within a transaction. Having
   this level of indirection allows new roles and access controls to
   be created and managed dynamically, without having to update code
   or schemas.

3) Add a new table, "RBAC_Permissions" which is initialized to contain one
   row for each table in the schema that can be modified by ovn-controller.
   Each row contains:

  - An "authorization" column containing a set of "string" type, where
each string is the name of a column (or column:key) whose contents
are to be compared against the ID of client attempting the transaction
(CN field from client certificate). If this set is empty, all IDs are
considered to be authorized.  If this set contains more than one string,
at least one must contain the client ID in order for the action to be
considered authorized. 

  - An "insert_delete" column of type boolean. If true, insertions
are allowed by any client and deletions are allowed for rows
meeting the authorization requirement.

  - An "update" column of type "set of strings". Each string is the
name of a column (or column:key) for which modification is allowed
in rows meeting the authorization requirement.

For the current implementation of the OVN_Southbound schema, these tables
would have the following contents:

RBAC_Roles:
name:"controller"
permissions: "Chassis": ,
 "Encap":   ,
 "Port_Binding":,
 "MAC_Binding": 

RBAC_Permissions:
   Chassis row:
  authorization: "chassis"
  insert_delete: "true"
  update:"nb_cfg", "external_ids", "encaps", "vtep_logical_switches"
 Modification of these columns is allowed for rows which in
 which the authorization check passes.

   Encap row:
  authorization: "chassis"  New column containing CN ID of row creator.
  insert_delete: "true"
  update:"type", "options", "ip"

   Port_Binding row:
  authorization: "" All chassis are authorized. In a recent 
live migration proposal, this column would contain
"options:chassis" and "options:migration-destination".
  insert_delete: "false"
  update:"chassis"

   MAC_Binding row:
  authorization: "" All chassis are allowed to update/delete any row.
Long- term plan would be to eliminate this table.
  insert_delete:  "true"
  update: "logical_port", "ip", "mac", "datapath"

4) Implementation logic:
   - Extract CN ID from client certificate and store with session state
 when remote connection is established.
   - Modify ovsdb_execute_{insert,delete,update,mutate} and related functions
 to pass the client session state (need to know whether RBAC is enabled
 for the session and client ID for the session).
   - Modify ovsdb_execute_{insert,delete,update,mutate} to apply access
 controls.
   - Logging, statistics.
   - Unit tests.
   - Documentation.

Open questions:
   1) Is "list of columns/column:keys" with implied boolean OR sufficient,
 or do we need something more powerful (e.g. a mini-language supporting
 comparison/boolean/set/... operations).
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [ dev-openvswitch ] sflow configuration

2017-03-15 Thread nicolas prochazka
Hello,
How can i configure agentSubId in sflow configuration .

[nicolas-kabylake-8259511d-ffc3-3191-1428-f44d30684975]ovs-vsctl list sflow
_uuid   : 72794c6b-4b70-46b5-887f-31e622684c38
agent   : neoEtap
external_ids: {}
header  : 128
polling : 10
sampling: 100
targets : ["10.10.0.71:6343", "172.16.10.3:6343"]
[nicolas-kabylake-8259511d-ffc3-3191-1428-f44d30684975]cat /data/ovs.sh
#!/bin/bash
COLLECTOR_IP=[\"172.16.10.3:6343\",\"10.10.0.71:6343\"]
COLLECTOR_PORT=7001
AGENT=neoEtap
HEADER=128
SAMPLING=100
POLLING=10
BRIDGE_NAME=switch0

 ovs-vsctl -- --id=@sflow1 create sflow agent=${AGENT}  \
target=${COLLECTOR_IP} header=${HEADER} \
sampling=${SAMPLING} polling=${POLLING} \
-- set bridge ${BRIDGE_NAME} sflow=@sflow1



agentSubId does not seems in sflow table, however we can see it in sflow
code from ovs.

Regards,
Nicolas Prochazka
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Call DR JAMES ADAMS No + 27 748 044 206 IRREVOCABLE PAYMENT ORDER VIA ATM CARD

2017-03-15 Thread Call DR JAMES ADAMS No + 27 748 044 206 IRREVOCABLE PAYMENT ORDER VIA ATM CARD
 ___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] netdev-dpdk: add dpdk interface strip_vlan option

2017-03-15 Thread Zang MingJie
dpdk-strip-vlan option specifies whether strip vlan for the dpdk interface.

Signed-off-by: Zang MingJie 
---
 lib/netdev-dpdk.c| 23 ++-
 vswitchd/vswitch.xml |  7 +++
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index ddc651bf9..ea49adf3e 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -373,6 +373,7 @@ struct netdev_dpdk {
 int requested_n_rxq;
 int requested_rxq_size;
 int requested_txq_size;
+bool requested_strip_vlan;
 
 /* Number of rx/tx descriptors for physical devices */
 int rxq_size;
@@ -395,6 +396,8 @@ struct netdev_dpdk {
 /* DPDK-ETH hardware offload features,
  * from the enum set 'dpdk_hw_ol_features' */
 uint32_t hw_ol_features;
+
+bool strip_vlan;
 };
 
 struct netdev_rxq_dpdk {
@@ -646,6 +649,7 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int 
n_rxq, int n_txq)
 conf.rxmode.jumbo_frame = 0;
 conf.rxmode.max_rx_pkt_len = 0;
 }
+conf.rxmode.hw_vlan_strip = dev->strip_vlan;
 conf.rxmode.hw_ip_checksum = (dev->hw_ol_features &
   NETDEV_RX_CHECKSUM_OFFLOAD) != 0;
 /* A device may report more queues than it makes available (this has
@@ -1133,6 +1137,19 @@ netdev_dpdk_process_devargs(const char *devargs, char 
**errp)
 }
 
 static void
+dpdk_set_strip_vlan_config(struct netdev_dpdk *dev, const struct smap *args)
+OVS_REQUIRES(dev->mutex)
+{
+bool strip_vlan;
+
+strip_vlan = smap_get_bool(args, "dpdk-strip-vlan", false);
+if (strip_vlan != dev->requested_strip_vlan) {
+dev->requested_strip_vlan = strip_vlan;
+netdev_request_reconfigure(>up);
+}
+}
+
+static void
 dpdk_set_rxq_config(struct netdev_dpdk *dev, const struct smap *args)
 OVS_REQUIRES(dev->mutex)
 {
@@ -1182,6 +1199,7 @@ netdev_dpdk_set_config(struct netdev *netdev, const 
struct smap *args,
 ovs_mutex_lock(>mutex);
 
 dpdk_set_rxq_config(dev, args);
+dpdk_set_strip_vlan_config(dev, args);
 
 dpdk_process_queue_size(netdev, args, "n_rxq_desc",
 NIC_PORT_DEFAULT_RXQ_SIZE,
@@ -3123,7 +3141,8 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
 && dev->mtu == dev->requested_mtu
 && dev->rxq_size == dev->requested_rxq_size
 && dev->txq_size == dev->requested_txq_size
-&& dev->socket_id == dev->requested_socket_id) {
+&& dev->socket_id == dev->requested_socket_id
+&& dev->strip_vlan == dev->requested_strip_vlan) {
 /* Reconfiguration is unnecessary */
 
 goto out;
@@ -3145,6 +3164,8 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
 dev->rxq_size = dev->requested_rxq_size;
 dev->txq_size = dev->requested_txq_size;
 
+dev->strip_vlan = dev->requested_strip_vlan;
+
 rte_free(dev->tx_q);
 err = dpdk_eth_dev_init(dev);
 dev->tx_q = netdev_dpdk_alloc_txq(netdev->n_txq);
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index a91be59b0..5c0083188 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -2360,6 +2360,13 @@
 
   
 
+  
+
+  Specifies whether strip vlan for the dpdk interface. It is useful
+  when using ixgbevf interface with a vlan filter.
+
+  
+
   
 
-- 
2.11.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] Document OVN support in ovs-sandbox.

2017-03-15 Thread Numan Siddique
On Tue, Mar 14, 2017 at 1:56 AM, Russell Bryant  wrote:

> A previous commit removed the original ovs-sandbox based OVN tutorial
> because it became too outdated and difficult to maintain.  However,
> the use of ovs-sandbox for basic OVN development and testing is incredibly
> useful, so we should provide at least basic documentation on how to use it.
>
> This commit introduces a new and shorter document that shows how to use OVN
> in ovs-sandbox.  It provides a single sample configuration, as well as a
> sample ovn-trace command.
>
> Signed-off-by: Russell Bryant 
>

​Nice to have the basic documentation back.

Acked-by: Numan Siddique 


---
>  Documentation/automake.mk   |   1 +
>  Documentation/index.rst |   1 +
>  Documentation/tutorials/index.rst   |   1 +
>  Documentation/tutorials/ovn-sandbox.rst | 178
> 
>  tutorial/automake.mk|   3 +-
>  tutorial/ovn-setup.sh   |  27 +
>  6 files changed, 210 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/tutorials/ovn-sandbox.rst
>  create mode 100755 tutorial/ovn-setup.sh
>
> diff --git a/Documentation/automake.mk b/Documentation/automake.mk
> index 22a614b..c4076aa 100644
> --- a/Documentation/automake.mk
> +++ b/Documentation/automake.mk
> @@ -24,6 +24,7 @@ EXTRA_DIST += \
> Documentation/intro/install/xenserver.rst \
> Documentation/tutorials/index.rst \
> Documentation/tutorials/ovs-advanced.rst \
> +   Documentation/tutorials/ovn-sandbox.rst \
> Documentation/topics/index.rst \
> Documentation/topics/bonding.rst \
> Documentation/topics/datapath.rst \
> diff --git a/Documentation/index.rst b/Documentation/index.rst
> index 9f692b1..6970dce 100644
> --- a/Documentation/index.rst
> +++ b/Documentation/index.rst
> @@ -61,6 +61,7 @@ vSwitch? Start here.
>:doc:`intro/install/dpdk`
>
>  - **Tutorials:** :doc:`tutorials/ovs-advanced` |
> +  :doc:`tutorials/ovn-sandbox`
>
>  Deeper Dive
>  ---
> diff --git a/Documentation/tutorials/index.rst b/Documentation/tutorials/
> index.rst
> index 8a7e6ee..4ba99c0 100644
> --- a/Documentation/tutorials/index.rst
> +++ b/Documentation/tutorials/index.rst
> @@ -40,3 +40,4 @@ vSwitch.
> :maxdepth: 2
>
> ovs-advanced
> +   ovn-sandbox
> diff --git a/Documentation/tutorials/ovn-sandbox.rst
> b/Documentation/tutorials/ovn-sandbox.rst
> new file mode 100644
> index 000..d47fcf5
> --- /dev/null
> +++ b/Documentation/tutorials/ovn-sandbox.rst
> @@ -0,0 +1,178 @@
> +..
> +  Licensed under the Apache License, Version 2.0 (the "License"); you
> may
> +  not use this file except in compliance with the License. You may
> obtain
> +  a copy of the License at
> +
> +  http://www.apache.org/licenses/LICENSE-2.0
> +
> +  Unless required by applicable law or agreed to in writing, software
> +  distributed under the License is distributed on an "AS IS" BASIS,
> WITHOUT
> +  WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> See the
> +  License for the specific language governing permissions and
> limitations
> +  under the License.
> +
> +  Convention for heading levels in Open vSwitch documentation:
> +
> +  ===  Heading 0 (reserved for the title in a document)
> +  ---  Heading 1
> +  ~~~  Heading 2
> +  +++  Heading 3
> +  '''  Heading 4
> +
> +  Avoid deeper levels because they do not render well.
> +
> +===
> +OVN Sandbox
> +===
> +
> +This tutorial shows you how to explore features using ``ovs-sandbox`` as a
> +simulated test environment.  It's assumed that you have an understanding
> of OVS
> +before going through this tutorial. Detail about OVN is covered in
> +ovn-architecture_, but this tutorial lets you quickly see it in action.
> +
> +Getting Started
> +---
> +
> +For some general information about ``ovs-sandbox``, see the "Getting
> Started"
> +section of the tutorial_.
> +
> +``ovs-sandbox`` does not include OVN support by default.  To enable OVN,
> you
> +must pass the ``--ovn`` flag.  For example, if running it straight from
> the ovs
> +git tree you would run::
> +
> +$ make sandbox SANDBOXFLAGS="--ovn"
> +
> +Running the sandbox with OVN enabled does the following additional steps
> to the
> +environment:
> +
> +1. Creates the ``OVN_Northbound`` and ``OVN_Southbound`` databases as
> described in
> +   `ovn-nb(5)`_ and `ovn-sb(5)`_.
> +
> +2. Creates a backup server for ``OVN_Southbond`` database. Sandbox launch
> +   screen provides the instructions on accessing the backup database.
> However
> +   access to the backup server is not required to go through the tutorial.
> +
> +3. Creates the ``hardware_vtep`` database as described in `vtep(5)`_.
> +
> +4. Runs the `ovn-northd(8)`_, `ovn-controller(8)`_, and
> +   `ovn-controller-vtep(8)`_ 

Re: [ovs-dev] [PATCH] ofproto: Add appctl command to show Datapath features

2017-03-15 Thread Andy Zhou
On Tue, Mar 14, 2017 at 6:19 PM, Joe Stringer  wrote:
> On 13 March 2017 at 14:21, Andy Zhou  wrote:
>> Exporting Datapath runtime detected features can be useful for
>> both debugging and for writing system unit testing easier.
>>
>> Signed-off-by: Andy Zhou 
>
> Can we perform some kind of build-time check how many fields are in
> 'support' and ensure that this function is extended when people add
> new probes?
>
That would be nice.  I don't have any concrete at the moment.

> Looks otherwise trivial and straightforward (and useful!), thanks.
> Couple of minor comments below.
>
> Acked-by: Joe Stringer 

Thanks for the review. Pushed.
>
>> ---
>>  ofproto/ofproto-dpif.c | 53 
>> +-
>>  1 file changed, 52 insertions(+), 1 deletion(-)
>>
>> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
>> index df779c2..af70ab3 100644
>> --- a/ofproto/ofproto-dpif.c
>> +++ b/ofproto/ofproto-dpif.c
>> @@ -4891,6 +4891,38 @@ ofproto_unixctl_dpif_dump_dps(struct unixctl_conn 
>> *conn, int argc OVS_UNUSED,
>>  }
>>
>>  static void
>> +show_dp_feature_b(struct ds *ds, const char *feature, bool b)
>> +{
>> +ds_put_format(ds, "%s: %s\n", feature, b ? "Yes" : "No");
>> +}
>> +
>> +static void
>> +show_dp_feature_s(struct ds *ds, const char *feature, size_t s)
>> +{
>> +ds_put_format(ds, "%s: %"PRIuSIZE"\n", feature, s);
>> +}
>> +
>> +static void
>> +dpif_show_support(const struct dpif_backer_support *support, struct ds *ds)
>> +{
>> +show_dp_feature_b(ds, "Variable length userdata",
>> +  support->variable_length_userdata);
>> +show_dp_feature_b(ds, "Masked set action", support->masked_set_action);
>> +show_dp_feature_b(ds, "Tunnel push pop",   support->tnl_push_pop);
>> +show_dp_feature_b(ds, "Ufid",  support->ufid);
>> +show_dp_feature_b(ds, "Trunc action",  support->trunc);
>> +show_dp_feature_b(ds, "Clone action",  support->clone);
>> +show_dp_feature_s(ds, "Max MPLS depth",support->odp.max_mpls_depth);
>> +show_dp_feature_b(ds, "Recirc",support->odp.recirc);
>> +show_dp_feature_b(ds, "CT state",  support->odp.ct_state);
>> +show_dp_feature_b(ds, "CT zone",   support->odp.ct_zone);
>> +show_dp_feature_b(ds, "CT mark",   support->odp.ct_mark);
>> +show_dp_feature_b(ds, "CT label",  support->odp.ct_label);
>> +show_dp_feature_b(ds, "CT State NAT",  support->odp.ct_state_nat);
>> +show_dp_feature_s(ds, "Max sample nesting",support->sample_nesting);
>
> Needs an extra whitespace on this last line, before support->sample_nesting
>
O.K. Fixed.

>> +}
>> +
>> +static void
>>  dpif_show_backer(const struct dpif_backer *backer, struct ds *ds)
>>  {
>>  const struct shash_node **ofprotos;
>> @@ -4899,7 +4931,6 @@ dpif_show_backer(const struct dpif_backer *backer, 
>> struct ds *ds)
>>  size_t i;
>>
>>  dpif_get_dp_stats(backer->dpif, _stats);
>> -
>>  ds_put_format(ds, "%s: hit:%"PRIu64" missed:%"PRIu64"\n",
>>dpif_name(backer->dpif), dp_stats.n_hit, 
>> dp_stats.n_missed);
>>
>> @@ -5119,6 +5150,24 @@ disable_datapath_clone(struct unixctl_conn *conn 
>> OVS_UNUSED,
>>  }
>>
>>  static void
>> +ofproto_unixctl_dpif_show_dp_features(struct unixctl_conn *conn,
>> +  int argc, const char *argv[],
>> +  void *aux OVS_UNUSED)
>> +{
>> +struct ds ds = DS_EMPTY_INITIALIZER;
>> +const char *br = argv[argc -1];
>> +struct ofproto_dpif *ofproto = ofproto_dpif_lookup(br);
>> +
>> +if (!ofproto) {
>> +unixctl_command_reply_error(conn, "no such bridge");
>> +return;
>> +}
>> +
>> +dpif_show_support(>backer->support, );
>> +unixctl_command_reply(conn, ds_cstr());
>> +}
>> +
>> +static void
>>  ofproto_unixctl_init(void)
>>  {
>>  static bool registered;
>> @@ -5139,6 +5188,8 @@ ofproto_unixctl_init(void)
>>   ofproto_unixctl_dpif_dump_dps, NULL);
>>  unixctl_command_register("dpif/show", "", 0, 0, 
>> ofproto_unixctl_dpif_show,
>>   NULL);
>> +unixctl_command_register("dpif/show-dp-features", "bridge", 1, 1,
>> + ofproto_unixctl_dpif_show_dp_features, NULL);
>>  unixctl_command_register("dpif/dump-flows", "[-m] bridge", 1, 2,
>>   ofproto_unixctl_dpif_dump_flows, NULL);
>>
>> --
>> 1.8.3.1
>>
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovn-northd: Skip icmp4 packets destined for router ports from conntrack

2017-03-15 Thread Numan Siddique
Adding Joe and Jarno to CC.



On Tue, Mar 14, 2017 at 9:01 PM, Lance Richardson 
wrote:

>
>
> - Original Message -
> > From: "Numan Siddique" 
> > To: "Russell Bryant" 
> > Cc: "ovs dev" 
> > Sent: Tuesday, March 14, 2017 11:21:33 AM
> > Subject: Re: [ovs-dev] [PATCH] ovn-northd: Skip icmp4 packets destined
> for router ports from conntrack
> >
> > On Tue, Mar 14, 2017 at 12:57 AM, Russell Bryant 
> wrote:
> >
> > > On Fri, Mar 10, 2017 at 4:48 PM, Russell Bryant 
> wrote:
> > > > On Fri, Mar 10, 2017 at 2:35 PM, Russell Bryant 
> wrote:
> > > >> On Thu, Mar 9, 2017 at 11:52 PM, Numan Siddique <
> nusid...@redhat.com>
> > > wrote:
> > > >> I don't think it's a Neutron issue.
> > > >>
> > > >> I see the conntrack entry remaining in the UNREPLIED state, even in
> > > >> the working case where there's not an ACL dropping the reply.
> > > >>
> > > >> icmp 1 29 src=10.0.0.10 dst=10.0.0.1 type=8 code=0 id=25857
> > > >> [UNREPLIED] src=10.0.0.1 dst=10.0.0.10 type=0 code=0 id=25857 mark=0
> > > >> zone=8 use=1
> > > >>
> > > >> If I ping a different address (something past the logical router), I
> > > >> see a proper conntrack entry that's not in the UNREPLIED state.
> > > >>
> > > >> I wonder if there's something about how we are generating the ICMP
> > > >> response from the logical router that's making conntrack not
> properly
> > > >> associate it with the request?
> > > >
> > > > I checked into this and there's no meaningful difference in how we
> > > > form the ICMP reply.
> > > >
> > > > I'm guessing this has to do with the request and reply both going
> > > > through conntrack as a part of processing the same packet in the OVS
> > > > data path.  That's not behaving how we would expect.  I'll keep
> > > > looking next week to try to identify the root cause here, but I would
> > > > appreciate any insight others may have about the behavior we should
> > > > expect in this scenario.
> > >
> > > I'm able to reproduce this outside of OpenStack.
> > >
> > > https://gist.github.com/russellb/4ab0a9641f12f8ac66fdd6822ee7789e
> > >
> > > This creates an OVN setup with two logical switches connected by a
> > > logical router.
> > >
> > > switch d47412f9-e64a-4734-be26-80ee71ded805 (sw1)
> > > port sw1-port1
> > > addresses: ["50:54:00:00:00:03 11.0.0.2"]
> > > switch a1730d73-ccab-4f00-b748-3cafb683e9b8 (sw0)
> > > port sw0-port2
> > > addresses: ["50:54:00:00:00:02 192.168.0.3"]
> > > port sw0-port1
> > > addresses: ["50:54:00:00:00:01 192.168.0.2"]
> > > router 193f68cd-4a93-4a04-ad3b-3ddf7b5c66f3 (lr0)
> > >
> > > The ping commands at the end demonstrate the problem.  My expectation
> > > is that the very last ping command should be successful, but is not
> > > due to the issue we're seeing here.
> > >
> > >
> >
> > This issue seems to be fixed with the below code diff. From my
> observation,
> > I could see that when the router datapath generates the icmp response and
> > when the reply packet hits the ovs_ct_execute function in datapath
> >  - the "sw_flow_key - key" param has old conntrack values
> >  - in the function "__ovs_ct_lookup",  skb_nfct_cached(net, key, info,
> skb)
> > returns false.
> >  - and the state is not set with the flag- OVS_CS_F_ESTABLISHED.
> >
> >
> > I want to check, if this is the right fix and get your feedback before
> > sending the upstream patch to net-next.
> >
>
> Hi Numan,
>
> Hopefully folks more familiar with this code than me will weigh in, but
> your patch seems to make sense.
>
> Since this is a bug fix, if you do wind up submitting it upstream it
> should go to "net" instead of "net-next" (with a "Fixes:" tag).
>
> Regards,
>
>Lance
>
>
​Thanks Lance for your comments.

​


> >
> > ><
> > diff --git a/datapath/flow.c b/datapath/flow.c
> > index 2bc1ad0..2c59acd 100644
> > --- a/datapath/flow.c
> > +++ b/datapath/flow.c
> > @@ -845,6 +845,8 @@ int ovs_flow_key_extract_userspace(struct net *net,
> > const struct nlattr *attr,
> > if (err)
> > return err;
> >
> > +   ovs_ct_fill_key(skb, key);
> > +
> > /* Check that we have conntrack original direction tuple metadata
> > only
> >  * for packets for which it makes sense.  Otherwise the key may
> be
> >  * corrupted due to overlapping key fields.
> >
> > ><
> >
> > Thanks
> > Numan
> >
> >
> > ​
> >
> >
> > > --
> > > Russell Bryant
> > >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev