date:20171206

[ovs-dev] [PATCH] dpctl: Fix comment describing get_one_dp().

2017-12-06 Thread Justin Pettit

Signed-off-by: Justin Pettit 
---
 lib/dpctl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/dpctl.c b/lib/dpctl.c
index 06cfbc4abf21..5e064ec9c100 100644
--- a/lib/dpctl.c
+++ b/lib/dpctl.c
@@ -129,8 +129,8 @@ if_up(struct netdev *netdev)
 }
 
 /* Retrieve the name of the datapath if exactly one exists.  The caller
- * is responsible for freeing the returned string.  If there is not one
- * datapath, aborts with an error message. */
+ * is responsible for freeing the returned string.  If a single datapath
+ * name cannot be determined, returns NULL. */
 static char *
 get_one_dp(struct dpctl_params *dpctl_p)
 {
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2 3/3] dpctl: Support flush conntrack by conntrack 5-tuple

2017-12-06 Thread Justin Pettit


> On Nov 21, 2017, at 5:00 PM, Yi-Hung Wei  wrote:
> diff --git a/lib/ct-dpif.c b/lib/ct-dpif.c
> index 5cd6b5cfd870..3cad5c928e8b 100644
> --- a/lib/ct-dpif.c
> +++ b/lib/ct-dpif.c
> 
> @@ -434,3 +435,110 @@ ct_dpif_format_tcp_stat(struct ds * ds, int tcp_state, 
> int conn_per_state)
> ds_put_cstr(ds, "]");
> ds_put_format(ds, "=%u", conn_per_state);
> }
> +
> +/* Parses a specification of a conntrack 5-tuple from 's' into 'tuple'.
> + * Returns true on success.  Otherwise, returns false and puts the error
> + * message in 'ds'. */
> +bool
> +ct_dpif_parse_tuple(struct ct_dpif_tuple *tuple, const char *s, struct ds 
> *ds)
> +{
> ...
> +if (ipv6_is_zero(>src.in6) || ipv6_is_zero(>dst.in6) ||
> +!tuple->ip_proto) {
> +/* icmp_type, icmp_code, and icmp_id can be 0. */
> +if (tuple->ip_proto != IPPROTO_ICMP &&
> +tuple->ip_proto != IPPROTO_ICMP) {

I think that second one should be IPPROTO_ICMPV6.

> diff --git a/lib/dpctl.c b/lib/dpctl.c
> index 7fc0e3afab37..06cfbc4abf21 100644
> --- a/lib/dpctl.c
> +++ b/lib/dpctl.c
> @@ -1331,30 +1331,73 @@ dpctl_flush_conntrack(int argc, const char *argv[],
>   struct dpctl_params *dpctl_p)
> {
> struct dpif *dpif;
> +struct ct_dpif_tuple tuple, *ptuple = NULL;
> +struct ds ds = DS_EMPTY_INITIALIZER;
> uint16_t zone, *pzone = NULL;
> char *name;
> -int error;
> +int error, i = 1;
> +bool get_dpif = false;

I think "got_dpif" is better, since it indicates that it was successful.  
Otherwise, it sounds like something that's planned to be done.

> +} else if (argc == 4) {
> +dpctl_error(dpctl_p, error, "Invalid datapath");
> +return error;
> +}
> ...
> +if (error) {
> +dpctl_error(dpctl_p, error, "opening datapath");
> +return error;
> +}

There's some inconsistencies in capitalizing the first letter of error 
messages.  I think we more commonly don't capitalize them.

> diff --git a/lib/dpctl.man b/lib/dpctl.man
> index 675fe5af4914..ce202418a0ee 100644
> --- a/lib/dpctl.man
> +++ b/lib/dpctl.man
> @@ -217,10 +217,22 @@ are included. With \fB\-\-statistics\fR timeouts and 
> timestamps are
> added to the output.
> .
> .TP
> -\*(DX\fBflush\-conntrack\fR [\fIdp\fR] [\fBzone=\fIzone\fR]
> -Flushes all the connection entries in the tracker used by \fIdp\fR.
> -If \fBzone=\fIzone\fR is specified, only flushes the connections in
> -\fBzone\fR.
> +\*(DX\fBflush\-conntrack\fR [\fIdp\fR] [\fBzone=\fIzone\fR] [\fIct-tuple\fR]
> +Flushes the connection entries in the tracker used by \fIdp\fR based on
> +\fIzone\fR and connection tracking tuple (\fIct-tuple\fR).

I don't think the parentheses are necessary around "ct-tuple".

I've attached an incremental patch to this message.  Can you take a look?  If 
you agree with the changes, I'll commit this patch and the first from this 
series.

Thanks,

--Justin


-=-=-=-=-=-=-=-=-

diff --git a/lib/ct-dpif.c b/lib/ct-dpif.c
index 0aed5bb6b43c..239c848d3735 100644
--- a/lib/ct-dpif.c
+++ b/lib/ct-dpif.c
@@ -498,7 +498,7 @@ ct_dpif_parse_tuple(struct ct_dpif_tuple *tuple, const char 
*s, struct ds *ds)
!strcmp(key, "icmp_id") ) {
 if (tuple->ip_proto != IPPROTO_ICMP &&
 tuple->ip_proto != IPPROTO_ICMPV6) {
-ds_put_cstr(ds, "Invalid L4 fields");
+ds_put_cstr(ds, "invalid L4 fields");
 goto error;
 }
 uint16_t icmp_id;
@@ -515,8 +515,8 @@ ct_dpif_parse_tuple(struct ct_dpif_tuple *tuple, const char 
*s, struct ds *ds)
 free(err);
 goto error_with_msg;
 }
-}else {
-ds_put_format(ds, "Invalid conntrack tuple field: %s", key);
+} else {
+ds_put_format(ds, "invalid conntrack tuple field: %s", key);
 goto error;
 }
 }
@@ -525,9 +525,9 @@ ct_dpif_parse_tuple(struct ct_dpif_tuple *tuple, const char 
*s, struct ds *ds)
 !tuple->ip_proto) {
 /* icmp_type, icmp_code, and icmp_id can be 0. */
 if (tuple->ip_proto != IPPROTO_ICMP &&
-tuple->ip_proto != IPPROTO_ICMP) {
+tuple->ip_proto != IPPROTO_ICMPV6) {
 if (!tuple->src_port || !tuple->dst_port) {
-ds_put_cstr(ds, "At least one of the conntrack 5-tuple fields "
+ds_put_cstr(ds, "at least one of the conntrack 5-tuple fields "
 "is missing.");
 goto error;
 }
@@ -538,7 +538,7 @@ ct_dpif_parse_tuple(struct ct_dpif_tuple *tuple, const char 
*s, struct ds *ds)
 return true;
 
 error_with_msg:
-ds_put_format(ds, "Failed to parse field %s", key);
+ds_put_format(ds, "failed to parse field %s", key);
 error:
 free(copy);
 return false;
diff --git a/lib/dpctl.c b/lib/dpctl.c
index 06cfbc4abf21..867b42105130 100644
---

Re: [ovs-dev] [PATCH] lib/conntrack: remove unnecessary addr check for ICMP.

2017-12-06 Thread Darrell Ball



On 12/6/17, 6:49 PM, "王志克"  wrote:

Hi Darrell,

In fact I indeed considerred whether to keep partial check as your patch. 
My idea is that it is better to do as little work as possible. So in my opnion 
such validation (and even checksum validation and so on) is not necessary at 
all (OVS is middle device, and such error validation can be done at end host 
stack).


[Darrell] The validations are necessary; under DOS, it is better to stop the 
attack as early as possible.
Also, making assumptions about what the ‘target’ host stack 
will or can do is “making assumptions”. 

I am open to the decision. So if you think your patch is more suitable, I 
can be the co-author.

[Darrell] I will keep you as co-author then.

Br,
Wang Zhike


-Original Message-
From: Darrell Ball [mailto:db...@vmware.com] 
Sent: Thursday, December 07, 2017 10:14 AM
To: 王志克; d...@openvswitch.org; Daniele Di Proietto
Subject: Re: [ovs-dev] [PATCH] lib/conntrack: remove unnecessary addr check 
for ICMP.

Hi Wang

To speed up the process, I sent an alternative patch here:

https://patchwork.ozlabs.org/patch/845407/

I agree the address sanity check is not correct but I think it should be 
partially retained
rather than removed. I also think a test was needed.

Pls let me know if it makes sense.
Also. if you prefer, I can make you the author.

Thanks Darrell




On 12/6/17, 11:22 AM, "Darrell Ball"  wrote:

Thanks for looking at this.

In the commit message, can you delineate.

1/ The forward direction packet in terms of src ip, dest ip, L4 
attributes
2/ The reverse direction error packet in terms of src ip, dest ip, icmp 
error payload

Darrell

On 12/4/17, 10:22 PM, "ovs-dev-boun...@openvswitch.org on behalf of 
wang zhike"  
wrote:

From: wangzhike 

ICMP response (Unreachable/fragmentationRequired/...) may be created
at devices in the middle, and such packets are tagged as invalid in
user space conntrack. In fact it does not make sense to validate the
src and dest address.

Signed-off-by: wang zhike 
---
 lib/conntrack.c | 13 -
 1 file changed, 13 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index f5a3aa9..c44ad0f 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -1702,11 +1702,6 @@ extract_l4_icmp(struct conn_key *key, const 
void *data, size_t size,
 return false;
 }
 
-if (inner_key.src.addr.ipv4_aligned != 
key->dst.addr.ipv4_aligned
-|| inner_key.dst.addr.ipv4_aligned != 
key->src.addr.ipv4_aligned) {
-return false;
-}
-
 key->src = inner_key.src;
 key->dst = inner_key.dst;
 key->nw_proto = inner_key.nw_proto;
@@ -1789,14 +1784,6 @@ extract_l4_icmp6(struct conn_key *key, const 
void *data, size_t size,
 return false;
 }
 
-/* pf doesn't do this, but it seems a good idea */
-if (!ipv6_addr_equals(_key.src.addr.ipv6_aligned,
-  >dst.addr.ipv6_aligned)
-|| !ipv6_addr_equals(_key.dst.addr.ipv6_aligned,
- >src.addr.ipv6_aligned)) {
-return false;
-}
-
 key->src = inner_key.src;
 key->dst = inner_key.dst;
 key->nw_proto = inner_key.nw_proto;
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev=DwICAg=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=IqOQ4UcLnOpQUkvP7FHVH6IWeAK_4DDBRBqX2w3wl94=Xmvgnl8plChpjcr77sLIOxE0krKVuNVqvS3_eOoNeMg=






___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] lib/conntrack: remove unnecessary addr check for ICMP.

2017-12-06 Thread 王志克

Hi Darrell,

In fact I indeed considerred whether to keep partial check as your patch. My 
idea is that it is better to do as little work as possible. So in my opnion 
such validation (and even checksum validation and so on) is not necessary at 
all (OVS is middle device, and such error validation can be done at end host 
stack).

I am open to the decision. So if you think your patch is more suitable, I can 
be the co-author.

Br,
Wang Zhike


-Original Message-
From: Darrell Ball [mailto:db...@vmware.com] 
Sent: Thursday, December 07, 2017 10:14 AM
To: 王志克; d...@openvswitch.org; Daniele Di Proietto
Subject: Re: [ovs-dev] [PATCH] lib/conntrack: remove unnecessary addr check for 
ICMP.

Hi Wang

To speed up the process, I sent an alternative patch here:

https://patchwork.ozlabs.org/patch/845407/

I agree the address sanity check is not correct but I think it should be 
partially retained
rather than removed. I also think a test was needed.

Pls let me know if it makes sense.
Also. if you prefer, I can make you the author.

Thanks Darrell




On 12/6/17, 11:22 AM, "Darrell Ball"  wrote:

Thanks for looking at this.

In the commit message, can you delineate.

1/ The forward direction packet in terms of src ip, dest ip, L4 attributes
2/ The reverse direction error packet in terms of src ip, dest ip, icmp 
error payload

Darrell

On 12/4/17, 10:22 PM, "ovs-dev-boun...@openvswitch.org on behalf of wang 
zhike"  wrote:

From: wangzhike 

ICMP response (Unreachable/fragmentationRequired/...) may be created
at devices in the middle, and such packets are tagged as invalid in
user space conntrack. In fact it does not make sense to validate the
src and dest address.

Signed-off-by: wang zhike 
---
 lib/conntrack.c | 13 -
 1 file changed, 13 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index f5a3aa9..c44ad0f 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -1702,11 +1702,6 @@ extract_l4_icmp(struct conn_key *key, const void 
*data, size_t size,
 return false;
 }
 
-if (inner_key.src.addr.ipv4_aligned != 
key->dst.addr.ipv4_aligned
-|| inner_key.dst.addr.ipv4_aligned != 
key->src.addr.ipv4_aligned) {
-return false;
-}
-
 key->src = inner_key.src;
 key->dst = inner_key.dst;
 key->nw_proto = inner_key.nw_proto;
@@ -1789,14 +1784,6 @@ extract_l4_icmp6(struct conn_key *key, const 
void *data, size_t size,
 return false;
 }
 
-/* pf doesn't do this, but it seems a good idea */
-if (!ipv6_addr_equals(_key.src.addr.ipv6_aligned,
-  >dst.addr.ipv6_aligned)
-|| !ipv6_addr_equals(_key.dst.addr.ipv6_aligned,
- >src.addr.ipv6_aligned)) {
-return false;
-}
-
 key->src = inner_key.src;
 key->dst = inner_key.dst;
 key->nw_proto = inner_key.nw_proto;
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev=DwICAg=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=IqOQ4UcLnOpQUkvP7FHVH6IWeAK_4DDBRBqX2w3wl94=Xmvgnl8plChpjcr77sLIOxE0krKVuNVqvS3_eOoNeMg=




___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v3 6/6] netdev-linux: fail ops not supporting remote netns.

2017-12-06 Thread Flavio Leitner

When the netdev is in another namespace and the operation doesn't
support network namespaces, return the correct error.

Signed-off-by: Flavio Leitner 
---
 lib/netdev-linux.c | 134 +++--
 1 file changed, 120 insertions(+), 14 deletions(-)

diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index c23e0f33c..74107d044 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -1340,6 +1340,11 @@ netdev_linux_send(struct netdev *netdev_, int qid 
OVS_UNUSED,
 int sock = 0;
 
 if (!is_tap_netdev(netdev_)) {
+if (netdev_linux_netnsid_is_remote(netdev_linux_cast(netdev_))) {
+error = EOPNOTSUPP;
+goto free_batch;
+}
+
 sock = af_packet_sock();
 if (sock < 0) {
 error = -sock;
@@ -1399,6 +1404,10 @@ netdev_linux_set_etheraddr(struct netdev *netdev_, const 
struct eth_addr mac)
 int error;
 
 ovs_mutex_lock(>mutex);
+if (netdev_linux_netnsid_is_remote(netdev)) {
+error = EOPNOTSUPP;
+goto exit;
+}
 
 if (netdev->cache_valid & VALID_ETHERADDR) {
 error = netdev->ether_addr_error;
@@ -1512,6 +1521,11 @@ netdev_linux_set_mtu(struct netdev *netdev_, int mtu)
 int error;
 
 ovs_mutex_lock(>mutex);
+if (netdev_linux_netnsid_is_remote(netdev)) {
+error = EOPNOTSUPP;
+goto exit;
+}
+
 if (netdev->cache_valid & VALID_MTU) {
 error = netdev->netdev_mtu_error;
 if (error || netdev->mtu == mtu) {
@@ -1541,9 +1555,14 @@ netdev_linux_get_ifindex(const struct netdev *netdev_)
 int ifindex, error;
 
 ovs_mutex_lock(>mutex);
+if (netdev_linux_netnsid_is_remote(netdev)) {
+error = EOPNOTSUPP;
+goto exit;
+}
 error = get_ifindex(netdev_, );
-ovs_mutex_unlock(>mutex);
 
+exit:
+ovs_mutex_unlock(>mutex);
 return error ? -error : ifindex;
 }
 
@@ -2084,6 +2103,11 @@ netdev_linux_get_features(const struct netdev *netdev_,
 int error;
 
 ovs_mutex_lock(>mutex);
+if (netdev_linux_netnsid_is_remote(netdev)) {
+error = EOPNOTSUPP;
+goto exit;
+}
+
 netdev_linux_read_features(netdev);
 if (!netdev->get_features_error) {
 *current = netdev->current;
@@ -2092,8 +2116,9 @@ netdev_linux_get_features(const struct netdev *netdev_,
 *peer = 0;  /* XXX */
 }
 error = netdev->get_features_error;
-ovs_mutex_unlock(>mutex);
 
+exit:
+ovs_mutex_unlock(>mutex);
 return error;
 }
 
@@ -2109,6 +2134,12 @@ netdev_linux_set_advertisements(struct netdev *netdev_,
 ovs_mutex_lock(>mutex);
 
 COVERAGE_INC(netdev_get_ethtool);
+
+if (netdev_linux_netnsid_is_remote(netdev)) {
+error = EOPNOTSUPP;
+goto exit;
+}
+
 memset(, 0, sizeof ecmd);
 error = netdev_linux_do_ethtool(netdev_get_name(netdev_), ,
 ETHTOOL_GSET, "ETHTOOL_GSET");
@@ -2186,6 +2217,11 @@ netdev_linux_set_policing(struct netdev *netdev_,
: kbits_burst);   /* Stick with user-specified value. */
 
 ovs_mutex_lock(>mutex);
+if (netdev_linux_netnsid_is_remote(netdev)) {
+error = EOPNOTSUPP;
+goto out;
+}
+
 if (netdev->cache_valid & VALID_POLICING) {
 error = netdev->netdev_policing_error;
 if (error || (netdev->kbits_rate == kbits_rate &&
@@ -2322,6 +2358,11 @@ netdev_linux_get_qos(const struct netdev *netdev_,
 int error;
 
 ovs_mutex_lock(>mutex);
+if (netdev_linux_netnsid_is_remote(netdev)) {
+error = EOPNOTSUPP;
+goto exit;
+}
+
 error = tc_query_qdisc(netdev_);
 if (!error) {
 *typep = netdev->tc->ops->ovs_name;
@@ -2329,8 +2370,9 @@ netdev_linux_get_qos(const struct netdev *netdev_,
  ? netdev->tc->ops->qdisc_get(netdev_, details)
  : 0);
 }
-ovs_mutex_unlock(>mutex);
 
+exit:
+ovs_mutex_unlock(>mutex);
 return error;
 }
 
@@ -2352,6 +2394,11 @@ netdev_linux_set_qos(struct netdev *netdev_,
 }
 
 ovs_mutex_lock(>mutex);
+if (netdev_linux_netnsid_is_remote(netdev)) {
+error = EOPNOTSUPP;
+goto exit;
+}
+
 error = tc_query_qdisc(netdev_);
 if (error) {
 goto exit;
@@ -2385,6 +2432,11 @@ netdev_linux_get_queue(const struct netdev *netdev_,
 int error;
 
 ovs_mutex_lock(>mutex);
+if (netdev_linux_netnsid_is_remote(netdev)) {
+error = EOPNOTSUPP;
+goto exit;
+}
+
 error = tc_query_qdisc(netdev_);
 if (!error) {
 struct tc_queue *queue = tc_find_queue(netdev_, queue_id);
@@ -2392,8 +2444,9 @@ netdev_linux_get_queue(const struct netdev *netdev_,
 ? netdev->tc->ops->class_get(netdev_, queue, details)
 : ENOENT);
 }
-ovs_mutex_unlock(>mutex);
 
+exit:
+ovs_mutex_unlock(>mutex);
 return error;
 }
 
@@ -2405,6 +2458,11 @@ netdev_linux_set_queue(struct

[ovs-dev] [PATCH v3 5/6] nlmon: added netns support.

2017-12-06 Thread Flavio Leitner

Signed-off-by: Flavio Leitner 
---
 utilities/nlmon.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/utilities/nlmon.c b/utilities/nlmon.c
index d38a70b6f..448b5eb5d 100644
--- a/utilities/nlmon.c
+++ b/utilities/nlmon.c
@@ -25,6 +25,7 @@
 #include 
 #include "netlink.h"
 #include "netlink-socket.h"
+#include "netnsid.h"
 #include "openvswitch/ofpbuf.h"
 #include "openvswitch/poll-loop.h"
 #include "timeval.h"
@@ -41,6 +42,7 @@ main(int argc OVS_UNUSED, char *argv[])
 {
 uint64_t buf_stub[4096 / 64];
 struct nl_sock *sock;
+int nsid;
 struct ofpbuf buf;
 int error;
 
@@ -57,9 +59,10 @@ main(int argc OVS_UNUSED, char *argv[])
 ovs_fatal(error, "could not join RTNLGRP_LINK multicast group");
 }
 
+nl_sock_listen_all_nsid(sock, true);
 ofpbuf_use_stub(, buf_stub, sizeof buf_stub);
 for (;;) {
-error = nl_sock_recv(sock, , NULL, false);
+error = nl_sock_recv(sock, , , false);
 if (error == EAGAIN) {
 /* Nothing to do. */
 } else if (error == ENOBUFS) {
@@ -123,6 +126,11 @@ main(int argc OVS_UNUSED, char *argv[])
 }
 }
 printf("\n");
+if (netnsid_is_remote(nsid)) {
+printf("\tnetns id: %d\n", nsid);
+} else {
+printf("\tnetns id: local\n");
+}
 if (attrs[IFLA_MASTER]) {
 uint32_t idx = nl_attr_get_u32(attrs[IFLA_MASTER]);
 char ifname[IFNAMSIZ];
-- 
2.14.3

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v3 4/6] netlink linux: enable listening to all nsids

2017-12-06 Thread Flavio Leitner

Internal ports may be moved to another network namespace
and when that happens, the vswitch stops receiving netlink
notifications.

This patch enables the vswitch to listen to all network
namespaces that have a nsid assigned into the network
namespace where the socket has been opened.

It requires kernel 4.2 or newer.

Signed-off-by: Flavio Leitner 
---
 lib/daemon-unix.c|  3 ++-
 lib/daemon.man   |  6 +++---
 lib/daemon.xml   |  8 
 lib/netdev-linux.c   |  1 +
 lib/netlink-protocol.h   |  6 ++
 lib/netlink-socket.c | 27 +++
 lib/netlink-socket.h |  2 ++
 tests/ofproto-macros.at  |  1 +
 tests/ovn-controller-vtep.at |  1 +
 9 files changed, 47 insertions(+), 8 deletions(-)

diff --git a/lib/daemon-unix.c b/lib/daemon-unix.c
index 967a28432..be6d29cbe 100644
--- a/lib/daemon-unix.c
+++ b/lib/daemon-unix.c
@@ -818,7 +818,8 @@ daemon_become_new_user_linux(bool access_datapath 
OVS_UNUSED)
 
 if (access_datapath && !ret) {
 ret = capng_update(CAPNG_ADD, cap_sets, CAP_NET_ADMIN)
-  || capng_update(CAPNG_ADD, cap_sets, CAP_NET_RAW);
+  || capng_update(CAPNG_ADD, cap_sets, CAP_NET_RAW)
+  || capng_update(CAPNG_ADD, cap_sets, CAP_NET_BROADCAST);
 }
 } else {
 ret = -1;
diff --git a/lib/daemon.man b/lib/daemon.man
index 820a09903..68c0a312d 100644
--- a/lib/daemon.man
+++ b/lib/daemon.man
@@ -76,9 +76,9 @@ started by the root user accepts this argument.
 .IP
 On Linux, daemons will be granted CAP_IPC_LOCK and CAP_NET_BIND_SERVICES
 before dropping root privileges. Daemons that interact with a datapath,
-such as \fBovs\-vswitchd\fR, will be granted two additional capabilities, 
namely
-CAP_NET_ADMIN and CAP_NET_RAW. The capability change will apply even if
-new user is "root".
+such as \fBovs\-vswitchd\fR, will be granted three additional capabilities,
+namely CAP_NET_ADMIN, CAP_NET_BROADCAST and CAP_NET_RAW.  The capability
+change will apply even if the new user is root.
 .IP
 On Windows, this option is not currently supported. For security reasons,
 specifying this option will cause the daemon process not to start.
diff --git a/lib/daemon.xml b/lib/daemon.xml
index 5cb447c49..1b5e8acae 100644
--- a/lib/daemon.xml
+++ b/lib/daemon.xml
@@ -107,10 +107,10 @@
   On Linux, daemons will be granted CAP_IPC_LOCK and
   CAP_NET_BIND_SERVICES before dropping root privileges.
   Daemons that interact with a datapath, such as
-  ovs-vswitchd, will be granted two additional
-  capabilities, namely CAP_NET_ADMIN and
-  CAP_NET_RAW.  The capability change will apply even
-  if the new user is root.
+  ovs-vswitchd, will be granted three additional
+  capabilities, namely CAP_NET_ADMIN,
+  CAP_NET_BROADCAST and CAP_NET_RAW.  The
+  capability change will apply even if the new user is root.
 
 
 
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index ac02deb62..c23e0f33c 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -653,6 +653,7 @@ netdev_linux_notify_sock(void)
 }
 }
 }
+nl_sock_listen_all_nsid(sock, true);
 ovsthread_once_done();
 }
 
diff --git a/lib/netlink-protocol.h b/lib/netlink-protocol.h
index a7b9a65fa..c0617dfad 100644
--- a/lib/netlink-protocol.h
+++ b/lib/netlink-protocol.h
@@ -158,6 +158,12 @@ enum {
 #define NETLINK_DROP_MEMBERSHIP 2
 #endif
 
+/* This was introduced in v4.2.  (We want our programs to support the newer
+ * kernel features even if compiled with older headers.) */
+#ifndef NETLINK_LISTEN_ALL_NSID
+#define NETLINK_LISTEN_ALL_NSID 8
+#endif
+
 /* These were introduced all together in 2.6.23.  (We want our programs to
  * support the newer kernel features even if compiled with older headers.) */
 #ifndef CTRL_ATTR_MCAST_GRP_MAX
diff --git a/lib/netlink-socket.c b/lib/netlink-socket.c
index f68ca860d..f3cce9314 100644
--- a/lib/netlink-socket.c
+++ b/lib/netlink-socket.c
@@ -442,6 +442,33 @@ nl_sock_join_mcgroup(struct nl_sock *sock, unsigned int 
multicast_group)
 return 0;
 }
 
+/* When 'enable' is true, it tries to enable 'sock' to receive netlink
+ * notifications form all network namespaces that have an nsid assigned
+ * into the network namespace where the socket has been opened. The
+ * running kernel needs to provide support for that. When 'enable' is
+ * false, it will receive netlink notifications only from the network
+ * namespace where the socket has been opened.
+ *
+ * Returns 0 if successful, otherwise a positive errno.  */
+int
+nl_sock_listen_all_nsid(struct nl_sock *sock, bool enable)
+{
+int error;
+int val = enable ? 1 : 0;
+
+#ifndef _WIN32
+if (setsockopt(sock->fd, SOL_NETLINK, NETLINK_LISTEN_ALL_NSID, ,
+   sizeof val) < 0) {
+error = errno;
+VLOG_INFO("netlink: could not

[ovs-dev] [PATCH v3 3/6] netdev-linux: use netlink to update netdev.

2017-12-06 Thread Flavio Leitner

The ioctl interface doesn't support network namespaces, so
try updating the netdev using netlink message instead.

To provide backwards compatibility, fall back to the previous
method if netlink isn't supported or fails.

Signed-off-by: Flavio Leitner 
---
 lib/netdev-linux.c | 114 +++--
 1 file changed, 111 insertions(+), 3 deletions(-)

diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index bf8aee5c7..ac02deb62 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -87,6 +87,9 @@ COVERAGE_DEFINE(netdev_get_ethtool);
 COVERAGE_DEFINE(netdev_set_ethtool);
 
 
+#ifndef IFLA_IF_NETNSID
+#define IFLA_IF_NETNSID 0x45
+#endif
 /* These were introduced in Linux 2.6.14, so they might be missing if we have
  * old headers. */
 #ifndef ADVERTISED_Pause
@@ -608,6 +611,14 @@ netdev_linux_netnsid_is_eq(struct netdev_linux *netdev, 
int nsid)
 return netnsid_eq(netdev->netnsid, nsid);
 }
 
+static bool
+netdev_linux_netnsid_is_remote(struct netdev_linux *netdev)
+{
+netdev_linux_netnsid_update(netdev);
+return netnsid_is_remote(netdev->netnsid);
+}
+
+static int netdev_linux_update_via_netlink(struct netdev_linux *);
 static void netdev_linux_update(struct netdev_linux *netdev, int,
 const struct rtnetlink_change *)
 OVS_REQUIRES(netdev->mutex);
@@ -1427,6 +1438,11 @@ netdev_linux_get_etheraddr(const struct netdev *netdev_, 
struct eth_addr *mac)
 
 ovs_mutex_lock(>mutex);
 if (!(netdev->cache_valid & VALID_ETHERADDR)) {
+netdev_linux_update_via_netlink(netdev);
+}
+
+if (!(netdev->cache_valid & VALID_ETHERADDR)) {
+/* Fall back to ioctl if netlink fails */
 netdev->ether_addr_error = get_etheraddr(netdev_get_name(netdev_),
  >etheraddr);
 netdev->cache_valid |= VALID_ETHERADDR;
@@ -1447,6 +1463,11 @@ netdev_linux_get_mtu__(struct netdev_linux *netdev, int 
*mtup)
 int error;
 
 if (!(netdev->cache_valid & VALID_MTU)) {
+netdev_linux_update_via_netlink(netdev);
+}
+
+if (!(netdev->cache_valid & VALID_MTU)) {
+/* Fall back to ioctl if netlink fails */
 struct ifreq ifr;
 
 netdev->netdev_mtu_error = af_inet_ifreq_ioctl(
@@ -2859,12 +2880,21 @@ netdev_linux_update_flags(struct netdev *netdev_, enum 
netdev_flags off,
   enum netdev_flags on, enum netdev_flags *old_flagsp)
 {
 struct netdev_linux *netdev = netdev_linux_cast(netdev_);
-int error;
+int error = 0;
 
 ovs_mutex_lock(>mutex);
-error = update_flags(netdev, off, on, old_flagsp);
+if (on || off) {
+/* Changing flags over netlink isn't support yet. */
+error = update_flags(netdev, off, on, old_flagsp);
+} else {
+/* Try reading flags over netlink, or fall back to ioctl. */
+if (!netdev_linux_update_via_netlink(netdev)) {
+*old_flagsp = iff_to_nd_flags(netdev->ifi_flags);
+} else {
+error = update_flags(netdev, off, on, old_flagsp);
+}
+}
 ovs_mutex_unlock(>mutex);
-
 return error;
 }
 
@@ -5506,6 +5536,11 @@ get_ifindex(const struct netdev *netdev_, int *ifindexp)
 struct netdev_linux *netdev = netdev_linux_cast(netdev_);
 
 if (!(netdev->cache_valid & VALID_IFINDEX)) {
+netdev_linux_update_via_netlink(netdev);
+}
+
+if (!(netdev->cache_valid & VALID_IFINDEX)) {
+/* Fall back to ioctl if netlink fails */
 int ifindex = linux_get_ifindex(netdev_get_name(netdev_));
 
 if (ifindex < 0) {
@@ -5522,6 +5557,79 @@ get_ifindex(const struct netdev *netdev_, int *ifindexp)
 return netdev->get_ifindex_error;
 }
 
+static int
+netdev_linux_update_via_netlink(struct netdev_linux *netdev)
+{
+struct ofpbuf request;
+struct ofpbuf *reply;
+struct rtnetlink_change chg;
+struct rtnetlink_change *change = 
+int error;
+
+ofpbuf_init(, 0);
+nl_msg_put_nlmsghdr(,
+sizeof(struct ifinfomsg) + NL_ATTR_SIZE(IFNAMSIZ),
+RTM_GETLINK, NLM_F_REQUEST);
+ofpbuf_put_zeros(, sizeof(struct ifinfomsg));
+
+/* The correct identifiers for a Linux device are netnsid and ifindex,
+ * but ifindex changes as the port is moved to another network namespace
+ * and the interface name statically stored in ovsdb. */
+nl_msg_put_string(, IFLA_IFNAME, netdev_get_name(>up));
+if (netdev_linux_netnsid_is_remote(netdev)) {
+nl_msg_push_u32(, IFLA_IF_NETNSID, netdev->netnsid);
+}
+error = nl_transact(NETLINK_ROUTE, , );
+ofpbuf_uninit();
+if (error) {
+ofpbuf_delete(reply);
+return error;
+}
+
+if (rtnetlink_parse(reply, change)
+&& change->nlmsg_type == RTM_NEWLINK) {
+bool changed = false;
+error = 0;
+
+/* Update netdev from rtnl msg and increment its seq if needed. */
+if

[ovs-dev] [PATCH v3 2/6] netnsid: update device only if netnsid matches.

2017-12-06 Thread Flavio Leitner

Recent kernels provide the network namespace ID of a port,
so use that to discover where the port currently is.

A network device in another network namespace could have the
same name, so once the socket starts listening to other network
namespaces, it is necessary to confirm the netnsid.

Signed-off-by: Flavio Leitner 
---
 datapath/linux/compat/include/linux/openvswitch.h |  2 +
 lib/dpif-netlink.c|  8 +++
 lib/dpif-netlink.h|  1 +
 lib/netdev-linux.c| 63 ---
 4 files changed, 68 insertions(+), 6 deletions(-)

diff --git a/datapath/linux/compat/include/linux/openvswitch.h 
b/datapath/linux/compat/include/linux/openvswitch.h
index 561f89502..f28d140ca 100644
--- a/datapath/linux/compat/include/linux/openvswitch.h
+++ b/datapath/linux/compat/include/linux/openvswitch.h
@@ -283,6 +283,8 @@ enum ovs_vport_attr {
/* receiving upcalls */
OVS_VPORT_ATTR_STATS,   /* struct ovs_vport_stats */
OVS_VPORT_ATTR_PAD,
+   OVS_VPORT_ATTR_IFINDEX,
+   OVS_VPORT_ATTR_NETNSID,
__OVS_VPORT_ATTR_MAX
 };
 
diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index fd333094d..8f04ed311 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -46,6 +46,7 @@
 #include "netlink-notifier.h"
 #include "netlink-socket.h"
 #include "netlink.h"
+#include "netnsid.h"
 #include "odp-util.h"
 #include "openvswitch/ofpbuf.h"
 #include "packets.h"
@@ -3065,6 +3066,7 @@ dpif_netlink_vport_from_ofpbuf(struct dpif_netlink_vport 
*vport,
 [OVS_VPORT_ATTR_STATS] = { NL_POLICY_FOR(struct ovs_vport_stats),
.optional = true },
 [OVS_VPORT_ATTR_OPTIONS] = { .type = NL_A_NESTED, .optional = true },
+[OVS_VPORT_ATTR_NETNSID] = { .type = NL_A_U32, .optional = true },
 };
 
 dpif_netlink_vport_init(vport);
@@ -3100,6 +3102,12 @@ dpif_netlink_vport_from_ofpbuf(struct dpif_netlink_vport 
*vport,
 vport->options = nl_attr_get(a[OVS_VPORT_ATTR_OPTIONS]);
 vport->options_len = nl_attr_get_size(a[OVS_VPORT_ATTR_OPTIONS]);
 }
+if (a[OVS_VPORT_ATTR_NETNSID]) {
+netnsid_set(>netnsid,
+nl_attr_get_u32(a[OVS_VPORT_ATTR_NETNSID]));
+} else {
+netnsid_set_local(>netnsid);
+}
 return 0;
 }
 
diff --git a/lib/dpif-netlink.h b/lib/dpif-netlink.h
index 568b81441..0a9628088 100644
--- a/lib/dpif-netlink.h
+++ b/lib/dpif-netlink.h
@@ -32,6 +32,7 @@ struct dpif_netlink_vport {
 
 /* ovs_vport header. */
 int dp_ifindex;
+int netnsid;   /* Network Namespace ID. */
 odp_port_t port_no;/* ODPP_NONE if unknown. */
 enum ovs_vport_type type;
 
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index bebd34402..bf8aee5c7 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -60,6 +60,7 @@
 #include "netlink-notifier.h"
 #include "netlink-socket.h"
 #include "netlink.h"
+#include "netnsid.h"
 #include "openvswitch/ofpbuf.h"
 #include "openflow/openflow.h"
 #include "ovs-atomic.h"
@@ -476,6 +477,7 @@ struct netdev_linux {
 long long int miimon_interval;  /* Miimon Poll rate. Disabled if <= 0. */
 struct timer miimon_timer;
 
+int netnsid;/* Network namespace ID. */
 /* The following are figured out "on demand" only.  They are only valid
  * when the corresponding VALID_* bit in 'cache_valid' is set. */
 int ifindex;
@@ -571,7 +573,42 @@ netdev_rxq_linux_cast(const struct netdev_rxq *rx)
 return CONTAINER_OF(rx, struct netdev_rxq_linux, up);
 }
 
-static void netdev_linux_update(struct netdev_linux *netdev,
+static int
+netdev_linux_netnsid_update__(struct netdev_linux *netdev)
+{
+struct dpif_netlink_vport reply;
+struct ofpbuf *buf;
+int error;
+
+error = dpif_netlink_vport_get(netdev_get_name(>up), , );
+if (error) {
+netnsid_unset(>netnsid);
+return error;
+}
+
+netnsid_set(>netnsid, reply.netnsid);
+ofpbuf_delete(buf);
+return 0;
+}
+
+static int
+netdev_linux_netnsid_update(struct netdev_linux *netdev)
+{
+if (netnsid_is_unset(netdev->netnsid)) {
+return netdev_linux_netnsid_update__(netdev);
+}
+
+return 0;
+}
+
+static bool
+netdev_linux_netnsid_is_eq(struct netdev_linux *netdev, int nsid)
+{
+netdev_linux_netnsid_update(netdev);
+return netnsid_eq(netdev->netnsid, nsid);
+}
+
+static void netdev_linux_update(struct netdev_linux *netdev, int,
 const struct rtnetlink_change *)
 OVS_REQUIRES(netdev->mutex);
 static void netdev_linux_changed(struct netdev_linux *netdev,
@@ -635,10 +672,11 @@ netdev_linux_run(const struct netdev_class *netdev_class 
OVS_UNUSED)
 do {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
 uint64_t buf_stub[4096 / 8];
+int nsid;

[ovs-dev] [PATCH v3 1/6] netlink: provide network namespace id from a msg.

2017-12-06 Thread Flavio Leitner

The netlink notification's ancillary data contains the network
namespace id (netnsid) needed to identify the device correctly.

Signed-off-by: Flavio Leitner 
---
 configure.ac   |   3 +-
 lib/automake.mk|   1 +
 lib/dpif-netlink.c |   6 +--
 lib/netdev-linux.c |   2 +-
 lib/netlink-notifier.c |   2 +-
 lib/netlink-socket.c   |  53 ---
 lib/netlink-socket.h   |   2 +-
 lib/netnsid.h  | 139 +
 utilities/nlmon.c  |   2 +-
 9 files changed, 196 insertions(+), 14 deletions(-)
 create mode 100644 lib/netnsid.h

diff --git a/configure.ac b/configure.ac
index 6a8113a5c..c88814198 100644
--- a/configure.ac
+++ b/configure.ac
@@ -107,7 +107,8 @@ AC_CHECK_MEMBERS([struct sockaddr_in6.sin6_scope_id], [], 
[],
   [[#include 
 #include ]])
 AC_CHECK_FUNCS([mlockall strnlen getloadavg statvfs getmntent_r sendmmsg 
clock_gettime])
-AC_CHECK_HEADERS([mntent.h sys/statvfs.h linux/types.h linux/if_ether.h 
stdatomic.h])
+AC_CHECK_HEADERS([mntent.h sys/statvfs.h linux/types.h linux/if_ether.h])
+AC_CHECK_HEADERS([linux/net_namespace.h stdatomic.h])
 AC_CHECK_HEADERS([net/if_mib.h], [], [], [[#include 
 #include ]])
 
diff --git a/lib/automake.mk b/lib/automake.mk
index effe5b5c2..8da00d597 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -140,6 +140,7 @@ lib_libopenvswitch_la_SOURCES = \
lib/netflow.h \
lib/netlink.c \
lib/netlink.h \
+   lib/netnsid.h \
lib/nx-match.c \
lib/nx-match.h \
lib/object-collection.c \
diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index c265909f8..fd333094d 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -1287,7 +1287,7 @@ dpif_netlink_port_poll(const struct dpif *dpif_, char 
**devnamep)
 int error;
 
 ofpbuf_use_stub(, buf_stub, sizeof buf_stub);
-error = nl_sock_recv(dpif->port_notifier, , false);
+error = nl_sock_recv(dpif->port_notifier, , NULL, false);
 if (!error) {
 struct dpif_netlink_vport vport;
 
@@ -2621,7 +2621,7 @@ dpif_netlink_recv_windows(struct dpif_netlink *dpif, 
uint32_t handler_id,
 return EAGAIN;
 }
 
-error = nl_sock_recv(sock_pool[i].nl_sock, buf, false);
+error = nl_sock_recv(sock_pool[i].nl_sock, buf, NULL, false);
 if (error == ENOBUFS) {
 /* ENOBUFS typically means that we've received so many
  * packets that the buffer overflowed.  Try again
@@ -2696,7 +2696,7 @@ dpif_netlink_recv__(struct dpif_netlink *dpif, uint32_t 
handler_id,
 return EAGAIN;
 }
 
-error = nl_sock_recv(ch->sock, buf, false);
+error = nl_sock_recv(ch->sock, buf, NULL, false);
 if (error == ENOBUFS) {
 /* ENOBUFS typically means that we've received so many
  * packets that the buffer overflowed.  Try again
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index e809b8877..bebd34402 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -638,7 +638,7 @@ netdev_linux_run(const struct netdev_class *netdev_class 
OVS_UNUSED)
 struct ofpbuf buf;
 
 ofpbuf_use_stub(, buf_stub, sizeof buf_stub);
-error = nl_sock_recv(sock, , false);
+error = nl_sock_recv(sock, , NULL, false);
 if (!error) {
 struct rtnetlink_change change;
 
diff --git a/lib/netlink-notifier.c b/lib/netlink-notifier.c
index 3acded418..d33904658 100644
--- a/lib/netlink-notifier.c
+++ b/lib/netlink-notifier.c
@@ -187,7 +187,7 @@ nln_run(struct nln *nln)
 int error;
 
 ofpbuf_use_stub(, buf_stub, sizeof buf_stub);
-error = nl_sock_recv(nln->notify_sock, , false);
+error = nl_sock_recv(nln->notify_sock, , NULL, false);
 if (!error) {
 int group = nln->parse(, nln->change);
 
diff --git a/lib/netlink-socket.c b/lib/netlink-socket.c
index 317bf907f..f68ca860d 100644
--- a/lib/netlink-socket.c
+++ b/lib/netlink-socket.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -28,6 +29,7 @@
 #include "openvswitch/hmap.h"
 #include "netlink.h"
 #include "netlink-protocol.h"
+#include "netnsid.h"
 #include "odp-netlink.h"
 #include "openvswitch/ofpbuf.h"
 #include "ovs-thread.h"
@@ -607,7 +609,7 @@ nl_sock_send_seq(struct nl_sock *sock, const struct ofpbuf 
*msg,
 }
 
 static int
-nl_sock_recv__(struct nl_sock *sock, struct ofpbuf *buf, bool wait)
+nl_sock_recv__(struct nl_sock *sock, struct ofpbuf *buf, int *nsid, bool wait)
 {
 /* We can't accurately predict the size of the data to be received.  The
  * caller is supposed to have allocated enough space in 'buf' to handle the
@@ -618,7 +620,10 @@ nl_sock_recv__(struct nl_sock *sock, struct ofpbuf *buf, 
bool wait)
 uint8_t tail[65536];
 struct iovec iov[2];
 struct msghdr msg;
+uint8_t

[ovs-dev] [PATCH v3 0/6] Add minimum network namespace support.

2017-12-06 Thread Flavio Leitner

Today Open vSwitch doesn't know about network namespaces (netns), but
users are moving internal ports to other namespaces.  Although packets
are still flowing, the daemon fails to find out basic port information,
like if it is UP or DOWN, for instance.

This patchset rely on a new kernel vport API recently accepted to find
out the new network namespace ID of a bridge's port. This information
along with the port's name recorded in the database is used to match the
corresponding netlink messages.

This patchset also leverages another kernel API that allows the daemon
to listen to all netlink messages from all netns which has an ID assigned
into it.  This and the previous change allows the userspace to track ports
in other network namespaces.

If any of the APIs aren't available, it falls back to the older APIs to
not break backwards compatibility.


ChangeLog:

* V3:
  - Fixed long line (Greg)
  - Rewrote assuming that the kernel will not send negative
numbers as valid network namespace id. (Ben, Flavio, Jiri)

* V2:
  - report and close unexpected file descriptors (Ben)

Flavio Leitner (6):
  netlink: provide network namespace id from a msg.
  netnsid: update device only if netnsid matches.
  netdev-linux: use netlink to update netdev.
  netlink linux: enable listening to all nsids
  nlmon: added netns support.
  netdev-linux: fail ops not supporting remote netns.

 configure.ac  |   3 +-
 datapath/linux/compat/include/linux/openvswitch.h |   2 +
 lib/automake.mk   |   1 +
 lib/daemon-unix.c |   3 +-
 lib/daemon.man|   6 +-
 lib/daemon.xml|   8 +-
 lib/dpif-netlink.c|  14 +-
 lib/dpif-netlink.h|   1 +
 lib/netdev-linux.c| 312 --
 lib/netlink-notifier.c|   2 +-
 lib/netlink-protocol.h|   6 +
 lib/netlink-socket.c  |  80 +-
 lib/netlink-socket.h  |   4 +-
 lib/netnsid.h | 139 ++
 tests/ofproto-macros.at   |   1 +
 tests/ovn-controller-vtep.at  |   1 +
 utilities/nlmon.c |  10 +-
 17 files changed, 549 insertions(+), 44 deletions(-)
 create mode 100644 lib/netnsid.h

-- 
2.14.3

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] lib/conntrack: remove unnecessary addr check for ICMP.

2017-12-06 Thread Darrell Ball

Hi Wang

To speed up the process, I sent an alternative patch here:

https://patchwork.ozlabs.org/patch/845407/

I agree the address sanity check is not correct but I think it should be 
partially retained
rather than removed. I also think a test was needed.

Pls let me know if it makes sense.
Also. if you prefer, I can make you the author.

Thanks Darrell




On 12/6/17, 11:22 AM, "Darrell Ball"  wrote:

Thanks for looking at this.

In the commit message, can you delineate.

1/ The forward direction packet in terms of src ip, dest ip, L4 attributes
2/ The reverse direction error packet in terms of src ip, dest ip, icmp 
error payload

Darrell

On 12/4/17, 10:22 PM, "ovs-dev-boun...@openvswitch.org on behalf of wang 
zhike"  wrote:

From: wangzhike 

ICMP response (Unreachable/fragmentationRequired/...) may be created
at devices in the middle, and such packets are tagged as invalid in
user space conntrack. In fact it does not make sense to validate the
src and dest address.

Signed-off-by: wang zhike 
---
 lib/conntrack.c | 13 -
 1 file changed, 13 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index f5a3aa9..c44ad0f 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -1702,11 +1702,6 @@ extract_l4_icmp(struct conn_key *key, const void 
*data, size_t size,
 return false;
 }
 
-if (inner_key.src.addr.ipv4_aligned != 
key->dst.addr.ipv4_aligned
-|| inner_key.dst.addr.ipv4_aligned != 
key->src.addr.ipv4_aligned) {
-return false;
-}
-
 key->src = inner_key.src;
 key->dst = inner_key.dst;
 key->nw_proto = inner_key.nw_proto;
@@ -1789,14 +1784,6 @@ extract_l4_icmp6(struct conn_key *key, const 
void *data, size_t size,
 return false;
 }
 
-/* pf doesn't do this, but it seems a good idea */
-if (!ipv6_addr_equals(_key.src.addr.ipv6_aligned,
-  >dst.addr.ipv6_aligned)
-|| !ipv6_addr_equals(_key.dst.addr.ipv6_aligned,
- >src.addr.ipv6_aligned)) {
-return false;
-}
-
 key->src = inner_key.src;
 key->dst = inner_key.dst;
 key->nw_proto = inner_key.nw_proto;
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev=DwICAg=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=IqOQ4UcLnOpQUkvP7FHVH6IWeAK_4DDBRBqX2w3wl94=Xmvgnl8plChpjcr77sLIOxE0krKVuNVqvS3_eOoNeMg=




___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [patch v1] conntrack: Fix icmp error address sanity check.

2017-12-06 Thread Darrell Ball

An address sanity check is done on icmp error packets to
check that the icmp error payload makes sense w.r.t. the
packet itself.

The sanity check was partially incorrect since it tried
to verify the source address of the error packet against the
original destination, which does not makes since the error
can be generated by any intermediate node.

Reported-by: wangzhike 
Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341609.html
Fixes: a489b1685 ("conntrack: New userspace connection tracker.")
CC: Daniele Di Proietto 
Signed-off-by: Darrell Ball 
Signed-off-by: wangzhike 
Co-authored-by: wangzhike 
---
 lib/conntrack.c | 7 ++-
 tests/system-traffic.at | 6 +++---
 2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index cd54ba7..6d078f5 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -1782,8 +1782,7 @@ extract_l4_icmp(struct conn_key *key, const void *data, 
size_t size,
 return false;
 }
 
-if (inner_key.src.addr.ipv4_aligned != key->dst.addr.ipv4_aligned
-|| inner_key.dst.addr.ipv4_aligned != key->src.addr.ipv4_aligned) {
+if (inner_key.src.addr.ipv4_aligned != key->dst.addr.ipv4_aligned) {
 return false;
 }
 
@@ -1871,9 +1870,7 @@ extract_l4_icmp6(struct conn_key *key, const void *data, 
size_t size,
 
 /* pf doesn't do this, but it seems a good idea */
 if (!ipv6_addr_equals(_key.src.addr.ipv6_aligned,
-  >dst.addr.ipv6_aligned)
-|| !ipv6_addr_equals(_key.dst.addr.ipv6_aligned,
- >src.addr.ipv6_aligned)) {
+  >dst.addr.ipv6_aligned)) {
 return false;
 }
 
diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index 4551c5c..4e7a1cd 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -1584,8 +1584,8 @@ AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 2 
resubmit\(,0\) 'f64c473528c9c
 dnl 2. Send and UDP packet to port 
 AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 1 resubmit\(,0\) 
'c6f94ecb72dbe64c473528c908004521317040004011b138ac11ac12a28e15b3000d20966369616f0a'])
 
-dnl 3. Send an ICMP port unreach reply for port , related to the first 
packet
-AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 2 resubmit\(,0\) 
'e64c473528c9c6f94ecb72db080045c0003d2e874001f355ac12ac110303553f4521317040004011b138ac11ac12a28e15b3000d20966369616f0a'])
+dnl 3. Send an ICMP port unreach reply from a path midpoint for port , 
related to the first packet
+AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 2 resubmit\(,0\) 
'e64c473528c9c6f94ecb72db080045c0003d2e874001f354ac13ac110303553f4521317040004011b138ac11ac12a28e15b3000d20966369616f0a'])
 
 dnl Check this output. We only see the latter two packets, not the first.
 AT_CHECK([cat ofctl_monitor.log], [0], [dnl
@@ -1594,7 +1594,7 @@ 
icmp,vlan_tci=0x,dl_src=c6:f5:4e:cb:72:db,dl_dst=f6:4c:47:35:28:c9,nw_src=17
 NXT_PACKET_IN2 (xid=0x0): table_id=1 cookie=0x0 total_len=47 
ct_state=new|trk,ct_nw_src=172.16.0.1,ct_nw_dst=172.16.0.2,ct_nw_proto=17,ct_tp_src=41614,ct_tp_dst=,ip,in_port=1
 (via action) data_len=47 (unbuffered)
 
udp,vlan_tci=0x,dl_src=e6:4c:47:35:28:c9,dl_dst=c6:f9:4e:cb:72:db,nw_src=172.16.0.1,nw_dst=172.16.0.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=41614,tp_dst=
 udp_csum:2096
 NXT_PACKET_IN2 (xid=0x0): table_id=1 cookie=0x0 total_len=75 
ct_state=rel|rpl|trk,ct_nw_src=172.16.0.1,ct_nw_dst=172.16.0.2,ct_nw_proto=17,ct_tp_src=41614,ct_tp_dst=,ip,in_port=2
 (via action) data_len=75 (unbuffered)
-icmp,vlan_tci=0x,dl_src=c6:f9:4e:cb:72:db,dl_dst=e6:4c:47:35:28:c9,nw_src=172.16.0.2,nw_dst=172.16.0.1,nw_tos=192,nw_ecn=0,nw_ttl=64,icmp_type=3,icmp_code=3
 icmp_csum:553f
+icmp,vlan_tci=0x,dl_src=c6:f9:4e:cb:72:db,dl_dst=e6:4c:47:35:28:c9,nw_src=172.16.0.3,nw_dst=172.16.0.1,nw_tos=192,nw_ecn=0,nw_ttl=64,icmp_type=3,icmp_code=3
 icmp_csum:553f
 ])
 
 AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(172.16.0.1)], [0], [dnl
-- 
1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 3/6] datapath: Constify struct nf_conntrack_l4proto

2017-12-06 Thread Gregory Rose


On 12/5/2017 4:01 PM, Greg Rose wrote:

Linux 4.14 requires this parameter to be constified.

Signed-off-by: Greg Rose 
---
  datapath/conntrack.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index d517a87..4b181ad 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -614,7 +614,7 @@ ovs_ct_find_existing(struct net *net, const struct 
nf_conntrack_zone *zone,
 u8 l3num, struct sk_buff *skb, bool natted)
  {
struct nf_conntrack_l3proto *l3proto;
-   struct nf_conntrack_l4proto *l4proto;
+   const struct nf_conntrack_l4proto *l4proto;
struct nf_conntrack_tuple tuple;
struct nf_conntrack_tuple_hash *h;
struct nf_conn *ct;


This is actually addressed in a separate patch from upstream that I will 
be backporting as part of the effort to keep
our out of tree datapath synched with the upstream.  I'll send a V2 of 
the patch series with the correct fixup and

pointers to the upstream commit, etc.

- Greg
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2 2/3] conntrack: Support conntrack flush by ct 5-tuple

2017-12-06 Thread Justin Pettit



> On Nov 21, 2017, at 5:00 PM, Yi-Hung Wei  wrote:
> 
> diff --git a/lib/conntrack.c b/lib/conntrack.c
> index f5a3aa9fab34..c9077c07042c 100644
> --- a/lib/conntrack.c
> +++ b/lib/conntrack.c
> @@ -2335,6 +2335,18 @@ ct_endpoint_to_ct_dpif_inet_addr(const struct ct_addr 
> *a,
> }
> 
> static void
> +ct_dpif_inet_addr_to_ct_endpoint(const union ct_dpif_inet_addr *a,
> + struct ct_addr *b,
> + const uint16_t l3_type)
> +{
> +if (l3_type == AF_INET) {
> +b->ipv4_aligned = a->ip;
> +} else if (l3_type == AF_INET6) {
> +b->ipv6_aligned = a->in6;
> +}
> +}

This is the reverse of the function ct_endpoint_to_ct_dpif_inet_addr(), and 
they both take a 16-bit argument describing the IP version, but their form is 
different.  I'd suggest adding comments to both to make it clearer, since it 
could lead to subtle bugs for people using them later:

-=-=-=-=-=-=-=-=-
/* Convert a conntrack address 'a' into an IP address 'b' of type 'dl_type'.
 *
 * Note that 'dl_type' should be either "ETH_TYPE_IP" or "ETH_TYPE_IPv6"
 * in network-byte order. */
static void
ct_endpoint_to_ct_dpif_inet_addr(const struct ct_addr *a,
 union ct_dpif_inet_addr *b,
 ovs_be16 dl_type)



/* Convert an IP address 'a' of type 'l3_type' into a conntrack address 'b'.
 *
 * Note that 'l3_type' should be either "AF_INET" or "AF_INET6". */
static void
ct_dpif_inet_addr_to_ct_endpoint(const union ct_dpif_inet_addr *a,
 struct ct_addr *b,
 const uint16_t l3_type)
-=-=-=-=-=-=-=-=-

> +int
> +conntrack_flush_tuple(struct conntrack *ct, const struct ct_dpif_tuple 
> *tuple,
> +  uint16_t zone)
> +{
> +struct conn_lookup_ctx ctx;
> +int error = 0;
> +
> +memset(, 0, sizeof(ctx));
> +tuple_to_conn_key(tuple, zone, );
> +ctx.hash = conn_key_hash(, ct->hash_basis);
> +unsigned bucket = hash_to_bucket(ctx.hash);
> +
> +ct_lock_lock(>buckets[bucket].lock);
> +conn_key_lookup(>buckets[bucket], , time_msec());
> +if (ctx.conn) {
> +conn_clean(ct, ctx.conn, >buckets[bucket]);
> +} else {
> +error = ENOENT;
> +}
> +ct_lock_unlock(>buckets[bucket].lock);
> +return error;
> +}

I believe the kernel clears out expectations.  As we discussed off-line, I 
think it would make sense to follow similar behavior, since presumably if the 
control channel is being flushed, the related flows should be, too.

--Justin


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] lib/conntrack: remove unnecessary addr check for ICMP.

2017-12-06 Thread Darrell Ball

Thanks for looking at this.

In the commit message, can you delineate.

1/ The forward direction packet in terms of src ip, dest ip, L4 attributes
2/ The reverse direction error packet in terms of src ip, dest ip, icmp error 
payload

Darrell

On 12/4/17, 10:22 PM, "ovs-dev-boun...@openvswitch.org on behalf of wang zhike" 
 wrote:

From: wangzhike 

ICMP response (Unreachable/fragmentationRequired/...) may be created
at devices in the middle, and such packets are tagged as invalid in
user space conntrack. In fact it does not make sense to validate the
src and dest address.

Signed-off-by: wang zhike 
---
 lib/conntrack.c | 13 -
 1 file changed, 13 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index f5a3aa9..c44ad0f 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -1702,11 +1702,6 @@ extract_l4_icmp(struct conn_key *key, const void 
*data, size_t size,
 return false;
 }
 
-if (inner_key.src.addr.ipv4_aligned != key->dst.addr.ipv4_aligned
-|| inner_key.dst.addr.ipv4_aligned != 
key->src.addr.ipv4_aligned) {
-return false;
-}
-
 key->src = inner_key.src;
 key->dst = inner_key.dst;
 key->nw_proto = inner_key.nw_proto;
@@ -1789,14 +1784,6 @@ extract_l4_icmp6(struct conn_key *key, const void 
*data, size_t size,
 return false;
 }
 
-/* pf doesn't do this, but it seems a good idea */
-if (!ipv6_addr_equals(_key.src.addr.ipv6_aligned,
-  >dst.addr.ipv6_aligned)
-|| !ipv6_addr_equals(_key.dst.addr.ipv6_aligned,
- >src.addr.ipv6_aligned)) {
-return false;
-}
-
 key->src = inner_key.src;
 key->dst = inner_key.dst;
 key->nw_proto = inner_key.nw_proto;
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev=DwICAg=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=IqOQ4UcLnOpQUkvP7FHVH6IWeAK_4DDBRBqX2w3wl94=Xmvgnl8plChpjcr77sLIOxE0krKVuNVqvS3_eOoNeMg=


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [patch v3 3/3] conntrack: Disable algs by default.

2017-12-06 Thread Darrell Ball

Thanks for the reviews Aaron.

Darrell



On 12/6/17, 11:04 AM, "ovs-dev-boun...@openvswitch.org on behalf of Aaron 
Conole"  wrote:

Darrell Ball  writes:

> Presently, alg processing is enabled by default to better exercise code.
> This is similar to kernels before 4.7 as well.  The recommended default
> behavior in the newer kernels is to only process algs if a helper is
> supplied in a conntrack rule.  The behavior is changed to match the
> later kernels.
>
> A test is extended to check that the control connection is still
> created in such a case.
>
> Signed-off-by: Darrell Ball 
> ---

LGTM.

Acked-by: Aaron Conole 
___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev=DwICAg=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=D7-nKQ8low4ohzl_EtbVC_dQ1XBd0pg8gTmTzeYVte0=o95-s6j6eDyJTjLmhlMjK8t_qwOO439Wigl_cgMNv0Q=


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] Topquality Irish Mackerels

2017-12-06 Thread Bonesca Mailing



Deze email nieuwsbrief werd in grafisch HTML formaat verzonden.
Als u deze tekstversie ziet, verkiest uw email programma "gewone tekst" emails.
U kan de originele nieuwsbrief online bekijken:
http://bonescamail-mail.nl/zcyLo8



Dear customer,

Last week we started to sell Topquality Irish Atlantic Mackerels,
fresh November production.

Please see photos and grading sheets attached (random from two
different vessels and 3 blocks from each)

Its all very nice fish, nice quality, good fat 22/24% and average size
is 400/600 with average 500g per piece so very nice large fish.

Packing is 20 kilo blockfrozen, each box has 0.5 - 1 kilo fish extra
for free!!

Prices :

Retail : 1 box € 1,75 - 10 box € 1,65 per kilo
Wholesale: 1 palet (48 box) € 1,55 - 3 palets (144 box) € 1,48 per
kilo
Container (23 x 48 box = 1104 box) € 1,38 per kilo

Interested ? Please contact us.

Kind regards, Met vriendelijke groeten, Mit freundlichen Grüßen,
Cordialement,

Bonesca Import en Export BV - Netherlands

Bertus Brouwer
Director/Purchase - Dutch / English / German
Tel.: 31 (0) 527 70 10 63
Mail: ber...@bonesca.nl

Nedal Muhtaseb
Sales - Dutch / Arab / English / German
Tel: 31 (0) 527 68 07 83
Mail: ne...@bonesca.nl

Marianne Kramer
Sales - Dutch / French / English / German
Tel: 31 (0) 527 68 07 82
Mail: maria...@bonesca.nl

Henry Bakker
Administration/Planning
Tel: 31 (0) 527 68 07 85
Mail: he...@bonesca.nl; i...@bonesca.nl;invo...@bonesca.nl

Thu Hanh
Sales Vietnamese / Thai / Laothian / German / English
Tel: 49 (0) 32221097676
Tel: 31 (0) 527 68 07 88
Mail: t...@bonesca.nl

Ramesh Raja
Sales Tamil / Dutch / English
Tel: 31 (0) 527 68 07 84
Mail: r...@bonesca.nl

Policarpo J. Olivas
Sales Spanish, French, Portugese, Italian, English
Tel Spain: 34 (0) 649 566 367
Tel Rep. Dominican: 1 (0) 809 778 4040
Mail: poli...@bonesca.nl

Marberta Korf
Office assistant Dutch / English
Tel: 31 (0) 527 68 07 86
Mail:marbe...@bonesca.nl
Warehouse - Albert Korf / Willem Bakker / Fokke Korf
Tel: 31 (0) 527 68 07 89
Mail: expedi...@bonesca.nl

_
Sign out / Change E-mailadress: 
http://bonescamail-mail.nl/ughbywyjgsgubbebbgjeqbgghewbs
Powered door YourMailingListProvider

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [patch v3 3/3] conntrack: Disable algs by default.

2017-12-06 Thread Aaron Conole

Darrell Ball  writes:

> Presently, alg processing is enabled by default to better exercise code.
> This is similar to kernels before 4.7 as well.  The recommended default
> behavior in the newer kernels is to only process algs if a helper is
> supplied in a conntrack rule.  The behavior is changed to match the
> later kernels.
>
> A test is extended to check that the control connection is still
> created in such a case.
>
> Signed-off-by: Darrell Ball 
> ---

LGTM.

Acked-by: Aaron Conole 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] Un traje a la medida de sus necesidades y finanzas

2017-12-06 Thread Fideicomisos en México

Una útil herramienta jurídica para sus necesidades personales y comerciales vial

Fideicomisos en México 
11 de Diciembre - Dr. Eduardo Rocha Núñez 9am-6pm

Usted puede obtener numerosos beneficios económicos con la constitución e 
implementación de un fideicomiso, un instrumento legislativo y jurídico de 
calidad para aplicación práctica en los negocios. Aprenda en este seminario 
todas las generalidades de estos actos jurídicos, sus aspectos fiscales e 
internacionales, así como la normatividad que coadyuve a la comprensión del 
tratamiento impositivo de los fideicomisos, con el fin de distinguir entre las 
obligaciones de la fiduciaria y las que se generan de manera particular por los 
participantes en los fideicomisos.

BENEFICIOS DE ASISTIR: 

• Conocerá el concepto jurídico y las partes que componen el contrato de 
fideicomiso. 
• Ampliará los conocimientos de los negocios a través del fideicomiso.
• Diseñará de manera conveniente un negocio a través del fideicomiso. 
• Conocerá los derechos que tienen por la administración de la fiduciaria.
• Aprenderá sobre las obligaciones fiscales que se generan por la realización 
de las actividades que se realicen a través del fideicomiso.
• Podrá reconocer los impactos fiscales que pueden impedir el éxito del negocio 
a través del fideicomiso. 

¿Requiere la información a la Brevedad? responda este email con la palabra: 
Fideicomisos + nombre - teléfono - correo.


centro telefónico:018002120744


 


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [patch v3 2/3] conntrack: Allow specified alg port numbers.

2017-12-06 Thread Aaron Conole

Darrell Ball  writes:

> Algs can use variable control port numbers for servers.
> The main use case is a kind of feeble security measure; the
> thinking being by some is that it obscures the alg traffic.
> It is really not very effective, but the kernel has this
> capability. This patch mimics the capability.
>
> Signed-off-by: Darrell Ball 
> ---

Looks like the only thing that changed was a bit of context from the 1/3
change.

Acked-by: Aaron Conole 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [patch v3 1/3] conntrack: Refactor algs.

2017-12-06 Thread Aaron Conole

Darrell Ball  writes:

> Upcoming requirements for new algs make it desirable to split out
> alg helpers more cleanly.
>
> Signed-off-by: Darrell Ball 
> ---

Acked-by: Aaron Conole 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 2/2] tests: Use $(MKDIR_P) instead of mkdir -p.

2017-12-06 Thread Ben Pfaff

On Wed, Dec 06, 2017 at 09:42:35AM -0800, Gregory Rose wrote:
> On 12/5/2017 4:27 PM, Ben Pfaff wrote:
> >It is more portable.
> >
> >Signed-off-by: Ben Pfaff 
> >---
> >  tests/automake.mk | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> >diff --git a/tests/automake.mk b/tests/automake.mk
> >index c5cc95251207..ad5362ff1f5b 100644
> >--- a/tests/automake.mk
> >+++ b/tests/automake.mk
> >@@ -167,7 +167,7 @@ check-lcov: all $(check_DATA) clean-lcov
> > find . -name '*.gcda' | xargs -n1 rm -f
> > -set $(SHELL) '$(TESTSUITE)' -C tests AUTOTEST_PATH=$(AUTOTEST_PATH) 
> > $(TESTSUITEFLAGS); \
> > "$$@" || (test X'$(RECHECK)' = Xyes && "$$@" --recheck)
> >-mkdir -p tests/lcov
> >+$(MKDIR_P) tests/lcov
> > lcov $(LCOV_OPTS) -o tests/lcov/coverage.info
> > genhtml $(GENHTML_OPTS) -o tests/lcov tests/lcov/coverage.info
> > @echo "coverage report generated at tests/lcov/index.html"
> 
> Reviewed-by: Greg Rose 

Thanks for the reviews.  I applied these to master and backported as far
as branch-2.5.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 2/2] Adding configuration option to whitelist DPDK physical ports.

2017-12-06 Thread Chandran, Sugesh

Thank you Billy for the review. 
Please find below my reply.

Regards
_Sugesh


> -Original Message-
> From: O Mahony, Billy
> Sent: Wednesday, December 6, 2017 5:31 PM
> To: Chandran, Sugesh ; d...@openvswitch.org;
> b...@ovn.org
> Subject: RE: [ovs-dev] [PATCH 2/2] Adding configuration option to whitelist
> DPDK physical ports.
> 
> Hi Sugesh,
> 
> This is definitely a very useful feature. I'm looking forward to running trex 
> on the
> same DUT as my ovs-dpdk.
> 
> However I'd suggest adding an sscanf or some such to verify that the domain is
> also specified for each whitelist member. And either add the default of 
> '' or
> complain loudly if the domain is absent.
[Sugesh] Will throw an error in that case then .

> 
> Currently (without this patch) you must specify the domain when adding ports:
>Vsctl add-port ... options:dpdk-devargs=:05:00.0 Or else an error such 
> as
> 'Cannot find unplugged device (05:00.0)'  is reported.
> 
> And with the patch if you include the domain in the other_config (e.g.
> other_config:dpdk-whitelist-pci-ids=":05:00.0") everything works just as
> before.
> 
> However with the patch if you add the whitelist *without* a domain e.g.
>   ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-whitelist-pci-
> ids="05:00.0"
> 
> There is no immediate error. However later when doing add-port if you include
> the domain (current required practice) you will get an error. If you omit the
> domain all is well.
[Sugesh] It looks to me, the dpdk-devargs need the PCI id with the ''.
But to bind and PCI scan its not necessary. 
So to keep it consistent, I would add check for PCI-ID in whitelist config too, 
and throw error incase pci-id are mentioned wrong(means without ''.
Does it looks OK to you?
> 
> It's a little bit strange as regardless of domain or no domain in the 
> other_config
> the PCI probe always reports the NIC as expected:
> 2017-12-06T16:55:27Z|00013|dpdk|INFO|EAL: PCI device :05:00.0 on
> NUMA socket -1
> 2017-12-06T16:55:27Z|00014|dpdk|WARN|EAL:   Invalid NUMA socket,
> default to 0
> 2017-12-06T16:55:27Z|00015|dpdk|INFO|EAL:   probe driver: 8086:1572
> net_i40e
> 
> I'll be using the other patch in this series "isolate rte-mempool allocation" 
> over
> the next few days so I'll review that in due course.
> 
> Thanks,
> Billy.
> 
> > -Original Message-
> > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> > boun...@openvswitch.org] On Behalf Of Sugesh Chandran
> > Sent: Friday, November 10, 2017 1:29 AM
> > To: d...@openvswitch.org; b...@ovn.org
> > Subject: [ovs-dev] [PATCH 2/2] Adding configuration option to
> > whitelist DPDK physical ports.
> >
> > Adding a OVS configuration option to whitelist DPDK physical ports. By
> > default running multiple instances of DPDK on a single platform cannot
> > use physical ports at the same time even though they are distinct.
> >
> > The eal init scans all the ports that are bound to DPDK and initialize
> > the drivers accordingly. This happens for every DPDK process init.
> > On a multi instance deployment usecase, it causes issues for using
> > physical NIC ports.
> > Consider a two DPDK process that are running on a single platform, the
> > second DPDK primary process will try to initialize the drivers for all
> > the physical ports even though it may be used in first DPDK process.
> >
> > To avoid this situation user can whitelist the ports for each DPDK 
> > application.
> > Whitelisting of ports/PCI-ID in a DPDK process will limit the eal-init
> > only on those ports.
> >
> > To whitelist two physical ports ":06:00.0" and ":06:00.1", the
> > configuration option in OVS would be
> >   ovs-vsctl  set Open_vSwitch . other_config:dpdk-whitelist-pci-
> > ids=":06:00.0,:06:00.1"
> >
> > To update the whitelist ports, OVS daemon has to be restarted.
> >
> > Signed-off-by: Sugesh Chandran 
> > ---
> >  lib/dpdk.c   | 29 +
> >  vswitchd/vswitch.xml | 21 +
> >  2 files changed, 50 insertions(+)
> >
> > diff --git a/lib/dpdk.c b/lib/dpdk.c
> > index 9d187c7..0f11977 100644
> > --- a/lib/dpdk.c
> > +++ b/lib/dpdk.c
> > @@ -323,6 +323,34 @@ dpdk_isolate_rte_mem_config(const struct smap
> > *ovs_other_config,  }
> >
> >  static void
> > +dpdk_whitelist_pci_ids(const struct smap *ovs_other_config, char ***argv,
> > +   int *argc)
> > +{
> > +const char *pci_ids;
> > +char *pci_dev;
> > +int len;
> > +int i;
> > +pci_ids = smap_get(ovs_other_config, "dpdk-whitelist-pci-ids");
> > +if (!pci_ids) {
> > +return;
> > +}
> > +len = strlen(pci_ids);
> > +do {
> > +i = strcspn(pci_ids, ",");
> > +pci_dev = xmemdup0(pci_ids, i);
> > +if (!strlen(pci_dev)) {
> > + break;
> > +}
> > +*argv = grow_argv(argv, *argc, 2);
> > +

[ovs-dev] TCP checksum errors from conntrack on padded packets

2017-12-06 Thread Ed Swierk via dev

On a system running a 3.10-derived kernel along with out-of-tree OVS
from around May 2017, we're getting "nf_ct_tcp: bad TCP checksum"
errors when handling network traffic from certain applications on
another system.

One thing all the error packets have in common is that they are short.
With a zero-length TCP payload and no IP options, they are shorter
than the minimum Ethernet frame of 64 bytes (including FCS). So on the
system running OVS, they arrive with some trailing padding (0x00
bytes).

In the normal receive path, ip_rcv() trims the packet to iph->tot_len
before invoking NF_INET_PRE_ROUTING hooks (including conntrack). Then
any subsequent L3+ processing steps, like nf_checksum(), can simply
use skb->len as the length of the packet, rather than referring back
to iph->tot_len.

This trimming does not seem to occur in the OVS conntrack path, so the
checksum verification in tcp_header() fails. (The extra 0x00 bytes
themselves don't affect the checksum, but the length in the IP
pseudoheader does. That length is based on skb->len, and without
trimming, it doesn't match the length the sender used when computing
the checksum.)

This wasn't an issue until OVS conntrack started passing
hooknum==NF_INET_PRE_ROUTING to nf_conntrack_in(), specifically to
validate L4 checksums
(https://github.com/openvswitch/ovs/commit/4a777f56ca).

It's possible that conntrack in newer kernels isn't confused by
padding, or removes it some other way, but unfortunately we're stuck
with 3.10.

Does this diagnosis make sense? Any suggestions on the right place to
trim padding in the OVS conntrack path?

--Ed
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 2/2] tests: Use $(MKDIR_P) instead of mkdir -p.

2017-12-06 Thread Gregory Rose


On 12/5/2017 4:27 PM, Ben Pfaff wrote:

It is more portable.

Signed-off-by: Ben Pfaff 
---
  tests/automake.mk | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/automake.mk b/tests/automake.mk
index c5cc95251207..ad5362ff1f5b 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -167,7 +167,7 @@ check-lcov: all $(check_DATA) clean-lcov
find . -name '*.gcda' | xargs -n1 rm -f
-set $(SHELL) '$(TESTSUITE)' -C tests AUTOTEST_PATH=$(AUTOTEST_PATH) 
$(TESTSUITEFLAGS); \
"$$@" || (test X'$(RECHECK)' = Xyes && "$$@" --recheck)
-   mkdir -p tests/lcov
+   $(MKDIR_P) tests/lcov
lcov $(LCOV_OPTS) -o tests/lcov/coverage.info
genhtml $(GENHTML_OPTS) -o tests/lcov tests/lcov/coverage.info
@echo "coverage report generated at tests/lcov/index.html"


Reviewed-by: Greg Rose 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 1/2] tests: Use $(MKDIR_P) to avoid races.

2017-12-06 Thread Gregory Rose


On 12/5/2017 4:27 PM, Ben Pfaff wrote:

"test -d x || mkdir x" has a race when invoked in parallel: it is possible
for two processes to both see that 'x' does not exist and both try to
create it, and if that happens then one of them will fail.  This avoids
the problem.

Signed-off-by: Ben Pfaff 
---
  tests/automake.mk | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/automake.mk b/tests/automake.mk
index 7eed1064e82b..c5cc95251207 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -193,7 +193,7 @@ valgrind_wrappers = \
tests/valgrind/test-type-props
  
  $(valgrind_wrappers): tests/valgrind-wrapper.in

-   @test -d tests/valgrind || mkdir tests/valgrind
+   @$(MKDIR_P) tests/valgrind
$(AM_V_GEN) sed -e 's,[@]wrap_program[@],$@,' \
$(top_srcdir)/tests/valgrind-wrapper.in > $@.tmp && \
chmod +x $@.tmp && \


Reviewed-by: Greg Rose 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] datapath-windows: Correct endianness for deleting zone.

2017-12-06 Thread aserdean

> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of Justin Pettit
> Sent: Tuesday, December 5, 2017 9:30 AM
> To: d...@openvswitch.org; Alin Gabriel Serdean ;
> Sairam Venugopal 
> Subject: [ovs-dev] [PATCH] datapath-windows: Correct endianness for
> deleting zone.
> 
> The zone Netlink attribute is supposed to be in network-byte order, but
the
> Windows code for deleting conntrack entries was treating it as host-byte
> order.
> 
> Found by inspection.
> 
> Signed-off-by: Justin Pettit 

Thanks for the patch, Justin.  I tested the patch, and everything looks
good.
I want to hold on the acknowledge until all of us are on the same page.

This will make the windows datapath consistent with future patches from the
userspace.
For consistency I would like to backport the patch all the way to
branch-2.6.

What do you think Sai?

Thanks,
Alin.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] ofproto-dpif-xlate: Incorrect handling of errors in group action processing

2017-12-06 Thread Ben Pfaff

On Wed, Dec 06, 2017 at 11:57:48AM +, Vishal Deep Ajmera wrote:
> Hi Ben,> From: Ben Pfaff [mailto:b...@ovn.org] 
> Sent: Tuesday, December 05, 2017 12:52 AM
> To: Vishal Deep Ajmera 
> Cc: d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH] ofproto-dpif-xlate: Incorrect handling of 
> errors in group action processing
> 
> On Mon, Dec 04, 2017 at 08:47:38AM +, Vishal Deep Ajmera wrote:
> > As per OpenFlow v1.3 specification, when an action list contains a 
> > group action a copy of the packet is passed to the group for 
> > processing by the group. This means that if there is an error 
> > encountered during group processing, only the copy of packet should be 
> > dropped, but subsequent actions in the action list should be executed on 
> > the original packet.
> > 
> > Additionally, if the group type is "ALL", each action bucket of the 
> > group should process a copy of the packet. If there is an error while 
> > processing one bucket other buckets should still be processed.
> > 
> > Example 1:
> > table=0,in_port=tap0 actions=output:tap1,group:10,output:tap2
> > 
> > Even if any error is encountered while processing the group action, 
> > the packet should still be forwarded to ports tap1 and tap2.
> > 
> > Example 2:
> > group_id=1,type=all,bucket=actions=output:tap1,bucket=actions=encap(et
> > h)
> > 
> > Even if processing the action in the second bucket fails because the 
> > packet already has an Ethernet header, the other copy of the packet 
> > should still be processed by the first bucket and output to port tap1.
> > 
> > Currently the error handling in OVS does not comply with those rules. 
> > When any group bucket execution fails the error is recorded in the 
> > so-called "translation context" which is global for the processing of 
> > the original packet. Once an error is recorded, OVS skips processing 
> > subsequent buckets and installs a drop action in the datapath even if 
> > parts of the action list were previously processed successfully.
> > 
> > This patch clears the error flag after any bucket of a group is executed.
> > This way the error encountered in processing any bucket of the group 
> > will not impact other actions of action-list or other buckets of the group.
> > 
> > Signed-off-by: Vishal Deep Ajmera 
> > Signed-off-by: Keshav Gupta 
> 
> Thank you for thinking about this issue.  Clearly there is something to be 
> done here.
> 
> When I look at the uses of translation errors, I see now that there are a few 
> categories:
> 
>1. Translation can't start at all: XLATE_NO_RECIRCULATION_CONTEXT,
>   XLATE_BRIDGE_NOT_FOUND, XLATE_RECIRCULATION_CONFLICT,
>   XLATE_INVALID_TUNNEL_METADATA.
> 
>2. Translation has run too long or used too much space and must end
>   to bound memory or CPU time.  XLATE_RECURSION_TOO_DEEP and
>   XLATE_TOO_MANY_RESUBMITS are in this category for time,
>   XLATE_STACK_TOO_DEEP for space.
> 
>3. Some step can't be executed: XLATE_TOO_MANY_MPLS_LABELS,
>   XLATE_UNSUPPORTED_PACKET_TYPE.
> 
> I believe that the appropriate handling varies among category.  Category
> 1 is fundamentally unrecoverable.  For category 2, we must not try to recover 
> because that would extend CPU time (perhaps exponentially) or it would cause 
> malfunctions (because a later "pop" would miss out on some data that wasn't 
> "push"ed).  For category 3, we could recover at an appropriate point in the 
> way you suggest.
> ===
> [Vishal/Keshav] Any packet modification done in group bucket processing is 
> local to bucket only and original packet is retained (as it was before 
> entering the group processing) after bucket is executed. OVS takes care of 
> this by restoring the flow back again in ctx->flow in xlate_group_bucket ().  
> Now if there are errors in any action in the bucket then the bucket 
> processing will halt for the packet. Also any changes to the ctx->flow done 
> in the bucket will not be committed in ctx->odp_actions. So, if we reset the 
> ctx->error back to old value (which must be 0) we still will be able to 
> process further actions in the action-list. 
> 
> Is our understanding correct ? Or are we missing some case which can 
> potentially break because of this change ? We think that 
> XLATE_RECURSION_TOO_DEEP and XLATE_TOO_MANY_RESUBMITS should be treated local 
> for group processing. Ideally we should also be restoring the ctx->resubmits 
> to old value after bucket processing is complete to make sure one bucket does 
> not disrupt processing (exhaust number of resubmits) of other buckets or 
> other actions of the action-list. ctx->depth parameter is already adjusted to 
> original value after bucket processing is done in xlate_group_bucket ().
> 
> We agree that for cases where errors are global in nature and can cause data 
> path issues should not be masked for e.g. XLATE_STACK_TOO_DEEP. Is there any 
>

[ovs-dev] Sale Fresh Water Shrimps HLSO

2017-12-06 Thread Bonesca - Jona

If you are not able to see this mail, click 
http://r.newsletter.bonescamail.nl/7xa28kqjioatrf.html 
[
]( http://r.newsletter.bonescamail.nl/track/click/vp48yay5paoatrd ) 
 
SALE OFF :
 
Fresh water Shrimps block HLSO 6 x 1,4 kg 50%
Origin: Bangladesh
Expiry: December 2018
 
Prices :
 
U5 = € 74,76 per box (code 4395.1)
6/8 = € 63-, per box (code 4395.2)    












   [ Click here for our public archive for offers ]( 
http://r.newsletter.bonescamail.nl/track/click/vp48yay6hqoatrd )     
This email was sent to d...@openvswitch.org
You received this email because you are registered with Bonesca Import en 
Export BV
 
[ Unsubscribe here ]( http://r.newsletter.bonescamail.nl/7xa28kqjioatrg.html )  

Sent by
[  ]( http://r.newsletter.bonescamail.nl/track/click/vp48yay7a6oatrd )     
© 2017 Bonesca Import en Export BV  

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 2/2] Adding configuration option to whitelist DPDK physical ports.

2017-12-06 Thread O Mahony, Billy

Hi Sugesh,

This is definitely a very useful feature. I'm looking forward to running trex 
on the same DUT as my ovs-dpdk.

However I'd suggest adding an sscanf or some such to verify that the domain is 
also specified for each whitelist member. And either add the default of '' 
or complain loudly if the domain is absent.

Currently (without this patch) you must specify the domain when adding ports:
   Vsctl add-port ... options:dpdk-devargs=:05:00.0
Or else an error such as 'Cannot find unplugged device (05:00.0)'  is reported.

And with the patch if you include the domain in the other_config (e.g. 
other_config:dpdk-whitelist-pci-ids=":05:00.0") everything works just as 
before.

However with the patch if you add the whitelist *without* a domain e.g.
ovs-vsctl --no-wait set Open_vSwitch . 
other_config:dpdk-whitelist-pci-ids="05:00.0"

There is no immediate error. However later when doing add-port if you include 
the domain (current required practice) you will get an error. If you omit the 
domain all is well.

It's a little bit strange as regardless of domain or no domain in the 
other_config the PCI probe always reports the NIC as expected:
2017-12-06T16:55:27Z|00013|dpdk|INFO|EAL: PCI device :05:00.0 on NUMA 
socket -1
2017-12-06T16:55:27Z|00014|dpdk|WARN|EAL:   Invalid NUMA socket, default to 0
2017-12-06T16:55:27Z|00015|dpdk|INFO|EAL:   probe driver: 8086:1572 net_i40e

I'll be using the other patch in this series "isolate rte-mempool allocation" 
over the next few days so I'll review that in due course.

Thanks,
Billy.

> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of Sugesh Chandran
> Sent: Friday, November 10, 2017 1:29 AM
> To: d...@openvswitch.org; b...@ovn.org
> Subject: [ovs-dev] [PATCH 2/2] Adding configuration option to whitelist DPDK
> physical ports.
> 
> Adding a OVS configuration option to whitelist DPDK physical ports. By default
> running multiple instances of DPDK on a single platform cannot use physical
> ports at the same time even though they are distinct.
> 
> The eal init scans all the ports that are bound to DPDK and initialize the 
> drivers
> accordingly. This happens for every DPDK process init.
> On a multi instance deployment usecase, it causes issues for using physical 
> NIC
> ports.
> Consider a two DPDK process that are running on a single platform, the second
> DPDK primary process will try to initialize the drivers for all the physical 
> ports
> even though it may be used in first DPDK process.
> 
> To avoid this situation user can whitelist the ports for each DPDK 
> application.
> Whitelisting of ports/PCI-ID in a DPDK process will limit the eal-init only 
> on those
> ports.
> 
> To whitelist two physical ports ":06:00.0" and ":06:00.1", the
> configuration option in OVS would be
>   ovs-vsctl  set Open_vSwitch . other_config:dpdk-whitelist-pci-
> ids=":06:00.0,:06:00.1"
> 
> To update the whitelist ports, OVS daemon has to be restarted.
> 
> Signed-off-by: Sugesh Chandran 
> ---
>  lib/dpdk.c   | 29 +
>  vswitchd/vswitch.xml | 21 +
>  2 files changed, 50 insertions(+)
> 
> diff --git a/lib/dpdk.c b/lib/dpdk.c
> index 9d187c7..0f11977 100644
> --- a/lib/dpdk.c
> +++ b/lib/dpdk.c
> @@ -323,6 +323,34 @@ dpdk_isolate_rte_mem_config(const struct smap
> *ovs_other_config,  }
> 
>  static void
> +dpdk_whitelist_pci_ids(const struct smap *ovs_other_config, char ***argv,
> +   int *argc)
> +{
> +const char *pci_ids;
> +char *pci_dev;
> +int len;
> +int i;
> +pci_ids = smap_get(ovs_other_config, "dpdk-whitelist-pci-ids");
> +if (!pci_ids) {
> +return;
> +}
> +len = strlen(pci_ids);
> +do {
> +i = strcspn(pci_ids, ",");
> +pci_dev = xmemdup0(pci_ids, i);
> +if (!strlen(pci_dev)) {
> + break;
> +}
> +*argv = grow_argv(argv, *argc, 2);
> +(*argv)[(*argc)++] = xstrdup("-w");
> +(*argv)[(*argc)++] = pci_dev;
> +i++;
> +pci_ids += i;
> +len -= i;
> +} while (pci_ids && len > 0);
> +}
> +
> +static void
>  dpdk_init__(const struct smap *ovs_other_config)  {
>  char **argv = NULL, **argv_to_release = NULL; @@ -409,6 +437,7 @@
> dpdk_init__(const struct smap *ovs_other_config)
>  }
> 
>  dpdk_isolate_rte_mem_config(ovs_other_config, , );
> +dpdk_whitelist_pci_ids(ovs_other_config, , );
>  argv = grow_argv(, argc, 1);
>  argv[argc] = NULL;
> 
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index
> 7462b30..0b64b25 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -442,6 +442,27 @@
>  
>
> 
> +  
> +
> +  Specifies list of pci-ids separated by , for whitelisting available
> +  physical NIC ports in OVS. The option valid

Re: [ovs-dev] [PATCH V4 1/2] netdev-dpdk: DPDK v17.11 upgrade

2017-12-06 Thread Jan Scheurich

I tested this patch series together with my PMD Performance Metrics and 
Supervision patch 
(https://mail.openvswitch.org/pipermail/ovs-dev/2017-November/341099.html) and 
it worked out of the box. 

The new rte_vhost API call introduced in DPDK 17.08 my patch uses to read the 
vhostuser queue length compiles and works with DPDK 17.11.

Within measurement accuracy the performance in our standard PVP L3-VPN over 
VXLAN benchmark is the same for DPDK 17.05 and 17.11 (not using IOMMU of 
course).

Regards, Jan

Acked-by: Jan Scheurich 
Tested-by: Jan Scheurich 

> -Original Message-
> From: Mark Kavanagh [mailto:mark.b.kavan...@intel.com]
> Sent: Monday, 04 December, 2017 12:16
> To: d...@openvswitch.org
> Cc: ktray...@redhat.com; maxime.coque...@redhat.com; i.maxim...@samsung.com; 
> Jan Scheurich ;
> sean.k.moo...@intel.com; ian.sto...@intel.com; Mark Kavanagh 
> 
> Subject: [ovs-dev][PATCH V4 1/2] netdev-dpdk: DPDK v17.11 upgrade
> 
> This commit adds support for DPDK v17.11:
> - minor updates to accomodate DPDK API changes
> - update references to DPDK version in Documentation
> - update DPDK version in travis' linux-build script
> 
> Signed-off-by: Mark Kavanagh 
> Acked-by: Maxime Coquelin 
> Acked-by: Ciara Loftus 
> ---
>  .travis/linux-build.sh   |  2 +-
>  Documentation/faq/releases.rst   |  1 +
>  Documentation/intro/install/dpdk.rst | 10 +-
>  Documentation/topics/dpdk/ring.rst   |  2 +-
>  Documentation/topics/dpdk/vhost-user.rst |  8 
>  NEWS |  2 ++
>  lib/netdev-dpdk.c|  5 +++--
>  7 files changed, 17 insertions(+), 13 deletions(-)
> 
> diff --git a/.travis/linux-build.sh b/.travis/linux-build.sh
> index 4d6459f..ed28ee4 100755
> --- a/.travis/linux-build.sh
> +++ b/.travis/linux-build.sh
> @@ -81,7 +81,7 @@ fi
> 
>  if [ "$DPDK" ]; then
>  if [ -z "$DPDK_VER" ]; then
> -DPDK_VER="17.05.2"
> +DPDK_VER="17.11"
>  fi
>  install_dpdk $DPDK_VER
>  if [ "$CC" = "clang" ]; then
> diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
> index d903b06..62a1957 100644
> --- a/Documentation/faq/releases.rst
> +++ b/Documentation/faq/releases.rst
> @@ -164,6 +164,7 @@ Q: What DPDK version does each Open vSwitch release work 
> with?
>  2.6.x16.07.2
>  2.7.x16.11.3
>  2.8.x17.05.2
> +2.9.x17.11
>   ===
> 
>  Q: I get an error like this when I configure Open vSwitch:
> diff --git a/Documentation/intro/install/dpdk.rst 
> b/Documentation/intro/install/dpdk.rst
> index bb69ae5..3fecb5c 100644
> --- a/Documentation/intro/install/dpdk.rst
> +++ b/Documentation/intro/install/dpdk.rst
> @@ -40,7 +40,7 @@ Build requirements
>  In addition to the requirements described in :doc:`general`, building Open
>  vSwitch with DPDK will require the following:
> 
> -- DPDK 17.05.2
> +- DPDK 17.11
> 
>  - A `DPDK supported NIC`_
> 
> @@ -69,9 +69,9 @@ Install DPDK
>  #. Download the `DPDK sources`_, extract the file and set ``DPDK_DIR``::
> 
> $ cd /usr/src/
> -   $ wget http://fast.dpdk.org/rel/dpdk-17.05.2.tar.xz
> -   $ tar xf dpdk-17.05.2.tar.xz
> -   $ export DPDK_DIR=/usr/src/dpdk-stable-17.05.2
> +   $ wget http://fast.dpdk.org/rel/dpdk-17.11.tar.xz
> +   $ tar xf dpdk-17.11.tar.xz
> +   $ export DPDK_DIR=/usr/src/dpdk-17.11
> $ cd $DPDK_DIR
> 
>  #. (Optional) Configure DPDK as a shared library
> @@ -583,7 +583,7 @@ Limitations
>The latest list of validated firmware versions can be found in the `DPDK
>release notes`_.
> 
> -.. _DPDK release notes: 
> http://dpdk.org/doc/guides/rel_notes/release_17_05.html
> +.. _DPDK release notes: 
> http://dpdk.org/doc/guides/rel_notes/release_17_11.html
> 
>  Reporting Bugs
>  --
> diff --git a/Documentation/topics/dpdk/ring.rst 
> b/Documentation/topics/dpdk/ring.rst
> index ad9d7a5..8d0ede8 100644
> --- a/Documentation/topics/dpdk/ring.rst
> +++ b/Documentation/topics/dpdk/ring.rst
> @@ -77,4 +77,4 @@ DPDK. However, this functionality was removed because:
>  - :doc:`vhost-user interfaces ` are the defacto DPDK-based path 
> to
>guests
> 
> -.. _DPDK documentation: 
> https://dpdk.readthedocs.io/en/v17.05/prog_guide/ring_lib.html
> +.. _DPDK documentation: 
> https://dpdk.readthedocs.io/en/v17.11/prog_guide/ring_lib.html
> diff --git a/Documentation/topics/dpdk/vhost-user.rst 
> b/Documentation/topics/dpdk/vhost-user.rst
> index 74ac06e..5347995 100644
> --- a/Documentation/topics/dpdk/vhost-user.rst
> +++ b/Documentation/topics/dpdk/vhost-user.rst
> @@ -292,9 +292,9 @@ To begin, instantiate a guest as described in 
> :ref:`dpdk-vhost-user` or
>  DPDK sources to VM and build DPDK::
> 
>  $ cd

Re: [ovs-dev] [PATCH v2] packets: Prefetch the packet metadata in cacheline1.

2017-12-06 Thread Chandran, Sugesh


Tested on Intel Haswell platform and Acked!
(Provided there are no performance regression are reported so far on other 
platforms for various tests too)


Regards
_Sugesh

> -Original Message-
> From: Bodireddy, Bhanuprakash
> Sent: Monday, December 4, 2017 8:10 PM
> To: d...@openvswitch.org
> Cc: Chandran, Sugesh ; Bodireddy, Bhanuprakash
> ; Ben Pfaff 
> Subject: [PATCH v2] packets: Prefetch the packet metadata in cacheline1.
> 
> pkt_metadata_prefetch_init() is used to prefetch the packet metadata before
> initializing the metadata in pkt_metadata_init(). This is done for every 
> packet in
> userspace datapath and is performance critical.
> 
> Commit 99fc16c0 prefetches only cachline0 and cacheline2 as the metadata
> part of respective cachelines will be initialized by pkt_metadata_init().
> 
> However in VXLAN case when popping the vxlan header,
> netdev_vxlan_pop_header() invokes pkt_metadata_init_tnl() which zeroes out
> metadata part of
> cacheline1 that wasn't prefetched earlier and causes performance degradation.
> 
> By prefetching cacheline1, 9% performance improvement is observed with vxlan
> decapsulation test case for packet sizes of 118 bytes. Performance variation 
> is
> observed based on CFLAGS.
> 
>CFLAGS="-O2"CFLAGS="-O2 -msse4.2"
>   Master  4.667 Mpps Master   4.710 Mpps
>   With Patch  5.045 Mpps With Patch   5.097 Mpps
> 
>   CFLAGS="-O2 -march=native" CFLAGS="-Ofast -march=native"
>   Master  5.072 Mpps Master   5.349 Mpps
>   With Patch  5.193 Mpps With Patch   5.378 Mpps
> 
> CC: Ben Pfaff 
> Fixes: 99fc16c0 ("Reorganize the pkt_metadata structure.")
> Signed-off-by: Bhanuprakash Bodireddy 
> ---
> v2->v1
>  * Include the throughput stats with different CFLAG options.
> 
>  lib/packets.h | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/packets.h b/lib/packets.h index 13ea46d..74bec5d 100644
> --- a/lib/packets.h
> +++ b/lib/packets.h
> @@ -159,7 +159,8 @@ pkt_metadata_init(struct pkt_metadata *md,
> odp_port_t port)  }
> 
>  /* This function prefetches the cachelines touched by pkt_metadata_init()
> - * For performance reasons the two functions should be kept in sync. */
> + * and pkt_metadata_init_tnl().  For performance reasons the two
> + functions
> + * should be kep in sync. */
>  static inline void
>  pkt_metadata_prefetch_init(struct pkt_metadata *md)  { @@ -167,6 +168,10
> @@ pkt_metadata_prefetch_init(struct pkt_metadata *md)
>   * be initialized later in pkt_metadata_init(). */
>  OVS_PREFETCH(md->cacheline0);
> 
> +/* Prefetch cacheline1 as members of this cacheline will be zeroed out
> + * in pkt_metadata_init_tnl(). */
> +OVS_PREFETCH(md->cacheline1);
> +
>  /* Prefetch cachline2 as ip_dst & ipv6_dst fields will be initialized. */
>  OVS_PREFETCH(md->cacheline2);
>  }
> --
> 2.4.11

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] ofproto-dpif-xlate: Incorrect handling of errors in group action processing

2017-12-06 Thread Vishal Deep Ajmera

Hi Ben,

Thank you for reviewing the patch. Please see inline for comments.

-Original Message-
From: Ben Pfaff [mailto:b...@ovn.org] 
Sent: Tuesday, December 05, 2017 12:52 AM
To: Vishal Deep Ajmera 
Cc: d...@openvswitch.org
Subject: Re: [ovs-dev] [PATCH] ofproto-dpif-xlate: Incorrect handling of errors 
in group action processing

On Mon, Dec 04, 2017 at 08:47:38AM +, Vishal Deep Ajmera wrote:
> As per OpenFlow v1.3 specification, when an action list contains a 
> group action a copy of the packet is passed to the group for 
> processing by the group. This means that if there is an error 
> encountered during group processing, only the copy of packet should be 
> dropped, but subsequent actions in the action list should be executed on the 
> original packet.
> 
> Additionally, if the group type is "ALL", each action bucket of the 
> group should process a copy of the packet. If there is an error while 
> processing one bucket other buckets should still be processed.
> 
> Example 1:
> table=0,in_port=tap0 actions=output:tap1,group:10,output:tap2
> 
> Even if any error is encountered while processing the group action, 
> the packet should still be forwarded to ports tap1 and tap2.
> 
> Example 2:
> group_id=1,type=all,bucket=actions=output:tap1,bucket=actions=encap(et
> h)
> 
> Even if processing the action in the second bucket fails because the 
> packet already has an Ethernet header, the other copy of the packet 
> should still be processed by the first bucket and output to port tap1.
> 
> Currently the error handling in OVS does not comply with those rules. 
> When any group bucket execution fails the error is recorded in the 
> so-called "translation context" which is global for the processing of 
> the original packet. Once an error is recorded, OVS skips processing 
> subsequent buckets and installs a drop action in the datapath even if 
> parts of the action list were previously processed successfully.
> 
> This patch clears the error flag after any bucket of a group is executed.
> This way the error encountered in processing any bucket of the group 
> will not impact other actions of action-list or other buckets of the group.
> 
> Signed-off-by: Vishal Deep Ajmera 
> Signed-off-by: Keshav Gupta 

Thank you for thinking about this issue.  Clearly there is something to be done 
here.

When I look at the uses of translation errors, I see now that there are a few 
categories:

   1. Translation can't start at all: XLATE_NO_RECIRCULATION_CONTEXT,
  XLATE_BRIDGE_NOT_FOUND, XLATE_RECIRCULATION_CONFLICT,
  XLATE_INVALID_TUNNEL_METADATA.

   2. Translation has run too long or used too much space and must end
  to bound memory or CPU time.  XLATE_RECURSION_TOO_DEEP and
  XLATE_TOO_MANY_RESUBMITS are in this category for time,
  XLATE_STACK_TOO_DEEP for space.

   3. Some step can't be executed: XLATE_TOO_MANY_MPLS_LABELS,
  XLATE_UNSUPPORTED_PACKET_TYPE.

I believe that the appropriate handling varies among category.  Category
1 is fundamentally unrecoverable.  For category 2, we must not try to recover 
because that would extend CPU time (perhaps exponentially) or it would cause 
malfunctions (because a later "pop" would miss out on some data that wasn't 
"push"ed).  For category 3, we could recover at an appropriate point in the way 
you suggest.
===
[Vishal/Keshav] Any packet modification done in group bucket processing is 
local to bucket only and original packet is retained (as it was before entering 
the group processing) after bucket is executed. OVS takes care of this by 
restoring the flow back again in ctx->flow in xlate_group_bucket ().  Now if 
there are errors in any action in the bucket then the bucket processing will 
halt for the packet. Also any changes to the ctx->flow done in the bucket will 
not be committed in ctx->odp_actions. So, if we reset the ctx->error back to 
old value (which must be 0) we still will be able to process further actions in 
the action-list. 

Is our understanding correct ? Or are we missing some case which can 
potentially break because of this change ? We think that 
XLATE_RECURSION_TOO_DEEP and XLATE_TOO_MANY_RESUBMITS should be treated local 
for group processing. Ideally we should also be restoring the ctx->resubmits to 
old value after bucket processing is complete to make sure one bucket does not 
disrupt processing (exhaust number of resubmits) of other buckets or other 
actions of the action-list. ctx->depth parameter is already adjusted to 
original value after bucket processing is done in xlate_group_bucket ().

We agree that for cases where errors are global in nature and can cause data 
path issues should not be masked for e.g. XLATE_STACK_TOO_DEEP. Is there any 
more such errors we should not mask ?
===

I think that the patch should be rewritten to make these more fine grained 
distinctions.  I don't think that

Re: [ovs-dev] [PATCH] Configurable Link State Change (LSC) detection mode

2017-12-06 Thread Róbert Mulik

Hi Ilya,

Thank you for the comments, I try to refactor the code according to them.

Regards,
Robert

-Original Message-
From: Ilya Maximets [mailto:i.maxim...@samsung.com] 
Sent: Wednesday, December 06, 2017 12:40 PM
To: ovs-dev@openvswitch.org; Róbert Mulik 
Subject: Re: [ovs-dev] [PATCH] Configurable Link State Change (LSC) detection 
mode

Hi Robert.

Few general comments:

If you'll move per-interface config from 'other_config' to 'options' column, 
you'll be able to localize code changes to only dpdk.c and netdev-dpdk.c.
The changes in all other code files are redundant especially because the option 
could only affect dpdk ports.
The second vote for 'options' is that all other interface configuration options 
like number of RX/TX queues, descriptor ring sizes are located in 'options' not 
in 'other_config'.
The third is that you'll be able to add proper 'area' prefix to you patch.
It's really hard to recognize that this patch is about DPDK while looking at 
its name.

Also, please, check you patch with ./utilities/checkpatch.py script.
I see at least few too long lines. Check out coding-style.rst for details.

Best regards, Ilya Maximets.

> It is possible to set LSC detection mode to polling or interrupt mode 
> for DPDK interfaces. The default is polling mode. To set interrupt 
> mode, option dpdk-lsc-interrupt has to be set to true.
> 
> 
> Global settings
> Service restart is necessary for the global settings to take effect.
> 
> Command to set interrupt mode for all interfaces:
> ovs-vsctl set Open_vSwitch . other_config:dpdk-lsc-interrupt=true
> 
> Command to set polling mode for all interfaces:
> ovs-vsctl set Open_vSwitch . other_config:dpdk-lsc-interrupt=false
> or:
> ovs-vsctl remove Open_vSwitch . other_config dpdk-lsc-interrupt
> 
> 
> Interface specific settings (override global settings) Service restart 
> is not necessary to take effect.
> 
> Command to set interrupt mode for a specific interface:
> ovs-vsctl set interface  
> other_config:dpdk-lsc-interrupt=true
> 
> Command to set polling mode for a specific interface:
> ovs-vsctl set interface  
> other_config:dpdk-lsc-interrupt=false
> 
> Command to reset to globally defined mode for a specific interface:
> ovs-vsctl remove interface  other_config 
> dpdk-lsc-interrupt
> 
> Signed-off-by: Robert Mulik 
> ---
>  lib/dpdk.c|  6 ++
>  lib/netdev-dpdk.c | 60 
> ++-
>  lib/netdev-dpdk.h |  8 +++
>  lib/netdev-dummy.c|  2 ++
>  lib/netdev-linux.c|  2 ++
>  lib/netdev-provider.h |  3 +++
>  lib/netdev-vport.c|  2 ++
>  lib/netdev.c  | 32 +++
>  lib/netdev.h  |  2 ++
>  vswitchd/bridge.c | 34 +
>  vswitchd/vswitch.xml  | 42 
>  11 files changed, 192 insertions(+), 1 deletion(-)
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] Ademe / Okra / Pondu / Attieke / Placali

2017-12-06 Thread Bonesca - Jona

If you are not able to see this mail, click 
http://r.newsletter.bonescamail.nl/7xa28koymoatrf.html 

 
 
New arrivals :
 
Egypt:
8666 Frozen Okra Zero/Gombo 20 x 400g € 0,99 per pack
8674 Frozen Molokhia / Ewedu / Krain / Ademe 20 x 400g € 0,89 per pack
 
Ivory Coast:
8722 Attieke / Cassava Couscous Ivory Coast 20 x 500 gr € 1,09 per pack
8765 Placali Manioc/Cassavedough/Watafufu 20 x 500 gr € 1,09 per pack
 
Vietnam:
8770 Pondu/Saka Saka Congostyle/Cassavaleaves 20x500gr € 0,69 per pack
 
Scroll down for the pictures:    



















 
This email was sent to d...@openvswitch.org
You received this email because you are registered with Bonesca Import en 
Export BV
 
[ Unsubscribe here ]( http://r.newsletter.bonescamail.nl/7xa28koymoatrg.html )  

Sent by
[  ]( http://r.newsletter.bonescamail.nl/track/click/vp48yaru5aoatrd )     
© 2017 Bonesca Import en Export BV  

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] Configurable Link State Change (LSC) detection mode

2017-12-06 Thread Ilya Maximets

Hi Robert.

Few general comments:

If you'll move per-interface config from 'other_config' to 'options' column,
you'll be able to localize code changes to only dpdk.c and netdev-dpdk.c.
The changes in all other code files are redundant especially because
the option could only affect dpdk ports.
The second vote for 'options' is that all other interface configuration
options like number of RX/TX queues, descriptor ring sizes are located
in 'options' not in 'other_config'.
The third is that you'll be able to add proper 'area' prefix to you patch.
It's really hard to recognize that this patch is about DPDK while looking
at its name.

Also, please, check you patch with ./utilities/checkpatch.py script.
I see at least few too long lines. Check out coding-style.rst for details.

Best regards, Ilya Maximets.

> It is possible to set LSC detection mode to polling or interrupt mode
> for DPDK interfaces. The default is polling mode. To set interrupt mode,
> option dpdk-lsc-interrupt has to be set to true.
> 
> 
> Global settings
> Service restart is necessary for the global settings to take effect.
> 
> Command to set interrupt mode for all interfaces:
> ovs-vsctl set Open_vSwitch . other_config:dpdk-lsc-interrupt=true
> 
> Command to set polling mode for all interfaces:
> ovs-vsctl set Open_vSwitch . other_config:dpdk-lsc-interrupt=false
> or:
> ovs-vsctl remove Open_vSwitch . other_config dpdk-lsc-interrupt
> 
> 
> Interface specific settings (override global settings)
> Service restart is not necessary to take effect.
> 
> Command to set interrupt mode for a specific interface:
> ovs-vsctl set interface  other_config:dpdk-lsc-interrupt=true
> 
> Command to set polling mode for a specific interface:
> ovs-vsctl set interface  other_config:dpdk-lsc-interrupt=false
> 
> Command to reset to globally defined mode for a specific interface:
> ovs-vsctl remove interface  other_config dpdk-lsc-interrupt
> 
> Signed-off-by: Robert Mulik 
> ---
>  lib/dpdk.c|  6 ++
>  lib/netdev-dpdk.c | 60 
> ++-
>  lib/netdev-dpdk.h |  8 +++
>  lib/netdev-dummy.c|  2 ++
>  lib/netdev-linux.c|  2 ++
>  lib/netdev-provider.h |  3 +++
>  lib/netdev-vport.c|  2 ++
>  lib/netdev.c  | 32 +++
>  lib/netdev.h  |  2 ++
>  vswitchd/bridge.c | 34 +
>  vswitchd/vswitch.xml  | 42 
>  11 files changed, 192 insertions(+), 1 deletion(-)
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 2/2] xlate: normalize the actions after translation

2017-12-06 Thread Zoltán Balogh

Hello Ben,

I added 'const' to the ctx members to be copied during backup in order to see
how big the code impact would be in case of using copy-on-write like method
as you proposed when we were talking about this problem in a conference break.
Actually there were more than 80 places indicated by the compiler in the code.
These were mostly in function headers, so I would say several hundreds of code
lines would be involved.

Instead of this approach I reduced the size of backup data structure by two
times the size of 'struct flow' and used bitwise operations instead of memcpy
for storing backup data. I moved backup data into 'struct ctx', its
initialization is performed when 'ctx' is created in xlate_actions().

I sent this second version of the series to the mailing list. Could you review
it, please? Do you think, copy-on-write is still needed?

https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341655.html
https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341657.html
https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341658.html

https://patchwork.ozlabs.org/patch/845118/
https://patchwork.ozlabs.org/patch/845127/

Best regards,
Zoltan

> -Original Message-
> From: Ben Pfaff [mailto:b...@ovn.org]
> Sent: Monday, November 13, 2017 8:02 PM
> To: Zoltán Balogh 
> Cc: Chandran, Sugesh ; d...@openvswitch.org; Jan 
> Scheurich 
> Subject: Re: [ovs-dev] [PATCH 2/2] xlate: normalize the actions after 
> translation
> 
> Here is an idea that just occurred to me.  What if we used a kind of
> copy-on-write approach?  That is, whenever the flow is modified, we
> first check whether we need to do a "backup" of it?  This should reduce
> the number of copies greatly, because flow modifications are relatively
> rare compared to, say, "resubmit" actions.  It will require a little
> work to make the code in ofproto-dpif-xlate call the right function to
> do the copying, but I think that it would be easier to implement and
> maintain than the approaches you suggest (assuming that I understand
> them correctly).
> 
> On Mon, Nov 13, 2017 at 05:28:24PM +, Zoltán Balogh wrote:
> > Hello Ben,
> >
> > Thank you for the review. I have been thinking about possible solutions.
> >
> > 1) We could use non-temporary (non-cached) copy when backing up data. That 
> > would avoid using write cache and make
> writing memory faster. In turn reading data back right after the write would 
> be slower but that should not be the
> regular case. Furthermore it should not be available for all hardware 
> platforms.
> >
> > 2) Sugesh came up with the idea to do conditional backup and restore of 
> > data. We should keep track if xlate flow,
> wildcard and base_flow are modified in translation and only copy needed 
> fields. I think this would be the best
> choice but has large code impact and would make code maintenance more 
> complicated.
> >
> > 3) Another option would be to spilt up the flow structure into partitions. 
> > Then instead of copying the whole
> structure, partitions could be checked for change by using memcmp(), then do 
> the copy in case of change. We could
> use something like this:
> >
> > #define FLOW_PART_LEN(start, end) \
> > ((char *)&((struct flow *)0)->end) - ((char *)&((struct flow 
> > *)0)->start)
> >
> > #define COMPARE_FLOW_PART(dst, src, start, end) \
> > memcmp(>start, >start, FLOW_PART_LEN(start, end))
> >
> > #define COPY_FLOW_PART(dst, src, start, end) \
> > memcpy(>start, >start, FLOW_PART_LEN(start, end))
> >
> > static void
> > copy_altered_flow_data(struct flow *dst, const struct flow *src)
> > {
> > BUILD_ASSERT_DECL(FLOW_WC_SEQ == 40);
> >
> > /* Metadata 1 */
> > if (!COMPARE_FLOW_PART(dst, src, tunnel, metadata)) {
> > COPY_FLOW_PART(dst, src, tunnel, metadata);
> > }
> >
> > /* Metadata 2 */
> > if (!COMPARE_FLOW_PART(dst, src, metadata, dl_dst)) {
> > COPY_FLOW_PART(dst, src, metadata, dl_dst);
> > }
> >
> > /* L2 */
> > if (!COMPARE_FLOW_PART(dst, src, dl_dst, nw_src)) {
> > COPY_FLOW_PART(dst, src, dl_dst, nw_src);
> > }
> >
> > /* L3 */
> > if (!COMPARE_FLOW_PART(dst, src, nw_src, tp_src)) {
> > COPY_FLOW_PART(dst, src, nw_src, tp_src);
> > }
> >
> > /* L4 */
> > if (!COMPARE_FLOW_PART(dst, src, tp_src, pad3)) {
> > COPY_FLOW_PART(dst, src, tp_src, pad3);
> > }
> > }
> >
> > static void
> > store_ctx_xlate_data(struct xlate_backup_data *data,
> >  const struct xlate_ctx *ctx, bool force)
> > {
> > data->odp_actions_size = ctx->odp_actions->size;
> > if (force) {
> > data->wc_masks = ctx->wc->masks;
> > data->flow = ctx->xin->flow;
> > data->base_flow = ctx->base_flow;
> > } else {
> > copy_altered_flow_data(>wc_masks, >wc->masks);
> > copy_altered_flow_data(>flow, >xin->flow);
> >

Re: [ovs-dev] [PATCH 2/2] xlate: normalize the actions after translation

2017-12-06 Thread Zoltán Balogh

Subject is incorrect, missing v2. Please use this instead:
https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341658.html


> -Original Message-
> From: Zoltán Balogh
> Sent: Wednesday, December 06, 2017 11:55 AM
> To: d...@openvswitch.org
> Cc: Zoltán Balogh ; Sugesh Chandran 
> ; Ben Pfaff 
> Subject: [PATCH 2/2] xlate: normalize the actions after translation
> 
> When all OF actions have been translated, there could be actions at
> the end of list of odp actions which are not needed to be executed.
> So, the list can be normalized at the end of xlate_actions().
> 
> Signed-off-by: Zoltan Balogh 
> Signed-off-by: Sugesh Chandran 
> Co-authored-by: Sugesh Chandran 
> CC: Ben Pfaff 
> ---
>  ofproto/ofproto-dpif-xlate.c | 67 
> 
>  tests/ofproto-dpif.at| 48 +++
>  2 files changed, 115 insertions(+)
> 
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 36d0a0e1f..2af1ec1e8 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -6954,6 +6954,70 @@ xlate_wc_finish(struct xlate_ctx *ctx)
>  }
>  }
> 
> +/* Returns true if the action stored in 'nla' can be a valid last action of a
> + * datapath flow. */
> +static bool
> +is_valid_last_action(const struct nlattr *nla)
> +{
> +enum ovs_action_attr action_type = nl_attr_type(nla);
> +
> +switch (action_type) {
> +case OVS_ACTION_ATTR_USERSPACE:
> +case OVS_ACTION_ATTR_SAMPLE:
> +case OVS_ACTION_ATTR_TRUNC:
> +case OVS_ACTION_ATTR_RECIRC:
> +case OVS_ACTION_ATTR_TUNNEL_PUSH:
> +case OVS_ACTION_ATTR_TUNNEL_POP:
> +case OVS_ACTION_ATTR_OUTPUT:
> +case OVS_ACTION_ATTR_CLONE:
> +case OVS_ACTION_ATTR_CT:
> +return true;
> +case OVS_ACTION_ATTR_UNSPEC:
> +case OVS_ACTION_ATTR_SET:
> +case OVS_ACTION_ATTR_PUSH_VLAN:
> +case OVS_ACTION_ATTR_POP_VLAN:
> +case OVS_ACTION_ATTR_HASH:
> +case OVS_ACTION_ATTR_PUSH_MPLS:
> +case OVS_ACTION_ATTR_POP_MPLS:
> +case OVS_ACTION_ATTR_SET_MASKED:
> +case OVS_ACTION_ATTR_PUSH_ETH:
> +case OVS_ACTION_ATTR_POP_ETH:
> +case OVS_ACTION_ATTR_METER:
> +case OVS_ACTION_ATTR_ENCAP_NSH:
> +case OVS_ACTION_ATTR_DECAP_NSH:
> +case __OVS_ACTION_ATTR_MAX:
> +default:
> +return false;
> +}
> +}
> +
> +/* Returns offset of last netlink attribute storing valid action in array
> + * 'data'. Execution of actions beyond this last attribute does not make 
> sense.
> + */
> +static size_t
> +last_action_offset(const struct nlattr *data, const size_t data_len)
> +{
> +const struct nlattr *last = data;
> +
> +uint16_t left;
> +const struct nlattr *a;
> +NL_ATTR_FOR_EACH (a, left, data, data_len) {
> +if (is_valid_last_action(a)) {
> +last = nl_attr_next(a);
> +}
> +}
> +
> +return (char *) last - (char *) data;
> +}
> +
> +/* Get rid of any unneeded actions at the tail end. */
> +static void
> +normalize_odp_actions(struct xlate_ctx *ctx)
> +{
> +struct ofpbuf *oa = ctx->odp_actions;
> +oa->size = last_action_offset(oa->data, oa->size);
> +}
> +
>  /* Translates the flow, actions, or rule in 'xin' into datapath actions in
>   * 'xout'.
>   * The caller must take responsibility for eventually freeing 'xout', with
> @@ -7364,7 +7428,10 @@ exit:
>  if (xin->odp_actions) {
>  ofpbuf_clear(xin->odp_actions);
>  }
> +} else {
> +normalize_odp_actions();
>  }
> +
>  return ctx.error;
>  }
> 
> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
> index e7df1504a..5bcd8bb46 100644
> --- a/tests/ofproto-dpif.at
> +++ b/tests/ofproto-dpif.at
> @@ -10133,3 +10133,51 @@ AT_CHECK([grep "Datapath actions" stdout], [0],
> 
>  OVS_VSWITCHD_STOP
>  AT_CLEANUP
> +
> +AT_SETUP([ofproto-dpif - xlate error - normalize actions])
> +
> +#  ->-+
> +# | p1
> +#   +-o---+
> +#   |   br0   |
> +#   +-o-o-+
> +#   patch0| |patch1
> +# +-->--+
> +
> +OVS_VSWITCHD_START([dnl
> +-- add-port br0 patch0 \
> +-- set interface patch0 type=patch options:peer=patch1 ofport_request=10 
> \
> +-- add-port br0 patch1 \
> +-- set interface patch1 type=patch options:peer=patch0 ofport_request=20
> +])
> +
> +AT_CHECK([
> +ovs-vsctl add-port br0 p1 -- set interface p1 type=dummy ofport_request=1
> +])
> +
> +AT_CHECK([
> +ovs-appctl dpif/show | grep -v hit | sed "s/$(printf \\t)//g" | sed 
> 's./[[0-9]]\{1,\}..'
> +], [0], [dnl
> +br0:
> +br0 65534: (dummy-internal)
> +p1 1: (dummy)
> +patch0 10/none: (patch: peer=patch1)
> +patch1 20/none: (patch: peer=patch0)
> +])
> +
> +# Error due to pushing too many MPLS labels.
> +AT_CHECK([
> +

[ovs-dev] [PATCH v2 2/2] xlate: normalize the actions after translation

2017-12-06 Thread Zoltan Balogh

When all OF actions have been translated, there could be actions at
the end of list of odp actions which are not needed to be executed.
So, the list can be normalized at the end of xlate_actions().

Signed-off-by: Zoltan Balogh 
Signed-off-by: Sugesh Chandran 
Co-authored-by: Sugesh Chandran 
CC: Ben Pfaff 
---
 ofproto/ofproto-dpif-xlate.c | 67 
 tests/ofproto-dpif.at| 48 +++
 2 files changed, 115 insertions(+)

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 36d0a0e1f..2af1ec1e8 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -6954,6 +6954,70 @@ xlate_wc_finish(struct xlate_ctx *ctx)
 }
 }
 
+/* Returns true if the action stored in 'nla' can be a valid last action of a
+ * datapath flow. */
+static bool
+is_valid_last_action(const struct nlattr *nla)
+{
+enum ovs_action_attr action_type = nl_attr_type(nla);
+
+switch (action_type) {
+case OVS_ACTION_ATTR_USERSPACE:
+case OVS_ACTION_ATTR_SAMPLE:
+case OVS_ACTION_ATTR_TRUNC:
+case OVS_ACTION_ATTR_RECIRC:
+case OVS_ACTION_ATTR_TUNNEL_PUSH:
+case OVS_ACTION_ATTR_TUNNEL_POP:
+case OVS_ACTION_ATTR_OUTPUT:
+case OVS_ACTION_ATTR_CLONE:
+case OVS_ACTION_ATTR_CT:
+return true;
+case OVS_ACTION_ATTR_UNSPEC:
+case OVS_ACTION_ATTR_SET:
+case OVS_ACTION_ATTR_PUSH_VLAN:
+case OVS_ACTION_ATTR_POP_VLAN:
+case OVS_ACTION_ATTR_HASH:
+case OVS_ACTION_ATTR_PUSH_MPLS:
+case OVS_ACTION_ATTR_POP_MPLS:
+case OVS_ACTION_ATTR_SET_MASKED:
+case OVS_ACTION_ATTR_PUSH_ETH:
+case OVS_ACTION_ATTR_POP_ETH:
+case OVS_ACTION_ATTR_METER:
+case OVS_ACTION_ATTR_ENCAP_NSH:
+case OVS_ACTION_ATTR_DECAP_NSH:
+case __OVS_ACTION_ATTR_MAX:
+default:
+return false;
+}
+}
+
+/* Returns offset of last netlink attribute storing valid action in array
+ * 'data'. Execution of actions beyond this last attribute does not make sense.
+ */
+static size_t
+last_action_offset(const struct nlattr *data, const size_t data_len)
+{
+const struct nlattr *last = data;
+
+uint16_t left;
+const struct nlattr *a;
+NL_ATTR_FOR_EACH (a, left, data, data_len) {
+if (is_valid_last_action(a)) {
+last = nl_attr_next(a);
+}
+}
+
+return (char *) last - (char *) data;
+}
+
+/* Get rid of any unneeded actions at the tail end. */
+static void
+normalize_odp_actions(struct xlate_ctx *ctx)
+{
+struct ofpbuf *oa = ctx->odp_actions;
+oa->size = last_action_offset(oa->data, oa->size);
+}
+
 /* Translates the flow, actions, or rule in 'xin' into datapath actions in
  * 'xout'.
  * The caller must take responsibility for eventually freeing 'xout', with
@@ -7364,7 +7428,10 @@ exit:
 if (xin->odp_actions) {
 ofpbuf_clear(xin->odp_actions);
 }
+} else {
+normalize_odp_actions();
 }
+
 return ctx.error;
 }
 
diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index e7df1504a..5bcd8bb46 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -10133,3 +10133,51 @@ AT_CHECK([grep "Datapath actions" stdout], [0],
 
 OVS_VSWITCHD_STOP
 AT_CLEANUP
+
+AT_SETUP([ofproto-dpif - xlate error - normalize actions])
+
+#  ->-+
+# | p1
+#   +-o---+
+#   |   br0   |
+#   +-o-o-+
+#   patch0| |patch1
+# +-->--+
+
+OVS_VSWITCHD_START([dnl
+-- add-port br0 patch0 \
+-- set interface patch0 type=patch options:peer=patch1 ofport_request=10 \
+-- add-port br0 patch1 \
+-- set interface patch1 type=patch options:peer=patch0 ofport_request=20
+])
+
+AT_CHECK([
+ovs-vsctl add-port br0 p1 -- set interface p1 type=dummy ofport_request=1
+])
+
+AT_CHECK([
+ovs-appctl dpif/show | grep -v hit | sed "s/$(printf \\t)//g" | sed 
's./[[0-9]]\{1,\}..'
+], [0], [dnl
+br0:
+br0 65534: (dummy-internal)
+p1 1: (dummy)
+patch0 10/none: (patch: peer=patch1)
+patch1 20/none: (patch: peer=patch0)
+])
+
+# Error due to pushing too many MPLS labels.
+AT_CHECK([
+ovs-ofctl del-flows br0
+ovs-ofctl add-flow br0 
"table=0,in_port=1,actions=mod_dl_src:aa:aa:aa:bb:bb:00,goto_table:1" 
-OOpenFlow13
+ovs-ofctl add-flow br0 "table=0,in_port=patch1,actions=goto_table:2" 
-OOpenFlow13
+ovs-ofctl add-flow br0 
"table=1,actions=push_vlan:0x8100,set_field:4196->vlan_vid,output:patch0" 
-OOpenFlow13
+ovs-ofctl add-flow br0 "table=2,actions=push_mpls:0x8847,output:patch0"
+], [0])
+
+AT_CHECK([ovs-appctl ofproto/trace br0 'in_port=1,ip'], [0], [stdout])
+AT_CHECK([tail -1 stdout], [0],
+  [Datapath actions: drop
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
-- 
2.14.1

___
dev mailing list
d...@openvswitch.org

[ovs-dev] [PATCH v2 1/2] xlate: rollback to valid known state in case of ctx->error on translation

2017-12-06 Thread Zoltan Balogh

If several bridges are connected via patch ports or the packet is sent
via tunnel and an error occurs during translation in a subsequent bridge,
e.g. Ethernet header cannot be pushed to the packet due to lack of
prerequisites in the second bridge and there were flow actions already
translated in first bridge, it can happen that the installed datapath
flow contains actions which are going to be performed for all received
packets, however the packets are going to be dropped in the end. This
is waste of processing resource.

Let's have a config like this and inject L2 Ethernet packets into br0
via LOCAL port:

   ->-+
  | LOCAL  LOCAL   LOCAL
+-o---+   +---o-+   +-+   +---o-+
|   br0   |   |   br1   |   |   br2   |   |   br3   |
+---o-+   +-o-o-+   +-o-o-+   +-o---+
p01 |   p10 | | p12   p21 | | p23   p32 |
+---+ +---+ +---+

netdev@ovs-netdev: hit:17754528 missed:57
br0:
br0 65534/1: (tap)
p01 10/none: (patch: peer=p10)
br1:
br1 65534/2: (tap)
p10 20/none: (patch: peer=p01)
p12 30/none: (patch: peer=p21)
br2:
br2 65534/3: (tap)
p21 40/none: (patch: peer=p12)
p23 50/none: (patch: peer=p32)
br3:
br3 65534/4: (tap)
p32 60/none: (patch: peer=p23)

- OpenFlow rules:

"br0"
OFPST_FLOW reply (OF1.3) (xid=0x2):
cookie=0x0, duration=15.601s, table=0, n_packets=20739, n_bytes=2032422,
reset_counts in_port=LOCAL actions=dec_ttl,output:10

"br1"
OFPST_FLOW reply (OF1.3) (xid=0x2):
cookie=0x0, duration=15.568s, table=0, n_packets=20739, n_bytes=2032422,
reset_counts ip,in_port=20
actions=dec_ttl,dec_ttl,dec_ttl,dec_ttl,LOCAL,
set_field:ee:ee:ee:ff:ff:01->eth_dst,output:30

"br2"
OFPST_FLOW reply (OF1.3) (xid=0x2):
cookie=0x0, duration=15.536s, table=0, n_packets=20739, n_bytes=2032422,
ip,in_port=40
actions=dec_ttl,dec_ttl,dec_ttl,dec_ttl,dec_ttl,
set_field:aa:aa:aa:bb:bb:ff->eth_src,encap(ethernet),dec_ttl,output:50

"br3"
OFPST_FLOW reply (OF1.3) (xid=0x2):
cookie=0x0, duration=15.502s, table=0, n_packets=0, n_bytes=0,
reset_counts in_port=60 actions=LOCAL

- Installed datapath flow:

flow-dump from non-dpdk interfaces:
recirc_id(0),in_port(1),packet_type(ns=0,id=0),
eth(src=aa:dd:dd:ee:ee:f9,dst=aa:aa:aa:bb:bb:00),eth_type(0x0800),
ipv4(ttl=255,frag=no), packets:20738, bytes:2032324, used:5.176s,
actions:set(ipv4(ttl=250)),2,
set(eth(src=aa:aa:aa:bb:bb:ff,dst=ee:ee:ee:ff:ff:01)),set(ipv4(ttl=245))

In this case, encap(ethernet) action cannot be performed, because of the
packet is L2 Ethernet packet. By looking at the datapath flow, you can
see, that setting MAC addresses and TTL at the end is actually not needed,
since the packet is not transmitted anywhere after these actions.

There are two points in the code where we can fix such datapath flows.
1) During translation of OF actions on a bridge, we can store the last
valid state of translated actions while iterating over the OF actions
and revert to it in case of error. This can be performed in the
do_xlate_actions() funtion.
2) When all OF actions have been translated, there could be actions at
the end of list of odp actions which are not needed to be executed.
So, the list can be normalized at the end of xlate_actions().

This commit implements 1). An upcomming commit is going to implement 2).

Signed-off-by: Zoltan Balogh 
Signed-off-by: Sugesh Chandran 
Co-authored-by: Sugesh Chandran 
CC: Ben Pfaff 
CC: Jan Scheurich 
---
 ofproto/ofproto-dpif-xlate.c | 137 +
 tests/ofproto-dpif.at| 179 +++
 2 files changed, 316 insertions(+)

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index fcced344e..36d0a0e1f 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -173,6 +173,8 @@ struct xport {
 struct lldp *lldp;   /* LLDP handle or null. */
 };
 
+struct xlate_backup_data;
+
 struct xlate_ctx {
 struct xlate_in *xin;
 struct xlate_out *xout;
@@ -390,6 +392,10 @@ struct xlate_ctx {
 struct ofpbuf action_set;   /* Action set. */
 
 enum xlate_error error; /* Translation failed. */
+
+struct xlate_backup_data *xlate_backup_data;   /* To restore flow data if
+  something goes wrong
+  during xlate. */
 };
 
 /* Structure to track VLAN manipulation */
@@ -6150,6 +6156,118 @@ xlate_ofpact_unroll_xlate(struct xlate_ctx *ctx,
  "cookie=%#"PRIx64, a->rule_table_id, a->rule_cookie);
 }

[ovs-dev] [PATCH 2/2] xlate: normalize the actions after translation

2017-12-06 Thread Zoltan Balogh

When all OF actions have been translated, there could be actions at
the end of list of odp actions which are not needed to be executed.
So, the list can be normalized at the end of xlate_actions().

Signed-off-by: Zoltan Balogh 
Signed-off-by: Sugesh Chandran 
Co-authored-by: Sugesh Chandran 
CC: Ben Pfaff 
---
 ofproto/ofproto-dpif-xlate.c | 67 
 tests/ofproto-dpif.at| 48 +++
 2 files changed, 115 insertions(+)

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 36d0a0e1f..2af1ec1e8 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -6954,6 +6954,70 @@ xlate_wc_finish(struct xlate_ctx *ctx)
 }
 }
 
+/* Returns true if the action stored in 'nla' can be a valid last action of a
+ * datapath flow. */
+static bool
+is_valid_last_action(const struct nlattr *nla)
+{
+enum ovs_action_attr action_type = nl_attr_type(nla);
+
+switch (action_type) {
+case OVS_ACTION_ATTR_USERSPACE:
+case OVS_ACTION_ATTR_SAMPLE:
+case OVS_ACTION_ATTR_TRUNC:
+case OVS_ACTION_ATTR_RECIRC:
+case OVS_ACTION_ATTR_TUNNEL_PUSH:
+case OVS_ACTION_ATTR_TUNNEL_POP:
+case OVS_ACTION_ATTR_OUTPUT:
+case OVS_ACTION_ATTR_CLONE:
+case OVS_ACTION_ATTR_CT:
+return true;
+case OVS_ACTION_ATTR_UNSPEC:
+case OVS_ACTION_ATTR_SET:
+case OVS_ACTION_ATTR_PUSH_VLAN:
+case OVS_ACTION_ATTR_POP_VLAN:
+case OVS_ACTION_ATTR_HASH:
+case OVS_ACTION_ATTR_PUSH_MPLS:
+case OVS_ACTION_ATTR_POP_MPLS:
+case OVS_ACTION_ATTR_SET_MASKED:
+case OVS_ACTION_ATTR_PUSH_ETH:
+case OVS_ACTION_ATTR_POP_ETH:
+case OVS_ACTION_ATTR_METER:
+case OVS_ACTION_ATTR_ENCAP_NSH:
+case OVS_ACTION_ATTR_DECAP_NSH:
+case __OVS_ACTION_ATTR_MAX:
+default:
+return false;
+}
+}
+
+/* Returns offset of last netlink attribute storing valid action in array
+ * 'data'. Execution of actions beyond this last attribute does not make sense.
+ */
+static size_t
+last_action_offset(const struct nlattr *data, const size_t data_len)
+{
+const struct nlattr *last = data;
+
+uint16_t left;
+const struct nlattr *a;
+NL_ATTR_FOR_EACH (a, left, data, data_len) {
+if (is_valid_last_action(a)) {
+last = nl_attr_next(a);
+}
+}
+
+return (char *) last - (char *) data;
+}
+
+/* Get rid of any unneeded actions at the tail end. */
+static void
+normalize_odp_actions(struct xlate_ctx *ctx)
+{
+struct ofpbuf *oa = ctx->odp_actions;
+oa->size = last_action_offset(oa->data, oa->size);
+}
+
 /* Translates the flow, actions, or rule in 'xin' into datapath actions in
  * 'xout'.
  * The caller must take responsibility for eventually freeing 'xout', with
@@ -7364,7 +7428,10 @@ exit:
 if (xin->odp_actions) {
 ofpbuf_clear(xin->odp_actions);
 }
+} else {
+normalize_odp_actions();
 }
+
 return ctx.error;
 }
 
diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index e7df1504a..5bcd8bb46 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -10133,3 +10133,51 @@ AT_CHECK([grep "Datapath actions" stdout], [0],
 
 OVS_VSWITCHD_STOP
 AT_CLEANUP
+
+AT_SETUP([ofproto-dpif - xlate error - normalize actions])
+
+#  ->-+
+# | p1
+#   +-o---+
+#   |   br0   |
+#   +-o-o-+
+#   patch0| |patch1
+# +-->--+
+
+OVS_VSWITCHD_START([dnl
+-- add-port br0 patch0 \
+-- set interface patch0 type=patch options:peer=patch1 ofport_request=10 \
+-- add-port br0 patch1 \
+-- set interface patch1 type=patch options:peer=patch0 ofport_request=20
+])
+
+AT_CHECK([
+ovs-vsctl add-port br0 p1 -- set interface p1 type=dummy ofport_request=1
+])
+
+AT_CHECK([
+ovs-appctl dpif/show | grep -v hit | sed "s/$(printf \\t)//g" | sed 
's./[[0-9]]\{1,\}..'
+], [0], [dnl
+br0:
+br0 65534: (dummy-internal)
+p1 1: (dummy)
+patch0 10/none: (patch: peer=patch1)
+patch1 20/none: (patch: peer=patch0)
+])
+
+# Error due to pushing too many MPLS labels.
+AT_CHECK([
+ovs-ofctl del-flows br0
+ovs-ofctl add-flow br0 
"table=0,in_port=1,actions=mod_dl_src:aa:aa:aa:bb:bb:00,goto_table:1" 
-OOpenFlow13
+ovs-ofctl add-flow br0 "table=0,in_port=patch1,actions=goto_table:2" 
-OOpenFlow13
+ovs-ofctl add-flow br0 
"table=1,actions=push_vlan:0x8100,set_field:4196->vlan_vid,output:patch0" 
-OOpenFlow13
+ovs-ofctl add-flow br0 "table=2,actions=push_mpls:0x8847,output:patch0"
+], [0])
+
+AT_CHECK([ovs-appctl ofproto/trace br0 'in_port=1,ip'], [0], [stdout])
+AT_CHECK([tail -1 stdout], [0],
+  [Datapath actions: drop
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
-- 
2.14.1

___
dev mailing list
d...@openvswitch.org

[ovs-dev] [PATCH v2 0/2] xlate: optimize dp flow action in case of error in multi-bridge setup

2017-12-06 Thread Zoltan Balogh

If several bridges are connected via patch ports or the packet is sent
via tunnel and an error occurs during translation in a subsequent bridge,
e.g. Ethernet header cannot be pushed to the packet due to lack of
prerequisites in the second bridge and there were flow actions already
translated in first bridge, it can happen that the installed datapath
flow contains actions which are going to be performed for all received
packets, however the packets are going to be dropped in the end. This
is waste of processing resource.

There are two points in the code where we can fix such datapath flows.
1) During translation of OF actions on a bridge, we can store the last
valid state of translated actions while iterating over the OF actions
and revert to it in case of error. This can be performed in the
do_xlate_actions() funtion.
2) When all OF actions have been translated, there could be actions at
the end of list of odp actions which are not needed to be executed.
So, the list can be normalized at the end of xlate_actions().

Let's have a config like this and inject L2 Ethernet packets into br0
via LOCAL port:

   ->-+
  | LOCAL  LOCAL   LOCAL
+-o---+   +---o-+   +-+   +---o-+
|   br0   |   |   br1   |   |   br2   |   |   br3   |
+---o-+   +-o-o-+   +-o-o-+   +-o---+
p01 |   p10 | | p12   p21 | | p23   p32 |
+---+ +---+ +---+

netdev@ovs-netdev: hit:17754528 missed:57
br0:
br0 65534/1: (tap)
p01 10/none: (patch: peer=p10)
br1:
br1 65534/2: (tap)
p10 20/none: (patch: peer=p01)
p12 30/none: (patch: peer=p21)
br2:
br2 65534/3: (tap)
p21 40/none: (patch: peer=p12)
p23 50/none: (patch: peer=p32)
br3:
br3 65534/4: (tap)
p32 60/none: (patch: peer=p23)

- OpenFlow rules:

"br0"
OFPST_FLOW reply (OF1.3) (xid=0x2):
cookie=0x0, duration=15.601s, table=0, n_packets=20739, n_bytes=2032422,
reset_counts in_port=LOCAL actions=dec_ttl,output:10

"br1"
OFPST_FLOW reply (OF1.3) (xid=0x2):
cookie=0x0, duration=15.568s, table=0, n_packets=20739, n_bytes=2032422,
reset_counts ip,in_port=20
actions=dec_ttl,dec_ttl,dec_ttl,dec_ttl,LOCAL,
set_field:ee:ee:ee:ff:ff:01->eth_dst,output:30

"br2"
OFPST_FLOW reply (OF1.3) (xid=0x2):
cookie=0x0, duration=15.536s, table=0, n_packets=20739, n_bytes=2032422,
ip,in_port=40
actions=dec_ttl,dec_ttl,dec_ttl,dec_ttl,dec_ttl,
set_field:aa:aa:aa:bb:bb:ff->eth_src,encap(ethernet),dec_ttl,output:50

"br3"
OFPST_FLOW reply (OF1.3) (xid=0x2):
cookie=0x0, duration=15.502s, table=0, n_packets=0, n_bytes=0,
reset_counts in_port=60 actions=LOCAL

- Installed datapath flow:

flow-dump from non-dpdk interfaces:
recirc_id(0),in_port(1),packet_type(ns=0,id=0),
eth(src=aa:dd:dd:ee:ee:f9,dst=aa:aa:aa:bb:bb:00),eth_type(0x0800),
ipv4(ttl=255,frag=no), packets:20738, bytes:2032324, used:5.176s,
actions:set(ipv4(ttl=250)),2,
set(eth(src=aa:aa:aa:bb:bb:ff,dst=ee:ee:ee:ff:ff:01)),set(ipv4(ttl=245))

In this case, encap(ethernet) action cannot be performed, because of the
packet is L2 Ethernet packet. By looking at the datapath flow, you can
see, that setting MAC addresses and TTL at the end is actually not needed,
since the packet is not transmitted anywhere after these actions.
With this series, these actions are omitted.

Zoltan Balogh (2):
  xlate: rollback to valid known state in case of ctx->error on
translation
  xlate: normalize the actions after translation

 ofproto/ofproto-dpif-xlate.c | 204 ++
 tests/ofproto-dpif.at| 227 +++
 2 files changed, 431 insertions(+)

-- 
2.14.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH] LACP: Check active partner sys id

2017-12-06 Thread Róbert Mulik

A reboot of one switch in an MC-LAG bond makes all bond links
to go down, causing a total connectivity loss for 3 seconds.

Packet capture shows that spurious LACP PDUs are sent to OVS with
a different MAC address (partner system id) during the final
stages of the MC-LAG switch reboot. The current implementation
doesn't care about the partner sys_id (MAC address).

The code change based on the following:
- If an interface (lead interface) on a bond has an "attached"
  LACP connection, then any other slaves on that bond is allowed
  to become active only when its partner's sys_id is the same as
  the partner's sys_id of the lead interface.
- So, when a slave interface of a bond becomes "current" (it gets
  valid LACP information), first checks if there is already an
  active interface on the bond.
- If there is a lead, the slave checks for the partner sys_ids,
  and becomes active only when they are the same, otherwise it
  remains in "current" state, but "detached".
- If there is no lead, it follows the old way, and accepts any
  partner sys_id.

Signed-off-by: Robert Mulik 
---
 lib/lacp.c | 38 --
 1 file changed, 32 insertions(+), 6 deletions(-)

diff --git a/lib/lacp.c b/lib/lacp.c
index e6ad7b9..eaac2a5 100644
--- a/lib/lacp.c
+++ b/lib/lacp.c
@@ -595,7 +595,7 @@ lacp_wait(struct lacp *lacp) OVS_EXCLUDED(mutex)
 static void
 lacp_update_attached(struct lacp *lacp) OVS_REQUIRES(mutex)
 {
-struct slave *lead, *slave;
+struct slave *lead, *lead_current, *slave;
 struct lacp_info lead_pri;
 bool lead_enable;
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 10);
@@ -603,7 +603,25 @@ lacp_update_attached(struct lacp *lacp) OVS_REQUIRES(mutex)
 lacp->update = false;

 lead = NULL;
+lead_current = NULL;
 lead_enable = false;
+
+/* Check if there is a working interface.
+ * Store as lead_current, if there is one. */
+HMAP_FOR_EACH (slave, node, >slaves) {
+if (slave->status == LACP_CURRENT && slave->attached) {
+struct lacp_info pri;
+slave_get_priority(slave, );
+if (!lead_current || memcmp(, _pri, sizeof pri) < 0) {
+lead_current = slave;
+lead = lead_current;
+lead_pri = pri;
+lead_enable = true;
+}
+}
+}
+
+/* Find interface with highest priority. */
 HMAP_FOR_EACH (slave, node, >slaves) {
 struct lacp_info pri;

@@ -624,14 +642,22 @@ lacp_update_attached(struct lacp *lacp) 
OVS_REQUIRES(mutex)
 continue;
 }

-slave->attached = true;
 slave_get_priority(slave, );
 bool enable = slave_may_enable__(slave);

-if (!lead
-|| enable > lead_enable
-|| (enable == lead_enable
-&& memcmp(, _pri, sizeof pri) < 0)) {
+/* Check if partner MAC address is the same as on the working
+ * interface. Activate slave only if the MAC is the same, or
+ * there is no working interface. */
+if (!lead_current || (lead_current
+&& eth_addr_equals(slave->partner.sys_id,
+   lead_current->partner.sys_id))) {
+slave->attached = true;
+}
+if (slave->attached &&
+(!lead
+ || enable > lead_enable
+ || (enable == lead_enable
+ && memcmp(, _pri, sizeof pri) < 0))) {
 lead = slave;
 lead_enable = enable;
 lead_pri = pri;
--
1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH] Configurable Link State Change (LSC) detection mode

2017-12-06 Thread Róbert Mulik

It is possible to set LSC detection mode to polling or interrupt mode
for DPDK interfaces. The default is polling mode. To set interrupt mode,
option dpdk-lsc-interrupt has to be set to true.


Global settings
Service restart is necessary for the global settings to take effect.

Command to set interrupt mode for all interfaces:
ovs-vsctl set Open_vSwitch . other_config:dpdk-lsc-interrupt=true

Command to set polling mode for all interfaces:
ovs-vsctl set Open_vSwitch . other_config:dpdk-lsc-interrupt=false
or:
ovs-vsctl remove Open_vSwitch . other_config dpdk-lsc-interrupt


Interface specific settings (override global settings)
Service restart is not necessary to take effect.

Command to set interrupt mode for a specific interface:
ovs-vsctl set interface  other_config:dpdk-lsc-interrupt=true

Command to set polling mode for a specific interface:
ovs-vsctl set interface  other_config:dpdk-lsc-interrupt=false

Command to reset to globally defined mode for a specific interface:
ovs-vsctl remove interface  other_config dpdk-lsc-interrupt

Signed-off-by: Robert Mulik 
---
 lib/dpdk.c|  6 ++
 lib/netdev-dpdk.c | 60 ++-
 lib/netdev-dpdk.h |  8 +++
 lib/netdev-dummy.c|  2 ++
 lib/netdev-linux.c|  2 ++
 lib/netdev-provider.h |  3 +++
 lib/netdev-vport.c|  2 ++
 lib/netdev.c  | 32 +++
 lib/netdev.h  |  2 ++
 vswitchd/bridge.c | 34 +
 vswitchd/vswitch.xml  | 42 
 11 files changed, 192 insertions(+), 1 deletion(-)

diff --git a/lib/dpdk.c b/lib/dpdk.c
index 8da6c32..09c4f9b 100644
--- a/lib/dpdk.c
+++ b/lib/dpdk.c
@@ -345,6 +345,12 @@ dpdk_init__(const struct smap *ovs_other_config)
 vhost_sock_dir = sock_dir_subcomponent;
 }

+if (smap_get_bool(ovs_other_config, "dpdk-lsc-interrupt", false)) {
+
netdev_dpdk_set_default_lsc_detect_mode(NETDEV_DPDK_LSC_DETECT_INTERRUPT_MODE);
+} else {
+
netdev_dpdk_set_default_lsc_detect_mode(NETDEV_DPDK_LSC_DETECT_POLL_MODE);
+}
+
 argv = grow_argv(, 0, 1);
 argc = 1;
 argv[0] = xstrdup(ovs_get_program_name());
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index faff842..ac722cc 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -146,7 +146,7 @@ typedef uint8_t dpdk_port_t;
 #define VHOST_ENQ_RETRY_NUM 8
 #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)

-static const struct rte_eth_conf port_conf = {
+static struct rte_eth_conf port_conf = {
 .rxmode = {
 .mq_mode = ETH_MQ_RX_RSS,
 .split_hdr_size = 0,
@@ -165,6 +165,9 @@ static const struct rte_eth_conf port_conf = {
 .txmode = {
 .mq_mode = ETH_MQ_TX_NONE,
 },
+.intr_conf = {
+.lsc = (uint16_t)NETDEV_DPDK_LSC_DETECT_POLL_MODE, /* LSC interrupt 
mode disabled, polling mode used. */
+},
 };

 /*
@@ -397,6 +400,9 @@ struct netdev_dpdk {
 int requested_n_rxq;
 int requested_rxq_size;
 int requested_txq_size;
+enum netdev_dpdk_lsc_detect_mode requested_lsc_detect_mode;
+
+enum netdev_dpdk_lsc_detect_mode lsc_detect_mode;

 /* Number of rx/tx descriptors for physical devices */
 int rxq_size;
@@ -430,6 +436,18 @@ int netdev_dpdk_get_vid(const struct netdev_dpdk *dev);
 struct ingress_policer *
 netdev_dpdk_get_ingress_policer(const struct netdev_dpdk *dev);

+void
+netdev_dpdk_set_default_lsc_detect_mode(enum netdev_dpdk_lsc_detect_mode 
default_lsc)
+{
+port_conf.intr_conf.lsc = (uint16_t)default_lsc;
+}
+
+enum netdev_dpdk_lsc_detect_mode
+netdev_dpdk_get_default_lsc_detect_mode(void)
+{
+return port_conf.intr_conf.lsc;
+}
+
 static bool
 is_dpdk_class(const struct netdev_class *class)
 {
@@ -663,6 +681,8 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int 
n_rxq, int n_txq)
 int i;
 struct rte_eth_conf conf = port_conf;

+conf.intr_conf.lsc = dev->lsc_detect_mode;
+
 /* For some NICs (e.g. Niantic), scatter_rx mode needs to be explicitly
  * enabled. */
 if (dev->mtu > ETHER_MTU) {
@@ -868,6 +888,7 @@ common_construct(struct netdev *netdev, dpdk_port_t port_no,
 dev->flags = 0;
 dev->requested_mtu = ETHER_MTU;
 dev->max_packet_len = MTU_TO_FRAME_LEN(dev->mtu);
+dev->requested_lsc_detect_mode = netdev_dpdk_get_default_lsc_detect_mode();
 ovsrcu_index_init(>vid, -1);
 dev->vhost_reconfigured = false;
 dev->attached = false;
@@ -2005,6 +2026,38 @@ netdev_dpdk_set_mtu(struct netdev *netdev, int mtu)
 }

 static int
+netdev_dpdk_get_lsc_detect_mode(const struct netdev *netdev, uint16_t *lscp)
+{
+struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+
+ovs_mutex_lock(>mutex);
+*lscp = (uint16_t)dev->lsc_detect_mode;
+ovs_mutex_unlock(>mutex);
+
+return 0;
+}
+
+static int
+netdev_dpdk_set_lsc_detect_mode(struct netdev *netdev, uint16_t 
lsc_detect_mode)

[ovs-dev] [PATCH branch-2.8 v1] OVN: Add external_ids to NAT and Logical_Router_Static_Route tables.

2017-12-06 Thread lmartins

From: Lucas Alvares Gomes 

The external_ids column is missing from the NAT and
Logical_Router_Static_Route tables.

As discussed at [0] the change to the schema for this backport should
leave the version number unmodified.

[0]
https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341630.html

Signed-off-by: Lucas Alvares Gomes 
---
 ovn/ovn-nb.ovsschema | 12 +---
 ovn/ovn-nb.xml   | 14 ++
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
index a077bfb81..30fe610ff 100644
--- a/ovn/ovn-nb.ovsschema
+++ b/ovn/ovn-nb.ovsschema
@@ -1,7 +1,7 @@
 {
 "name": "OVN_Northbound",
 "version": "5.8.0",
-"cksum": "2812300190 16766",
+"cksum": "3279121944 17086",
 "tables": {
 "NB_Global": {
 "columns": {
@@ -235,7 +235,10 @@
  "dst-ip"]]},
 "min": 0, "max": 1}},
 "nexthop": {"type": "string"},
-"output_port": {"type": {"key": "string", "min": 0, "max": 
1}}},
+"output_port": {"type": {"key": "string", "min": 0, "max": 1}},
+"external_ids": {
+"type": {"key": "string", "value": "string",
+ "min": 0, "max": "unlimited"}}},
 "isRoot": false},
 "NAT": {
 "columns": {
@@ -249,7 +252,10 @@
"enum": ["set", ["dnat",
  "snat",
  "dnat_and_snat"
-   ]],
+   ]]}}},
+"external_ids": {
+"type": {"key": "string", "value": "string",
+ "min": 0, "max": "unlimited"}}},
 "isRoot": false},
 "DHCP_Options": {
 "columns": {
diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
index 5b022697e..763846a43 100644
--- a/ovn/ovn-nb.xml
+++ b/ovn/ovn-nb.xml
@@ -1497,6 +1497,13 @@
 address as the one via which the  is reachable.
   
 
+
+
+  
+See External IDs at the beginning of this document.
+  
+
+
   
 
   
@@ -1575,6 +1582,13 @@
 port instance on the redirect-chassis.
   
 
+
+
+  
+See External IDs at the beginning of this document.
+  
+
+
   
 
   
-- 
2.15.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v1] OVN: Add external_ids to NAT and Logical_Router_Static_Route tables.

2017-12-06 Thread Lucas Alvares Gomes Martins

Hi,

Thanks a lot Russell and Ben for this email. Having this change in the
2.8 will make certain things *a lot* easier in networking-ovn.

I will go ahead and propose a patch for the 2.8 branch then.

Cheers,
Lucas

On Tue, Dec 5, 2017 at 7:20 PM, Ben Pfaff  wrote:
> I don't object to #2.
>
> On Tue, Dec 05, 2017 at 01:44:25PM -0500, Russell Bryant wrote:
>> Lucas asked me about backporting this one, as OpenStack would start
>> making use of it with an OVS 2.8 update if available.
>>
>> The schema change seems pretty harmless.  The catch is that this also
>> updated the schema version number from "5.8.1" to "5.8.2", while
>> branch-2.8 has "5.8.0".  master includes a change that introduced a
>> new feature, along with the "5.8.1" update, that we would not
>> backport.
>>
>> The main choices seem to be ...
>>
>> 1) Don't backport.
>>
>> 2) Backport, but leave the schema version number unchanged in branch-2.8.
>>
>> Does anyone see a problem with option #2?  It's easy enough to
>> determine if the new columns are present, even without the version
>> number bump.
>>
>> On Mon, Dec 4, 2017 at 2:11 PM, Ben Pfaff  wrote:
>> > Applied, thanks.
>> >
>> > On Mon, Dec 04, 2017 at 03:06:39PM +0100, Daniel Alvarez Sanchez wrote:
>> >> Acked-by: Daniel Alvarez 
>> >>
>> >> From [0] one can expect this column to be present in all tables.
>> >> [0] https://github.com/openvswitch/ovs/blob/v2.8.1/ovn/ovn-nb.xml#L19
>> >>
>> >> On Mon, Dec 4, 2017 at 2:16 PM,  wrote:
>> >>
>> >> > From: Lucas Alvares Gomes 
>> >> >
>> >> > The external_ids column is missing from the NAT and
>> >> > Logical_Router_Static_Route tables.
>> >> >
>> >> > Signed-off-by: Lucas Alvares Gomes 
>> >> > ---
>> >> >  ovn/ovn-nb.ovsschema | 14 ++
>> >> >  ovn/ovn-nb.xml   | 14 ++
>> >> >  2 files changed, 24 insertions(+), 4 deletions(-)
>> >> >
>> >> > diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
>> >> > index fcd878cf2..081ddb54c 100644
>> >> > --- a/ovn/ovn-nb.ovsschema
>> >> > +++ b/ovn/ovn-nb.ovsschema
>> >> > @@ -1,7 +1,7 @@
>> >> >  {
>> >> >  "name": "OVN_Northbound",
>> >> > -"version": "5.8.1",
>> >> > -"cksum": "607160660 16929",
>> >> > +"version": "5.9.0",
>> >> > +"cksum": "1120419033 17249",
>> >> >  "tables": {
>> >> >  "NB_Global": {
>> >> >  "columns": {
>> >> > @@ -238,7 +238,10 @@
>> >> >   
>> >> > "dst-ip"]]},
>> >> >  "min": 0, "max": 1}},
>> >> >  "nexthop": {"type": "string"},
>> >> > -"output_port": {"type": {"key": "string", "min": 0,
>> >> > "max": 1}}},
>> >> > +"output_port": {"type": {"key": "string", "min": 0,
>> >> > "max": 1}},
>> >> > +"external_ids": {
>> >> > +"type": {"key": "string", "value": "string",
>> >> > + "min": 0, "max": "unlimited"}}},
>> >> >  "isRoot": false},
>> >> >  "NAT": {
>> >> >  "columns": {
>> >> > @@ -252,7 +255,10 @@
>> >> > "enum": ["set", ["dnat",
>> >> >   "snat",
>> >> >
>> >> > "dnat_and_snat"
>> >> > -   ]],
>> >> > +   ]]}}},
>> >> > +"external_ids": {
>> >> > +"type": {"key": "string", "value": "string",
>> >> > + "min": 0, "max": "unlimited"}}},
>> >> >  "isRoot": false},
>> >> >  "DHCP_Options": {
>> >> >  "columns": {
>> >> > diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
>> >> > index 1091c05ce..4e3899f28 100644
>> >> > --- a/ovn/ovn-nb.xml
>> >> > +++ b/ovn/ovn-nb.xml
>> >> > @@ -1540,6 +1540,13 @@
>> >> >  address as the one via which the  is
>> >> > reachable.
>> >> >
>> >> >  
>> >> > +
>> >> > +
>> >> > +  
>> >> > +See External IDs at the beginning of this document.
>> >> > +  
>> >> > +
>> >> > +
>> >> >
>> >> >
>> >> >
>> >> > @@ -1618,6 +1625,13 @@
>> >> >  port instance on the redirect-chassis.
>> >> >
>> >> >  
>> >> > +
>> >> > +
>> >> > +  
>> >> > +See External IDs at the beginning of this document.
>> >> > +  
>> >> > +
>> >> > +
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > 2.15.1
>> >> >
>> >> > ___
>> >> > dev mailing list
>> >> > d...@openvswitch.org
>> >> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>> >> >
>> >> ___
>> >> dev mailing list
>> >> d...@openvswitch.org
>> >>

47 matches

Mail list logo