date:20180105

Re: [ovs-dev] [PATCH v2] openvswitch: Trim off padding before L3+ netfilter processing

2018-01-05 Thread Ed Swierk via dev

On Jan 5, 2018 22:17, "Pravin Shelar"  wrote:

On Fri, Jan 5, 2018 at 3:20 PM, Ed Swierk 
wrote:
> On Fri, Jan 5, 2018 at 10:14 AM, Ed Swierk 
> wrote:
>> On Thu, Jan 4, 2018 at 7:36 PM, Pravin Shelar  wrote:
>>> OVS already pull all required headers in skb linear data, so no need
>>> to redo all of it. only check required is the ip-checksum validation.
>>> I think we could avoid it in most of cases by checking skb length to
>>> ipheader length before verifying the ip header-checksum.
>>
>> Shouldn't the IP header checksum be verified even earlier, like in
>> key_extract(), before actually using any of the fields in the IP
>> header?
>
> Something like this for verifying the IP header checksum (not tested):
>
AFAIU openflow does not need this verification, so it is not required
in flow extract.


Okay. How about my proposed trimming implementation, caching the pad length
in the ovs cb?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2] openvswitch: Trim off padding before L3+ netfilter processing

2018-01-05 Thread Pravin Shelar

On Fri, Jan 5, 2018 at 3:20 PM, Ed Swierk  wrote:
> On Fri, Jan 5, 2018 at 10:14 AM, Ed Swierk 
> wrote:
>> On Thu, Jan 4, 2018 at 7:36 PM, Pravin Shelar  wrote:
>>> OVS already pull all required headers in skb linear data, so no need
>>> to redo all of it. only check required is the ip-checksum validation.
>>> I think we could avoid it in most of cases by checking skb length to
>>> ipheader length before verifying the ip header-checksum.
>>
>> Shouldn't the IP header checksum be verified even earlier, like in
>> key_extract(), before actually using any of the fields in the IP
>> header?
>
> Something like this for verifying the IP header checksum (not tested):
>
AFAIU openflow does not need this verification, so it is not required
in flow extract.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v6 1/4] nsh: rework NSH netlink keys and actions

2018-01-05 Thread Yang, Yi

On Fri, Jan 05, 2018 at 04:42:00AM +0800, Ben Pfaff wrote:
> On Fri, Dec 08, 2017 at 10:04:22PM +0800, Yi Yang wrote:
> > 
> > Signed-off-by: Yi Yang 
> 
> 
> In odp_execute_actions(), this looks bogus: there is nothing to
> guarantee that 'buffer' is properly aligned for struct nsh_hdr.
> +uint8_t buffer[NSH_HDR_MAX_LEN];
> +struct nsh_hdr *nsh_hdr = ALIGNED_CAST(struct nsh_hdr *, buffer);
> 
> Similarly in format_odp_action().
> 

Ben, do you mean buffer isn't 4 bytes aligned? I redefine it as
"uint32_t buffer[NSH_HDR_MAX_LEN / 4];", it is enough for struct nsh_hdr to
align to 4 bytes boundary.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v6 1/4] nsh: rework NSH netlink keys and actions

2018-01-05 Thread Yang, Yi

On Fri, Jan 05, 2018 at 04:42:00AM +0800, Ben Pfaff wrote:
> On Fri, Dec 08, 2017 at 10:04:22PM +0800, Yi Yang wrote:
> > Signed-off-by: Yi Yang 
> 
> This fails to build with Clang (and, I would guess, MSVC):
> ../lib/odp-execute.c:497:21: error: fields must have a constant size: 
> 'variable length array in structure' extension will never be supported
> 
> "sparse" issues some warnings.  This one is probably sensible:
> ../lib/odp-util.c:1884:32: warning: incorrect type in argument 1 
> (different base types)
> ../lib/odp-util.c:1884:32:expected unsigned int [unsigned] [usertype] 
> x
> ../lib/odp-util.c:1884:32:got restricted ovs_be32 [assigned] 
> [usertype] spi
> 
> These I don't understand.
> https://marc.info/?l=linux-sparse=110218288411207 suggests that they
> might be a sparse bug:
> 
> ../lib/odp-util.c:7123:32: warning: crazy programmer
> ../lib/odp-util.c:7123:43: warning: crazy programmer
> ../lib/odp-util.c:7123:22: warning: crazy programmer
> 
> In odp_execute_actions(), this looks bogus: there is nothing to
> guarantee that 'buffer' is properly aligned for struct nsh_hdr.
> +uint8_t buffer[NSH_HDR_MAX_LEN];
> +struct nsh_hdr *nsh_hdr = ALIGNED_CAST(struct nsh_hdr *, buffer);
> 
> Similarly in format_odp_action().
> 
> In nsh_key_to_attr(), can this:
> +for (int i = 0; i < 4; i++) {
> +md1.context[i] = nsh->context[i];
> +}
> +nl_msg_put_unspec(buf, OVS_NSH_KEY_ATTR_MD1, , sizeof md1);
> be written as just this?
> +nl_msg_put_unspec(buf, OVS_NSH_KEY_ATTR_MD1, nsh->context,
> +  sizeof nsh->context);
> and similarly in a second place.
> 
> In parse_odp_push_nsh_action(), the idea of xmalloc()'ing a stub is
> weird.  A stub should be a local array.  There are many examples in the
> tree.
> 
> Please don't check a pointer for null before calling free():
> +if (metadata != NULL) {
> +free(metadata);
>  }

Thanks, Ben, I have posted v7
https://mail.openvswitch.org/pipermail/ovs-dev/2018-January/342698.html
to fix the above issues.

By the way, what command do you run to do static code analysis by sparse?
"make clang-analyze" will have too much warnings. I can't get sparse
warnings, it will be better if I can run it locally.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v7 4/4] nsh: add dec_nsh_ttl action

2018-01-05 Thread Yi Yang

NSH ttl is a 6-bit field ranged from 0 to 63, it should be
decremented by 1 every hop, if it is 0 or it is so after
decremented, the packet should be dropped and a packet-in
message is sent to main controller.

Signed-off-by: Yi Yang 
---
 include/openvswitch/ofp-actions.h |  1 +
 lib/ofp-actions.c | 49 +++
 ofproto/ofproto-dpif-xlate.c  | 31 +
 tests/nsh.at  |  6 ++---
 utilities/ovs-ofctl.8.in  | 13 ++-
 5 files changed, 96 insertions(+), 4 deletions(-)

diff --git a/include/openvswitch/ofp-actions.h 
b/include/openvswitch/ofp-actions.h
index 3d97755..c82dae0 100644
--- a/include/openvswitch/ofp-actions.h
+++ b/include/openvswitch/ofp-actions.h
@@ -93,6 +93,7 @@ struct vl_mff_map;
 OFPACT(DEC_MPLS_TTL,ofpact_null,ofpact, "dec_mpls_ttl") \
 OFPACT(PUSH_MPLS,   ofpact_push_mpls,   ofpact, "push_mpls")\
 OFPACT(POP_MPLS,ofpact_pop_mpls,ofpact, "pop_mpls") \
+OFPACT(DEC_NSH_TTL, ofpact_null,ofpact, "dec_nsh_ttl")  \
 \
 /* Generic encap & decap */ \
 OFPACT(ENCAP,   ofpact_encap,   props, "encap") \
diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c
index 4918498..7acf05f 100644
--- a/lib/ofp-actions.c
+++ b/lib/ofp-actions.c
@@ -350,6 +350,9 @@ enum ofp_raw_action_type {
 /* NX1.3+(47): struct nx_action_decap, ... */
 NXAST_RAW_DECAP,
 
+/* NX1.3+(48): void. */
+NXAST_RAW_DEC_NSH_TTL,
+
 /* ## -- ## */
 /* ## Debugging actions. ## */
 /* ## -- ## */
@@ -482,6 +485,7 @@ ofpact_next_flattened(const struct ofpact *ofpact)
 case OFPACT_NAT:
 case OFPACT_ENCAP:
 case OFPACT_DECAP:
+case OFPACT_DEC_NSH_TTL:
 return ofpact_next(ofpact);
 
 case OFPACT_CLONE:
@@ -4332,6 +4336,39 @@ format_DECAP(const struct ofpact_decap *a,
 ds_put_format(s, "%s)%s", colors.paren, colors.end);
 }
 
+/* Action dec_nsh_ttl */
+
+static enum ofperr
+decode_NXAST_RAW_DEC_NSH_TTL(struct ofpbuf *out)
+{
+ofpact_put_DEC_NSH_TTL(out);
+return 0;
+}
+
+static void
+encode_DEC_NSH_TTL(const struct ofpact_null *null OVS_UNUSED,
+enum ofp_version ofp_version OVS_UNUSED, struct ofpbuf *out)
+{
+put_NXAST_DEC_NSH_TTL(out);
+}
+
+static char * OVS_WARN_UNUSED_RESULT
+parse_DEC_NSH_TTL(char *arg OVS_UNUSED,
+   const struct ofputil_port_map *port_map OVS_UNUSED,
+   struct ofpbuf *ofpacts,
+   enum ofputil_protocol *usable_protocols OVS_UNUSED)
+{
+ofpact_put_DEC_NSH_TTL(ofpacts);
+return NULL;
+}
+
+static void
+format_DEC_NSH_TTL(const struct ofpact_null *a OVS_UNUSED,
+const struct ofputil_port_map *port_map OVS_UNUSED, struct ds *s)
+{
+ds_put_format(s, "%sdec_nsh_ttl%s", colors.special, colors.end);
+}
+
 
 /* Action structures for NXAST_RESUBMIT, NXAST_RESUBMIT_TABLE, and
  * NXAST_RESUBMIT_TABLE_CT.
@@ -7115,6 +7152,7 @@ ofpact_is_set_or_move_action(const struct ofpact *a)
 case OFPACT_SET_VLAN_VID:
 case OFPACT_ENCAP:
 case OFPACT_DECAP:
+case OFPACT_DEC_NSH_TTL:
 return true;
 case OFPACT_BUNDLE:
 case OFPACT_CLEAR_ACTIONS:
@@ -7192,6 +7230,7 @@ ofpact_is_allowed_in_actions_set(const struct ofpact *a)
 case OFPACT_STRIP_VLAN:
 case OFPACT_ENCAP:
 case OFPACT_DECAP:
+case OFPACT_DEC_NSH_TTL:
 return true;
 
 /* In general these actions are excluded because they are not part of
@@ -7305,6 +7344,7 @@ ofpacts_execute_action_set(struct ofpbuf *action_list,
 ofpacts_copy_last(action_list, action_set, OFPACT_PUSH_VLAN);
 ofpacts_copy_last(action_list, action_set, OFPACT_DEC_TTL);
 ofpacts_copy_last(action_list, action_set, OFPACT_DEC_MPLS_TTL);
+ofpacts_copy_last(action_list, action_set, OFPACT_DEC_NSH_TTL);
 ofpacts_copy_all(action_list, action_set, ofpact_is_set_or_move_action);
 ofpacts_copy_last(action_list, action_set, OFPACT_SET_QUEUE);
 
@@ -7446,6 +7486,7 @@ ovs_instruction_type_from_ofpact_type(enum ofpact_type 
type)
 case OFPACT_NAT:
 case OFPACT_ENCAP:
 case OFPACT_DECAP:
+case OFPACT_DEC_NSH_TTL:
 default:
 return OVSINST_OFPIT11_APPLY_ACTIONS;
 }
@@ -8132,6 +8173,13 @@ ofpact_check__(enum ofputil_protocol *usable_protocols, 
struct ofpact *a,
 }
 return 0;
 
+case OFPACT_DEC_NSH_TTL:
+if ((flow->packet_type != htonl(PT_NSH)) &&
+(flow->dl_type != htons(ETH_TYPE_NSH))) {
+inconsistent_match(usable_protocols);
+}
+return 0;
+
 default:
 OVS_NOT_REACHED();
 }
@@ -8627,6 +8675,7 @@ ofpact_outputs_to_port(const struct ofpact *ofpact, 
ofp_port_t port)
 case OFPACT_NAT:
 case OFPACT_ENCAP:
 case OFPACT_DECAP:
+case OFPACT_DEC_NSH_TTL:

[ovs-dev] [PATCH v7 3/4] nsh: fix nested mask for OVS_KEY_ATTR_NSH

2018-01-05 Thread Yi Yang

NSH kernel implementation used nested mask for OVS_KEY_ATTR_NSH,
so NSH userspace must adapt to it, OVS hasn't used nested mask for
any key attribute so far, OVS_KEY_ATTR_NSH is the first use case.

Signed-off-by: Yi Yang 
---
 lib/odp-execute.c |  56 +--
 lib/odp-util.c| 131 --
 lib/odp-util.h|   3 +-
 3 files changed, 122 insertions(+), 68 deletions(-)

diff --git a/lib/odp-execute.c b/lib/odp-execute.c
index c680364..ebaea31 100644
--- a/lib/odp-execute.c
+++ b/lib/odp-execute.c
@@ -274,20 +274,26 @@ odp_set_nd(struct dp_packet *packet, const struct 
ovs_key_nd *key,
 /* Set the NSH header. Assumes the NSH header is present and matches the
  * MD format of the key. The slow path must take case of that. */
 static void
-odp_set_nsh(struct dp_packet *packet, const struct ovs_key_nsh *key,
-const struct ovs_key_nsh *mask)
+odp_set_nsh(struct dp_packet *packet, const struct nlattr *a, bool has_mask)
 {
+struct ovs_key_nsh key, mask;
 struct nsh_hdr *nsh = dp_packet_l3(packet);
 uint8_t mdtype = nsh_md_type(nsh);
 ovs_be32 path_hdr;
 
-if (!mask) {
-nsh_set_flags_and_ttl(nsh, key->flags, key->ttl);
-put_16aligned_be32(>path_hdr, key->path_hdr);
+if (has_mask) {
+odp_nsh_key_from_attr(a, , );
+} else {
+odp_nsh_key_from_attr(a, , NULL);
+}
+
+if (!has_mask) {
+nsh_set_flags_and_ttl(nsh, key.flags, key.ttl);
+put_16aligned_be32(>path_hdr, key.path_hdr);
 switch (mdtype) {
 case NSH_M_TYPE1:
 for (int i = 0; i < 4; i++) {
-put_16aligned_be32(>md1.context[i], key->context[i]);
+put_16aligned_be32(>md1.context[i], key.context[i]);
 }
 break;
 case NSH_M_TYPE2:
@@ -299,19 +305,19 @@ odp_set_nsh(struct dp_packet *packet, const struct 
ovs_key_nsh *key,
 uint8_t flags = nsh_get_flags(nsh);
 uint8_t ttl = nsh_get_ttl(nsh);
 
-flags = key->flags | (flags & ~mask->flags);
-ttl = key->ttl | (ttl & ~mask->ttl);
+flags = key.flags | (flags & ~mask.flags);
+ttl = key.ttl | (ttl & ~mask.ttl);
 nsh_set_flags_and_ttl(nsh, flags, ttl);
 
 uint32_t spi = ntohl(nsh_get_spi(nsh));
 uint8_t si = nsh_get_si(nsh);
-uint32_t spi_mask = nsh_path_hdr_to_spi_uint32(mask->path_hdr);
-uint8_t si_mask = nsh_path_hdr_to_si(mask->path_hdr);
+uint32_t spi_mask = nsh_path_hdr_to_spi_uint32(mask.path_hdr);
+uint8_t si_mask = nsh_path_hdr_to_si(mask.path_hdr);
 if (spi_mask == 0x00ff) {
 spi_mask = UINT32_MAX;
 }
-spi = nsh_path_hdr_to_spi_uint32(key->path_hdr) | (spi & ~spi_mask);
-si = nsh_path_hdr_to_si(key->path_hdr) | (si & ~si_mask);
+spi = nsh_path_hdr_to_spi_uint32(key.path_hdr) | (spi & ~spi_mask);
+si = nsh_path_hdr_to_si(key.path_hdr) | (si & ~si_mask);
 path_hdr = nsh_get_path_hdr(nsh);
 nsh_path_hdr_set_spi(_hdr, htonl(spi));
 nsh_path_hdr_set_si(_hdr, si);
@@ -320,8 +326,8 @@ odp_set_nsh(struct dp_packet *packet, const struct 
ovs_key_nsh *key,
 case NSH_M_TYPE1:
 for (int i = 0; i < 4; i++) {
 ovs_be32 p = get_16aligned_be32(>md1.context[i]);
-ovs_be32 k = key->context[i];
-ovs_be32 m = mask->context[i];
+ovs_be32 k = key.context[i];
+ovs_be32 m = mask.context[i];
 put_16aligned_be32(>md1.context[i], k | (p & ~m));
 }
 break;
@@ -359,9 +365,7 @@ odp_execute_set_action(struct dp_packet *packet, const 
struct nlattr *a)
 break;
 
 case OVS_KEY_ATTR_NSH: {
-struct ovs_key_nsh nsh;
-odp_nsh_key_from_attr(a, );
-odp_set_nsh(packet, , NULL);
+odp_set_nsh(packet, a, false);
 break;
 }
 
@@ -490,23 +494,7 @@ odp_execute_masked_set_action(struct dp_packet *packet,
 break;
 
 case OVS_KEY_ATTR_NSH: {
-struct ovs_key_nsh nsh, nsh_mask;
-struct {
-struct nlattr nla;
-uint8_t data[sizeof(struct ovs_nsh_key_base) + NSH_CTX_HDRS_MAX_LEN
- + 2 * NLA_HDRLEN];
-} attr, mask;
-size_t size = nl_attr_get_size(a) / 2;
-
-mask.nla.nla_type = attr.nla.nla_type = nl_attr_type(a);
-mask.nla.nla_len = attr.nla.nla_len = NLA_HDRLEN + size;
-memcpy(attr.data, (char *)(a + 1), size);
-memcpy(mask.data, (char *)(a + 1) + size, size);
-
-odp_nsh_key_from_attr(, );
-odp_nsh_key_from_attr(, _mask);
-odp_set_nsh(packet, , _mask);
-
+odp_set_nsh(packet, a, true);
 break;
 }
 
diff --git a/lib/odp-util.c b/lib/odp-util.c
index 95fef7e..5d548dc 100644
--- a/lib/odp-util.c

[ovs-dev] [PATCH v7 2/4] nsh: add new flow key 'ttl'

2018-01-05 Thread Yi Yang

IETF NSH draft added a new filed ttl in NSH header, this patch
is to add new nsh key 'ttl' for it.

Signed-off-by: Yi Yang 
---
 datapath/linux/compat/include/linux/openvswitch.h |   2 +-
 include/openvswitch/flow.h|   6 +-
 include/openvswitch/meta-flow.h   |  31 +++--
 include/openvswitch/nsh.h |  96 ++
 include/openvswitch/packets.h |  12 +-
 lib/flow.c|  23 ++--
 lib/flow.h|   2 +-
 lib/match.c   |  12 +-
 lib/meta-flow.c   |  56 ++--
 lib/meta-flow.xml |   6 +-
 lib/nx-match.c|  16 ++-
 lib/odp-execute.c |  40 +++---
 lib/odp-util.c| 151 --
 lib/odp-util.h|   2 +-
 lib/packets.c |   1 +
 ofproto/ofproto-dpif-xlate.c  |   7 +-
 tests/nsh.at  |  41 +++---
 17 files changed, 308 insertions(+), 196 deletions(-)

diff --git a/datapath/linux/compat/include/linux/openvswitch.h 
b/datapath/linux/compat/include/linux/openvswitch.h
index 3ddb1c5..c7142b6 100644
--- a/datapath/linux/compat/include/linux/openvswitch.h
+++ b/datapath/linux/compat/include/linux/openvswitch.h
@@ -504,9 +504,9 @@ enum ovs_nsh_key_attr {
 
 struct ovs_nsh_key_base {
__u8 flags;
+   __u8 ttl;
__u8 mdtype;
__u8 np;
-   __u8 pad;
__be32 path_hdr;
 };
 
diff --git a/include/openvswitch/flow.h b/include/openvswitch/flow.h
index a658a58..cd61fff 100644
--- a/include/openvswitch/flow.h
+++ b/include/openvswitch/flow.h
@@ -146,7 +146,7 @@ struct flow {
 struct eth_addr arp_tha;/* ARP/ND target hardware address. */
 ovs_be16 tcp_flags; /* TCP flags. With L3 to avoid matching L4. */
 ovs_be16 pad2;  /* Pad to 64 bits. */
-struct flow_nsh nsh;/* Network Service Header keys */
+struct ovs_key_nsh nsh; /* Network Service Header keys */
 
 /* L4 (64-bit aligned) */
 ovs_be16 tp_src;/* TCP/UDP/SCTP source port/ICMP type. */
@@ -159,13 +159,13 @@ struct flow {
 };
 BUILD_ASSERT_DECL(sizeof(struct flow) % sizeof(uint64_t) == 0);
 BUILD_ASSERT_DECL(sizeof(struct flow_tnl) % sizeof(uint64_t) == 0);
-BUILD_ASSERT_DECL(sizeof(struct flow_nsh) % sizeof(uint64_t) == 0);
+BUILD_ASSERT_DECL(sizeof(struct ovs_key_nsh) % sizeof(uint64_t) == 0);
 
 #define FLOW_U64S (sizeof(struct flow) / sizeof(uint64_t))
 
 /* Remember to update FLOW_WC_SEQ when changing 'struct flow'. */
 BUILD_ASSERT_DECL(offsetof(struct flow, igmp_group_ip4) + sizeof(uint32_t)
-  == sizeof(struct flow_tnl) + sizeof(struct flow_nsh) + 300
+  == sizeof(struct flow_tnl) + sizeof(struct ovs_key_nsh) + 300
   && FLOW_WC_SEQ == 40);
 
 /* Incremental points at which flow classification may be performed in
diff --git a/include/openvswitch/meta-flow.h b/include/openvswitch/meta-flow.h
index 436501f..14e6b59 100644
--- a/include/openvswitch/meta-flow.h
+++ b/include/openvswitch/meta-flow.h
@@ -1757,6 +1757,21 @@ enum OVS_PACKED_ENUM mf_field_id {
  */
 MFF_NSH_FLAGS,
 
+/* "nsh_ttl".
+ *
+ * TTL field in NSH base header.
+ *
+ * Type: u8.
+ * Maskable: no.
+ * Formatting: decimal.
+ * Prerequisites: NSH.
+ * Access: read/write.
+ * NXM: none.
+ * OXM: NXOXM_NSH_TTL(2) since OF1.3 and v2.8.
+ */
+MFF_NSH_TTL,
+
+
 /* "nsh_mdtype".
  *
  * mdtype field in NSH base header.
@@ -1767,7 +1782,7 @@ enum OVS_PACKED_ENUM mf_field_id {
  * Prerequisites: NSH.
  * Access: read-only.
  * NXM: none.
- * OXM: NXOXM_NSH_MDTYPE(2) since OF1.3 and v2.8.
+ * OXM: NXOXM_NSH_MDTYPE(3) since OF1.3 and v2.8.
  */
 MFF_NSH_MDTYPE,
 
@@ -1781,7 +1796,7 @@ enum OVS_PACKED_ENUM mf_field_id {
  * Prerequisites: NSH.
  * Access: read-only.
  * NXM: none.
- * OXM: NXOXM_NSH_NP(3) since OF1.3 and v2.8.
+ * OXM: NXOXM_NSH_NP(4) since OF1.3 and v2.8.
  */
 MFF_NSH_NP,
 
@@ -1795,7 +1810,7 @@ enum OVS_PACKED_ENUM mf_field_id {
  * Prerequisites: NSH.
  * Access: read/write.
  * NXM: none.
- * OXM: NXOXM_NSH_SPI(4) since OF1.3 and v2.8.
+ * OXM: NXOXM_NSH_SPI(5) since OF1.3 and v2.8.
  */
 MFF_NSH_SPI,
 
@@ -1809,7 +1824,7 @@ enum OVS_PACKED_ENUM mf_field_id {
  * Prerequisites: NSH.
  * Access: read/write.
  * NXM: none.
- * OXM: NXOXM_NSH_SI(5) since OF1.3 and v2.8.
+ * OXM: NXOXM_NSH_SI(6) since OF1.3 and v2.8.
  */
 MFF_NSH_SI,
 
@@ -1823,10 +1838,10 @@ enum OVS_PACKED_ENUM mf_field_id {
  * Prerequisites: NSH.
  * Access: read/write.

[ovs-dev] [PATCH v7 1/4] nsh: rework NSH netlink keys and actions

2018-01-05 Thread Yi Yang

This patch changes OVS_KEY_ATTR_NSH
to nested attribute and adds three new NSH sub attribute keys:

OVS_NSH_KEY_ATTR_BASE: for length-fixed NSH base header
OVS_NSH_KEY_ATTR_MD1:  for length-fixed MD type 1 context
OVS_NSH_KEY_ATTR_MD2:  for length-variable MD type 2 metadata

Its intention is to align to NSH kernel implementation.

NSH match fields, set and PUSH_NSH action all use the below
nested attribute format:

OVS_KEY_ATTR_NSH begin
OVS_NSH_KEY_ATTR_BASE
OVS_NSH_KEY_ATTR_MD1
OVS_KEY_ATTR_NSH end

or

OVS_KEY_ATTR_NSH begin
OVS_NSH_KEY_ATTR_BASE
OVS_NSH_KEY_ATTR_MD2
OVS_KEY_ATTR_NSH end

In addition, NSH encap and decap actions are renamed as push_nsh
and pop_nsh to meet action naming convention.

Signed-off-by: Yi Yang 
---
 datapath/linux/compat/include/linux/openvswitch.h |  58 +-
 include/openvswitch/nsh.h |  26 +-
 include/openvswitch/packets.h |  11 +-
 lib/dpif-netdev.c |   4 +-
 lib/dpif.c|   4 +-
 lib/flow.c|  40 +-
 lib/match.c   |  12 +-
 lib/meta-flow.c   |  13 +-
 lib/nx-match.c|   4 +-
 lib/odp-execute.c |  77 ++-
 lib/odp-util.c| 726 ++
 lib/odp-util.h|   4 +
 lib/packets.c |  26 +-
 lib/packets.h |   5 +-
 ofproto/ofproto-dpif-ipfix.c  |   4 +-
 ofproto/ofproto-dpif-sflow.c  |   4 +-
 ofproto/ofproto-dpif-xlate.c  |  24 +-
 tests/nsh.at  |  30 +-
 18 files changed, 786 insertions(+), 286 deletions(-)

diff --git a/datapath/linux/compat/include/linux/openvswitch.h 
b/datapath/linux/compat/include/linux/openvswitch.h
index 561f895..3ddb1c5 100644
--- a/datapath/linux/compat/include/linux/openvswitch.h
+++ b/datapath/linux/compat/include/linux/openvswitch.h
@@ -369,7 +369,7 @@ enum ovs_key_attr {
 #ifndef __KERNEL__
/* Only used within userspace data path. */
OVS_KEY_ATTR_PACKET_TYPE,  /* be32 packet type */
-   OVS_KEY_ATTR_NSH,  /* struct ovs_key_nsh */
+   OVS_KEY_ATTR_NSH,  /* Nested set of ovs_nsh_key_* */
 #endif
 
__OVS_KEY_ATTR_MAX
@@ -492,13 +492,28 @@ struct ovs_key_ct_labels {
};
 };
 
-struct ovs_key_nsh {
-__u8 flags;
-__u8 mdtype;
-__u8 np;
-__u8 pad;
-__be32 path_hdr;
-__be32 c[4];
+enum ovs_nsh_key_attr {
+   OVS_NSH_KEY_ATTR_UNSPEC,
+   OVS_NSH_KEY_ATTR_BASE,  /* struct ovs_nsh_key_base. */
+   OVS_NSH_KEY_ATTR_MD1,   /* struct ovs_nsh_key_md1. */
+   OVS_NSH_KEY_ATTR_MD2,   /* variable-length octets. */
+   __OVS_NSH_KEY_ATTR_MAX
+};
+
+#define OVS_NSH_KEY_ATTR_MAX (__OVS_NSH_KEY_ATTR_MAX - 1)
+
+struct ovs_nsh_key_base {
+   __u8 flags;
+   __u8 mdtype;
+   __u8 np;
+   __u8 pad;
+   __be32 path_hdr;
+};
+
+#define NSH_MD1_CONTEXT_SIZE 4
+
+struct ovs_nsh_key_md1 {
+   __be32 context[NSH_MD1_CONTEXT_SIZE];
 };
 
 /* OVS_KEY_ATTR_CT_STATE flags */
@@ -793,25 +808,6 @@ struct ovs_action_push_eth {
struct ovs_key_ethernet addresses;
 };
 
-#define OVS_ENCAP_NSH_MAX_MD_LEN 248
-/*
- * struct ovs_action_encap_nsh - %OVS_ACTION_ATTR_ENCAP_NSH
- * @flags: NSH header flags.
- * @mdtype: NSH metadata type.
- * @mdlen: Length of NSH metadata in bytes, including padding.
- * @np: NSH next_protocol: Inner packet type.
- * @path_hdr: NSH service path id and service index.
- * @metadata: NSH context metadata, padded to 4-bytes
- */
-struct ovs_action_encap_nsh {
-uint8_t flags;
-uint8_t mdtype;
-uint8_t mdlen;
-uint8_t np;
-__be32 path_hdr;
-uint8_t metadata[OVS_ENCAP_NSH_MAX_MD_LEN];
-};
-
 /**
  * enum ovs_nat_attr - Attributes for %OVS_CT_ATTR_NAT.
  *
@@ -887,8 +883,8 @@ enum ovs_nat_attr {
  * @OVS_ACTION_ATTR_PUSH_ETH: Push a new outermost Ethernet header onto the
  * packet.
  * @OVS_ACTION_ATTR_POP_ETH: Pop the outermost Ethernet header off the packet.
- * @OVS_ACTION_ATTR_ENCAP_NSH: encap NSH action to push NSH header.
- * @OVS_ACTION_ATTR_DECAP_NSH: decap NSH action to remove NSH header.
+ * @OVS_ACTION_ATTR_PUSH_NSH: push NSH header to the packet.
+ * @OVS_ACTION_ATTR_POP_NSH: pop the outermost NSH header off the packet.
  *
  * Only a single header can be set with a single %OVS_ACTION_ATTR_SET.  Not all
  * fields within a header are modifiable, e.g. the IPv4 protocol and fragment
@@ -930,8 +926,8 @@ enum ovs_action_attr {
OVS_ACTION_ATTR_TUNNEL_POP,/* u32 port number. */
OVS_ACTION_ATTR_CLONE, /* Nested OVS_CLONE_ATTR_*.  */
OVS_ACTION_ATTR_METER, /* u32 meter

[ovs-dev] [PATCH v7 0/4] nsh: add new nsh key ttl and action dec_nsh_ttl

2018-01-05 Thread Yi Yang

v6->v7
  - Fix comments for v6 from Ben Pfaff
  - Fix checkpatch warnings
  - Fix static code analysis warnings

v5->v6
  - Rebase v5 to master
  - Refactor netlink message format to align to NSH kernel
implementation
  - Add dec_nsh_ttl unit test into tests/nsh.at
  - Fix unit test unstable issue

v4->v5
  - Remove fix patch 1 in v4 because it is merged
  - Fix several comments by Jan Scheurich

v3->v4
  - Add new action dec_nsh_ttl
  - Remove encap_nsh and decap_nsh changes
  - Remove netlink rework to adapt to OVS 2.8
  - Dynamically allocate struct ovs_action_encap_nsh and put
appropriate size for ENCAP_NSH netlink message.

v2->v3
  - Fix several comments Jan Scheurich

v1->v2
  - Rework per kernel datapath review comments
  - Add new NSH key ttl
  - Add many helpers in nsh.h and replace much code
with these helpers
  - nsh.h includes the lasted NSH spec
  - bits of flags and mdtype have a change

This patch refactored NSH netlink keys, push_nsh action
 and netlink message format to align to NSH kernel
implementation. It also adds new NSH key 'ttl' and a new
action dec_nsh_ttl to follow the lasted IETF NSH draft:

https://datatracker.ietf.org/doc/draft-ietf-sfc-nsh/

Yi Yang (4):
  nsh: rework NSH netlink keys and actions
  nsh: add new flow key 'ttl'
  nsh: fix nested mask for OVS_KEY_ATTR_NSH
  nsh: add dec_nsh_ttl action

 datapath/linux/compat/include/linux/openvswitch.h |  58 +-
 include/openvswitch/flow.h|   6 +-
 include/openvswitch/meta-flow.h   |  31 +-
 include/openvswitch/nsh.h | 122 +++-
 include/openvswitch/ofp-actions.h |   1 +
 include/openvswitch/packets.h |   9 +-
 lib/dpif-netdev.c |   4 +-
 lib/dpif.c|   4 +-
 lib/flow.c|  61 +-
 lib/flow.h|   2 +-
 lib/match.c   |  24 +-
 lib/meta-flow.c   |  69 +-
 lib/meta-flow.xml |   6 +-
 lib/nx-match.c|  20 +-
 lib/odp-execute.c |  83 ++-
 lib/odp-util.c| 828 +-
 lib/odp-util.h|   5 +
 lib/ofp-actions.c |  49 ++
 lib/packets.c |  27 +-
 lib/packets.h |   5 +-
 ofproto/ofproto-dpif-ipfix.c  |   4 +-
 ofproto/ofproto-dpif-sflow.c  |   4 +-
 ofproto/ofproto-dpif-xlate.c  |  62 +-
 tests/nsh.at  |  63 +-
 utilities/ovs-ofctl.8.in  |  13 +-
 25 files changed, 1159 insertions(+), 401 deletions(-)

-- 
2.1.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH V2] Documentation: Update Faucet tutorial.

2018-01-05 Thread Brad Cowie

Updates Faucet tutorial to work with newer versions than 1.6.7:

 * Tutorial now shows how to check out latest tag from Faucet's git.

 * Set minimum_ip_size_check flag to False so that the
   payloadless packets generated by ofproto/trace aren't dropped by Faucet.

 * Update Faucet ACL syntax

 * Update output from commands/log files to reflect changes in the
   Faucet pipeline and to use OpenFlow 1.3 format.

Signed-off-by: Brad Cowie 
---
 Documentation/tutorials/faucet.rst | 163 +++--
 1 file changed, 82 insertions(+), 81 deletions(-)

diff --git a/Documentation/tutorials/faucet.rst 
b/Documentation/tutorials/faucet.rst
index 2777924..2f7ac62 100644
--- a/Documentation/tutorials/faucet.rst
+++ b/Documentation/tutorials/faucet.rst
@@ -28,9 +28,9 @@ OVS Faucet Tutorial
 This tutorial demonstrates how Open vSwitch works with a general-purpose
 OpenFlow controller, using the Faucet controller as a simple way to get
 started.  It was tested with the "master" branch of Open vSwitch and version
-1.6.7 of Faucet in October 2017.  It does not use advanced or recently added
-features in OVS or Faucet, so other versions of both pieces of software are
-likely to work equally well.
+1.6.15 of Faucet.  It does not use advanced or recently added features in OVS
+or Faucet, so other versions of both pieces of software are likely to work
+equally well.
 
 The goal of the tutorial is to demonstrate Open vSwitch and Faucet in an
 end-to-end way, that is, to show how it works from the Faucet controller
@@ -125,7 +125,8 @@ between one and the other.
 
At this point I checked out the latest tag::
 
- $ git checkout v1.6.7
+ $ latest_tag=$(git describe --tags $(git rev-list --tags --max-count=1))
+ $ git checkout $latest_tag
 
 2. Build a docker container image::
 
@@ -269,7 +270,7 @@ Now restart Faucet so that the configuration takes effect, 
e.g.::
 Assuming that the configuration update is successful, you should now
 see a new line at the end of ``inst/faucet.log``::
 
-  Oct 14 22:36:42 faucet INFO Add new datapath DPID 1 (0x1)
+  Jan 06 15:14:35 faucet INFO Add new datapath DPID 1 (0x1)
 
 Faucet is now waiting for a switch with datapath ID 0x1 to connect to
 it over OpenFlow, so our next step is to create a switch with OVS and
@@ -289,7 +290,7 @@ output::
 
   Exit the shell to kill the running daemons.
   blp@sigabrt:~/nicira/ovs/tutorial(0)$
-
+
 Inside the sandbox, create a switch ("bridge") named ``br0``, set its
 datapath ID to 0x1, add simulated ports to it named ``p1`` through
 ``p5``, and tell it to connect to the Faucet controller.  To make it
@@ -321,14 +322,15 @@ information, run ``man ovs-vswitchd.conf.db`` and search 
for
 Now, if you look at ``inst/faucet.log`` again, you should see that
 Faucet recognized and configured the new switch and its ports::
 
-  Oct 14 22:50:08 faucet.valve INFO DPID 1 (0x1) Cold start configuring DP
-  Oct 14 22:50:08 faucet.valve INFO DPID 1 (0x1) Configuring VLAN 100 
vid:100 ports:Port 1,Port 2,Port 3
-  Oct 14 22:50:08 faucet.valve INFO DPID 1 (0x1) Configuring VLAN 200 
vid:200 ports:Port 4,Port 5
-  Oct 14 22:50:08 faucet.valve INFO DPID 1 (0x1) Port Port 1 up, 
configuring
-  Oct 14 22:50:08 faucet.valve INFO DPID 1 (0x1) Port Port 2 up, 
configuring
-  Oct 14 22:50:08 faucet.valve INFO DPID 1 (0x1) Port Port 3 up, 
configuring
-  Oct 14 22:50:08 faucet.valve INFO DPID 1 (0x1) Port Port 4 up, 
configuring
-  Oct 14 22:50:08 faucet.valve INFO DPID 1 (0x1) Port Port 5 up, 
configuring
+  Jan 06 15:17:10 faucet   INFO DPID 1 (0x1) connected
+  Jan 06 15:17:10 faucet.valve INFO DPID 1 (0x1) Cold start configuring DP
+  Jan 06 15:17:10 faucet.valve INFO DPID 1 (0x1) Configuring VLAN 100 
vid:100 ports:Port 1,Port 2,Port 3
+  Jan 06 15:17:10 faucet.valve INFO DPID 1 (0x1) Configuring VLAN 200 
vid:200 ports:Port 4,Port 5
+  Jan 06 15:17:10 faucet.valve INFO DPID 1 (0x1) Port 1 up, configuring
+  Jan 06 15:17:10 faucet.valve INFO DPID 1 (0x1) Port 2 up, configuring
+  Jan 06 15:17:10 faucet.valve INFO DPID 1 (0x1) Port 3 up, configuring
+  Jan 06 15:17:10 faucet.valve INFO DPID 1 (0x1) Port 4 up, configuring
+  Jan 06 15:17:10 faucet.valve INFO DPID 1 (0x1) Port 5 up, configuring
 
 Over on the Open vSwitch side, you can see a lot of related activity
 if you take a look in ``sandbox/ovs-vswitchd.log``.  For example, here
@@ -419,7 +421,7 @@ Table 7
 
 Table 8
   Flooding
-  
+
 With that in mind, let's dump the flow tables.  The simplest way is to
 just run plain ``ovs-ofctl dump-flows``::
 
@@ -466,12 +468,8 @@ will do more if we configured port-based ACLs::
   priority=0 actions=drop
 
 Table 1, for ingress VLAN processing, has a bunch of flows that drop
-inappropriate packets, like those that claim to be from a broadcast
-source address (why not from all multicast source addresses,
-though?)::
+inappropriate packets, such

[ovs-dev] [PATCH 1/3] ofp-actions: Make formatting and parsing functions take a struct argument.

2018-01-05 Thread Ben Pfaff

An upcoming commit will add another parameter for parsing and formatting
actions.  It is much easier to add these parameters if they are
encapsulated in a struct, so this commit first makes that change.

Signed-off-by: Ben Pfaff 
---
 include/openvswitch/ofp-actions.h |   32 +-
 lib/ofp-actions.c | 1114 +++--
 lib/ofp-parse.c   |   24 +-
 lib/ofp-print.c   |   40 +-
 ofproto/ofproto-dpif-trace.c  |8 +-
 ofproto/ofproto-dpif-xlate.c  |   12 +-
 ofproto/ofproto.c |3 +-
 ovn/controller/ofctrl.c   |3 +-
 ovn/utilities/ovn-sbctl.c |3 +-
 tests/test-ovn.c  |3 +-
 utilities/ovs-ofctl.c |   19 +-
 11 files changed, 557 insertions(+), 704 deletions(-)

diff --git a/include/openvswitch/ofp-actions.h 
b/include/openvswitch/ofp-actions.h
index 3d9775582f6d..7e5bc448fd83 100644
--- a/include/openvswitch/ofp-actions.h
+++ b/include/openvswitch/ofp-actions.h
@@ -1062,18 +1062,32 @@ bool ofpacts_equal_stringwise(const struct ofpact a[], 
size_t a_len,
 const struct mf_field *ofpact_get_mf_dst(const struct ofpact *ofpact);
 uint32_t ofpacts_get_meter(const struct ofpact[], size_t ofpacts_len);
 
-/* Formatting and parsing ofpacts. */
+/* Formatting ofpacts. */
+struct ofpact_format_params {
+/* Input. */
+const struct ofputil_port_map *port_map;
+
+/* Output. */
+struct ds *s;
+};
 void ofpacts_format(const struct ofpact[], size_t ofpacts_len,
-const struct ofputil_port_map *, struct ds *);
-char *ofpacts_parse_actions(const char *, const struct ofputil_port_map *,
-struct ofpbuf *ofpacts,
-enum ofputil_protocol *usable_protocols)
+const struct ofpact_format_params *);
+const char *ofpact_name(enum ofpact_type);
+
+/* Parsing ofpacts. */
+struct ofpact_parse_params {
+/* Input. */
+const struct ofputil_port_map *port_map;
+
+/* Output. */
+struct ofpbuf *ofpacts;
+enum ofputil_protocol *usable_protocols;
+};
+char *ofpacts_parse_actions(const char *, const struct ofpact_parse_params *)
 OVS_WARN_UNUSED_RESULT;
-char *ofpacts_parse_instructions(const char *, const struct ofputil_port_map *,
- struct ofpbuf *ofpacts,
- enum ofputil_protocol *usable_protocols)
+char *ofpacts_parse_instructions(const char *,
+ const struct ofpact_parse_params *)
 OVS_WARN_UNUSED_RESULT;
-const char *ofpact_name(enum ofpact_type);
 
 /* Internal use by the helpers below. */
 void ofpact_init(struct ofpact *, enum ofpact_type, size_t len);
diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c
index 4918498efb30..ed5cc75c2338 100644
--- a/lib/ofp-actions.c
+++ b/lib/ofp-actions.c
@@ -411,8 +411,7 @@ static void *ofpact_put_raw(struct ofpbuf *, enum 
ofp_version,
 enum ofp_raw_action_type, uint64_t arg);
 
 static char *OVS_WARN_UNUSED_RESULT ofpacts_parse(
-char *str, const struct ofputil_port_map *,
-struct ofpbuf *ofpacts, enum ofputil_protocol *usable_protocols,
+char *str, const struct ofpact_parse_params *pp,
 bool allow_instructions, enum ofpact_type outer_action);
 static enum ofperr ofpacts_pull_openflow_actions__(
 struct ofpbuf *openflow, unsigned int actions_len,
@@ -420,8 +419,7 @@ static enum ofperr ofpacts_pull_openflow_actions__(
 struct ofpbuf *ofpacts, enum ofpact_type outer_action,
 const struct vl_mff_map *vl_mff_map, uint64_t *ofpacts_tlv_bitmap);
 static char * OVS_WARN_UNUSED_RESULT ofpacts_parse_copy(
-const char *s_, const struct ofputil_port_map *, struct ofpbuf *ofpacts,
-enum ofputil_protocol *usable_protocols,
+const char *s_, const struct ofpact_parse_params *pp,
 bool allow_instructions, enum ofpact_type outer_action);
 
 /* Returns the ofpact following 'ofpact', except that if 'ofpact' contains
@@ -599,16 +597,16 @@ encode_OUTPUT(const struct ofpact_output *output,
 }
 
 static char * OVS_WARN_UNUSED_RESULT
-parse_truncate_subfield(struct ofpact_output_trunc *output_trunc,
-const char *arg_,
-const struct ofputil_port_map *port_map)
+parse_truncate_subfield(const char *arg_,
+const struct ofpact_parse_params *pp,
+struct ofpact_output_trunc *output_trunc)
 {
 char *key, *value;
 char *arg = CONST_CAST(char *, arg_);
 
 while (ofputil_parse_key_value(, , )) {
 if (!strcmp(key, "port")) {
-if (!ofputil_port_from_string(value, port_map,
+if (!ofputil_port_from_string(value, pp->port_map,
   _trunc->port)) {
 return xasprintf("output to unknown truncate port: %s",
   value);
@@ -635,21 +633,18 @@

[ovs-dev] [PATCH 2/3] ofp-util: New data structure for mapping between table names and numbers.

2018-01-05 Thread Ben Pfaff

This shares the infrastructure for mapping port names and numbers.  It will
be used in an upcoming commit.

Signed-off-by: Ben Pfaff 
---
 include/openvswitch/ofp-util.h |  33 +++--
 lib/ofp-util.c | 150 ++---
 2 files changed, 138 insertions(+), 45 deletions(-)

diff --git a/include/openvswitch/ofp-util.h b/include/openvswitch/ofp-util.h
index a9e57ed5808d..84b1dbd2756b 100644
--- a/include/openvswitch/ofp-util.h
+++ b/include/openvswitch/ofp-util.h
@@ -43,15 +43,24 @@ union ofp_action;
 struct ofpact_set_field;
 struct vl_mff_map;
 
-/* Mapping between port numbers and names. */
-struct ofputil_port_map {
+/* Name-number mapping.
+ *
+ * This is not exported directly but only through specializations for port
+ * name-number and table name-number mappings. */
+struct ofputil_name_map {
 struct hmap by_name;
 struct hmap by_number;
 };
-
-#define OFPUTIL_PORT_MAP_INITIALIZER(MAP)  \
+#define OFPUTIL_NAME_MAP_INITIALIZER(MAP)  \
 { HMAP_INITIALIZER(&(MAP)->by_name), HMAP_INITIALIZER(&(MAP)->by_number) }
 
+/* Mapping between port numbers and names. */
+struct ofputil_port_map {
+struct ofputil_name_map map;
+};
+#define OFPUTIL_PORT_MAP_INITIALIZER(MAP) \
+{ OFPUTIL_NAME_MAP_INITIALIZER(&(MAP)->map) }
+
 void ofputil_port_map_init(struct ofputil_port_map *);
 const char *ofputil_port_map_get_name(const struct ofputil_port_map *,
   ofp_port_t);
@@ -791,6 +800,22 @@ struct ofputil_table_mod_prop_vacancy {
 uint8_t vacancy; /* Current vacancy (%). */
 };
 
+/* Mapping between table numbers and names. */
+struct ofputil_table_map {
+struct ofputil_name_map map;
+};
+#define OFPUTIL_TABLE_MAP_INITIALIZER(MAP) \
+{ OFPUTIL_NAME_MAP_INITIALIZER((MAP).map) }
+
+void ofputil_table_map_init(struct ofputil_table_map *);
+const char *ofputil_table_map_get_name(const struct ofputil_table_map *,
+   uint8_t);
+uint8_t ofputil_table_map_get_number(const struct ofputil_table_map *,
+ const char *name);
+void ofputil_table_map_put(struct ofputil_table_map *,
+   uint8_t, const char *name);
+void ofputil_table_map_destroy(struct ofputil_table_map *);
+
 /* Abstract ofp_table_mod. */
 struct ofputil_table_mod {
 uint8_t table_id; /* ID of the table, 0xff indicates all tables. */
diff --git a/lib/ofp-util.c b/lib/ofp-util.c
index 47f30c7716f3..43122ef09e5d 100644
--- a/lib/ofp-util.c
+++ b/lib/ofp-util.c
@@ -7454,12 +7454,14 @@ ofputil_port_to_string(ofp_port_t port,
 snprintf(namebuf, bufsize, "%"PRIu32, port);
 }
 
-/* ofputil_port_map.  */
-struct ofputil_port_map_node {
+/* ofputil_name_map.  */
+
+struct ofputil_name_map_node {
 struct hmap_node name_node;
 struct hmap_node number_node;
-ofp_port_t ofp_port;/* Port number. */
-char *name; /* Port name. */
+
+uint32_t number;
+char *name;
 
 /* OpenFlow doesn't require port names to be unique, although that's the
  * only sensible way.  However, even in Open vSwitch it's possible for two
@@ -7469,22 +7471,25 @@ struct ofputil_port_map_node {
  * corner case.
  *
  * OpenFlow does require port numbers to be unique.  We check for duplicate
- * ports numbers just in case a switch has a bug. */
+ * ports numbers just in case a switch has a bug.
+ *
+ * OpenFlow doesn't require table names to be unique and Open vSwitch
+ * doesn't try to make them unique. */
 bool duplicate;
 };
 
-void
-ofputil_port_map_init(struct ofputil_port_map *map)
+static void
+ofputil_name_map_init(struct ofputil_name_map *map)
 {
 hmap_init(>by_name);
 hmap_init(>by_number);
 }
 
-static struct ofputil_port_map_node *
-ofputil_port_map_find_by_name(const struct ofputil_port_map *map,
+static struct ofputil_name_map_node *
+ofputil_name_map_find_by_name(const struct ofputil_name_map *map,
   const char *name)
 {
-struct ofputil_port_map_node *node;
+struct ofputil_name_map_node *node;
 
 HMAP_FOR_EACH_WITH_HASH (node, name_node, hash_string(name, 0),
  >by_name) {
@@ -7495,38 +7500,38 @@ ofputil_port_map_find_by_name(const struct 
ofputil_port_map *map,
 return NULL;
 }
 
-static struct ofputil_port_map_node *
-ofputil_port_map_find_by_number(const struct ofputil_port_map *map,
-ofp_port_t ofp_port)
+static struct ofputil_name_map_node *
+ofputil_name_map_find_by_number(const struct ofputil_name_map *map,
+uint32_t number)
 {
-struct ofputil_port_map_node *node;
+struct ofputil_name_map_node *node;
 
-HMAP_FOR_EACH_IN_BUCKET (node, number_node, hash_ofp_port(ofp_port),
+HMAP_FOR_EACH_IN_BUCKET (node, number_node, hash_int(number, 0),
  >by_number) {
-if

[ovs-dev] [PATCH 0/3] support table names in CLI

2018-01-05 Thread Ben Pfaff

These patches are not intended to be applied until we've branched for 2.9,
since I didn't finish or send them out in time for 2.9.

Ben Pfaff (3):
  ofp-actions: Make formatting and parsing functions take a struct
argument.
  ofp-util: New data structure for mapping between table names and
numbers.
  Support accepting and displaying table names in OVS tools.

 NEWS  |   11 +
 include/openvswitch/ofp-actions.h |   34 +-
 include/openvswitch/ofp-parse.h   |   20 +-
 include/openvswitch/ofp-print.h   |   12 +-
 include/openvswitch/ofp-util.h|   41 +-
 lib/learn.c   |   20 +-
 lib/learn.h   |6 +-
 lib/learning-switch.c |2 +-
 lib/ofp-actions.c | 1147 +++--
 lib/ofp-parse.c   |  104 ++--
 lib/ofp-print.c   |  242 +---
 lib/ofp-util.c|  255 +++--
 lib/vconn.c   |   10 +-
 ofproto/ofproto-dpif-trace.c  |8 +-
 ofproto/ofproto-dpif-xlate.c  |   12 +-
 ofproto/ofproto-dpif.c|4 +-
 ofproto/ofproto.c |5 +-
 ovn/controller/ofctrl.c   |   15 +-
 ovn/controller/pinctrl.c  |2 +-
 ovn/utilities/ovn-sbctl.c |5 +-
 ovn/utilities/ovn-trace.c |2 +-
 tests/ofproto.at  |   89 ++-
 tests/test-ovn.c  |3 +-
 utilities/ovs-ofctl.8.in  |   90 +--
 utilities/ovs-ofctl.c |  313 --
 utilities/ovs-testcontroller.c|2 +-
 26 files changed, 1426 insertions(+), 1028 deletions(-)

-- 
2.10.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] Userspace space conntrack tcp issue

2018-01-05 Thread Darrell Ball

Hi Wang

Pls see inline

Thanks Darrell

On 1/1/18, 10:57 PM, "ovs-dev-boun...@openvswitch.org on behalf of 王志克" 
 wrote:

Hi,



I am testing below scenario, and I think there is some issue on TCP 
conntrack sequence number filter.



Scenario:



VM1->Host1-Host2-->VM2



There is SCP file copy below VM1 and VM2, and we configured conntrack. 
During the scp, I restart the openvswitch service (process stop and start), 
then after the restart, I saw the consequence TCP packets are tagged as invalid 
by conntrack and traffic can not be recovered.

I did some debug and found it fails on below check “(ackskew >= 
-MAXACKWINDOW)”. I am wondering should it be “(ackskew >= 
-(MAXACKWINDOW<seqlo - ack;

#define MAXACKWINDOW (0x + 1500)/* 1500 is an arbitrary fudge 
factor */

if (SEQ_GEQ(src->seqhi, end)

/* Last octet inside other's window space */

&& SEQ_GEQ(seq, src->seqlo - (dst->max_win << dws))

/* Retrans: not more than one window back */



&& (ackskew >= -MAXACKWINDOW)

/* Acking not more than one reassembled fragment backwards */

[Darrell] The above specific check looks ok to me; also, I am not convinced 
widening the valid skew is a good idea in the general case.

   Btw, what is the sws value in your case ?


&& (ackskew <= (MAXACKWINDOW << sws))

/* Acking not more than one window forward */

&& ((tcp_flags & TCP_RST) == 0 || orig_seq == src->seqlo

|| (orig_seq == src->seqlo + 1) || (orig_seq + 1 == 
src->seqlo))) {



Details:



   TCP Client Seq   TCP Client ACKTCP Server 
Seq TCP Server ACK

Before the restart:0x69f1536e 0xa3c81999   0xa3ca2d49   
0x69f15302

After the restart(5s later): 0x69f15302 0xa3c81999   0xa3c561e1 
  0x69f15302



As we can see the new seq 0xa3c561e1 (server steped back since previous 
segments are not acked.) is much less than 0xa3c81999 (client keeps sending 
last acked packet), which leads to failed check on conntrack.


[Darrell] Did you see any eventual resets ?; followed by conn timeout ?



I am using OVS2.7.0+dpdk16.11.3



Any thought?



Br,

Wang Zhike

___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev=DwIGaQ=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=y-qxAvIT_2KCfecDj8-3VCoLgJ-0NF-mDO_50tx2rcs=asIHHvNQJ8KbNdwwo78o7tz9FsckFUAdWziWIsltDxQ=


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2] openvswitch: Trim off padding before L3+ netfilter processing

2018-01-05 Thread Ed Swierk via dev

On Fri, Jan 5, 2018 at 10:14 AM, Ed Swierk 
wrote:
> On Thu, Jan 4, 2018 at 7:36 PM, Pravin Shelar  wrote:
>> OVS already pull all required headers in skb linear data, so no need
>> to redo all of it. only check required is the ip-checksum validation.
>> I think we could avoid it in most of cases by checking skb length to
>> ipheader length before verifying the ip header-checksum.
>
> Shouldn't the IP header checksum be verified even earlier, like in
> key_extract(), before actually using any of the fields in the IP
> header?

Something like this for verifying the IP header checksum (not tested):

--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -203,6 +203,7 @@ static int check_iphdr(struct sk_buff *skb)
 {
  unsigned int nh_ofs = skb_network_offset(skb);
  unsigned int ip_len;
+ const struct iphdr *nh;
  int err;

  err = check_header(skb, nh_ofs + sizeof(struct iphdr));
@@ -214,6 +215,13 @@ static int check_iphdr(struct sk_buff *skb)
   skb->len < nh_ofs + ip_len))
  return -EINVAL;

+ if (unlikely(!pskb_may_pull(skb, nh_ofs + ip_len)))
+ return -ENOMEM;
+
+ nh = ip_hdr(skb);
+ if (unlikely(ip_fast_csum((u8 *)nh, nh->ihl)))
+ return -EINVAL;
+
  skb_set_transport_header(skb, nh_ofs + ip_len);
  return 0;
 }

> And since key_extract() is already inspecting the IP/IPv6 header, it
> would be a convenient spot to check whether the skb->len matches. If
> there's a difference, it could record the number of bytes to trim in
> an ovs_skb_cb field. Then ovs_ct_execute() would look at this field
> and trim the skb only if necessary.

And something like this for trimming the padding (also not tested):

--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -1112,6 +1112,14 @@ int ovs_ct_execute(struct net *net, struct sk_buff
*skb,
  nh_ofs = skb_network_offset(skb);
  skb_pull_rcsum(skb, nh_ofs);

+ if (unlikely(OVS_CB(skb)->padlen)) {
+ err = pskb_trim_rcsum(skb, skb->len - OVS_CB(skb)->padlen);
+ if (err) {
+ kfree_skb(skb);
+ return err;
+ }
+ }
+
  if (key->ip.frag != OVS_FRAG_TYPE_NONE) {
  err = handle_fragments(net, key, info->zone.id, skb);
  if (err)
diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
index 523d655..eab29fa 100644
--- a/net/openvswitch/datapath.h
+++ b/net/openvswitch/datapath.h
@@ -106,12 +106,14 @@ struct datapath {
  * fragmented.
  * @acts_origlen: The netlink size of the flow actions applied to this skb.
  * @cutlen: The number of bytes from the packet end to be removed.
+ * @padlen: The number of padding bytes to be ignored in L3+ processing.
  */
 struct ovs_skb_cb {
  struct vport *input_vport;
  u16 mru;
  u16 acts_origlen;
  u32 cutlen;
+ u32 padlen;
 };
 #define OVS_CB(skb) ((struct ovs_skb_cb *)(skb)->cb)

diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index 987a97f..c749dfd 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -223,6 +223,7 @@ static int check_iphdr(struct sk_buff *skb)
  return -EINVAL;

  skb_set_transport_header(skb, nh_ofs + ip_len);
+ OVS_CB(skb)->padlen = skb->len - nh_ofs - ntohs(nh->tot_len);
  return 0;
 }

@@ -305,6 +306,7 @@ static int parse_ipv6hdr(struct sk_buff *skb, struct
sw_flow_key *key)

  nh_len = payload_ofs - nh_ofs;
  skb_set_transport_header(skb, nh_ofs + nh_len);
+ OVS_CB(skb)->padlen = skb->len - payload_ofs - ntohs(nh->payload_len);
  key->ip.proto = nexthdr;
  return nh_len;
 }
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH] Documentation: Update Faucet tutorial.

2018-01-05 Thread Brad Cowie

Updates Faucet tutorial to work with newer versions than 1.6.7.

Tutorial now shows how to check out latest tag from Faucet's git.

Tutorial now sets minimum_ip_size_check flag to False so that the
payloadless packets generated by ofproto/trace aren't dropped by Faucet.

Signed-off-by: Brad Cowie 
---
 Documentation/tutorials/faucet.rst | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/Documentation/tutorials/faucet.rst 
b/Documentation/tutorials/faucet.rst
index 2777924..5d229eb 100644
--- a/Documentation/tutorials/faucet.rst
+++ b/Documentation/tutorials/faucet.rst
@@ -28,9 +28,9 @@ OVS Faucet Tutorial
 This tutorial demonstrates how Open vSwitch works with a general-purpose
 OpenFlow controller, using the Faucet controller as a simple way to get
 started.  It was tested with the "master" branch of Open vSwitch and version
-1.6.7 of Faucet in October 2017.  It does not use advanced or recently added
-features in OVS or Faucet, so other versions of both pieces of software are
-likely to work equally well.
+1.6.15 of Faucet.  It does not use advanced or recently added features in OVS
+or Faucet, so other versions of both pieces of software are likely to work
+equally well.
 
 The goal of the tutorial is to demonstrate Open vSwitch and Faucet in an
 end-to-end way, that is, to show how it works from the Faucet controller
@@ -125,7 +125,8 @@ between one and the other.
 
At this point I checked out the latest tag::
 
- $ git checkout v1.6.7
+ $ latest_tag=$(git describe --tags $(git rev-list --tags --max-count=1))
+ $ git checkout $latest_tag
 
 2. Build a docker container image::
 
@@ -920,8 +921,11 @@ Now let's start over, adding L3 routing into the picture.
 
 It's remarkably easy to enable routing.  We just change our ``vlans``
 section in ``inst/faucet.yaml`` to specify a router IP address for
-each VLAN and define a router between them.  The ``dps`` section is
-unchanged::
+each VLAN and define a router between them. For our example we need to
+set ``minimum_ip_size_check`` to ``False``, this will disable a sanity
+check in Faucet that will allow it to accept packets generated by
+``ofproto/trace``, normally in production you would not turn this off.
+The ``dps`` section is unchanged::
 
   dps:
   switch-1:
@@ -942,8 +946,10 @@ unchanged::
   vlans:
   100:
   faucet_vips: ["10.100.0.254/24"]
+  minimum_ip_size_check: False
   200:
   faucet_vips: ["10.200.0.254/24"]
+  minimum_ip_size_check: False
   routers:
   router-1:
   vlans: [100, 200]
@@ -1336,8 +1342,10 @@ the ways that OVS tries to optimize megaflows.  Update
   vlans:
   100:
  faucet_vips: ["10.100.0.254/24"]
+ minimum_ip_size_check: False
   200:
  faucet_vips: ["10.200.0.254/24"]
+ minimum_ip_size_check: False
   routers:
   router-1:
  vlans: [100, 200]
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] ofp-print: Include full hex dump of messages that fail decode.

2018-01-05 Thread Ben Pfaff

On Fri, Jan 05, 2018 at 10:58:45AM -0800, Guru Shetty wrote:
> On 4 January 2018 at 22:40, Ben Pfaff  wrote:
> 
> > In debugging issues with a controller that appears to send invalid
> > OpenFlow messages, it can be difficult to actually see the important
> > details of those messages, because OpenFlow message logging (in the vconn
> > log module) will generally just print a generic decode error that doesn't
> > allow a developer to see what failed to decode.  This commit enhances the
> > ofp-print code so that, if a message fails to decode, the full hex dump of
> > the message is included in the output.
> >
> > Reported-by: Su Wang 
> > Signed-off-by: Ben Pfaff 
> >
> Acked-by: Gurucharan Shetty 

Thanks, applied to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] Conozca qué es, su aplicación práctica y beneficios

2018-01-05 Thread Business Analytics

Conocimientos de actualidad al servicio de su empresa 

Business Analytics: Información financiera clave para la toma de decisiones 
empresariales 
24 de enero- Dr. Mauricio Carrera Abarca - 9am-6pm

La minería de datos es el proceso de detectar la información procesable de los 
conjuntos grandes de datos. Es un método que tiene la finalidad de descubrir 
patrones en grandes volúmenes de datos, el cual utiliza el análisis matemático 
para deducir las tendencias que existen, la cuales no pueden detectarse 
mediante la exploración superficial, pues las relaciones son demasiado 
complejas o bien, existen demasiados datos. 

BENEFICIOS DE ASISTIR: 

• Sabrá qué es la minería de datos, específicamente la minería de datos 
financieros.
 • Conocerá las principales metodologías de minería de datos.
 • Aplicará los potenciales beneficios en la esfera empresarial de la minería 
de datos financieros.
 • Aplicará de forma práctica la metodología estadística de “Regresión Lineal” 
y “Error Típico” para pronosticar valores económicos y financieros. 

¿Requiere la información a la Brevedad? responda este email con la palabra: 
Empresarial + nombre - teléfono - correo.


centro telefónico:018002120744


 


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 3/3] rhel: add "force-reload-kmod" support in "ovs-systemd-reload"

2018-01-05 Thread Gregory Rose


On 12/22/2017 7:00 AM, Timothy Redaelli wrote:

Since you can't use "ovs-ctl force-reload-kmod" on Fedora/RHEL, due to
systemd dependencies, this commit adds the "force-reload-kmod" feature on
ovs-systemd-reload.

Signed-off-by: Timothy Redaelli 
---
  rhel/usr_share_openvswitch_scripts_ovs-systemd-reload | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/rhel/usr_share_openvswitch_scripts_ovs-systemd-reload 
b/rhel/usr_share_openvswitch_scripts_ovs-systemd-reload
index 5d2efc621..894df0427 100755
--- a/rhel/usr_share_openvswitch_scripts_ovs-systemd-reload
+++ b/rhel/usr_share_openvswitch_scripts_ovs-systemd-reload
@@ -40,6 +40,10 @@ add_managers() {
  :
  }
  
-restart

+if [ "$1" = "force-reload-kmod" ]; then
+force_reload_kmod
+else
+restart
+fi
  
  exit 0


Works as advertised...

Tested-by: Greg Rose 
Reviewed-by: Greg Rose 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 2/3] rhel: use the functions in ovs-lib.in in ovs-systemd-reload

2018-01-05 Thread Gregory Rose


On 12/22/2017 7:00 AM, Timothy Redaelli wrote:

To avoid code duplication use the functions from ovs-lib.in

Signed-off-by: Timothy Redaelli 
---
  ...sr_share_openvswitch_scripts_ovs-systemd-reload | 37 ++
  1 file changed, 23 insertions(+), 14 deletions(-)


Tested-by: Greg Rose 
Reviewed-by: Greg Rose 



diff --git a/rhel/usr_share_openvswitch_scripts_ovs-systemd-reload 
b/rhel/usr_share_openvswitch_scripts_ovs-systemd-reload
index 3ac1a46c6..5d2efc621 100755
--- a/rhel/usr_share_openvswitch_scripts_ovs-systemd-reload
+++ b/rhel/usr_share_openvswitch_scripts_ovs-systemd-reload
@@ -14,23 +14,32 @@
  # See the License for the specific language governing permissions and
  # limitations under the License.
  
-# Save flows

-bridges=$(ovs-vsctl -- --real list-br)
-flows=$(/usr/share/openvswitch/scripts/ovs-save save-flows $bridges)
+case $0 in
+*/*) dir0=`echo "$0" | sed 's,/[^/]*$,,'` ;;
+*) dir0=./ ;;
+esac
+. "$dir0/ovs-lib" || exit 1
  
-# Restart the database first, since a large database may take a

-# while to load, and we want to minimize forwarding disruption.
-systemctl --job-mode=ignore-dependencies restart ovsdb-server
+stop_ovsdb() {
+systemctl --job-mode=ignore-dependencies stop ovsdb-server
+}
  
-# Stop ovs-vswitchd.

-systemctl --job-mode=ignore-dependencies stop ovs-vswitchd
+start_ovsdb() {
+systemctl --job-mode=ignore-dependencies start ovsdb-server
+}
  
-# Start vswitchd by asking it to wait till flow restore is finished.

-ovs-vsctl --no-wait set open_vswitch . other_config:flow-restore-wait=true
-systemctl --job-mode=ignore-dependencies start ovs-vswitchd
+stop_forwarding() {
+systemctl --job-mode=ignore-dependencies stop ovs-vswitchd
+}
  
-# Restore saved flows and inform vswitchd that we are done.

-eval "$flows"
-ovs-vsctl --if-exists remove open_vswitch . other_config flow-restore-wait=true
+start_forwarding() {
+systemctl --job-mode=ignore-dependencies start ovs-vswitchd
+}
+
+add_managers() {
+:
+}
+
+restart
  
  exit 0


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 1/3] utilities: move some functions from ovs-ctl.in to ovs-lib.in

2018-01-05 Thread Gregory Rose


On 12/22/2017 7:00 AM, Timothy Redaelli wrote:

Move the functions related to "force-reload-kmod" and "restart" from
ovs-ctl.in to ovs-lib.in in order to permit other scripts to use them.

Signed-off-by: Timothy Redaelli 
---
  utilities/ovs-ctl.in | 173 ---
  utilities/ovs-lib.in | 173 +++
  2 files changed, 173 insertions(+), 173 deletions(-)


A few minor checkpatch errors:

== Checking 
"0001-utilities-move-some-functions-from-ovs-ctl.in-to-ovs.patch" ==

WARNING: Line has non-spaces leading whitespace
#231 FILE: utilities/ovs-lib.in:465:
    -- list-br) | sort -u`

WARNING: Line has non-spaces leading whitespace
#234 FILE: utilities/ovs-lib.in:468:
    printf "%s " "$d"

WARNING: Line has non-spaces leading whitespace
#235 FILE: utilities/ovs-lib.in:469:
    fi

Lines checked: 385, Warnings: 3, Errors: 0

Otherwise looks good to me.

Tested-by: Greg Rose 
Reviewed-by: Greg Rose 




diff --git a/utilities/ovs-ctl.in b/utilities/ovs-ctl.in
index f1b01d1d3..1df56c4a5 100755
--- a/utilities/ovs-ctl.in
+++ b/utilities/ovs-ctl.in
@@ -63,10 +63,6 @@ insert_mod_if_required () {
  insert_mods
  }
  
-ovs_vsctl () {

-ovs-vsctl --no-wait "$@"
-}
-
  set_hostname () {
  # 'hostname -f' needs network connectivity to work.  So we should
  # call this only after ovs-vswitchd is running.
@@ -270,175 +266,6 @@ stop_forwarding () {
  fi
  }
  
-## - ##

-## force-reload-kmod ##
-## - ##
-
-internal_interfaces () {
-# Outputs a list of internal interfaces:
-#
-#   - There is an internal interface for every bridge, whether it
-# has an Interface record or not and whether the Interface
-# record's 'type' is properly set or not.
-#
-#   - There is an internal interface for each Interface record whose
-# 'type' is 'internal'.
-#
-# But ignore interfaces that don't really exist.
-for d in `(ovs_vsctl --bare \
--- --columns=name find Interface type=internal \
-   -- list-br) | sort -u`
-do
-if test -e "/sys/class/net/$d"; then
-   printf "%s " "$d"
-   fi
-done
-}
-
-ovs_save () {
-bridges=`ovs_vsctl -- --real list-br`
-if [ -n "${bridges}" ] && \
-"$datadir/scripts/ovs-save" "$1" ${bridges} > "$2"; then
-chmod +x "$2"
-return 0
-fi
-[ -z "${bridges}" ] && return 0
-}
-
-save_flows_if_required () {
-if test X"$DELETE_BRIDGES" != Xyes; then
-action "Saving flows" ovs_save save-flows "${script_flows}"
-fi
-}
-
-save_interfaces () {
-"$datadir/scripts/ovs-save" save-interfaces ${ifaces} \
-> "${script_interfaces}"
-}
-
-flow_restore_wait () {
-if test X"$OVS_VSWITCHD" = Xyes; then
-ovs_vsctl set open_vswitch . other_config:flow-restore-wait="true"
-fi
-}
-
-flow_restore_complete () {
-if test X"$OVS_VSWITCHD" = Xyes; then
-ovs_vsctl --if-exists remove open_vswitch . other_config \
-  flow-restore-wait="true"
-fi
-}
-
-restore_flows () {
-[ -x "${script_flows}" ] && \
-action "Restoring saved flows" "${script_flows}"
-}
-
-restore_interfaces () {
-[ ! -x "${script_interfaces}" ] && return 0
-action "Restoring interface configuration" "${script_interfaces}"
-rc=$?
-if test $rc = 0; then
-level=debug
-else
-level=err
-fi
-log="logger -p daemon.$level -t ovs-save"
-$log "interface restore script exited with status $rc:"
-$log -f "$script_interfaces"
-}
-
-init_restore_scripts () {
-script_interfaces=`mktemp`
-script_flows=`mktemp`
-trap 'rm -f "${script_interfaces}" "${script_flows}"' 0
-}
-
-force_reload_kmod () {
-
-if test X"$OVS_VSWITCHD" != Xyes; then
-log_failure_msg "Reloading of kmod without ovs-vswitchd is an error"
-exit 1
-fi
-
-ifaces=`internal_interfaces`
-action "Detected internal interfaces: $ifaces" true
-
-init_restore_scripts
-save_flows_if_required
-
-# Restart the database first, since a large database may take a
-# while to load, and we want to minimize forwarding disruption.
-stop_ovsdb
-start_ovsdb || return 1
-
-stop_forwarding
-
-if action "Saving interface configuration" save_interfaces; then
-:
-else
-log_warning_msg "Failed to save configuration, not replacing kernel 
module"
-start_forwarding
-add_managers
-exit 1
-fi
-chmod +x "$script_interfaces"
-
-for dp in `ovs-dpctl dump-dps`; do
-action "Removing datapath: $dp" ovs-dpctl del-dp "$dp"
-done
-
-for vport in `awk '/^vport_/ { print $1 }' /proc/modules`; do
-action "Removing $vport module" rmmod $vport
-done
-
-if test -e

[ovs-dev] [PATCH V2] compat:inet_frag.h: Check for frag_percpu_counter_batch

2018-01-05 Thread Greg Rose

Fix up the compat layer to check for frag_percpu_counter_batch and
if not present then use atomic_sub and atomic_add as per the
backport in the 3.16.50 LTS kernel.  Fixes compile errors on
3.16 series kernels from 3.16.50 on.

Signed-off-by: Greg Rose 

---

V2 - Fix typo
---
 datapath/linux/compat/include/net/inet_frag.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/datapath/linux/compat/include/net/inet_frag.h 
b/datapath/linux/compat/include/net/inet_frag.h
index 34078c8..c98c3a4 100644
--- a/datapath/linux/compat/include/net/inet_frag.h
+++ b/datapath/linux/compat/include/net/inet_frag.h
@@ -30,6 +30,7 @@ static inline bool inet_frag_evicting(struct inet_frag_queue 
*q)
 #endif
 
 #ifndef HAVE_SUB_FRAG_MEM_LIMIT_ARG_STRUCT_NETNS_FRAGS
+#ifdef frag_percpu_counter_batch
 static inline void rpl_sub_frag_mem_limit(struct netns_frags *nf, int i)
 {
__percpu_counter_add(>mem, -i, frag_percpu_counter_batch);
@@ -41,6 +42,19 @@ static inline void rpl_add_frag_mem_limit(struct netns_frags 
*nf, int i)
__percpu_counter_add(>mem, i, frag_percpu_counter_batch);
 }
 #define add_frag_mem_limit rpl_add_frag_mem_limit
+#else /* frag_percpu_counter_batch */
+static inline void rpl_sub_frag_mem_limit(struct netns_frags *nf, int i)
+{
+   atomic_sub(i, >mem);
+}
+#define sub_frag_mem_limit rpl_sub_frag_mem_limit
+
+static inline void rpl_add_frag_mem_limit(struct netns_frags *nf, int i)
+{
+   atomic_add(i, >mem);
+}
+#define add_frag_mem_limit rpl_add_frag_mem_limit
+#endif /* frag_percpu_counter_batch */
 #endif
 
 #ifdef HAVE_VOID_INET_FRAGS_INIT
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [no-slow 2/6] ofproto-dpif: Reorganize upcall handling.

2018-01-05 Thread Gregory Rose


On 1/5/2018 10:46 AM, Justin Pettit wrote:

On Jan 5, 2018, at 8:05 AM, Gregory Rose  wrote:

For now that's about as far as I can take my investigation since I have a few 
other things I need to work
on.  If you can think of another test I should run or something for me to check 
into let me know.

Ben noticed that this seemed to have been removed in error in my patch:


@@ -813,7 +816,6 @@ recv_upcalls(struct handler *handler)

 upcall->key = dupcall->key;
 upcall->key_len = dupcall->key_len;
-upcall->ufid = >ufid;

 upcall->out_tun_key = dupcall->out_tun_key;
 upcall->actions = dupcall->actions;


Can you try adding that line back and see if the problem goes away on your end?

Thanks,

--Justin


Justin,

That didn't seem to help.  The cookie->header.type is still equal to 
type USER_ACTION_COOKIE_UNSPEC

in the classify_upcall() function and that causes this message:

2018-01-05T19:12:09.957Z|1|ofproto_dpif_upcall(handler1)|WARN|invalid 
user cookie of type 0 and size 4


- Greg

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] ofp-print: Include full hex dump of messages that fail decode.

2018-01-05 Thread Guru Shetty

On 4 January 2018 at 22:40, Ben Pfaff  wrote:

> In debugging issues with a controller that appears to send invalid
> OpenFlow messages, it can be difficult to actually see the important
> details of those messages, because OpenFlow message logging (in the vconn
> log module) will generally just print a generic decode error that doesn't
> allow a developer to see what failed to decode.  This commit enhances the
> ofp-print code so that, if a message fails to decode, the full hex dump of
> the message is included in the output.
>
> Reported-by: Su Wang 
> Signed-off-by: Ben Pfaff 
>
Acked-by: Gurucharan Shetty 

> ---
>  lib/ofp-print.c| 599 +-
> ---
>  tests/ofp-print.at |  26 +++
>  2 files changed, 306 insertions(+), 319 deletions(-)
>
> diff --git a/lib/ofp-print.c b/lib/ofp-print.c
> index 151d618b59e8..1376ef687fe2 100644
> --- a/lib/ofp-print.c
> +++ b/lib/ofp-print.c
> @@ -117,7 +117,7 @@ format_hex_arg(struct ds *s, const uint8_t *data,
> size_t len)
>  }
>  }
>
> -static void
> +static enum ofperr
>  ofp_print_packet_in(struct ds *string, const struct ofp_header *oh,
>  const struct ofputil_port_map *port_map, int
> verbosity)
>  {
> @@ -131,8 +131,7 @@ ofp_print_packet_in(struct ds *string, const struct
> ofp_header *oh,
>  error = ofputil_decode_packet_in_private(oh, true, NULL, NULL,
>   , _len,
> _id);
>  if (error) {
> -ofp_print_error(string, error);
> -return;
> +return error;
>  }
>
>  if (public->table_id) {
> @@ -230,9 +229,11 @@ ofp_print_packet_in(struct ds *string, const struct
> ofp_header *oh,
>  }
>
>  ofputil_packet_in_private_destroy();
> +
> +return 0;
>  }
>
> -static void
> +static enum ofperr
>  ofp_print_packet_out(struct ds *string, const struct ofp_header *oh,
>   const struct ofputil_port_map *port_map, int
> verbosity)
>  {
> @@ -244,8 +245,7 @@ ofp_print_packet_out(struct ds *string, const struct
> ofp_header *oh,
>  error = ofputil_decode_packet_out(, oh, NULL, );
>  if (error) {
>  ofpbuf_uninit();
> -ofp_print_error(string, error);
> -return;
> +return error;
>  }
>
>  ds_put_char(string, ' ');
> @@ -272,6 +272,7 @@ ofp_print_packet_out(struct ds *string, const struct
> ofp_header *oh,
>  }
>
>  ofpbuf_uninit();
> +return 0;
>  }
>
>  /* qsort comparison function. */
> @@ -486,7 +487,7 @@ ofp_print_phy_port(struct ds *string, const struct
> ofputil_phy_port *port)
>  /* Given a buffer 'b' that contains an array of OpenFlow ports of type
>   * 'ofp_version', writes a detailed description of each port into
>   * 'string'. */
> -static void
> +static enum ofperr
>  ofp_print_phy_ports(struct ds *string, uint8_t ofp_version,
>  struct ofpbuf *b)
>  {
> @@ -514,9 +515,7 @@ ofp_print_phy_ports(struct ds *string, uint8_t
> ofp_version,
>  }
>  free(ports);
>
> -if (retval != EOF) {
> -ofp_print_error(string, retval);
> -}
> +return retval != EOF ? retval : 0;
>  }
>
>  static const char *
> @@ -541,15 +540,14 @@ ofputil_capabilities_to_name(uint32_t bit)
>  return NULL;
>  }
>
> -static void
> +static enum ofperr
>  ofp_print_switch_features(struct ds *string, const struct ofp_header *oh)
>  {
>  struct ofputil_switch_features features;
>  struct ofpbuf b = ofpbuf_const_initializer(oh, ntohs(oh->length));
>  enum ofperr error = ofputil_pull_switch_features(, );
>  if (error) {
> -ofp_print_error(string, error);
> -return;
> +return error;
>  }
>
>  ds_put_format(string, " dpid:%016"PRIx64"\n", features.datapath_id);
> @@ -579,12 +577,12 @@ ofp_print_switch_features(struct ds *string, const
> struct ofp_header *oh)
>  case OFP14_VERSION:
>  case OFP15_VERSION:
>  case OFP16_VERSION:
> -return; /* no ports in ofp13_switch_features */
> +return 0; /* no ports in ofp13_switch_features */
>  default:
>  OVS_NOT_REACHED();
>  }
>
> -ofp_print_phy_ports(string, oh->version, );
> +return ofp_print_phy_ports(string, oh->version, );
>  }
>
>  static void
> @@ -601,7 +599,7 @@ ofp_print_switch_config(struct ds *string,
>  ds_put_format(string, " miss_send_len=%"PRIu16"\n",
> config->miss_send_len);
>  }
>
> -static void
> +static enum ofperr
>  ofp_print_set_config(struct ds *string, const struct ofp_header *oh)
>  {
>  struct ofputil_switch_config config;
> @@ -609,18 +607,19 @@ ofp_print_set_config(struct ds *string, const struct
> ofp_header *oh)
>
>  error = ofputil_decode_set_config(oh, );
>  if (error) {
> -ofp_print_error(string, error);
> -return;
> +return error;
>  }
>  ofp_print_switch_config(string, );
> +return 0;
>  }
>
> -static void
> +static enum ofperr
>

Re: [ovs-dev] [PATCH] compat:inet_frag.h: Check for frag_percpu_counter_batch

2018-01-05 Thread Gregory Rose


On 1/5/2018 10:37 AM, Greg Rose wrote:

Fix up the compat layer to check for frag_percpu_counter_batch and
if not present then use atomic_sub and atomic_add as per the
backport in the 3.16.50 LTS kernel.  Fixes compile errors on
3.16 series kernels from 3.16.50 on.

Signed-off-by: Greg Rose 
---
  datapath/linux/compat/include/net/inet_frag.h | 14 ++
  1 file changed, 14 insertions(+)

diff --git a/datapath/linux/compat/include/net/inet_frag.h 
b/datapath/linux/compat/include/net/inet_frag.h
index 34078c8..e060d3f 100644
--- a/datapath/linux/compat/include/net/inet_frag.h
+++ b/datapath/linux/compat/include/net/inet_frag.h
@@ -30,6 +30,7 @@ static inline bool inet_frag_evicting(struct inet_frag_queue 
*q)
  #endif
  
  #ifndef HAVE_SUB_FRAG_MEM_LIMIT_ARG_STRUCT_NETNS_FRAGS

+#ifdef frag_percpu_counter_batch
  static inline void rpl_sub_frag_mem_limit(struct netns_frags *nf, int i)
  {
__percpu_counter_add(>mem, -i, frag_percpu_counter_batch);
@@ -41,6 +42,19 @@ static inline void rpl_add_frag_mem_limit(struct netns_frags 
*nf, int i)
__percpu_counter_add(>mem, i, frag_percpu_counter_batch);
  }
  #define add_frag_mem_limit rpl_add_frag_mem_limit
+#else /* frag_percpu_counter_batch */
+static inline void rpl_sub_frag_mem_limit(struct netns_frags *nf, int i)
+{
+   atomic_sub(i, >mem);
+}
+#define sub_frag_mem_limit rpl_sub_frag_mem_limit
+
+static inline void rpl_add_frag_mem_limit(struct netns_frags *nf, int i)
+{
+   atomic_sub(i, >mem);


Typo - quite obviously should be atomic_add().  I need to get my 
eyesight checked...


V2 upcoming.


+}
+#define add_frag_mem_limit rpl_add_frag_mem_limit
+#endif /* frag_percpu_counter_batch */
  #endif
  
  #ifdef HAVE_VOID_INET_FRAGS_INIT


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] xlate: fix xport lookup for recirc

2018-01-05 Thread Zoltán Balogh

Hello Jan,

> As far as I can see the frozen state already contains the ofproto UUID and in 
> frozen_metadata the OF port ID.
> Together these should be sufficient to look up the xport in 
> xlate_lookup_ofproto_():

I know, that ofproto_uuid (bridge to resume from) and in_port (incoming port) 
are 
stored in frozen_state and frozen_metadata. As I mentioned these in a private 
message yesterday. I also wrote to you, that the stored ofproto_uuid refers to 
the 
bridge to resume from, and this can differ from the one xlate_lookup_ofproto_() 
could provide. For instance, in the case of using patch ports and recirculate 
in 
the second bridge. Please, double check your mail-box.

> 
> First look up the xbridge through xbridge_lookup_by_uuid() and afterwards 
> scan the xports hmap of the xbridge for
> the OF port ID as done in static function get_ofp_port(xbridge, ofp_port).
> 
> Thus, there should be no need to add a UUID to the xport and store that in 
> frozen_state.
> 

Actually, because of xlate_lookup_ofproto_() could provide a different ofproto, 
than stored in frozen_state, we cannot use ofproto_uuid from frozen_state to 
retrieve the correct xport. Am I wrong?

> Please also consider and deal with the other bug we found in xlate_actions:
> 

I hope, Ben (or the competent maintainer) will answer my question in my previous
e-mail and clarify if the code you mentioned below needs to be modified or not.

Best regards,
Zoltan

> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index d94e9dc..bfa3acd 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -7037,11 +7037,28 @@ xlate_actions(struct xlate_in *xin, struct xlate_out 
> *xout)
>  }
>  ctx.wc->masks.tunnel.metadata.tab = flow->tunnel.metadata.tab;
> 
> +/* FIXME: The line below looks up the ingress xport from the in_port
> + * in base_flow, which is populated with the initial OF port
> + * determined early in the upcall from the ODP port in the dp_packet's
> + * flow. In the case of recirculation in a subsequent bridge (think patch
> + * port or tunnel port) the ODP port is a port in the initial bridge,
> + * and its OF port number has no meaning in the current bridge. In the
> + * best case there is a miss, in the worst case the base_flow.in_port
> + * matches a bogus port in the current bridge. */
> +
>  /* Get the proximate input port of the packet.  (If xin->frozen_state,
>   * flow->in_port is the ultimate input port of the packet.) */
>  struct xport *in_port = get_ofp_port(xbridge,
>   ctx.base_flow.in_port.ofp_port);
> 
> +/* It would be more correct to look up the xport from
> + * ctx.in.flow.in_port, which after recirculation has already been set
> + * with the thawed in_port in frozen_metadata_to_flow() above.
> + * Only in the case of bond recirculation there will be no valid in_port
> + * in the static recirculation context. But perhaps this is not a real
> + * problem as the output to bond action is typically the last action
> + * and the in_port won't matter anymore. */
> +
>  if (flow->packet_type != htonl(PT_ETH) && in_port &&
>  in_port->pt_mode == NETDEV_PT_LEGACY_L3 && ctx.table_id == 0) {
>  /* Add dummy Ethernet header to non-L2 packet if it's coming from a
> 
> BR, Jan
> 
> 
> > -Original Message-
> > From: ovs-dev-boun...@openvswitch.org 
> > [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of Zoltán Balogh
> > Sent: Friday, 05 January, 2018 12:27
> > To: Ben Pfaff 
> > Cc: 'd...@openvswitch.org' 
> > Subject: Re: [ovs-dev] [PATCH] xlate: fix xport lookup for recirc
> >
> > > On Thu, Dec 21, 2017 at 02:22:43PM +, Zoltán Balogh wrote:
> > > > Xlate_lookup and xlate_lookup_ofproto_() provides in_port and ofproto
> > > > based on xport determined using flow, which is extracted from packet.
> > > > The lookup can happen due to recirculation as well. It can happen, that
> > > > packet_type has been modified during xlate before recirculation is
> > > > triggered, so the lookup fails or delivers wrong xport.
> > > > This can be worked around by propagating xport to ctx->xin after the 
> > > > very
> > > > first lookup and store it in frozen state of the recirculation.
> > > > So, when lookup is performed due to recirculation, the xport can be
> > > > retrieved from the frozen state.
> > > >
> > > > Signed-off-by: Zoltan Balogh 
> > > > CC: Jan Scheurich 
> > >
> > > Thanks for working on finding and fixing bugs.
> > >
> > > Storing a pointer to an xport, then checking later that it points to a
> > > valid xport, is risky because it opens up the possibility that the
> > > pointer points to a different xport that just happens to have the same
> > > address.  It's hard to guess how likely that coincidence is, but it
> > > would be

Re: [ovs-dev] [no-slow 2/6] ofproto-dpif: Reorganize upcall handling.

2018-01-05 Thread Justin Pettit


> On Jan 5, 2018, at 8:05 AM, Gregory Rose  wrote:
> 
> For now that's about as far as I can take my investigation since I have a few 
> other things I need to work
> on.  If you can think of another test I should run or something for me to 
> check into let me know.

Ben noticed that this seemed to have been removed in error in my patch:

> @@ -813,7 +816,6 @@ recv_upcalls(struct handler *handler)
> 
> upcall->key = dupcall->key;
> upcall->key_len = dupcall->key_len;
> -upcall->ufid = >ufid;
> 
> upcall->out_tun_key = dupcall->out_tun_key;
> upcall->actions = dupcall->actions;


Can you try adding that line back and see if the problem goes away on your end?

Thanks,

--Justin


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH] compat:inet_frag.h: Check for frag_percpu_counter_batch

2018-01-05 Thread Greg Rose

Fix up the compat layer to check for frag_percpu_counter_batch and
if not present then use atomic_sub and atomic_add as per the
backport in the 3.16.50 LTS kernel.  Fixes compile errors on
3.16 series kernels from 3.16.50 on.

Signed-off-by: Greg Rose 
---
 datapath/linux/compat/include/net/inet_frag.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/datapath/linux/compat/include/net/inet_frag.h 
b/datapath/linux/compat/include/net/inet_frag.h
index 34078c8..e060d3f 100644
--- a/datapath/linux/compat/include/net/inet_frag.h
+++ b/datapath/linux/compat/include/net/inet_frag.h
@@ -30,6 +30,7 @@ static inline bool inet_frag_evicting(struct inet_frag_queue 
*q)
 #endif
 
 #ifndef HAVE_SUB_FRAG_MEM_LIMIT_ARG_STRUCT_NETNS_FRAGS
+#ifdef frag_percpu_counter_batch
 static inline void rpl_sub_frag_mem_limit(struct netns_frags *nf, int i)
 {
__percpu_counter_add(>mem, -i, frag_percpu_counter_batch);
@@ -41,6 +42,19 @@ static inline void rpl_add_frag_mem_limit(struct netns_frags 
*nf, int i)
__percpu_counter_add(>mem, i, frag_percpu_counter_batch);
 }
 #define add_frag_mem_limit rpl_add_frag_mem_limit
+#else /* frag_percpu_counter_batch */
+static inline void rpl_sub_frag_mem_limit(struct netns_frags *nf, int i)
+{
+   atomic_sub(i, >mem);
+}
+#define sub_frag_mem_limit rpl_sub_frag_mem_limit
+
+static inline void rpl_add_frag_mem_limit(struct netns_frags *nf, int i)
+{
+   atomic_sub(i, >mem);
+}
+#define add_frag_mem_limit rpl_add_frag_mem_limit
+#endif /* frag_percpu_counter_batch */
 #endif
 
 #ifdef HAVE_VOID_INET_FRAGS_INIT
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] gre: strip gre-tso offload flags

2018-01-05 Thread Gregory Rose


On 12/28/2017 8:45 PM, we...@ucloud.cn wrote:

From: wenxu 

if the gro enable, ipgre receive a gre-tso package. After pop
the gre-tunnel the encapsulation and GSO_ENCAP flags should be
striped. or the packet encap again and will be dropped in
ovs_iptunnel_handle_offloads

Signed-off-by: wenxu 
---
  datapath/linux/compat/ip_gre.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/datapath/linux/compat/ip_gre.c b/datapath/linux/compat/ip_gre.c
index 03c5435..94fdaa9 100644
--- a/datapath/linux/compat/ip_gre.c
+++ b/datapath/linux/compat/ip_gre.c
@@ -140,6 +140,8 @@ static int ipgre_rcv(struct sk_buff *skb, const struct 
tnl_ptk_info *tpi)
__be64 tun_id;
int err;
  
+		if (iptunnel_pull_offloads(skb))

+   return PACKET_REJECT;
  
  		skb_pop_mac_header(skb);

flags = tpi->flags & (TUNNEL_CSUM | TUNNEL_KEY);


Wenxu,

I see to patches in the dev list that appear to be identical but this 
one has the later timestamp so I'll

respond to this one.

The patch looks simple enough itself but can you explain how to trigger 
the bug so that I can test

this fix?

Thanks,

- Greg
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] lex: Fix parsing of long tokens.

2018-01-05 Thread Gregory Rose


On 1/2/2018 11:15 AM, Ben Pfaff wrote:

When a token is longer than the built-in 256-byte buffer, a buffer is
malloc()'d but it was not properly null-terminated.

Found by afl-fuzz.

Reported-by: Bhargava Shastry 
Signed-off-by: Ben Pfaff 
---
  ovn/lib/lex.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovn/lib/lex.c b/ovn/lib/lex.c
index 6f2b570f5c65..2f49af0e91e2 100644
--- a/ovn/lib/lex.c
+++ b/ovn/lib/lex.c
@@ -89,7 +89,7 @@ lex_token_strcpy(struct lex_token *token, const char *s, 
size_t length)
  ? token->buffer
  : xmalloc(length + 1));
  memcpy(token->s, s, length);
-token->buffer[length] = '\0';
+token->s[length] = '\0';
  }
  
  void


Reviewed-by: Greg Rose 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2] openvswitch: Trim off padding before L3+ netfilter processing

2018-01-05 Thread Ed Swierk via dev

On Thu, Jan 4, 2018 at 7:36 PM, Pravin Shelar  wrote:
> On Wed, Jan 3, 2018 at 7:49 PM, Ed Swierk  wrote:
>> On Fri, Dec 22, 2017 at 3:31 PM, Pravin Shelar  wrote:
>>> On Thu, Dec 21, 2017 at 7:17 AM, Ed Swierk  
>>> wrote:
 IPv4 and IPv6 packets may arrive with lower-layer padding that is not
 included in the L3 length. For example, a short IPv4 packet may have
 up to 6 bytes of padding following the IP payload when received on an
 Ethernet device. In the normal IPv4 receive path, ip_rcv() trims the
 packet to ip_hdr->tot_len before invoking netfilter hooks (including
 conntrack and nat).

 In the IPv6 receive path, ip6_rcv() does the same using
 ipv6_hdr->payload_len. Similarly in the br_netfilter receive path,
 br_validate_ipv4() and br_validate_ipv6() trim the packet to the L3
 length before invoking NF_INET_PRE_ROUTING hooks.

 In the OVS conntrack receive path, ovs_ct_execute() pulls the skb to
 the L3 header but does not trim it to the L3 length before calling
 nf_conntrack_in(NF_INET_PRE_ROUTING). When nf_conntrack_proto_tcp
 encounters a packet with lower-layer padding, nf_checksum() fails and
 logs "nf_ct_tcp: bad TCP checksum". While extra zero bytes don't
 affect the checksum, the length in the IP pseudoheader does. That
 length is based on skb->len, and without trimming, it doesn't match
 the length the sender used when computing the checksum.

 The assumption throughout nf_conntrack and nf_nat is that skb->len
 reflects the length of the L3 header and payload, so there is no need
 to refer back to ip_hdr->tot_len or ipv6_hdr->payload_len.

 This change brings OVS into line with other netfilter users, trimming
 IPv4 and IPv6 packets prior to L3+ netfilter processing.

 Signed-off-by: Ed Swierk 
 ---
 v2:
 - Trim packet in nat receive path as well as conntrack
 - Free skb on error
 ---
  net/openvswitch/conntrack.c | 34 ++
  1 file changed, 34 insertions(+)

 diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
 index b27c5c6..1bdc78f 100644
 --- a/net/openvswitch/conntrack.c
 +++ b/net/openvswitch/conntrack.c
 @@ -703,6 +703,33 @@ static bool skb_nfct_cached(struct net *net,
 return ct_executed;
  }

 +/* Trim the skb to the L3 length. Assumes the skb is already pulled to
 + * the L3 header. The skb is freed on error.
 + */
 +static int skb_trim_l3(struct sk_buff *skb)
 +{
 +   unsigned int nh_len;
 +   int err;
 +
 +   switch (skb->protocol) {
 +   case htons(ETH_P_IP):
 +   nh_len = ntohs(ip_hdr(skb)->tot_len);
 +   break;
 +   case htons(ETH_P_IPV6):
 +   nh_len = ntohs(ipv6_hdr(skb)->payload_len)
 +   + sizeof(struct ipv6hdr);
 +   break;
 +   default:
 +   nh_len = skb->len;
 +   }
 +
 +   err = pskb_trim_rcsum(skb, nh_len);
 +   if (err)
>>> This should is unlikely.
 +   kfree_skb(skb);
 +
 +   return err;
 +}
 +
>>> This looks like a generic function, it probably does not belong to OVS
>>> code base.
>>
>> It occurs to me that skb_trim_l3() can't just reach into ip_hdr(skb)
>> before calling pskb_may_pull(skb, sizeof(struct iphdr)) to make sure
>> the IP header is actually there; and for IPv4 it should validate the
>> IP header checksum, including options. Once we add all these steps,
>> skb_trim_l3() starts to look an awful lot like br_validate_ipv4() and
>> br_validate_ipv6(). And those in turn are eerily similar to ip_rcv()
>> and ip6_rcv(). It would be nice to avoid duplicating this logic yet
>> again.
>>
> OVS already pull all required headers in skb linear data, so no need
> to redo all of it. only check required is the ip-checksum validation.
> I think we could avoid it in most of cases by checking skb length to
> ipheader length before verifying the ip header-checksum.

Shouldn't the IP header checksum be verified even earlier, like in
key_extract(), before actually using any of the fields in the IP
header?

And since key_extract() is already inspecting the IP/IPv6 header, it
would be a convenient spot to check whether the skb->len matches. If
there's a difference, it could record the number of bytes to trim in
an ovs_skb_cb field. Then ovs_ct_execute() would look at this field
and trim the skb only if necessary.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH] OVN: remove useless ds_clear() on actions ds

2018-01-05 Thread Lorenzo Bianconi

Remove ds_clear() on actions dynamic string in build_acls()
since they have just been initialized to DS_EMPTY_INITIALIZER

Signed-off-by: Lorenzo Bianconi 
---
 ovn/northd/ovn-northd.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index e3ddc1fd9..63ed97efb 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -3272,7 +3272,6 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows)
   "(!ct.est || (ct.est && ct_label.blocked == 1)) "
   "&& (%s)",
   acl->match);
-ds_clear();
 build_acl_log(, acl);
 ds_put_cstr(, "/* drop */");
 ovn_lflow_add_with_hint(lflows, od, stage,
@@ -3307,7 +3306,6 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows)
 /* There are no stateful ACLs in use on this datapath,
  * so a "drop" ACL is simply the "drop" logical flow action
  * in all cases. */
-ds_clear();
 build_acl_log(, acl);
 ds_put_cstr(, "/* drop */");
 ovn_lflow_add_with_hint(lflows, od, stage,
-- 
2.13.6

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v6 2/2] OVN: Add support for periodic router advertisements.

2018-01-05 Thread Ben Pfaff

On Wed, Nov 29, 2017 at 03:59:48PM -0600, Mark Michelson wrote:
> This change adds three new options to the Northbound
> Logical_Router_Port's ipv6_ra_configs option:
> 
> * send_periodic: If set to "true", then OVN will send periodic router
> advertisements out of this router port.
> * max_interval: The maximum amount of time to wait between sending
> periodic router advertisements.
> * min_interval: The minimum amount of time to wait between sending
> periodic router advertisements.
> 
> When send_periodic is true, then IPv6 RA configs, as well as some layer
> 2 and layer 3 information about the router port, are copied to the
> southbound database. From there, ovn-controller can use this information
> to know when to send periodic RAs and what to send in them.
> 
> Because periodic RAs originate from each ovn-controller, the new
> keep-local flag is set on the packet so that ports don't receive an
> overabundance of RAs.
> 
> Signed-off-by: Mark Michelson 

Thanks a lot for the revised series.

I folded in the following changes and applied this series to master.

diff --git a/lib/packets.h b/lib/packets.h
index 8819f829970e..395599f08c92 100644
--- a/lib/packets.h
+++ b/lib/packets.h
@@ -1020,7 +1020,12 @@ BUILD_ASSERT_DECL(RA_MSG_LEN == sizeof(struct 
ovs_ra_msg));
  * 6.2.1
  */
 #define ND_RA_MAX_INTERVAL_DEFAULT 600
-#define ND_RA_MIN_INTERVAL_DEFAULT(max) ((max) >= 9 ? (max) / 3 : (max) * 3 / 
4)
+
+static inline int
+nd_ra_min_interval_default(int max)
+{
+return max >= 9 ? max / 3 : max * 3 / 4;
+}
 
 /*
  * Use the same struct for MLD and MLD2, naming members as the defined fields 
in
@@ -1420,7 +1425,7 @@ void compose_nd_ra(struct dp_packet *,
const struct in6_addr *ipv6_dst,
uint8_t cur_hop_limit, uint8_t mo_flags,
ovs_be16 router_lt, ovs_be32 reachable_time,
-   ovs_be32 retrans_timer, ovs_be32 mtu);
+   ovs_be32 retrans_timer, uint32_t mtu);
 void packet_put_ra_prefix_opt(struct dp_packet *,
   uint8_t plen, uint8_t la_flags,
   ovs_be32 valid_lifetime,
diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
index cf414b8f229b..7542db3f4854 100644
--- a/ovn/controller/pinctrl.c
+++ b/ovn/controller/pinctrl.c
@@ -1167,7 +1167,7 @@ ipv6_ra_update_config(const struct sbrec_port_binding *pb)
 config->max_interval = smap_get_int(>options, "ipv6_ra_max_interval",
 ND_RA_MAX_INTERVAL_DEFAULT);
 config->min_interval = smap_get_int(>options, "ipv6_ra_min_interval",
-ND_RA_MIN_INTERVAL_DEFAULT(config->max_interval));
+nd_ra_min_interval_default(config->max_interval));
 config->mtu = smap_get_int(>options, "ipv6_ra_mtu", ND_MTU_DEFAULT);
 config->la_flags = ND_PREFIX_ON_LINK;
 
@@ -1194,7 +1194,7 @@ ipv6_ra_update_config(const struct sbrec_port_binding *pb)
 }
 
 /* All nodes multicast addresses */
-config->eth_dst = ETH_ADDR_C(33,33,00,00,00,01);
+config->eth_dst = (struct eth_addr) ETH_ADDR_C(33,33,00,00,00,01);
 ipv6_parse("ff02::1", >ipv6_dst);
 
 const char *eth_addr = smap_get(>options, "ipv6_ra_src_eth");
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index fc14dc8c38eb..e3ddc1fd9bc1 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -4486,7 +4486,7 @@ copy_ra_to_sb(struct ovn_port *op, const char 
*address_mode)
 smap_add_format(, "ipv6_ra_max_interval", "%d", max_interval);
 
 int min_interval = smap_get_int(>nbrp->ipv6_ra_configs,
-"min_interval", ND_RA_MIN_INTERVAL_DEFAULT(max_interval));
+"min_interval", nd_ra_min_interval_default(max_interval));
 if (min_interval > ND_RA_MIN_INTERVAL_MAX(max_interval)) {
 min_interval = ND_RA_MIN_INTERVAL_MAX(max_interval);
 }
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] dpif-netdev: Refactor datapath flow cache

2018-01-05 Thread William Tu

On Thu, Jan 4, 2018 at 5:41 PM, Wang, Yipeng1  wrote:
> I agree with you guys on the issue with RSS hash.
>
> RSS is not originally designed for hash table so it is not flexible enough to 
> be configured
> in the NICs. Smart NICs will be more capable to offload hash calculation etc, 
> I think.
>
> Another thing is I think it will not be very costly if we combine RSS with 
> the software hash.
> We just need to use the RSS together with the Vlan field to calculate a new 
> hash (using CRC hash
> which is fast). It is like what recirculation does. We do not need to hash 
> 5-tuple again in software.
> User can choose which field they want to include into the hash for the hash 
> table structures in OvS.
>
> What do you think?
>

I like the idea of incremental hash using CRC. As for adding new
field, can we do it on-demand without relying on user to choose which
fields?

Assuming that in the beginning, 5-tuple covers unions of all fields
used in all subtables, so RSS works fine. As subtables increase, there
might be new subtables using new fields, such as vlan, or tunnel
metadata. At the moment (or periodically) when we detect new subtable
using new field, then we consider adding new field into the hash
function. So for simply traffic using 5-tuple, RSS remains effective
and unchanged, for tunneled traffic or traffic matching more fields,
this on-demand hashing approach can alleviate the collision. Of
course, changing the hashing function requires rehashing all existing
elements, so we should do it less frequent.

Regards,
William
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v8] netdev-dpdk: Add support for vHost dequeue zero copy (experimental)

2018-01-05 Thread Ciara Loftus

Zero copy is disabled by default. To enable it, set the 'dq-zero-copy'
option to 'true' when configuring the Interface:

ovs-vsctl set Interface dpdkvhostuserclient0
options:vhost-server-path=/tmp/dpdkvhostuserclient0
options:dq-zero-copy=true

When packets from a vHost device with zero copy enabled are destined for
a single 'dpdk' port, the number of tx descriptors on that 'dpdk' port
must be set to a smaller value. 128 is recommended. This can be achieved
like so:

ovs-vsctl set Interface dpdkport options:n_txq_desc=128

Note: The sum of the tx descriptors of all 'dpdk' ports the VM will send
to should not exceed 128. Due to this requirement, the feature is
considered 'experimental'.

Testing of the patch showed a 15% improvement when switching 512B
packets between vHost devices on different VMs on the same host when
zero copy was enabled on the transmitting device.

Signed-off-by: Ciara Loftus 
---
v8:
* Disallow configurability after vHost device has been registered &
update docs accordingly.
* Give performance datapoint in commit message.

 Documentation/intro/install/dpdk.rst |  2 +
 Documentation/topics/dpdk/vhost-user.rst | 72 
 NEWS |  1 +
 lib/netdev-dpdk.c|  9 +++-
 vswitchd/vswitch.xml | 11 +
 5 files changed, 94 insertions(+), 1 deletion(-)

diff --git a/Documentation/intro/install/dpdk.rst 
b/Documentation/intro/install/dpdk.rst
index 3fecb5c..087eb88 100644
--- a/Documentation/intro/install/dpdk.rst
+++ b/Documentation/intro/install/dpdk.rst
@@ -518,6 +518,8 @@ The above command sets the number of rx queues for DPDK 
physical interface.
 The rx queues are assigned to pmd threads on the same NUMA node in a
 round-robin fashion.
 
+.. _dpdk-queues-sizes:
+
 DPDK Physical Port Queue Sizes
 ~~~
 
diff --git a/Documentation/topics/dpdk/vhost-user.rst 
b/Documentation/topics/dpdk/vhost-user.rst
index 8447e2d..1a6c6d0 100644
--- a/Documentation/topics/dpdk/vhost-user.rst
+++ b/Documentation/topics/dpdk/vhost-user.rst
@@ -458,3 +458,75 @@ Sample XML
 
 
 .. _QEMU documentation: 
http://git.qemu-project.org/?p=qemu.git;a=blob;f=docs/specs/vhost-user.txt;h=7890d7169;hb=HEAD
+
+vhost-user Dequeue Zero Copy (experimental)
+---
+
+Normally when dequeuing a packet from a vHost User device, a memcpy operation
+must be used to copy that packet from guest address space to host address
+space. This memcpy can be removed by enabling dequeue zero-copy like so::
+
+$ ovs-vsctl add-port br0 dpdkvhostuserclient0 -- set Interface \
+dpdkvhostuserclient0 type=dpdkvhostuserclient \
+options:vhost-server-path=/tmp/dpdkvhostclient0 \
+options:dq-zero-copy=true
+
+With this feature enabled, a reference (pointer) to the packet is passed to
+the host, instead of a copy of the packet. Removing this memcpy can give a
+performance improvement for some use cases, for example switching large packets
+between different VMs. However additional packet loss may be observed.
+
+Note that the feature is disabled by default and must be explicitly enabled
+by setting the 'dq-zero-copy' option to 'true' while specifying the
+'vhost-server-path' option as above. If you wish to split out the command into
+multiple commands as below, ensure 'dq-zero-copy' is set before
+'vhost-server-path'::
+
+$ ovs-vsctl set Interface dpdkvhostuserclient0 options:dq-zero-copy=true
+$ ovs-vsctl set Interface dpdkvhostuserclient0 \
+options:vhost-server-path=/tmp/dpdkvhostclient0
+
+The feature is only available to dpdkvhostuserclient port types.
+
+A limitation exists whereby if packets from a vHost port with dq-zero-copy=true
+are destined for a 'dpdk' type port, the number of tx descriptors (n_txq_desc)
+for that port must be reduced to a smaller number, 128 being the recommended
+value. This can be achieved by issuing the following command::
+
+$ ovs-vsctl set Interface dpdkport options:n_txq_desc=128
+
+Note: The sum of the tx descriptors of all 'dpdk' ports the VM will send to
+should not exceed 128. For example, in case of a bond over two physical ports
+in balance-tcp mode, one must divide 128 by the number of links in the bond.
+
+Refer to :ref:`` for more information.
+
+The reason for this limitation is due to how the zero copy functionality is
+implemented. The vHost device's 'tx used vring', a virtio structure used for
+tracking used ie. sent descriptors, will only be updated when the NIC frees
+the corresponding mbuf. If we don't free the mbufs frequently enough, that
+vring will be starved and packets will no longer be processed. One way to
+ensure we don't encounter this scenario, is to configure n_txq_desc to a small
+enough number such that the 'mbuf free threshold' for the NIC will be hit more
+often and thus free mbufs more frequently. The value of 128 is suggested, but

Re: [ovs-dev] [no-slow 2/6] ofproto-dpif: Reorganize upcall handling.

2018-01-05 Thread Gregory Rose


On 1/3/2018 9:37 AM, Gregory Rose wrote:

On 1/2/2018 11:42 AM, Justin Pettit wrote:

On Dec 28, 2017, at 3:22 PM, Gregory Rose  wrote:

SFAICT it emulates exactly what the system-traffic.at test 001 
does.  And it works fine... /shrug.


What distribution, kernel, etc are you using for your check-kmod 
testing?  I'll try to emulate that

exactly and then see if I can get similar results.
I'm using Ubuntu 16.04 with kernel 4.4.0-104-generic.  I sent you a 
link on our Slack channel to the internal tester that runs different 
OSs.  It fails a few of tests, but they're the same ones that fail on 
master.  (We need to address those, but they shouldn't be related to 
my patches.)


--Justin






I've created a script that runs a test outside of the m4sh autotest 
framework that exactly emulates

what test 001 of the 'make check-kmod' test performs.

When running the test I've created with this patch applied I 
consistently see the following error

message:

2018-01-05T15:53:14.440Z|1|ofproto_dpif_upcall(handler1)|WARN|invalid 
user cookie of type 0 and size 4


But the ping test does succeed.  Without this patch applied the error 
messsage does not occur.


I've attached the script.  I'm not sure why the test on the internal 
server you pointed me to does not have
this error in it but as mentioned before I can reproduce it reliably on 
several different VMs with both Ubuntu

and RHEL based distributions and various kernels.

For now that's about as far as I can take my investigation since I have 
a few other things I need to work
on.  If you can think of another test I should run or something for me 
to check into let me know.


Thanks,

- Greg


Justin,

I have done more testing last night and this morning and have a couple 
of findings.


First, the tests themselves *all* succeed.  However, they are marked 
as failed because of warnings that
occur during OVS_TRAFFIC_VSWITCHD_STOP in system-traffic.at.  If I 
comment out

OVS_TRAFFIC_VSWITCHD_STOP then the test runs successfully.

AT_SETUP([datapath - ping between two ports])
OVS_TRAFFIC_VSWITCHD_START()

AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])

ADD_NAMESPACES(at_ns0, at_ns1)

ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")

NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 | 
FORMAT_PING], [0], [dnl

3 packets transmitted, 3 received, 0% packet loss, time 0ms
])
NS_CHECK_EXEC([at_ns0], [ping -s 1600 -q -c 3 -i 0.3 -w 2 10.1.1.2 | 
FORMAT_PING], [0], [dnl

3 packets transmitted, 3 received, 0% packet loss, time 0ms
])
NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3 -w 2 10.1.1.2 | 
FORMAT_PING], [0], [dnl

3 packets transmitted, 3 received, 0% packet loss, time 0ms
])

dnl OVS_TRAFFIC_VSWITCHD_STOP
AT_CLEANUP

## -- ##
## openvswitch 2.8.90 test suite. ##
## -- ##
  1: datapath - ping between two ports   ok

## - ##
## Test results. ##
## - ##

1 test was successful.

I'm now debugging the OVS_TRAFFIC_VSWITCHD_STOP macro and trying to 
determine what

is causing the problem.  Here are the log messages:

2018-01-03T17:30:52.340Z|00039|netdev_linux|WARN|ovs-p1: removing 
policing failed: No such device
2018-01-03T17:30:52.341Z|00040|ofproto|WARN|br0: cannot get STP status 
on nonexistent port 2
2018-01-03T17:30:52.341Z|00041|ofproto|WARN|br0: cannot get RSTP 
status on nonexistent port 2
2018-01-03T17:30:52.343Z|00042|bridge|INFO|bridge br0: deleted 
interface ovs-p1 on port 2
2018-01-03T17:30:52.346Z|00043|bridge|WARN|could not open network 
device ovs-p1 (No such device)
2018-01-03T17:30:52.360Z|00044|bridge|INFO|bridge br0: deleted 
interface ovs-p0 on port 1
2018-01-03T17:30:52.364Z|00045|bridge|WARN|could not open network 
device ovs-p0 (No such device)
2018-01-03T17:30:52.367Z|00046|bridge|WARN|could not open network 
device ovs-p1 (No such device)


It is the WARNS from the OVS_TRAFFIC_VSWITCHD_STOP part of the test 
that are causing all tests to fail.


Again, I see this on multiple systems.  They are all VMs though so I'm 
wondering if the internal test that

you are referring to was run on bare metal?

Thanks,

- Greg



#!/bin/bash
if [ ! -f vswitch.ovsschema ]; then
echo "No schema file found - please copy vswitch.ovsschema to this 
directory"
exit 1
fi
rm -f logfile
modprobe openvswitch
touch .conf.db.~lock~
ovsdb-tool create conf.db vswitch.ovsschema
ovsdb-server conf.db --detach --no-chdir --pidfile --log-file 
--remote=punix:/usr/local/var/run/openvswitch/db.sock
ovs-vsctl --no-wait init
ovs-vswitchd --detach --no-chdir --pidfile --log-file=logfile -vvconn 
-vofproto_dpif -vunixctl
ovs-vsctl -- add-br br0 -- set Bridge br0 
protocols=OpenFlow10,OpenFlow11,OpenFlow12,OpenFlow13,OpenFlow14,OpenFlow15 
fail-mode=secure
ovs-ofctl add-flow br0 "actions=normal"
ip netns add at_ns0
ip netns exec at_ns0 sysctl -w net.netfilter.nf_conntrack_helper=0
ip netns add at_ns1

Re: [ovs-dev] master now "frozen" for forking

2018-01-05 Thread Stokes, Ian

> Hello everyone.  We are at the point in the release cycle where
> traditionally we would fork a branch from master for release.  We have
> tried a slightly different approach a few times and I'd like to propose
> that we do it again.  Instead of forking immediately, I propose that we
> "freeze" master so that only bug fixes and previously posted feature
> enhancements go in, until approximately January 15, and then fork
> branch-2.9 from master.  Unless anyone has a serious objection, let's do
> this.

Hi Ben,

Am I right in think that cut off for feature enhancements previously discussed 
on the ML is now January 15th if the above proposal goes ahead?

I guess there was an assumption on the OVS DPDK side that we would branch on 
January 1st with bug fixes and previously discussed feature enhancements 
possible until end of January for the 2.9 release, official release of 2.9 then 
being February 15th?

Thanks
Ian
> 
> Thanks,
> 
> Ben.
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] ofp-print: Include full hex dump of messages that fail decode.

2018-01-05 Thread Su Wang

Hi Guru,


can u please help us to pick up Ben's change to nsx-dgo brach? Suneel will 
rerun the test after that.


Thx a lot,


Su


From: Ben Pfaff 
Sent: Thursday, January 4, 2018 10:40:01 PM
To: d...@openvswitch.org
Cc: Ben Pfaff; Su Wang
Subject: [PATCH] ofp-print: Include full hex dump of messages that fail decode.

In debugging issues with a controller that appears to send invalid
OpenFlow messages, it can be difficult to actually see the important
details of those messages, because OpenFlow message logging (in the vconn
log module) will generally just print a generic decode error that doesn't
allow a developer to see what failed to decode.  This commit enhances the
ofp-print code so that, if a message fails to decode, the full hex dump of
the message is included in the output.

Reported-by: Su Wang 
Signed-off-by: Ben Pfaff 
---
 lib/ofp-print.c| 599 +
 tests/ofp-print.at |  26 +++
 2 files changed, 306 insertions(+), 319 deletions(-)

diff --git a/lib/ofp-print.c b/lib/ofp-print.c
index 151d618b59e8..1376ef687fe2 100644
--- a/lib/ofp-print.c
+++ b/lib/ofp-print.c
@@ -117,7 +117,7 @@ format_hex_arg(struct ds *s, const uint8_t *data, size_t 
len)
 }
 }

-static void
+static enum ofperr
 ofp_print_packet_in(struct ds *string, const struct ofp_header *oh,
 const struct ofputil_port_map *port_map, int verbosity)
 {
@@ -131,8 +131,7 @@ ofp_print_packet_in(struct ds *string, const struct 
ofp_header *oh,
 error = ofputil_decode_packet_in_private(oh, true, NULL, NULL,
  , _len, _id);
 if (error) {
-ofp_print_error(string, error);
-return;
+return error;
 }

 if (public->table_id) {
@@ -230,9 +229,11 @@ ofp_print_packet_in(struct ds *string, const struct 
ofp_header *oh,
 }

 ofputil_packet_in_private_destroy();
+
+return 0;
 }

-static void
+static enum ofperr
 ofp_print_packet_out(struct ds *string, const struct ofp_header *oh,
  const struct ofputil_port_map *port_map, int verbosity)
 {
@@ -244,8 +245,7 @@ ofp_print_packet_out(struct ds *string, const struct 
ofp_header *oh,
 error = ofputil_decode_packet_out(, oh, NULL, );
 if (error) {
 ofpbuf_uninit();
-ofp_print_error(string, error);
-return;
+return error;
 }

 ds_put_char(string, ' ');
@@ -272,6 +272,7 @@ ofp_print_packet_out(struct ds *string, const struct 
ofp_header *oh,
 }

 ofpbuf_uninit();
+return 0;
 }

 /* qsort comparison function. */
@@ -486,7 +487,7 @@ ofp_print_phy_port(struct ds *string, const struct 
ofputil_phy_port *port)
 /* Given a buffer 'b' that contains an array of OpenFlow ports of type
  * 'ofp_version', writes a detailed description of each port into
  * 'string'. */
-static void
+static enum ofperr
 ofp_print_phy_ports(struct ds *string, uint8_t ofp_version,
 struct ofpbuf *b)
 {
@@ -514,9 +515,7 @@ ofp_print_phy_ports(struct ds *string, uint8_t ofp_version,
 }
 free(ports);

-if (retval != EOF) {
-ofp_print_error(string, retval);
-}
+return retval != EOF ? retval : 0;
 }

 static const char *
@@ -541,15 +540,14 @@ ofputil_capabilities_to_name(uint32_t bit)
 return NULL;
 }

-static void
+static enum ofperr
 ofp_print_switch_features(struct ds *string, const struct ofp_header *oh)
 {
 struct ofputil_switch_features features;
 struct ofpbuf b = ofpbuf_const_initializer(oh, ntohs(oh->length));
 enum ofperr error = ofputil_pull_switch_features(, );
 if (error) {
-ofp_print_error(string, error);
-return;
+return error;
 }

 ds_put_format(string, " dpid:%016"PRIx64"\n", features.datapath_id);
@@ -579,12 +577,12 @@ ofp_print_switch_features(struct ds *string, const struct 
ofp_header *oh)
 case OFP14_VERSION:
 case OFP15_VERSION:
 case OFP16_VERSION:
-return; /* no ports in ofp13_switch_features */
+return 0; /* no ports in ofp13_switch_features */
 default:
 OVS_NOT_REACHED();
 }

-ofp_print_phy_ports(string, oh->version, );
+return ofp_print_phy_ports(string, oh->version, );
 }

 static void
@@ -601,7 +599,7 @@ ofp_print_switch_config(struct ds *string,
 ds_put_format(string, " miss_send_len=%"PRIu16"\n", config->miss_send_len);
 }

-static void
+static enum ofperr
 ofp_print_set_config(struct ds *string, const struct ofp_header *oh)
 {
 struct ofputil_switch_config config;
@@ -609,18 +607,19 @@ ofp_print_set_config(struct ds *string, const struct 
ofp_header *oh)

 error = ofputil_decode_set_config(oh, );
 if (error) {
-ofp_print_error(string, error);
-return;
+return error;
 }
 ofp_print_switch_config(string, );
+return 0;
 }

-static void
+static enum ofperr
 ofp_print_get_config_reply(struct

Re: [ovs-dev] [PATCH 0/3] Fixes while compiling with 1709 and VS2017

2018-01-05 Thread Ben Pfaff

OK.

Acked-by: Ben Pfaff 


On Fri, Jan 05, 2018 at 03:59:02PM +0200, aserd...@ovn.org wrote:
> Sure. Do you mind if I add your acked-by on each one?
> 
> Thanks,
> Alin.
> 
> > -Original Message-
> > From: Ben Pfaff [mailto:b...@ovn.org]
> > Sent: Thursday, January 4, 2018 11:37 PM
> > To: Alin Gabriel Serdean 
> > Cc: d...@openvswitch.org
> > Subject: Re: [ovs-dev] [PATCH 0/3] Fixes while compiling with 1709 and
> > VS2017
> > 
> > On Mon, Nov 06, 2017 at 02:07:15PM -0800, Ben Pfaff wrote:
> > > On Mon, Nov 06, 2017 at 05:51:54PM +0200, Alin Gabriel Serdean wrote:
> > > > This series includes fixes that were found while compiling using WDK
> > > > 1709 and Visual Studio 2017.
> > > >
> > > > Alin Gabriel Serdean (3):
> > > >   datapath-windows: Add directory to .gitignore
> > > >   datapath-windows: Change include type in Iphelper.h
> > > >   datapath-windows: Add include directory to ovsext project
> > >
> > > All of these seem reasonable to me.
> > 
> > Do you want to apply these?
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] dpif-netdev: Refactor datapath flow cache

2018-01-05 Thread Jan Scheurich

Hi Yipeng,

Thank you very much for your comments. Please find my answers inline.

Regards, Jan

> -Original Message-
> From: Wang, Yipeng1 [mailto:yipeng1.w...@intel.com]
> Sent: Wednesday, 20 December, 2017 00:09
> To: Jan Scheurich ; d...@openvswitch.org
> Cc: Gobriel, Sameh ; Tai, Charlie 
> 
> Subject: RE: [PATCH] dpif-netdev: Refactor datapath flow cache
> 
> Hi, Jan,
> 
> Please see my comments inlined.
> 
> Thanks
> Yipeng
> 
> >-Original Message-
> >From: Jan Scheurich [mailto:jan.scheur...@ericsson.com]
> >Sent: Monday, December 18, 2017 9:01 AM
> >To: Wang, Yipeng1 ; d...@openvswitch.org
> >Cc: Gobriel, Sameh ; Tai, Charlie
> >
> >Subject: RE: [PATCH] dpif-netdev: Refactor datapath flow cache
> >
> >Hi Yipeng,
> >
> >Thanks a lot for your feedback. Please find some responses below.
> >
> >Regards, Jan
> >
> >
> >From: Wang, Yipeng1 [mailto:yipeng1.w...@intel.com]
> >Sent: Sunday, 17 December, 2017 19:49
> >To: Jan Scheurich ; d...@openvswitch.org
> >Cc: Gobriel, Sameh ; Tai, Charlie
> >
> >Subject: RE: [PATCH] dpif-netdev: Refactor datapath flow cache
> >
> >Hi, Jan
> >
> >We went through the code and did some performance comparisons. We
> >notice that the patch contains two parts of optimizations: EMC
> >simplifying/resizing, and another layer of cache added before megaflow cache.
> >The new cache idea has the same direction with our cuckoo distributor(CD)
> >proposal we posted back in April
> >(https://mail.openvswitch.org/pipermail/ovs-dev/2017-April/330570.html
> > )
> >and presented in OvS 2017 conference. Comparing to our patch, we saw pros
> >and cons for both CD and DFC, and currently seeking a way to combine the
> >benefits of both patches. We are also seeking other ways to further simplify
> >current datapath, to have a scalable while simple solution. Below are some
> >detailed comments:
> >
> >For EMC part, we wonder if you enabled transparent huge page (THP) during
> >the test. For our test case, the new EMC only gives a little speedup if THP
> >enabled, since with huge page, reducing EMC entry does not benefit much.
> >Also, reducing 2-hash to 1-hash actually harms certain traffic patterns we
> >tested. I guess the optimization will largely depend on traffic patterns.
> >Another question is that it seems when EMC lookup does
> >"netdev_flow_key_equal_mf", the key length is not initialized yet. Thus, the
> >key comparison actually does not do correctly. Could you please double check?
> >
> >[Jan] Yes, THP is enabled on my test setup, but I have doubts that it
> >significantly boosts the performance of the DFC/EMC by transparently
> >allocating that memory on a hugepage. Do you have a means to check that on
> >a running OVS?
> 
> [Wang, Yipeng] In my test, I compared the proposed EMC with current EMC with 
> same 16k entries.
> If I turned off THP, the current EMC will cause many TLB misses because of 
> its larger entry size, which I profiled with vTunes.
> Once I turned on THP with no other changes, the current EMC's throughput 
> increases a lot and is comparable with the newly
> proposed EMC. From vTunes, the EMC lookup TLB misses decreases from 100 
> million to 0 during the 30sec profiling time.
> So if THP is enabled, reducing EMC entry size may not give too much benefit 
> comparing to the current EMC.
> It is worth to mention that they both use similar amount of CPU cache since 
> only the miniflow struct is accessed by CPU,
> thus the TLB should be the major concern.

I understand your point. But I can't seem to reproduce the effect of THP on my 
system. 
I don't have vTunes available, but I guess "perf stat" should also provide TLB 
miss 
statistics. 

How can you check if ovs-vswitchd is using transparent huge pages for backing 
e.g. the EMC memory?

> 
> >My primary goal when I chose to change the EMC implementation from 2-way
> >to 1-way associativity was to simplify the code. In my tests I have not seen 
> >any
> >benefit of having two possible locations for an EMC entry. As far as I can 
> >see
> >there is no theoretical reason why we should expect systematic collisions of
> >pairs of flows that would justify such a design. There may well be specific
> >traffic patterns that benefit from 2-way EMC, but then there are obviously
> >others for which 1-way performs better. In doubt I believe we should choose
> >the simpler design.
> >
> [Wang, Yipeng] Yes that there is no systematic collisions. However, in 
> general,
> 1-hash table tends to cause many more misses than 2-hash. For code simplicity,
> I agree that 1-hash is simpler and much easier to understand. For performance,
> if the flows can fit in 1-hash table, they should also stay in the primary 
> location of

Re: [ovs-dev] [PATCH v2 3/3] xlate: call tnl_neigh_snoop() from terminate_native_tunnel()

2018-01-05 Thread Jan Scheurich

Hi Ben,

Let me try to explain the motivation of this patch. It actually fixes a severe 
design flaw in the native tunnel handling of OVS in the context of 
SDN-controlled cloud systems, which we found and had to fix downstream in our 
NFVI solution.

To implement tunnels (e.g. VXLAN, GRE) in the userspace datapath, OVS needs to 
have the MAC address of the IP nexthop on the underlay network (either the 
remote tunnel end-point in the case of an L2 underlay or a gateway in case of a 
L3 underlay). To this end OVS snoops passing ARP reply messages to build up an 
internal ARP cache. (I don't mention IPv6 ND in this mail, but the equivalent 
holds there as well).

The current ARP snooping logic is broken because OVS simply snoops any ARP 
packet in any bridge after each flow/group table lookup:

static void
do_xlate_actions(const struct ofpact *ofpacts, size_t ofpacts_len,
 struct xlate_ctx *ctx, bool is_last_action)
{
struct flow_wildcards *wc = ctx->wc;
struct flow *flow = >xin->flow;
const struct ofpact *a;

/* FIXME: This unconditional snooping in all bridges and every flow 
translation is bad. */
if (ovs_native_tunneling_is_on(ctx->xbridge->ofproto)) {
tnl_neigh_snoop(flow, wc, ctx->xbridge->name);
}

This happens even in overlay bridges carrying the tenant network traffic. What 
is worse is that OVS stores all IP-MAC address pairs in a single global per 
bridge, not considering VLANs or any other SDN-controlled network separation. 
As IP addresses are generally not unique across tenant networks in NFVI clouds, 
there are frequent collisions and these useless ARP cache entries are 
constantly overwritten.

Every update in the ARP table is considered a switch configuration change and 
triggers the revalidation of all existing datapath flow cache entries, an 
expensive operation. The result is constant high ovs-vswitchd CPU load and even 
data plane disturbances. In the worst case the correct ARP entry for the tunnel 
nexthop can be overwritten, resulting in lost tunnel packets.

This patch addresses the problem by: 

* Moving the tnl_neigh_snoop() call to the only place where ARP snooping makes 
sense: at output to the bridge's LOCAL port, as this is the only supported port 
for tunnel endpoints today.

* Checking that the ARP reply is destined to one of the known IP addresses 
assigned to the bridge's LOCAL port before snooping.

* As the actual tunnel endpoint is typically a LOCAL port of a NORMAL (non-SDN 
controlled) underlay bridge (e.g. br-prv), the VLAN filtering is implicitly 
done by the NORMAL action, so that only packets in the LOCAL port's VLAN are 
seen by the output function.

* On the SDN-controlled bridges that handle tenant networking ("br-int"), the 
LOCAL port does not participate in traffic. No tenant packets will be seen by 
the tunnel neighbor snooping, so that frequent collisions are totally avoided.

I hope that clarifies the big picture. I leave the more technical aspects of 
the data structures to Zoltan.

@Zoltan: Can you include this explanation into the commit message make sure the 
code is commented as well in the critical places?

Regards, Jan

> I don't understand yet the motivation here.  What does it mean "to keep
> ARP neighbor cache clean"?  Is this a bug fix, a performance fix, a
> cleanup, or something else?
> 
> What is the goal of the filtering you mention?  Is that a second change,
> that can and should be a separate patch, or is it closely coupled to the
> first change you mention?
> 
> This gives me the following Clang errors:
> 
> ../ofproto/ofproto-dpif-xlate.c:3749:13: error: cast from 'const uint8_t 
> *' (aka 'const unsigned char *') to 'const struct in6_addr *'
> increases required alignment from 1 to 4 [-Werror,-Wcast-align]
> /usr/include/netinet/in.h:454:36: note: expanded from macro 
> 'IN6_ARE_ADDR_EQUAL'
> ../ofproto/ofproto-dpif-xlate.c:3749:13: error: cast from 'const uint8_t 
> *' (aka 'const unsigned char *') to 'const struct in6_addr *'
> increases required alignment from 1 to 4 [-Werror,-Wcast-align]
> /usr/include/netinet/in.h:455:36: note: expanded from macro 
> 'IN6_ARE_ADDR_EQUAL'
> 
> The change to tnl-neigh-cache.c that just adds an #include directive
> seems odd to me.
> 
> The treatment of the new xbridge_addr is different from the treatment of
> everything already handled in xlate_ofproto_set().  Why does it need
> special treatment?  Is it RCU safe?  Why does it need a refcount (none
> of the other structs do)?
> 
> Thanks,
> 
> Ben.
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 0/3] Fixes while compiling with 1709 and VS2017

2018-01-05 Thread aserdean

Sure. Do you mind if I add your acked-by on each one?

Thanks,
Alin.

> -Original Message-
> From: Ben Pfaff [mailto:b...@ovn.org]
> Sent: Thursday, January 4, 2018 11:37 PM
> To: Alin Gabriel Serdean 
> Cc: d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH 0/3] Fixes while compiling with 1709 and
> VS2017
> 
> On Mon, Nov 06, 2017 at 02:07:15PM -0800, Ben Pfaff wrote:
> > On Mon, Nov 06, 2017 at 05:51:54PM +0200, Alin Gabriel Serdean wrote:
> > > This series includes fixes that were found while compiling using WDK
> > > 1709 and Visual Studio 2017.
> > >
> > > Alin Gabriel Serdean (3):
> > >   datapath-windows: Add directory to .gitignore
> > >   datapath-windows: Change include type in Iphelper.h
> > >   datapath-windows: Add include directory to ovsext project
> >
> > All of these seem reasonable to me.
> 
> Do you want to apply these?

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] MSI: Use platform specific netcfg location

2018-01-05 Thread aserdean

Thanks for the review. Applied on master.

> -Original Message-
> From: Ben Pfaff [mailto:b...@ovn.org]
> Sent: Thursday, January 4, 2018 11:36 PM
> To: Alin Gabriel Serdean 
> Cc: d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH] MSI: Use platform specific netcfg location
> 
> On Mon, Dec 11, 2017 at 08:22:27PM +0200, Alin Gabriel Serdean wrote:
> > We use the command `netcfg` to install the Windows datapath.
> >
> > Since we have both 32 and 64 bit installers available point it to the
> > platform specific binary.
> >
> > Found while testing.
> >
> > Signed-off-by: Alin Gabriel Serdean 
> 
> Acked-by: Ben Pfaff 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] xlate: fix xport lookup for recirc

2018-01-05 Thread Jan Scheurich

Hi Zoltan,

As far as I can see the frozen state already contains the ofproto UUID and in 
frozen_metadata the OF port ID. Together these should be sufficient to look up 
the xport in xlate_lookup_ofproto_():

First look up the xbridge through xbridge_lookup_by_uuid() and afterwards scan 
the xports hmap of the xbridge for the OF port ID as done in static function 
get_ofp_port(xbridge, ofp_port).

Thus, there should be no need to add a UUID to the xport and store that in 
frozen_state.

Please also consider and deal with the other bug we found in xlate_actions:

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index d94e9dc..bfa3acd 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -7037,11 +7037,28 @@ xlate_actions(struct xlate_in *xin, struct xlate_out 
*xout)
 }
 ctx.wc->masks.tunnel.metadata.tab = flow->tunnel.metadata.tab;

+/* FIXME: The line below looks up the ingress xport from the in_port
+ * in base_flow, which is populated with the initial OF port
+ * determined early in the upcall from the ODP port in the dp_packet's
+ * flow. In the case of recirculation in a subsequent bridge (think patch
+ * port or tunnel port) the ODP port is a port in the initial bridge,
+ * and its OF port number has no meaning in the current bridge. In the
+ * best case there is a miss, in the worst case the base_flow.in_port
+ * matches a bogus port in the current bridge. */
+
 /* Get the proximate input port of the packet.  (If xin->frozen_state,
  * flow->in_port is the ultimate input port of the packet.) */
 struct xport *in_port = get_ofp_port(xbridge,
  ctx.base_flow.in_port.ofp_port);

+/* It would be more correct to look up the xport from
+ * ctx.in.flow.in_port, which after recirculation has already been set
+ * with the thawed in_port in frozen_metadata_to_flow() above.
+ * Only in the case of bond recirculation there will be no valid in_port
+ * in the static recirculation context. But perhaps this is not a real
+ * problem as the output to bond action is typically the last action
+ * and the in_port won't matter anymore. */
+
 if (flow->packet_type != htonl(PT_ETH) && in_port &&
 in_port->pt_mode == NETDEV_PT_LEGACY_L3 && ctx.table_id == 0) {
 /* Add dummy Ethernet header to non-L2 packet if it's coming from a

BR, Jan


> -Original Message-
> From: ovs-dev-boun...@openvswitch.org 
> [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of Zoltán Balogh
> Sent: Friday, 05 January, 2018 12:27
> To: Ben Pfaff 
> Cc: 'd...@openvswitch.org' 
> Subject: Re: [ovs-dev] [PATCH] xlate: fix xport lookup for recirc
> 
> > On Thu, Dec 21, 2017 at 02:22:43PM +, Zoltán Balogh wrote:
> > > Xlate_lookup and xlate_lookup_ofproto_() provides in_port and ofproto
> > > based on xport determined using flow, which is extracted from packet.
> > > The lookup can happen due to recirculation as well. It can happen, that
> > > packet_type has been modified during xlate before recirculation is
> > > triggered, so the lookup fails or delivers wrong xport.
> > > This can be worked around by propagating xport to ctx->xin after the very
> > > first lookup and store it in frozen state of the recirculation.
> > > So, when lookup is performed due to recirculation, the xport can be
> > > retrieved from the frozen state.
> > >
> > > Signed-off-by: Zoltan Balogh 
> > > CC: Jan Scheurich 
> >
> > Thanks for working on finding and fixing bugs.
> >
> > Storing a pointer to an xport, then checking later that it points to a
> > valid xport, is risky because it opens up the possibility that the
> > pointer points to a different xport that just happens to have the same
> > address.  It's hard to guess how likely that coincidence is, but it
> > would be better to avoid it.  This is the reason that frozen_state uses
> > a random UUID to locate "ofprotos".  Probably, that approach would be
> > better for xports too.
> 
> I see, thank you for the explanation. I'm going to create a new patch using
> xport UUIDs.
> 
> >
> > When xport_in is nonnull but invalid, wouldn't it be better to abort
> > than to search for an xport the fallback way?
> 
> Yes, that would be better. Especially if UUID is used to get the xport.
> 
> >
> > What is the reason for the special case for patch ports?
> 
> I've performed some tests before created the patch. I had a setup with two
> bridges connected with patch ports:
> 
>   p1 +---+ +---+ p2
>   ->-o   | |   o->-
>  |  br1  o-o  br2  |
>  +---+ +---+
> 
> Recirculation can occur in the 2nd bridge as well, for instance when pop_mpls
> action is followed by resubmit. In this case a new upcall is performed and
> xlate_lookup_ofproto_() and xport_lookup() are invoked again. As I observed,
>

Re: [ovs-dev] [PATCH v3 0/6] Add minimum network namespace support.

2018-01-05 Thread Flavio Leitner


Hi,

I know this change is not easy to review, but at the same time
we are getting close to the release date.  I am afraid that
delaying the acceptance will reduce its exposure and testing.

Please let me know the questions you might have to help you
reviewing this patchset.

Thanks,
fbl



On Thu, Dec 07, 2017 at 12:22:18AM -0200, Flavio Leitner wrote:
> Today Open vSwitch doesn't know about network namespaces (netns), but
> users are moving internal ports to other namespaces.  Although packets
> are still flowing, the daemon fails to find out basic port information,
> like if it is UP or DOWN, for instance.
> 
> This patchset rely on a new kernel vport API recently accepted to find
> out the new network namespace ID of a bridge's port. This information
> along with the port's name recorded in the database is used to match the
> corresponding netlink messages.
> 
> This patchset also leverages another kernel API that allows the daemon
> to listen to all netlink messages from all netns which has an ID assigned
> into it.  This and the previous change allows the userspace to track ports
> in other network namespaces.
> 
> If any of the APIs aren't available, it falls back to the older APIs to
> not break backwards compatibility.
> 
> 
> ChangeLog:
> 
> * V3:
>   - Fixed long line (Greg)
>   - Rewrote assuming that the kernel will not send negative
> numbers as valid network namespace id. (Ben, Flavio, Jiri)
> 
> * V2:
>   - report and close unexpected file descriptors (Ben)
> 
> Flavio Leitner (6):
>   netlink: provide network namespace id from a msg.
>   netnsid: update device only if netnsid matches.
>   netdev-linux: use netlink to update netdev.
>   netlink linux: enable listening to all nsids
>   nlmon: added netns support.
>   netdev-linux: fail ops not supporting remote netns.
> 
>  configure.ac  |   3 +-
>  datapath/linux/compat/include/linux/openvswitch.h |   2 +
>  lib/automake.mk   |   1 +
>  lib/daemon-unix.c |   3 +-
>  lib/daemon.man|   6 +-
>  lib/daemon.xml|   8 +-
>  lib/dpif-netlink.c|  14 +-
>  lib/dpif-netlink.h|   1 +
>  lib/netdev-linux.c| 312 
> --
>  lib/netlink-notifier.c|   2 +-
>  lib/netlink-protocol.h|   6 +
>  lib/netlink-socket.c  |  80 +-
>  lib/netlink-socket.h  |   4 +-
>  lib/netnsid.h | 139 ++
>  tests/ofproto-macros.at   |   1 +
>  tests/ovn-controller-vtep.at  |   1 +
>  utilities/nlmon.c |  10 +-
>  17 files changed, 549 insertions(+), 44 deletions(-)
>  create mode 100644 lib/netnsid.h
> 
> -- 
> 2.14.3
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

-- 
Flavio

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2 1/3] ofproto-dpif-sflow: propagate actions within clone

2018-01-05 Thread Zoltán Balogh

> On Thu, Dec 14, 2017 at 09:56:19AM +0100, Zoltan Balogh wrote:
> > By avoiding Tx recirculation and embracing tnl_push action within clone,
> > the tunnel metadata is not propagated. Unless clone action is handled in
> > the dpif_sflow_read_actions() function as well. This commit resolves
> > this issue.
> > In addition, some sflow data needs to be stored and restored in
> > ofproto-dpif-xlate when native_tunnel_output() is invoked. Otherwise the
> > output action of underlay bridge is getting counted as well if sFlow is
> > set on the overlay bridge.
> >
> > Signed-off-by: Zoltan Balogh 
> 
> Thanks for working on bug fixes.
> 
> I don't yet understand the first problem that you mention.  Could you
> give an example of something that does not work properly without the
> patch, and how the patch fixes it?

Hi Ben,

As a performance improvement, an "avoid Tx recirculaion" patch was introduced 
in OVS 2.8:

commit 7c12dfc527a5f6e35eb47494d6284d5a7dbc352c
tunneling: Avoid datapath-recirc by combining recirc actions at xlate.

With this patch, instead of having two datapath flows (one for over- and one
for underlay bridge) there is only a single combined flow in datapath when
sending packet to native tunnel.
This single flow has a clone() action, which holds the actions from the 
underlay bridge.

Now, when configuring sflow in the overlay bridge, tunnel metadata is not
processed by OVS when reading sflow, because push tunnel action is embedded 
within the clone action. This is fixed by the patch.

> 
> It sounds like this patch addresses two independent bugs.  Could it be
> broken into two patches?

The second bug is connected to the "avoid Tx recirculation" patch as well.
Without fixing this second bug, the number of output ports provided by sflow
is not correct. 

Both are side-effects of the performance improvement patch mentioned above and
both affect sflow handling. That's why I put them into a single patch. 
Furthermore, fixing the one would introduce modifications which should be
removed when fixing the other one later.
By looking at the patch again, I see there these leftover modifications... 
I need to fix it.

Best regards,
Zoltan

> 
> Thanks,
> 
> Ben.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] xlate: fix xport lookup for recirc

2018-01-05 Thread Zoltán Balogh

> On Thu, Dec 21, 2017 at 02:22:43PM +, Zoltán Balogh wrote:
> > Xlate_lookup and xlate_lookup_ofproto_() provides in_port and ofproto
> > based on xport determined using flow, which is extracted from packet.
> > The lookup can happen due to recirculation as well. It can happen, that
> > packet_type has been modified during xlate before recirculation is
> > triggered, so the lookup fails or delivers wrong xport.
> > This can be worked around by propagating xport to ctx->xin after the very
> > first lookup and store it in frozen state of the recirculation.
> > So, when lookup is performed due to recirculation, the xport can be
> > retrieved from the frozen state.
> >
> > Signed-off-by: Zoltan Balogh 
> > CC: Jan Scheurich 
> 
> Thanks for working on finding and fixing bugs.
> 
> Storing a pointer to an xport, then checking later that it points to a
> valid xport, is risky because it opens up the possibility that the
> pointer points to a different xport that just happens to have the same
> address.  It's hard to guess how likely that coincidence is, but it
> would be better to avoid it.  This is the reason that frozen_state uses
> a random UUID to locate "ofprotos".  Probably, that approach would be
> better for xports too.

I see, thank you for the explanation. I'm going to create a new patch using
xport UUIDs.

> 
> When xport_in is nonnull but invalid, wouldn't it be better to abort
> than to search for an xport the fallback way? 

Yes, that would be better. Especially if UUID is used to get the xport.

> 
> What is the reason for the special case for patch ports?

I've performed some tests before created the patch. I had a setup with two
bridges connected with patch ports:

  p1 +---+ +---+ p2
  ->-o   | |   o->-
 |  br1  o-o  br2  |
 +---+ +---+

Recirculation can occur in the 2nd bridge as well, for instance when pop_mpls
action is followed by resubmit. In this case a new upcall is performed and
xlate_lookup_ofproto_() and xport_lookup() are invoked again. As I observed,
xport_lookup() returns xport referring to p1 in br1, i.e. the very first port
where the packet was received on. I assume, this is not a bug in OVS, is it?

So, I would like to store this very first port in the frozen state. That's the
reason why I do store xport in ctx.xin->xport_in only if xport is not a patch
port. If it's not a patch port, it can be only the very first input port.

> 
> Can you provide a good example of where this is important?

The main goal of this patch is to avoid invocation of xlate_lookup() in case of
recirculation (except recirc due to bond), because it can return a wrong xport. 
For instance, if L3 packet with MPLS label is received on a L3 tunnel port and
pop_mpls + resubmit actions are performed, then first packet_type is changed
due to pushing a dummy ethernet header, MPLS label is removed, then resubmit
action is processed. This triggers recirculation, where xport_lookup() fails
due to former change of packet_type as I noted in the commit message.

> 
> Thanks,
> 
> Ben.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH V2] OF1.5/EXT-334 OXS/Extensible Flow Entry Statistics Support

2018-01-05 Thread Satyavalli Rama

Hi Jan and Ben,

Please find the inline responses.


-Ben Pfaff  wrote: -
To: Jan Scheurich 
From: Ben Pfaff 
Date: 01/05/2018 02:35AM
Cc: SatyaValli , "d...@openvswitch.org" 
, Manasa Cherukupally , 
Pavani Panthagada , Lavanya Harivelam 
, Surya Muttamsetty , 
SatyaValli 
Subject: Re: [ovs-dev] [PATCH V2] OF1.5/EXT-334 OXS/Extensible Flow Entry 
Statistics Support

On Wed, Jan 03, 2018 at 04:24:06PM +, Jan Scheurich wrote:
> > > >
> > > > This Patch provides implementation Existing flow entry statistics are
> > > > redefined as standard OXS(OpenFlow Extensible Statistics) fields for
> > > > displaying the arbitrary flow stats.The existing Flow Stats were renamed
> > > > as Flow Description.
> > > >
> > > > To support this implementation below messages are newly added
> > > >
> > > > OFPRAW_OFPT15_FLOW_REMOVED,
> > > > OFPRAW_OFPST15_FLOW_REQUEST,
> > > > OFPRAW_OFPST15_FLOW_DESC_REQUEST,
> > > > OFPRAW_OFPST15_AGGREGATE_REQUEST,
> > > > OFPRAW_OFPST15_FLOW_REPLY,
> > > > OFPRAW_OFPST15_FLOW_DESC_REPLY,
> > > > OFPRAW_OFPST15_AGGREGATE_REPLY,
> > > >
> > > > The current commit adds support for the new feature in flow statistics
> > > > multipart messages,aggregate multipart messages and OXS support for flow
> > > > removal message, individual flow description messages.
> > > >
> > > > "ovs-ofctl dump-flows" needs to be provided with the arbitrary OXS 
> > > > fields
> > > > for displaying the desired flow stats.
> > > >
> > > > Below are Commands to display OXS stats field wise
> > > >
> > > > Flow Statistics Multipart
> > > > ovs-ofctl dump-flows -O OpenFlow15  idle_time
> > > > ovs-ofctl dump-flows -O OpenFlow15  packet_count
> > > > ovs-ofctl dump-flows -O OpenFlow15  byte_count
> > >
> > > This would break backward compatibility for one of the most frequently 
> > > used OVS CLI commands. Why don't you introduce a new
> > command such as "ovs-ofctl dump-flow-stats" for the new OXS stats?
> > 
> > I think you might be misinterpreting the meaning here.  It doesn't
> > appear to break compatibility, at least not in a major way, since it
> > doesn't do a lot of updates to the tests that would otherwise be
> > required.
> 
> Perhaps I am missing the point of some of these changes. I understand that 
> OVS needs to support the new extensible OXS flow stats syntax in OpenFlow 1.5 
> and the differentiated MP request/reply pairs OFPMP_FLOW_DESC (replacing the 
> former OFPMP_FLOW) and OFPMP_FLOW_STATS (just fetching flow stats per flow 
> w/o the rest of the flow data).
> 
> But I don't understand why this should have any impact on the existing CLI 
> command "ovs-ofctl dump-flows" and its output. This tool expressly fetches 
> and displays the complete flow dump from OVS, including match, 
> instructions/actions and statistics. When using OF 1.5 it should 
> transparently apply OFPMP_FLOW_DESC MP request/reply to fetch the data, up to 
> OF 1.4 it should use the original OFPMP_FLOW.
> 
> I can't see any ovs-ofctl use case that would justify the use of the new 
> OFPMP_FLOW_STATS request/reply. The removed data in the reply compared to the 
> full flow description are mainly the instructions, the full match is still 
> there to identify each flow. So cutting down the transferred data volume can 
> hardly be the reason (Note, this may still be different for real OF 1.5 
> controllers).
> 
> If you believe we should have an ovs-ofctl command anyhow, e.g. for testing 
> purposes, I suggest to introduce a new command or add an option to dump-flows 
> to force use of this particular MP message. The output would be limited to 
> flow match and stats in that case.
> 

As per our understanding and from previous review comments we treated OF1.5+ 
has two different ways to request and get replies for Flow Stats: FLOW_DESC and 
FLOW_STATS (which will be even used for Flow Stats Trigger). And we've 
supported this with the help of two commands

OFPMP_FLOW_DESC - ovs-ofctl dump-flow-desc -O OpenFlow15
OFPMP_FLOW_STATS - ovs-ofctl dump-flows -O OpenFlow15

> The new command to dump aggregate flow stats is of course a welcome addition. 
> It should work irrespectively of the used OpenFlow version with the 
> OFPMP_AGGREGATE_STATS primitive using classic flow stats prior to OF 1.5 and 
> OXS flow stats in OF 1.5.
> 
> All in all I am NACK-ing this patch as it stands.
> 
> Regards, Jan
> 
> BTW: I have played a bit with the patch. The existing ovs-ofctl test cases 
> appear to not break because the changes described in the commit message and 
> the documentation are not effective. The legacy command format "ovs-ofctl 
> dump-flows -O OpenFlow15  []" still produces the complete flow 
> dump including instructions/actions:
> 

We cross checked and tested the patch again on the latest OVS

Re: [ovs-dev] [PATCH V2] ofproto-dpif-xlate: Incorrect handling of errors in group action processing

2018-01-05 Thread Vishal Deep Ajmera

Thanks Ben for reviewing the patch. I have sent V3 patch after rebasing with 
master.

Warm Regards,
Vishal Ajmera

-Original Message-
From: Ben Pfaff [mailto:b...@ovn.org] 
Sent: Friday, January 05, 2018 2:29 AM
To: Vishal Deep Ajmera 
Cc: d...@openvswitch.org; Keshav Gupta 
Subject: Re: [PATCH V2] ofproto-dpif-xlate: Incorrect handling of errors in 
group action processing

On Mon, Dec 11, 2017 at 11:54:33AM +, Vishal Deep Ajmera wrote:
> As per OpenFlow v1.3 specification, when an action list contains a 
> group action a copy of the packet is passed to the group for 
> processing by the group. This means that if there is an error 
> encountered during group processing, only the copy of packet should be 
> dropped, but subsequent actions in the action list should be executed on the 
> original packet.
> 
> Additionally, if the group type is "ALL", each action bucket of the 
> group should process a copy of the packet. If there is an error while 
> processing one bucket other buckets should still be processed.
> 
> Example 1:
> table=0,in_port=tap0 actions=output:tap1,group:10,output:tap2
> 
> Even if any error is encountered while processing the group action, 
> the packet should still be forwarded to ports tap1 and tap2.
> 
> Example 2:
> group_id=1,type=all,bucket=actions=output:tap1,bucket=actions=encap(et
> h)
> 
> Even if processing the action in the second bucket fails because the 
> packet already has an Ethernet header, the other copy of the packet 
> should still be processed by the first bucket and output to port tap1.
> 
> Currently the error handling in OVS does not comply with those rules. 
> When any group bucket execution fails the error is recorded in the 
> so-called "translation context" which is global for the processing of 
> the original packet. Once an error is recorded, OVS skips processing 
> subsequent buckets and installs a drop action in the datapath even if 
> parts of the action list were previously processed successfully.
> 
> This patch clears the error flag after any bucket of a group is executed.
> This way the error encountered in processing any bucket of the group 
> will not impact other actions of action-list or other buckets of the group.
> 
> Errors which are system limits to protect translation from taking too 
> long time or too much space are not cleared. Instead drop action is 
> installed for them.
> 
> Signed-off-by: Vishal Deep Ajmera 
> Signed-off-by: Keshav Gupta 

Seems reasonable to me, but it fails to apply due to white space damage.  Can 
you resubmit a clean copy?

Thanks,

Ben.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2] ofproto: Delete all groups and meters when (un)configuring a controller.

2018-01-05 Thread Jan Scheurich

Just one small observation below. Otherwise LGTM.

I have tested the patch and it worked for all cases. I couldn't test the case 
that the switch loses connection to a controller in stand-alone fail mode.

Acked-by: Jan Scheurich   
Tested-by: Jan Scheurich 

> -Original Message-
> From: Ben Pfaff [mailto:b...@ovn.org]
> Sent: Wednesday, 08 November, 2017 04:04
> To: d...@openvswitch.org
> Cc: Ben Pfaff ; Periyasamy Palanisamy 
> ; Jan Scheurich
> 
> Subject: [PATCH v2] ofproto: Delete all groups and meters when 
> (un)configuring a controller.
> 
> Open vSwitch has always deleted all flows from the flow table whenever a
> controller is configured or whenever all the controllers are unconfigured.
> After this commit, OVS additionally deletes all OpenFlow groups and meters.
> 
> Suggested-by: Periyasamy Palanisamy 
> Suggested-by: Jan Scheurich 
> Signed-off-by: Ben Pfaff 
> ---
> v1->v2: Clear the meter table too (thanks Jan).
> 
>  AUTHORS.rst  |  1 +
>  NEWS |  3 +++
>  ofproto/ofproto.c| 26 +--
>  tests/ofproto.at | 58 
> 
>  vswitchd/vswitch.xml | 15 +++---
>  5 files changed, 90 insertions(+), 13 deletions(-)
> 
> diff --git a/AUTHORS.rst b/AUTHORS.rst
> index 139e99b330d8..26f81508d3d5 100644
> --- a/AUTHORS.rst
> +++ b/AUTHORS.rst
> @@ -519,6 +519,7 @@ Pasi Kärkkäinen pa...@iki.fi
>  Patrik Andersson R  patrik.r.anders...@ericsson.com
>  Paulo Cravero   pcrav...@as2594.net
>  Pawan Shuklashuk...@vmware.com
> +Periyasamy Palanisamy   periyasamy.palanis...@ericsson.com
>  Peter Amidonpe...@picnicpark.org
>  Peter Balland   pe...@nicira.com
>  Peter Phaal peter.ph...@inmon.com
> diff --git a/NEWS b/NEWS
> index 047f34b9f402..7ef4d6728d28 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -1,5 +1,8 @@
>  Post-v2.8.0
>  
> +   - ovs-vswitchd:
> + * Configuring a controller, or unconfiguring all controllers, now 
> deletes
> +   all groups and meters (as well as all flows).
> - OVN:
>   * The "requested-chassis" option for a logical switch port now accepts a
> chassis "hostname" in addition to a chassis "name".
> diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
> index 82c2bb27d348..e762888b746e 100644
> --- a/ofproto/ofproto.c
> +++ b/ofproto/ofproto.c
> @@ -251,6 +251,8 @@ static void delete_flows__(struct rule_collection *,
> const struct openflow_mod_requester *)
>  OVS_REQUIRES(ofproto_mutex);
> 
> +static void ofproto_group_delete_all__(struct ofproto *)
> +OVS_REQUIRES(ofproto_mutex);
>  static bool ofproto_group_exists(const struct ofproto *, uint32_t group_id);
>  static void handle_openflow(struct ofconn *, const struct ofpbuf *);
>  static enum ofperr ofproto_flow_mod_init(struct ofproto *,
> @@ -1566,6 +1568,8 @@ ofproto_flush__(struct ofproto *ofproto)
>  }
>  delete_flows__(, OFPRR_DELETE, NULL);
>  }
> +ofproto_group_delete_all__(ofproto);
> +meter_delete_all(ofproto);

When the ofproto instance is destroyed meter_delete_all() appears to be now 
invoked twice:
first from ofproto_destroy() directly and then a second time from here. It 
doesn't seem to do any harm,
but I guess it would be cleaner to remove the meter_delete_all() call from 
ofproto_destroy().

It seems that the hmap_destroy(>meters) call in ofproto_destroy() should 
also better be moved to the rcu-deferred ofproto_destroy__() function.

>  /* XXX: Concurrent handler threads may insert new learned flows based on
>   * learn actions of the now deleted flows right after we release
>   * 'ofproto_mutex'. */
> @@ -7348,20 +7352,30 @@ ofproto_group_mod_finish(struct ofproto *ofproto,
>   *
>   * This is intended for use within an ofproto provider's 'destruct'
>   * function. */
> -void
> -ofproto_group_delete_all(struct ofproto *ofproto)
> -OVS_EXCLUDED(ofproto_mutex)
> +static void
> +ofproto_group_delete_all__(struct ofproto *ofproto)
> +OVS_REQUIRES(ofproto_mutex)
>  {
>  struct ofproto_group_mod ogm;
> -
>  ogm.gm.command = OFPGC11_DELETE;
>  ogm.gm.group_id = OFPG_ALL;
> -
> -ovs_mutex_lock(_mutex);
>  ogm.version = ofproto->tables_version + 1;
> +
>  ofproto_group_mod_start(ofproto, );
>  ofproto_bump_tables_version(ofproto);
>  ofproto_group_mod_finish(ofproto, , NULL);
> +}
> +
> +/* Delete all groups from 'ofproto'.
> + *
> + * This is intended for use within an ofproto provider's 'destruct'
> + * function. */
> +void
> +ofproto_group_delete_all(struct ofproto *ofproto)
> +OVS_EXCLUDED(ofproto_mutex)
> +{
> +

[ovs-dev] [PATCH] ovn-northd: Avoid duplicate logical flows in SB db

2018-01-05 Thread Daniel Alvarez

When there are two ACLs in a Logical Switch with same direction,
priority, match and action fields, ovn-northd will generate the
exact same logical flow for them into SB database. This will make
ovn-controller log messages (INFO) saying that the duplicate flow
is going to be dropped.

This patch avoids adding duplicate lflows into SB database so that
ovn-controller doesn't have to process them.

Signed-off-by: Daniel Alvarez 
---

This patch is needed as part of the consistency work we're doing in the
OpenStack integration [0]. In our effort to ensure consistency across
objects in Neutron and OVN databases we find some special cases like
security group rules which match OVN ACLs but not in 1:1 relationship.
Until now, two identical security group rules beloning each to a
different security group would generate a single ACL in NB database.
With this behavior, there's no way to map the ACL in OVN to the
corresponding Neutron object.

By implementing [0] we're trying to ensure this mapping so we make use
of the external_ids column of every table for this purpose. It may happen
that we'll have two identical ACLs but each referencing a different
Neutron object in their external_ids field. However, this will make
ovn-northd to generate two duplicate lflows into SB database which will
make ovn-controller drop them when installing the actual flows. With this
patch we'll avoid duplicate flows to be inserted in SB database in such
cases.

[0] 
https://docs.openstack.org/networking-ovn/latest/contributor/design/database_consistency.html

 ovn/northd/ovn-northd.c | 11 +++
 tests/ovn-northd.at | 24 
 2 files changed, 35 insertions(+)

diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 7e6b1d9..cc64861 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -428,6 +428,13 @@ struct macam_node {
 struct eth_addr mac_addr; /* Allocated MAC address. */
 };
 
+static struct ovn_lflow *ovn_lflow_find(struct hmap *lflows,
+struct ovn_datapath *od,
+enum ovn_stage stage,
+uint16_t priority,
+const char *match,
+const char *actions);
+
 static void
 cleanup_macam(struct hmap *macam)
 {
@@ -2298,6 +2305,10 @@ ovn_lflow_add_at(struct hmap *lflow_map, struct 
ovn_datapath *od,
  const char *stage_hint, const char *where)
 {
 ovs_assert(ovn_stage_to_datapath_type(stage) == ovn_datapath_get_type(od));
+   
+if (ovn_lflow_find(lflow_map, od, stage, priority, match, actions)) {
+return;
+}
 
 struct ovn_lflow *lflow = xmalloc(sizeof *lflow);
 ovn_lflow_init(lflow, od, stage, priority,
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 954e259..ba96c81 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -152,3 +152,27 @@ ovn-nbctl lsp-set-options S1-R1 router-port=R1-S1
 AT_CHECK([test x`ovn-nbctl lsp-get-up S1-R1` = xup])
 
 AT_CLEANUP
+
+AT_SETUP([ovn -- check that duplicate acls don't generate duplicate lflows])
+AT_SKIP_IF([test $HAVE_PYTHON = no])
+ovn_start
+
+ovn-nbctl ls-add S1
+
+# Insert a duplicate ACL into NB database.
+ovn-nbctl -- --id=@acl create acl direction=to-lport priority=1000 \
+match='"tcp.dst == 22"' action=drop -- add logical_switch S1 acl @acl
+ovn-nbctl -- --id=@acl create acl direction=to-lport priority=1000 \
+match='"tcp.dst == 22"' action=drop -- add logical_switch S1 acl @acl
+
+# Check that there are two entries in ACL table in NB database.
+AT_CHECK([ovn-nbctl find ACL match='"tcp.dst == 22"' | \
+grep _uuid | wc -l], [0], [2
+])
+
+# Now make sure that only one logical flow is added to SB database.
+AT_CHECK([ovn-sbctl find Logical_Flow match='"tcp.dst == 22"' | \
+grep _uuid | wc -l], [0], [1
+])
+
+AT_CLEANUP
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH V3] ofproto-dpif-xlate: Incorrect handling of errors in group action processing

2018-01-05 Thread Vishal Deep Ajmera

As per OpenFlow v1.3 specification, when an action list contains a group
action a copy of the packet is passed to the group for processing by the
group. This means that if there is an error encountered during group
processing, only the copy of packet should be dropped, but subsequent
actions in the action list should be executed on the original packet.

Additionally, if the group type is "ALL", each action bucket of the group
should process a copy of the packet. If there is an error while processing
one bucket other buckets should still be processed.

Example 1:
table=0,in_port=tap0 actions=output:tap1,group:10,output:tap2

Even if any error is encountered while processing the group action, the
packet should still be forwarded to ports tap1 and tap2.

Example 2:
group_id=1,type=all,bucket=actions=output:tap1,bucket=actions=encap(eth)

Even if processing the action in the second bucket fails because the
packet already has an Ethernet header, the other copy of the packet should
still be processed by the first bucket and output to port tap1.

Currently the error handling in OVS does not comply with those rules. When
any group bucket execution fails the error is recorded in the so-called
"translation context" which is global for the processing of the original
packet. Once an error is recorded, OVS skips processing subsequent buckets
and installs a drop action in the datapath even if parts of the action list
were previously processed successfully.

This patch clears the error flag after any bucket of a group is executed.
This way the error encountered in processing any bucket of the group will
not impact other actions of action-list or other buckets of the group.

Errors which are system limits to protect translation from taking too long
time or too much space are not cleared. Instead drop action is installed
for them.

Signed-off-by: Vishal Deep Ajmera 
vishal.deep.ajm...@ericsson.com
Signed-off-by: Keshav Gupta 
keshav.gu...@ericsson.com
---
ofproto/ofproto-dpif-xlate.c | 16 
1 file changed, 16 insertions(+)

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index d94e9dc..eb8021a 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -4066,6 +4066,22 @@ xlate_group_bucket(struct xlate_ctx *ctx, struct 
ofputil_bucket *bucket,
  * the group bucket freezes translation, the actions after the group action
  * must continue processing with the original, not the frozen packet! */
ctx->exit = false;
+
+/* Context error in a bucket should not impact processing of other buckets
+ * or actions. This is similar to cloning a packet for group buckets.
+ * There is no need to restore the error back to old value due to the fact
+ * that we actually processed group action which can happen only when there
+ * is no previous context error.
+ *
+ * Exception to above is errors which are system limits to protect
+ * translation from running too long or occupy too much space. These errors
+ * should not be masked. XLATE_RECURSION_TOO_DEEP, XLATE_TOO_MANY_RESUBMITS
+ * and XLATE_STACK_TOO_DEEP fall in this category. */
+if (ctx->error == XLATE_TOO_MANY_MPLS_LABELS ||
+ctx->error == XLATE_UNSUPPORTED_PACKET_TYPE) {
+/* reset the error and continue processing other buckets */
+ctx->error = XLATE_OK;
+}
}

static void
--
1.9.1
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] is there any performance consideration for max emc cache numbers and megaflow cache numbers?

2018-01-05 Thread ychen

Hi:
in ovs code,
MAX_FLOWS = 65536  // for megaflow
#define EM_FLOW_HASH_SHIFT 13
#define EM_FLOW_HASH_ENTRIES (1u << EM_FLOW_HASH_SHIFT)   // for emc cache


so why choose 65536 and 8192? is there any performance consideration? can I 
just larger these numbers to make packet only lookup emc cache and megaflow 
cache?


another question:
is there any document/data for packet thuoughput in netdev dpdk mode with only 
emc cache/megaflow cache or with only userspace flow lookup?


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

53 matches

Mail list logo