Re: [ovs-dev] Request to tag 2.7 branch with v2.7.1

2017-06-23 Thread Justin Pettit

> On Jun 24, 2017, at 2:12 AM, Joe Stringer  wrote:
> 
>> On 23 June 2017 at 05:38, Russell Bryant  wrote:
>>> On Thu, Jun 8, 2017 at 12:12 PM, Ben Pfaff  wrote:
 On Thu, Jun 08, 2017 at 04:36:21AM +0530, Numan Siddique wrote:
 Is it possible to tag the 2.7 branch with v2.7.1. Recently there were many
 back ports to 2.7 branch. OVS 2.7 is required for RDO [1] and the RDO
 community is expecting a new tag to build the latest OVS 2.7 branch.
>>> 
>>> It's fine with me if we release 2.7.1.  Justin?
>> 
>> Hi all, just wanted to check in on 2.7.1.  Last week Justin said there
>> may be some more fixes from Joe to get in first.  Is anything still
>> pending?  I also know some folks have been traveling this week.
> 
> I had been thinking about this patch:
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-June/334446.html
> 
> However, it is in a delicate area of the code and no-one other than me
> has had a chance to look it over yet. Therefore, I don't think it
> should block the 2.7.1 release.

Ben and I are in China right now, and I owe Ben a review on one patch. I will 
get that done on Monday, and then I'll get the release process kicked off. 

--Justin


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [patch_v2 1/4] Userspace Datapath: Add ALG infra and FTP.

2017-06-23 Thread Darrell Ball


On 6/23/17, 4:08 PM, "ovs-dev-boun...@openvswitch.org on behalf of Joe 
Stringer"  wrote:

On 17 June 2017 at 15:53, Darrell Ball  wrote:
> ALG infra and FTP (both V4 and V6) support is added to the userspace
> datapath.  Also, NAT support is included.
>
> Signed-off-by: Darrell Ball 
> ---

Hi Darrell, thanks for the patch.

I wasn't able to test this out, because my build and test environments
depend on sparse correctly running and it wouldn't compile. Please
consider setting it up, or using Travis-CI to validate builds before
v3.

I see Travis found a sparse warning; strange, I always look for these
and still see none, I even changed the line you saw flagged
> +return v4_addr >> (index * 8) & 0xff;
and still no warning :-(. I usually even get extra spurious warnings.
However, from simple inspection right now, I would expect a “restricted type” 
warning; it is not a real problem, but, of course I’ll fix it up



I didn't do a thorough review, but I scanned through and pointed out a
couple of parts I have questions about.

>  lib/conntrack-private.h |   17 +
>  lib/conntrack.c | 1016 
+++
>  lib/conntrack.h |4 +-
>  3 files changed, 957 insertions(+), 80 deletions(-)
>
> diff --git a/lib/conntrack-private.h b/lib/conntrack-private.h
> index 55084d3..993af33 100644
> --- a/lib/conntrack-private.h
> +++ b/lib/conntrack-private.h
> @@ -62,17 +62,34 @@ struct nat_conn_key_node {
>  struct conn_key value;
>  };
>
> +struct alg_exp_node {
> +struct hmap_node node;
> +struct ovs_list exp_node;
> +long long expiration;
> +struct conn_key key;
> +struct conn_key master_key;
> +struct ct_addr alg_nat_repl_addr;
> +ovs_u128 master_label;
> +uint32_t master_mark;
> +};
> +
>  struct conn {
>  struct conn_key key;
>  struct conn_key rev_key;
> +/* Only used for orig_tuple support. */
> +struct conn_key master_key;

I guess no-one's looked at this structure from a cacheline
perspective, but if we think it's worth optimizing then perhaps a
followup patch should rearrange these. You might consider starting
here by just placing all of the ALG-related fields together at the
end.

I intend some optimizations later, but it is outside the scope of this patch.  
I placed the fields presently based on similar function for one field - 
master_conn
and the other fields based on padding considerations.
I leave as is for the time being and revisit later.

>  long long expiration;
>  struct ovs_list exp_node;
>  struct hmap_node node;
>  ovs_u128 label;
>  /* XXX: consider flattening. */
>  struct nat_action_info_t *nat_info;
> +char *alg;
> +int seq_skew;
>  uint32_t mark;
>  uint8_t conn_type;
> +uint8_t seq_skew_dir;
> +uint8_t alg_related;
>  };
>
>  enum ct_update_res {
> diff --git a/lib/conntrack.c b/lib/conntrack.c
> index 90b154a..1f54fe3 100644
> --- a/lib/conntrack.c
> +++ b/lib/conntrack.c
> @@ -1,5 +1,5 @@
>  /*
> - * Copyright (c) 2015, 2016 Nicira, Inc.
> + * Copyright (c) 2015, 2016, 2017 Nicira, Inc.

All three files could get an update like this.

Yep, was on my todo list.

>   *
>   * Licensed under the Apache License, Version 2.0 (the "License");
>   * you may not use this file except in compliance with the License.
> @@ -21,6 +21,7 @@
>  #include 
>  #include 
>  #include 
> +#include 

Alphabetic order in each section of #includes?

Yep, I had read that rule.

>
>  #include "bitmap.h"
>  #include "conntrack-private.h"
> @@ -39,7 +40,6 @@
>  #include "random.h"
>  #include "timeval.h"
>
> -
>  VLOG_DEFINE_THIS_MODULE(conntrack);
>
>  COVERAGE_DEFINE(conntrack_full);
> @@ -50,9 +50,21 @@ struct conn_lookup_ctx {
>  struct conn *conn;
>  uint32_t hash;
>  bool reply;
> +/* XXX: Only used by ICMP; change name. */

Please consider sending a separate patch to change the name.

I put the reminder here to do later, but it is relevant to this series.
Maybe, I’ll just fold in.



> @@ -179,14 +241,20 @@ conntrack_destroy(struct conntrack *ct)
>  ct_lock_unlock(&ctb->lock);
>  ct_lock_destroy(&ctb->lock);
>  }
> -ct_rwlock_wrlock(&ct->nat_resources_lock);
> +ct_rwlock_wrlock(&ct->resources_lock);
>  struct nat_conn_key_node *nat_conn_key_node;
>  HMAP_FOR_EACH_POP (nat_conn_key_node, node, &ct->nat_conn_keys) {
>  free(nat_conn_key_node);
>  }
>  hmap_destroy(&ct->nat_conn_keys);
> -ct_rwlock_unloc

Re: [ovs-dev] [patch_v2 1/4] Userspace Datapath: Add ALG infra and FTP.

2017-06-23 Thread Joe Stringer
On 17 June 2017 at 15:53, Darrell Ball  wrote:
> ALG infra and FTP (both V4 and V6) support is added to the userspace
> datapath.  Also, NAT support is included.
>
> Signed-off-by: Darrell Ball 
> ---

Hi Darrell, thanks for the patch.

I wasn't able to test this out, because my build and test environments
depend on sparse correctly running and it wouldn't compile. Please
consider setting it up, or using Travis-CI to validate builds before
v3.

I didn't do a thorough review, but I scanned through and pointed out a
couple of parts I have questions about.

>  lib/conntrack-private.h |   17 +
>  lib/conntrack.c | 1016 
> +++
>  lib/conntrack.h |4 +-
>  3 files changed, 957 insertions(+), 80 deletions(-)
>
> diff --git a/lib/conntrack-private.h b/lib/conntrack-private.h
> index 55084d3..993af33 100644
> --- a/lib/conntrack-private.h
> +++ b/lib/conntrack-private.h
> @@ -62,17 +62,34 @@ struct nat_conn_key_node {
>  struct conn_key value;
>  };
>
> +struct alg_exp_node {
> +struct hmap_node node;
> +struct ovs_list exp_node;
> +long long expiration;
> +struct conn_key key;
> +struct conn_key master_key;
> +struct ct_addr alg_nat_repl_addr;
> +ovs_u128 master_label;
> +uint32_t master_mark;
> +};
> +
>  struct conn {
>  struct conn_key key;
>  struct conn_key rev_key;
> +/* Only used for orig_tuple support. */
> +struct conn_key master_key;

I guess no-one's looked at this structure from a cacheline
perspective, but if we think it's worth optimizing then perhaps a
followup patch should rearrange these. You might consider starting
here by just placing all of the ALG-related fields together at the
end.

>  long long expiration;
>  struct ovs_list exp_node;
>  struct hmap_node node;
>  ovs_u128 label;
>  /* XXX: consider flattening. */
>  struct nat_action_info_t *nat_info;
> +char *alg;
> +int seq_skew;
>  uint32_t mark;
>  uint8_t conn_type;
> +uint8_t seq_skew_dir;
> +uint8_t alg_related;
>  };
>
>  enum ct_update_res {
> diff --git a/lib/conntrack.c b/lib/conntrack.c
> index 90b154a..1f54fe3 100644
> --- a/lib/conntrack.c
> +++ b/lib/conntrack.c
> @@ -1,5 +1,5 @@
>  /*
> - * Copyright (c) 2015, 2016 Nicira, Inc.
> + * Copyright (c) 2015, 2016, 2017 Nicira, Inc.

All three files could get an update like this.

>   *
>   * Licensed under the Apache License, Version 2.0 (the "License");
>   * you may not use this file except in compliance with the License.
> @@ -21,6 +21,7 @@
>  #include 
>  #include 
>  #include 
> +#include 

Alphabetic order in each section of #includes?

>
>  #include "bitmap.h"
>  #include "conntrack-private.h"
> @@ -39,7 +40,6 @@
>  #include "random.h"
>  #include "timeval.h"
>
> -
>  VLOG_DEFINE_THIS_MODULE(conntrack);
>
>  COVERAGE_DEFINE(conntrack_full);
> @@ -50,9 +50,21 @@ struct conn_lookup_ctx {
>  struct conn *conn;
>  uint32_t hash;
>  bool reply;
> +/* XXX: Only used by ICMP; change name. */

Please consider sending a separate patch to change the name.



> @@ -179,14 +241,20 @@ conntrack_destroy(struct conntrack *ct)
>  ct_lock_unlock(&ctb->lock);
>  ct_lock_destroy(&ctb->lock);
>  }
> -ct_rwlock_wrlock(&ct->nat_resources_lock);
> +ct_rwlock_wrlock(&ct->resources_lock);
>  struct nat_conn_key_node *nat_conn_key_node;
>  HMAP_FOR_EACH_POP (nat_conn_key_node, node, &ct->nat_conn_keys) {
>  free(nat_conn_key_node);
>  }
>  hmap_destroy(&ct->nat_conn_keys);
> -ct_rwlock_unlock(&ct->nat_resources_lock);
> -ct_rwlock_destroy(&ct->nat_resources_lock);
> +
> +struct alg_exp_node *alg_exp_node;
> +HMAP_FOR_EACH_POP (alg_exp_node, node, &ct->alg_expectations) {
> +free(alg_exp_node);
> +}
> +hmap_destroy(&ct->alg_expectations);

You might also consider poisoning ct->alg_exp_list.



> @@ -363,8 +469,8 @@ reverse_nat_packet(struct dp_packet *pkt, const struct 
> conn *conn)
>  struct ip_header *nh = dp_packet_l3(pkt);
>  struct icmp_header *icmp = dp_packet_l4(pkt);
>  struct ip_header *inner_l3 = (struct ip_header *) (icmp + 1);
> -extract_l3_ipv4(&inner_key, inner_l3, tail - ((char *)inner_l3)
> --pad, &inner_l4, false);
> +extract_l3_ipv4(&inner_key, inner_l3, tail - ((char *)inner_l3) -pad,
> +&inner_l4, false);

I guess this is just some automatic argument formatting that you
performed incidentally, but if it's getting changed, then usually we
put a space around operators such as '-'.



> @@ -554,34 +681,50 @@ conn_not_found(struct conntrack *ct, struct dp_packet 
> *pkt,
>  nc->rev_key = nc->key;
>  conn_key_reverse(&nc->rev_key);
>
> +if (helper) {
> +nc->alg = xstrdup(helper);
> +}
> +
> +if (alg_exp) {
> +nc->alg_related = true;
> +nc->mark = alg_exp->master_mar

Re: [ovs-dev] [PATCH 4/4] dp-packet: Use memcpy to copy dp_packet fields.

2017-06-23 Thread Fischetti, Antonio
Hi Billy, thanks for your review.
Replies inline.

/Antonio

> -Original Message-
> From: O Mahony, Billy
> Sent: Friday, June 23, 2017 2:27 PM
> To: Fischetti, Antonio ; d...@openvswitch.org
> Subject: RE: [ovs-dev] [PATCH 4/4] dp-packet: Use memcpy to copy dp_packet
> fields.
> 
> Hi Antonio,
> 
> I'm not sure that this approach will work. Mainly the memcpy will not take
> account of any padding that the compiler may insert and the struct.  Maybe
> if the struct was defined as packed it could fix this objection.

[AF] This patch is somewhat similar to the patch for initializing 
the pkt_metadata struct in pkt_metadata_init() at
http://patchwork.ozlabs.org/patch/779696/


> 
> Also anyone editing the structure in future would have to be aware that
> these elements need to be kept contiguous in order for packet_clone to
> keep working.

[AF] Agree, I should add a comment locally to the struct definition. 

> 
> Or if the relevant fields here were all placed in the their own nested
> struct then sizeof that nested struct could be used in the memcpy call as
> the sizeof nested_struct would account for whatever padding the compiler
> inserted.
> 
> Does this change give much of a performance increase?

I tested this while working on connection tracker. I was using this particular 
set
of flows - see 4th line - with 
"action=ct(table=1),NORMAL";
to trigger a call to dp_packet_clone_with_headroom():

ovs-ofctl del-flows br0;
ovs-ofctl add-flow br0 table=0,priority=1,action=drop;
ovs-ofctl add-flow br0 table=0,priority=10,arp,action=normal;
ovs-ofctl add-flow br0 
table=0,priority=100,ip,ct_state=-trk,"action=ct(table=1),NORMAL";
ovs-ofctl add-flow br0 
table=1,in_port=1,ip,ct_state=+trk+new,"action=ct(commit),2";
ovs-ofctl add-flow br0 table=1,in_port=1,ip,ct_state=+trk+est,"action=2";
ovs-ofctl add-flow br0 table=1,in_port=2,ip,ct_state=+trk+new,"action=drop";
ovs-ofctl add-flow br0 table=1,in_port=2,ip,ct_state=+trk+est,"action=1";

After running a Hotspot analysis with VTune for 60 secs I had in the original 
that dp_packet_clone_with_headroom was ranked at the 2nd place:
Function   CPU Time
-+---
__libc_malloc5.880s
dp_packet_clone_with_headroom4.530s
emc_lookup   4.050s
free 3.500s
pthread_mutex_unlock 2.890s
...

Instead after this change the same fn was consuming less cpu cycles:
Function   CPU Time
-+---
__libc_malloc5.900s
emc_lookup   4.070s
free 4.010s
dp_packet_clone_with_headroom3.920s
pthread_mutex_unlock 3.060s




> 
> Also I commented on 1/4 of this patchset about a cover letter - but if the
> patchset members are independent of each other then maybe they should just
> be separate patches.

[AF] I grouped these patches together because they all would be some 
optimizations on performance with a focus mainly on conntracker usecase. 
Maybe a better choice was to split them in separate patches.

> 
> Regards,
> Billy.
> 
> > -Original Message-
> > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> > boun...@openvswitch.org] On Behalf Of antonio.fische...@intel.com
> > Sent: Monday, June 19, 2017 11:12 AM
> > To: d...@openvswitch.org
> > Subject: [ovs-dev] [PATCH 4/4] dp-packet: Use memcpy to copy dp_packet
> > fields.
> >
> > From: Antonio Fischetti 
> >
> > memcpy replaces the single copies inside
> > dp_packet_clone_with_headroom().
> >
> > Signed-off-by: Antonio Fischetti 
> > ---
> >  lib/dp-packet.c | 17 +
> >  1 file changed, 9 insertions(+), 8 deletions(-)
> >
> > diff --git a/lib/dp-packet.c b/lib/dp-packet.c index 67aa406..5b1d416
> 100644
> > --- a/lib/dp-packet.c
> > +++ b/lib/dp-packet.c
> > @@ -157,7 +157,7 @@ dp_packet_clone(const struct dp_packet *buffer)
> >  return dp_packet_clone_with_headroom(buffer, 0);  }
> >
> > -/* Creates and returns a new dp_packet whose data are copied from
> > 'buffer'.   The
> > +/* Creates and returns a new dp_packet whose data are copied from
> > +'buffer'. The
> >   * returned dp_packet will additionally have 'headroom' bytes of
> headroom.
> > */  struct dp_packet *  dp_packet_clone_with_headroom(const struct
> > dp_packet *buffer, size_t headroom) @@ -167,13 +167,14 @@
> > dp_packet_clone_with_headroom(const struct dp_packet *buffer, size_t
> > headroom)
> >  new_buffer =
> > dp_packet_clone_data_with_headroom(dp_packet_data(buffer),
> >
> dp_packet_size(buffer),
> >   headroom);
> > -new_buffer->l2_pad_size = buffer->l2_pad_size;
> > -new_buffer->l2_5_ofs = buffer->l2_5_ofs;
> > -new_buffer->l3_ofs = buffer->l3_ofs;
> > -new_buffer->l4_ofs = buffer->l4_ofs;
> > -new_buffer->md = buffer->md;
> > -new_buffer->cutlen = buffer->cutlen;
> > -new_buffer->pack

Re: [ovs-dev] [PATCH v2] datapath-windows: Include ICMP type and code fields to find a matching ct entry

2017-06-23 Thread Sairam Venugopal
Acked-by: Sairam Venugopal 





On 6/23/17, 2:05 PM, "ovs-dev-boun...@openvswitch.org on behalf of Anand Kumar" 
 wrote:

>In conntrack lookup, ICMP type and code fields were not being used to
>determine a matching entry. As a result, ICMP4_ECHO_REQUEST packet could
>be tracked as ICMP4_ECHO_REPLY packet and vice versa, which is invalid.
>
>To fix this, add ICMP type and code fields for matching a conntrack entry.
>
>Signed-off-by: Anand Kumar 
>Acked-by: Sairam Venugopal 
>---
> datapath-windows/ovsext/Conntrack.c | 16 +++-
> 1 file changed, 7 insertions(+), 9 deletions(-)
>
>diff --git a/datapath-windows/ovsext/Conntrack.c 
>b/datapath-windows/ovsext/Conntrack.c
>index 07a9583..e97d6ce 100644
>--- a/datapath-windows/ovsext/Conntrack.c
>+++ b/datapath-windows/ovsext/Conntrack.c
>@@ -383,15 +383,13 @@ OvsDetectCtPacket(OvsForwardingContext *fwdCtx,
> BOOLEAN
> OvsCtKeyAreSame(OVS_CT_KEY ctxKey, OVS_CT_KEY entryKey)
> {
>-return ((ctxKey.src.addr.ipv4 == entryKey.src.addr.ipv4) &&
>-(ctxKey.src.addr.ipv4_aligned == entryKey.src.addr.ipv4_aligned) &&
>-(ctxKey.src.port == entryKey.src.port) &&
>-(ctxKey.dst.addr.ipv4 == entryKey.dst.addr.ipv4) &&
>-(ctxKey.dst.addr.ipv4_aligned == entryKey.dst.addr.ipv4_aligned) &&
>-(ctxKey.dst.port == entryKey.dst.port) &&
>-(ctxKey.dl_type == entryKey.dl_type) &&
>-(ctxKey.nw_proto == entryKey.nw_proto) &&
>-(ctxKey.zone == entryKey.zone));
>+return ((NdisEqualMemory(&ctxKey.src, &entryKey.src,
>+ sizeof(struct ct_endpoint))) &&
>+(NdisEqualMemory(&ctxKey.dst, &entryKey.dst,
>+ sizeof(struct ct_endpoint))) &&
>+(ctxKey.dl_type == entryKey.dl_type) &&
>+(ctxKey.nw_proto == entryKey.nw_proto) &&
>+(ctxKey.zone == entryKey.zone));
> }
> 
> static __inline VOID
>-- 
>2.9.3.windows.1
>
>___
>dev mailing list
>d...@openvswitch.org
>https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=Z6vowHUOjP5ysP_g372c49Nqc1vEKqHKNBkR5Q5Z7uo&m=dZbQHOKOvzZMDRL3iLHrW462Arv61PvgHu7MBbIgGfE&s=J6o8L1w_jhv5yW37xELhRdaqZnN-WcG8UAHw4xlXG7I&e=
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/4] dpif-netdev: Avoid reading RSS hash when EMC is disabled.

2017-06-23 Thread Fischetti, Antonio
Hi Billy, thanks for your suggestion, it makes the code more clean 
and readable. 
Once I get back from vacation I'll give it a try and check if this 
still gives a performance benefit.

/Antonio

> -Original Message-
> From: O Mahony, Billy
> Sent: Friday, June 23, 2017 5:23 PM
> To: Fischetti, Antonio ; d...@openvswitch.org
> Subject: RE: [ovs-dev] [PATCH 1/4] dpif-netdev: Avoid reading RSS hash
> when EMC is disabled.
> 
> Hi Antonio,
> 
> > -Original Message-
> > From: Fischetti, Antonio
> > Sent: Friday, June 23, 2017 3:10 PM
> > To: O Mahony, Billy ; d...@openvswitch.org
> > Subject: RE: [ovs-dev] [PATCH 1/4] dpif-netdev: Avoid reading RSS hash
> > when EMC is disabled.
> >
> > Hi Billy,
> > thanks a lot for you suggestions. Those would really help re-factoring
> the
> > code by avoiding duplications.
> > The thing is that this patch 1/4 is mainly a preparation for the next
> patch 2/4.
> > So I did these changes with the next patch 2/4 in mind.
> >
> > The final result I meant to achieve in patch 2/4 is the following.
> > EMC lookup is skipped - not only when EMC is disabled - but also when
> > (we're processing recirculated packets) && (the EMC is 'enough' full).
> > The purpose is to avoid EMC thrashing.
> >
> > Below is how the code looks like after applying patches 1/4 and 2/4.
> > Please let me know if you can find some similar optimizations to avoid
> code
> > duplications, that would be great.
> > 
> > /*
> >  * EMC lookup is skipped when one or both of the following
> >  * two cases occurs:
> >  *
> >  *   - EMC is disabled.  This is detected from cur_min.
> >  *
> >  *   - The EMC occupancy exceeds EMC_FULL_THRESHOLD and the
> >  * packet to be classified is being recirculated.  When this
> >  * happens also EMC insertions are skipped for recirculated
> >  * packets.  So that EMC is used just to store entries which
> >  * are hit from the 'original' packets.  This way the EMC
> >  * thrashing is mitigated with a benefit on performance.
> >  */
> > if (!md_is_valid) {
> > pkt_metadata_init(&packet->md, port_no);
> > miniflow_extract(packet, &key->mf);  <== this fn must be
> called after
> > pkt_metadta_init
> > /* This is not a recirculated packet. */
> > if (OVS_LIKELY(cur_min)) {
> > /* EMC is enabled.  We can retrieve the 5-tuple hash
> >  * without considering the recirc id. */
> > if (OVS_LIKELY(dp_packet_rss_valid(packet))) {
> > key->hash = dp_packet_get_rss_hash(packet);
> > } else {
> > key->hash = miniflow_hash_5tuple(&key->mf, 0);
> > dp_packet_set_rss_hash(packet, key->hash);
> > }
> > flow = emc_lookup(flow_cache, key);
> > } else {
> > /* EMC is disabled, skip emc_lookup. */
> > flow = NULL;
> > }
> > } else {
> > /* Recirculated packets. */
> > miniflow_extract(packet, &key->mf);
> > if (flow_cache->n_entries & EMC_FULL_THRESHOLD) {
> > /* EMC occupancy is over the threshold.  We skip EMC
> >  * lookup for recirculated packets. */
> > flow = NULL;
> > } else {
> > if (OVS_LIKELY(cur_min)) {
> > key->hash = dpif_netdev_packet_get_rss_hash(packet,
> > &key->mf);
> > flow = emc_lookup(flow_cache, key);
> > } else {
> > flow = NULL;
> > }
> > }
> > }
> > 
> >
> > Basically patch 1/4 is mostly a preliminary change for 2/4.
> >
> > Yes, patch 1/4 also allows to avoid reading hash when EMC is disabled.
> > Or - for packets that are not recirculated - avoids calling
> > recirc_depth_get_unsafe() when reading the hash.
> >
> > Also, as these functions are critical for performance, I tend to avoid
> adding
> > new Booleans that require new if statements.
> [[BO'M]]
> 
> Can you investigate refactoring this patch with something like below.  I
> think this is equivalent.  The current patch duplicates miniflow_extract,
> emc_lookup across the md_is_valid and !md_is_valid branches. It also
> duplicates some of the internals of get_rss_hash out into the !md_is_valid
> case and is difficult to follow.
> 
> If the following suggestion works  the change in emc_processing from patch
> 2/4 can easily be grafted on to that.
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 4e29085..a7e854d 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -4442,7 +4442,8 @@ dp_netdev_upcall(struct dp_netdev_pmd_thread *pmd,
> struct dp_packet *packet_,
> 
>  static inl

Re: [ovs-dev] [PATCH RFC 2/4] dpif-netdev: Skip EMC lookup/insert for recirculated packets.

2017-06-23 Thread Fischetti, Antonio
Thanks a lot Billy, really appreciate your feedback.
My replies inline.

/Antonio

> -Original Message-
> From: O Mahony, Billy
> Sent: Friday, June 23, 2017 6:39 PM
> To: Fischetti, Antonio ; d...@openvswitch.org
> Subject: RE: [ovs-dev] [PATCH RFC 2/4] dpif-netdev: Skip EMC lookup/insert
> for recirculated packets.
> 
> Hi Antonio,
> 
> This is a really interesting patch. Comments inline below.
> 
> Thanks,
> /Billy.
> 
> > -Original Message-
> > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> > boun...@openvswitch.org] On Behalf Of antonio.fische...@intel.com
> > Sent: Monday, June 19, 2017 11:12 AM
> > To: d...@openvswitch.org
> > Subject: [ovs-dev] [PATCH RFC 2/4] dpif-netdev: Skip EMC lookup/insert
> for
> > recirculated packets.
> >
> > From: Antonio Fischetti 
> >
> > When OVS is configured as a firewall, with thousands of active
> concurrent
> > connections, the EMC gets quicly saturated and may come under heavy
> > thrashing for the reason that original and recirculated packets keep
> overwrite
> > existing active EMC entries due to its limited size(8k).
> >
> > This thrashing causes the EMC to be less efficient than the dcpls in
> terms of
> > lookups and insertions.
> >
> > This patch allows to use the EMC efficiently by allowing only the
> 'original'
> > packets to hit EMC. All recirculated packets are sent to classifier
> directly.
> > An empirical threshold (EMC_FULL_THRESHOLD - of 50%) for EMC occupancy
> > is set to trigger this logic. By doing so when EMC utilization exceeds
> > EMC_FULL_THRESHOLD.
> >  - EMC Insertions are allowed just for original packets. EMC insertion
> >and look up is skipped for recirculated packets.
> >  - Recirculated packets are sent to classifier.
> >
> > This patch depends on the previous one in this series. It's based on
> patch
> > "dpif-netdev: add EMC entry count and %full figure to pmd-stats-show"
> at:
> > https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327570.html
> >
> > Signed-off-by: Antonio Fischetti 
> > Signed-off-by: Bhanuprakash Bodireddy
> > 
> > Co-authored-by: Bhanuprakash Bodireddy
> > 
> > ---
> > In our Connection Tracker testbench set up with
> >
> >  table=0, priority=1 actions=drop
> >  table=0, priority=10,arp actions=NORMAL  table=0,
> priority=100,ct_state=-
> > trk,ip actions=ct(table=1)  table=1, ct_state=+new+trk,ip,in_port=1
> > actions=ct(commit),output:2  table=1, ct_state=+est+trk,ip,in_port=1
> > actions=output:2  table=1, ct_state=+new+trk,ip,in_port=2 actions=drop
> > table=1, ct_state=+est+trk,ip,in_port=2 actions=output:1
> >
> > we saw the following performance improvement.
> >
> > Measured packet Rx rate (regardless of packet loss). Bidirectional test
> with
> > 64B UDP packets.
> > Each row is a test with a different number of traffic streams. The
> traffic
> > generator is set so that each stream establishes one UDP connection.
> > Mpps columns reports the Rx rates on the 2 sides.
> >
> >  Traffic |Orig| Orig  |  +changes  |   +changes
> >  Streams |   [Mpps]   | [EMC entries] |   [Mpps]   | [EMC entries]
> > -++---++---
> >  10  |  3.4, 3.4  |  20   |  3.4, 3.4  |  20
> > 100  |  2.6, 2.7  | 200   |  2.6, 2.7  | 201
> >   1,000  |  2.4, 2.4  |2009   |  2.4, 2.4  |1994
> >   2,000  |  2.2, 2.2  |3903   |  2.2, 2.2  |3900
> >   3,000  |  2.1, 2.1  |5473   |  2.2, 2.2  |4798
> >   4,000  |  2.0, 2.0  |6478   |  2.2, 2.2  |5663
> >  10,000  |  1.8, 1.9  |8070   |  2.0, 2.0  |7347
> > 100,000  |  1.7, 1.7  |8192   |  1.8, 1.8  |8192
> >
> 
> [[BO'M]]
> A few questions on the test:
> Are all the pkts rxd being recirculated?

[AF] Yes, I sent UDP packets with the firewall rules above. Every packets 
goes through the CT module, so after that it is recirculated.

> Are there any flows present where the pkts do not require recirculation?

[AF] No. The flow
table=0,priority=100,ct_state=-trk,ip actions=ct(table=1)
implies that any received packet goes through CT and then it is recirculated.

> Was the rxd rss hash calculation offloaded to the NIC?

[AF] Yes.

> For the cases with larger numbers of flows (10K , 100K) did you
> investigate the results when the EMC is simply switched off?

[AF] No, I just focused on this criteria to check if I could really get some
performance improvement. It's likely that with - say 100k or more flows - it's
better to switch off EMC? 

> 
> Say we have 3000 flows (the lowest figure at which we see a positive
> effect) that means 6000 flows are contending for places in the emc.
> Is the effect we see here to do with disabling recirculated packets in
> particular or just reducing contention on the emc in general. 

[AF] I guess the benefit comes from reducing the continuous recursive 
overwrites in EMC.
With this approach
  => less overhead for insertions
  => it's more likely that orig

Re: [ovs-dev] [PATCH] datapath-windows: Include ICMP type and code fields to find a matching NAT entry

2017-06-23 Thread Yin Lin
Hi Anand,

Please do not change NAT code to include other fields in ct_endpoint. NAT
doesn't care about those fields. As long as the address and port are
identical, we will reverse NAT it. For example, even if we receive an ICMP
reply, we will match it against a ICMP request entry to do the reverse NAT.
We don't support ipv6 yet and don't want to be bothered by the extra fields
in src.addr and dst.addr either. So those FIELD_COMPARE are the exact
things we want to compare, nothing less.

Best regards,
Yin Lin

On Fri, Jun 23, 2017 at 2:27 PM, Anand Kumar  wrote:

> Update OvsNatKeyAreSame() and OvsHashNatKey() to include ICMP type and code
> fields, so that ICMP_ECHO_REQUEST packets are not matched with
> ICMP_ECHO_REPLY
> packets and vice versa.
>
> Signed-off-by: Anand Kumar 
> ---
>  datapath-windows/ovsext/Conntrack-nat.c | 32
> ++--
>  1 file changed, 10 insertions(+), 22 deletions(-)
>
> diff --git a/datapath-windows/ovsext/Conntrack-nat.c
> b/datapath-windows/ovsext/Conntrack-nat.c
> index ae6b92c..f3c282c 100644
> --- a/datapath-windows/ovsext/Conntrack-nat.c
> +++ b/datapath-windows/ovsext/Conntrack-nat.c
> @@ -13,17 +13,11 @@ PLIST_ENTRY ovsUnNatTable = NULL;
>  static __inline UINT32
>  OvsHashNatKey(const OVS_CT_KEY *key)
>  {
> -UINT32 hash = 0;
> -#define HASH_ADD(field) \
> -hash = OvsJhashBytes(&key->field, sizeof(key->field), hash)
> -
> -HASH_ADD(src.addr.ipv4_aligned);
> -HASH_ADD(dst.addr.ipv4_aligned);
> -HASH_ADD(src.port);
> -HASH_ADD(dst.port);
> -HASH_ADD(zone);
> -#undef HASH_ADD
> -return hash;
> +UINT32 hash[3];
> +hash[0] = OvsJhashBytes(&key->src, sizeof(key->src), 0);
> +hash[1] = OvsJhashBytes(&key->dst, sizeof(key->dst), 0);
> +hash[2] = OvsJhashBytes(&key->zone, sizeof(key->zone), 0);
> +return OvsJhashWords(hash, 3, OVS_HASH_BASIS);
>  }
>
>  /*
> @@ -35,17 +29,11 @@ OvsHashNatKey(const OVS_CT_KEY *key)
>  static __inline BOOLEAN
>  OvsNatKeyAreSame(const OVS_CT_KEY *key1, const OVS_CT_KEY *key2)
>  {
> -// XXX: Compare IPv6 key as well
> -#define FIELD_COMPARE(field) \
> -if (key1->field != key2->field) return FALSE
> -
> -FIELD_COMPARE(src.addr.ipv4_aligned);
> -FIELD_COMPARE(dst.addr.ipv4_aligned);
> -FIELD_COMPARE(src.port);
> -FIELD_COMPARE(dst.port);
> -FIELD_COMPARE(zone);
> -return TRUE;
> -#undef FIELD_COMPARE
> +return ((NdisEqualMemory(&key1->src, &key2->src,
> + sizeof(struct ct_endpoint))) &&
> +(NdisEqualMemory(&key1->dst, &key2->dst,
> + sizeof(struct ct_endpoint))) &&
> +(key1->zone == key2->zone));
>  }
>
>  /*
> --
> 2.9.3.windows.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] datapath-windows: Include ICMP type and code fields to find a matching NAT entry

2017-06-23 Thread Anand Kumar
Update OvsNatKeyAreSame() and OvsHashNatKey() to include ICMP type and code
fields, so that ICMP_ECHO_REQUEST packets are not matched with ICMP_ECHO_REPLY
packets and vice versa.

Signed-off-by: Anand Kumar 
---
 datapath-windows/ovsext/Conntrack-nat.c | 32 ++--
 1 file changed, 10 insertions(+), 22 deletions(-)

diff --git a/datapath-windows/ovsext/Conntrack-nat.c 
b/datapath-windows/ovsext/Conntrack-nat.c
index ae6b92c..f3c282c 100644
--- a/datapath-windows/ovsext/Conntrack-nat.c
+++ b/datapath-windows/ovsext/Conntrack-nat.c
@@ -13,17 +13,11 @@ PLIST_ENTRY ovsUnNatTable = NULL;
 static __inline UINT32
 OvsHashNatKey(const OVS_CT_KEY *key)
 {
-UINT32 hash = 0;
-#define HASH_ADD(field) \
-hash = OvsJhashBytes(&key->field, sizeof(key->field), hash)
-
-HASH_ADD(src.addr.ipv4_aligned);
-HASH_ADD(dst.addr.ipv4_aligned);
-HASH_ADD(src.port);
-HASH_ADD(dst.port);
-HASH_ADD(zone);
-#undef HASH_ADD
-return hash;
+UINT32 hash[3];
+hash[0] = OvsJhashBytes(&key->src, sizeof(key->src), 0);
+hash[1] = OvsJhashBytes(&key->dst, sizeof(key->dst), 0);
+hash[2] = OvsJhashBytes(&key->zone, sizeof(key->zone), 0);
+return OvsJhashWords(hash, 3, OVS_HASH_BASIS);
 }
 
 /*
@@ -35,17 +29,11 @@ OvsHashNatKey(const OVS_CT_KEY *key)
 static __inline BOOLEAN
 OvsNatKeyAreSame(const OVS_CT_KEY *key1, const OVS_CT_KEY *key2)
 {
-// XXX: Compare IPv6 key as well
-#define FIELD_COMPARE(field) \
-if (key1->field != key2->field) return FALSE
-
-FIELD_COMPARE(src.addr.ipv4_aligned);
-FIELD_COMPARE(dst.addr.ipv4_aligned);
-FIELD_COMPARE(src.port);
-FIELD_COMPARE(dst.port);
-FIELD_COMPARE(zone);
-return TRUE;
-#undef FIELD_COMPARE
+return ((NdisEqualMemory(&key1->src, &key2->src,
+ sizeof(struct ct_endpoint))) &&
+(NdisEqualMemory(&key1->dst, &key2->dst,
+ sizeof(struct ct_endpoint))) &&
+(key1->zone == key2->zone));
 }
 
 /*
-- 
2.9.3.windows.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Ultimas Conexiones - Asegure su Lugar

2017-06-23 Thread CFDI
DIRIGIDO A: Contadores, Administradores, Abogados,
Empresarios, Estudiantes, Público en General.

Implicaciones de la Nueva Obligación de CFDI de Pagos y los Últimos Cambios en 
la FACTURA ELECTRÓNICA


El 5 de diciembre de 2016, el Servicio de Administración Tributaria (SAT) de 
México publicó de manera
anticipada dos especificaciones técnicas que darán lugar a actualizaciones en 
los CFDI a partir del 1 de
julio de 2017.

El participante conocerá los aspectos normativos y tecnológicos de los 
Comprobantes Fiscales
Digitales por Internet y sus complementos, desde la emisión, recepción, 
almacenamiento y
validación; hasta el análisis de catálogos, atributos y reglas de validación 
contenidas en el Anexo 20 de
la Resolución Miscelánea Fiscal 2017.



Temario :
 
- Disposiciones legales y administrativas.
- Disposiciones tecnológicas.
- Cancelación de CFDI.
- Sanciones para los proveedores.



¿Requiere la información a la Brevedad?
responda este email con la palabra: CFDI.
Junto con los siguientes datos: 

Nombre:

Teléfono:

Empresa:

centro telefónico: 018002129393
 
Coordinador de Evento


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] 答复: [PATCH] pkt reassemble: fix kernel panic for ovs reassemble

2017-06-23 Thread Joe Stringer
Hi Wang Zhike,

I'd like if others like Greg could take a look as well, since this
code is delicate. The more review it gets, the better. It seems like
maybe the version of your email that goes to the list does not get the
attachment. Perhaps you could try sending the patch using git
send-email or putting the patch on GitHub instead, and linking to it
here.

For what it's worth, I did run your patch for a while and it seemed
OK, but when I tried again today on an Ubuntu Trusty (Linux
3.13.0-119-generic) box, running make check-kmod, I saw an issue with
get_next_timer_interrupt():

[181250.892557] BUG: unable to handle kernel paging request at a03317e0
[181250.892557] IP: [] get_next_timer_interrupt+0x86/0x250
[181250.892557] PGD 1c11067 PUD 1c12063 PMD 1381a2067 PTE 0
[181250.892557] Oops:  [#1] SMP
[181250.892557] Modules linked in: nf_nat_ipv6 nf_nat_ipv4 nf_nat
gre(-) nf_conntrack_ipv6 nf_conntrack_ipv4 nf_defrag_ipv6
nf_defrag_ipv4 nf_conntrack_netlink nfnetlink nf_conntrack bonding
8021q garp stp mrp llc veth nfsd auth_rpcgss nfs_acl nfs lockd sunrpc
fscache dm_crypt kvm_intel kvm serio_raw netconsole configfs
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse floppy
ahci libahci [last unloaded: libcrc32c]
[181250.892557] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G   OX
3.13.0-119-generic #166-Ubuntu
[181250.892557] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Bochs 01/01/2011
[181250.892557] task: 81c15480 ti: 81c0 task.ti:
81c0
[181250.892557] RIP: 0010:[]  []
get_next_timer_interrupt+0x86/0x250
[181250.892557] RSP: 0018:81c01e00  EFLAGS: 00010002
[181250.892557] RAX: a03317c8 RBX: 000102b245da RCX:
00db
[181250.892557] RDX: 81ebac58 RSI: 00db RDI:
000102b245db
[181250.892557] RBP: 81c01e48 R08: 00c88c1c R09:

[181250.892557] R10:  R11:  R12:
000142b245d9
[181250.892557] R13: 81eb9e80 R14: 000102b245da R15:
00cd63e8
[181250.892557] FS:  () GS:88013fc0()
knlGS:
[181250.892557] CS:  0010 DS:  ES:  CR0: 8005003b
[181250.892557] CR2: a03317e0 CR3: 3707f000 CR4:
06f0
[181250.892557] Stack:
[181250.892557]   81c01e30 810a3af5
88013fc13bc0
[181250.892557]  88013fc0dce0 000102b245da 
0063ae154000
[181250.892557]  00cd63e8 81c01ea8 810da655
a4d8c2cb6200
[181250.892557] Call Trace:
[181250.892557]  [] ? set_next_entity+0x95/0xb0
[181250.892557]  [] tick_nohz_stop_sched_tick+0x1e5/0x340
[181250.892557]  [] __tick_nohz_idle_enter+0xa1/0x160
[181250.892557]  [] tick_nohz_idle_enter+0x3d/0x70
[181250.892557]  [] cpu_startup_entry+0x87/0x2b0
[181250.892557]  [] rest_init+0x77/0x80
[181250.892557]  [] start_kernel+0x432/0x43d
[181250.892557]  [] ? repair_env_string+0x5c/0x5c
[181250.892557]  [] ? early_idt_handler_array+0x120/0x120
[181250.892557]  [] x86_64_start_reservations+0x2a/0x2c
[181250.892557]  [] x86_64_start_kernel+0x143/0x152
[181250.892557] Code: 8b 7d 10 4d 8b 75 18 4c 39 f7 78 5c 40 0f b6 cf
89 ce 48 63 c6 48 c1 e0 04 49 8d 54 05 00 48 8b 42 28 48 83 c2 28 48
39 d0 74 0e  40 18 01 74 24 48 8b 00 48 39 d0 75 f2 83 c6 01 40 0f
b6 f6
[181250.892557] RIP  [] get_next_timer_interrupt+0x86/0x250
[181250.892557]  RSP 
[181250.892557] CR2: a03317e0

It seems like perhaps a fragment timer signed up by OVS is still
remaining when the OVS module is unloaded, so it may attempt to clean
up an entry using OVS code but the OVS code has been unloaded at that
point. This might be related to IPv6 cvlan test - that seems to be
where my VM froze and went to 100% CPU, but I would think that the
IPv6 fragmentation cleanup test is a more likely to cause this, since
it leaves fragments behind in the cache after the test finishes. I've
only hit this when running all of the tests in make check-kmod.

Cheers,
Joe

On 22 June 2017 at 17:53, 王志克  wrote:
> Hi Joe,
>
> Please check the attachment. Thanks.
>
> Br,
> Wang Zhike
>
> -邮件原件-
> 发件人: Joe Stringer [mailto:j...@ovn.org]
> 发送时间: 2017年6月23日 8:20
> 收件人: 王志克
> 抄送: d...@openvswitch.org
> 主题: Re: [ovs-dev] [PATCH] pkt reassemble: fix kernel panic for ovs reassemble
>
> On 21 June 2017 at 18:54, 王志克  wrote:
>> Ovs and kernel stack would add frag_queue to same netns_frags list.
>> As result, ovs and kernel may access the fraq_queue without correct
>> lock. Also the struct ipq may be different on kernel(older than 4.3),
>> which leads to invalid pointer access.
>>
>> The fix creates specific netns_frags for ovs.
>>
>> Signed-off-by: wangzhike 
>> ---
>
> Hi,
>
> It looks like the whitespace has been corrupted in this version of the patch 
> that you sent, I cannot apply it. Probably your email client mistreat

[ovs-dev] [PATCH v2] datapath-windows: Include ICMP type and code fields to find a matching ct entry

2017-06-23 Thread Anand Kumar
In conntrack lookup, ICMP type and code fields were not being used to
determine a matching entry. As a result, ICMP4_ECHO_REQUEST packet could
be tracked as ICMP4_ECHO_REPLY packet and vice versa, which is invalid.

To fix this, add ICMP type and code fields for matching a conntrack entry.

Signed-off-by: Anand Kumar 
Acked-by: Sairam Venugopal 
---
 datapath-windows/ovsext/Conntrack.c | 16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/datapath-windows/ovsext/Conntrack.c 
b/datapath-windows/ovsext/Conntrack.c
index 07a9583..e97d6ce 100644
--- a/datapath-windows/ovsext/Conntrack.c
+++ b/datapath-windows/ovsext/Conntrack.c
@@ -383,15 +383,13 @@ OvsDetectCtPacket(OvsForwardingContext *fwdCtx,
 BOOLEAN
 OvsCtKeyAreSame(OVS_CT_KEY ctxKey, OVS_CT_KEY entryKey)
 {
-return ((ctxKey.src.addr.ipv4 == entryKey.src.addr.ipv4) &&
-(ctxKey.src.addr.ipv4_aligned == entryKey.src.addr.ipv4_aligned) &&
-(ctxKey.src.port == entryKey.src.port) &&
-(ctxKey.dst.addr.ipv4 == entryKey.dst.addr.ipv4) &&
-(ctxKey.dst.addr.ipv4_aligned == entryKey.dst.addr.ipv4_aligned) &&
-(ctxKey.dst.port == entryKey.dst.port) &&
-(ctxKey.dl_type == entryKey.dl_type) &&
-(ctxKey.nw_proto == entryKey.nw_proto) &&
-(ctxKey.zone == entryKey.zone));
+return ((NdisEqualMemory(&ctxKey.src, &entryKey.src,
+ sizeof(struct ct_endpoint))) &&
+(NdisEqualMemory(&ctxKey.dst, &entryKey.dst,
+ sizeof(struct ct_endpoint))) &&
+(ctxKey.dl_type == entryKey.dl_type) &&
+(ctxKey.nw_proto == entryKey.nw_proto) &&
+(ctxKey.zone == entryKey.zone));
 }
 
 static __inline VOID
-- 
2.9.3.windows.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] datapath-windows: Include ICMP type and code fields to find a matching ct entry

2017-06-23 Thread Anand Kumar
Thanks for the review. 
I will split up the patch and send out Conntract-Nat.c changes in a separate 
patch.

Regards,
Anand Kumar

On 6/23/17, 1:50 PM, "Sairam Venugopal"  wrote:

Hi Anand,

Can you split this patch up instead to handle Conntrack.c and 
Conntrack-Nat.c in a separate one? This way we can keep the testing isolated to 
the changes.

Acking Conntrack.c change:

Acked-by: Sairam Venugopal 




On 6/22/17, 11:54 AM, "ovs-dev-boun...@openvswitch.org on behalf of Anand 
Kumar"  
wrote:

>In conntrack lookup, ICMP type and code fields were not being used to
>determine a matching entry. As a result, ICMP4_ECHO_REQUEST packet could
>be tracked as ICMP4_ECHO_REPLY packet and vice versa, which is invalid.
>
>To fix this, add ICMP type and code fields for matching a conntrack entry.
>Also updated OvsNatKeyAreSame() and OvsHashNatKey() to include these two
>fields.
>
>Signed-off-by: Anand Kumar 
>---
> datapath-windows/ovsext/Conntrack-nat.c | 32 
++--
> datapath-windows/ovsext/Conntrack.c | 16 +++-
> 2 files changed, 17 insertions(+), 31 deletions(-)
>
>diff --git a/datapath-windows/ovsext/Conntrack-nat.c 
b/datapath-windows/ovsext/Conntrack-nat.c
>index ae6b92c..f3c282c 100644
>--- a/datapath-windows/ovsext/Conntrack-nat.c
>+++ b/datapath-windows/ovsext/Conntrack-nat.c
>@@ -13,17 +13,11 @@ PLIST_ENTRY ovsUnNatTable = NULL;
> static __inline UINT32
> OvsHashNatKey(const OVS_CT_KEY *key)
> {
>-UINT32 hash = 0;
>-#define HASH_ADD(field) \
>-hash = OvsJhashBytes(&key->field, sizeof(key->field), hash)
>-
>-HASH_ADD(src.addr.ipv4_aligned);
>-HASH_ADD(dst.addr.ipv4_aligned);
>-HASH_ADD(src.port);
>-HASH_ADD(dst.port);
>-HASH_ADD(zone);
>-#undef HASH_ADD
>-return hash;
>+UINT32 hash[3];
>+hash[0] = OvsJhashBytes(&key->src, sizeof(key->src), 0);
>+hash[1] = OvsJhashBytes(&key->dst, sizeof(key->dst), 0);
>+hash[2] = OvsJhashBytes(&key->zone, sizeof(key->zone), 0);
>+return OvsJhashWords(hash, 3, OVS_HASH_BASIS);
> }
> 
> /*
>@@ -35,17 +29,11 @@ OvsHashNatKey(const OVS_CT_KEY *key)
> static __inline BOOLEAN
> OvsNatKeyAreSame(const OVS_CT_KEY *key1, const OVS_CT_KEY *key2)
> {
>-// XXX: Compare IPv6 key as well
>-#define FIELD_COMPARE(field) \
>-if (key1->field != key2->field) return FALSE
>-
>-FIELD_COMPARE(src.addr.ipv4_aligned);
>-FIELD_COMPARE(dst.addr.ipv4_aligned);
>-FIELD_COMPARE(src.port);
>-FIELD_COMPARE(dst.port);
>-FIELD_COMPARE(zone);
>-return TRUE;
>-#undef FIELD_COMPARE
>+return ((NdisEqualMemory(&key1->src, &key2->src,
>+ sizeof(struct ct_endpoint))) &&
>+(NdisEqualMemory(&key1->dst, &key2->dst,
>+ sizeof(struct ct_endpoint))) &&
>+(key1->zone == key2->zone));
> }
> 
> /*
>diff --git a/datapath-windows/ovsext/Conntrack.c 
b/datapath-windows/ovsext/Conntrack.c
>index 07a9583..e97d6ce 100644
>--- a/datapath-windows/ovsext/Conntrack.c
>+++ b/datapath-windows/ovsext/Conntrack.c
>@@ -383,15 +383,13 @@ OvsDetectCtPacket(OvsForwardingContext *fwdCtx,
> BOOLEAN
> OvsCtKeyAreSame(OVS_CT_KEY ctxKey, OVS_CT_KEY entryKey)
> {
>-return ((ctxKey.src.addr.ipv4 == entryKey.src.addr.ipv4) &&
>-(ctxKey.src.addr.ipv4_aligned == entryKey.src.addr.ipv4_aligned) 
&&
>-(ctxKey.src.port == entryKey.src.port) &&
>-(ctxKey.dst.addr.ipv4 == entryKey.dst.addr.ipv4) &&
>-(ctxKey.dst.addr.ipv4_aligned == entryKey.dst.addr.ipv4_aligned) 
&&
>-(ctxKey.dst.port == entryKey.dst.port) &&
>-(ctxKey.dl_type == entryKey.dl_type) &&
>-(ctxKey.nw_proto == entryKey.nw_proto) &&
>-(ctxKey.zone == entryKey.zone));
>+return ((NdisEqualMemory(&ctxKey.src, &entryKey.src,
>+ sizeof(struct ct_endpoint))) &&
>+(NdisEqualMemory(&ctxKey.dst, &entryKey.dst,
>+ sizeof(struct ct_endpoint))) &&
>+(ctxKey.dl_type == entryKey.dl_type) &&
>+(ctxKey.nw_proto == entryKey.nw_proto) &&
>+(ctxKey.zone == entryKey.zone));
> }
> 
> static __inline VOID
>-- 
>2.9.3.windows.1
>
>___
>dev mailing list
>d...@openvswitch.org

>https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=Z6vowHUOjP5ysP_g372c49Nqc1vEKqHKNBkR5Q5Z7uo&m=RcKP7MCbcn9PTpNRinYsO9Q0x4MrsrHutzBQ0hweH2M&s=7sFDQthiIL809LnOAfhczTX9Lk7iZDvnFhR77N4DfbE&e=
 


_

Re: [ovs-dev] 答复: [PATCH] pkt reassemble: fix kernel panic for ovs reassemble

2017-06-23 Thread Greg Rose

On 06/22/2017 05:53 PM, 王志克 wrote:

Hi Joe,

Please check the attachment. Thanks.


The attachment is only this:

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

And the patch listed does not apply.  Please resend.

Thanks,

- Greg





Br,
Wang Zhike

-邮件原件-
发件人: Joe Stringer [mailto:j...@ovn.org]
发送时间: 2017年6月23日 8:20
收件人: 王志克
抄送: d...@openvswitch.org
主题: Re: [ovs-dev] [PATCH] pkt reassemble: fix kernel panic for ovs reassemble

On 21 June 2017 at 18:54, 王志克  wrote:
> Ovs and kernel stack would add frag_queue to same netns_frags list.
> As result, ovs and kernel may access the fraq_queue without correct
> lock. Also the struct ipq may be different on kernel(older than 4.3),
> which leads to invalid pointer access.
>
> The fix creates specific netns_frags for ovs.
>
> Signed-off-by: wangzhike 
> ---

Hi,

It looks like the whitespace has been corrupted in this version of the patch 
that you sent, I cannot apply it. Probably your email client mistreats it when 
sending the email out. A reliable method to send patches correctly via email is 
to use the commandline client 'git send-email'. This is the preferred method. 
If you are unable to set that up, consider attaching the patch to the email (or 
send a pull request on GitHub).

Cheers,
Joe



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] datapath-windows: Include ICMP type and code fields to find a matching ct entry

2017-06-23 Thread Sairam Venugopal
Hi Anand,

Can you split this patch up instead to handle Conntrack.c and Conntrack-Nat.c 
in a separate one? This way we can keep the testing isolated to the changes.

Acking Conntrack.c change:

Acked-by: Sairam Venugopal 




On 6/22/17, 11:54 AM, "ovs-dev-boun...@openvswitch.org on behalf of Anand 
Kumar"  
wrote:

>In conntrack lookup, ICMP type and code fields were not being used to
>determine a matching entry. As a result, ICMP4_ECHO_REQUEST packet could
>be tracked as ICMP4_ECHO_REPLY packet and vice versa, which is invalid.
>
>To fix this, add ICMP type and code fields for matching a conntrack entry.
>Also updated OvsNatKeyAreSame() and OvsHashNatKey() to include these two
>fields.
>
>Signed-off-by: Anand Kumar 
>---
> datapath-windows/ovsext/Conntrack-nat.c | 32 ++--
> datapath-windows/ovsext/Conntrack.c | 16 +++-
> 2 files changed, 17 insertions(+), 31 deletions(-)
>
>diff --git a/datapath-windows/ovsext/Conntrack-nat.c 
>b/datapath-windows/ovsext/Conntrack-nat.c
>index ae6b92c..f3c282c 100644
>--- a/datapath-windows/ovsext/Conntrack-nat.c
>+++ b/datapath-windows/ovsext/Conntrack-nat.c
>@@ -13,17 +13,11 @@ PLIST_ENTRY ovsUnNatTable = NULL;
> static __inline UINT32
> OvsHashNatKey(const OVS_CT_KEY *key)
> {
>-UINT32 hash = 0;
>-#define HASH_ADD(field) \
>-hash = OvsJhashBytes(&key->field, sizeof(key->field), hash)
>-
>-HASH_ADD(src.addr.ipv4_aligned);
>-HASH_ADD(dst.addr.ipv4_aligned);
>-HASH_ADD(src.port);
>-HASH_ADD(dst.port);
>-HASH_ADD(zone);
>-#undef HASH_ADD
>-return hash;
>+UINT32 hash[3];
>+hash[0] = OvsJhashBytes(&key->src, sizeof(key->src), 0);
>+hash[1] = OvsJhashBytes(&key->dst, sizeof(key->dst), 0);
>+hash[2] = OvsJhashBytes(&key->zone, sizeof(key->zone), 0);
>+return OvsJhashWords(hash, 3, OVS_HASH_BASIS);
> }
> 
> /*
>@@ -35,17 +29,11 @@ OvsHashNatKey(const OVS_CT_KEY *key)
> static __inline BOOLEAN
> OvsNatKeyAreSame(const OVS_CT_KEY *key1, const OVS_CT_KEY *key2)
> {
>-// XXX: Compare IPv6 key as well
>-#define FIELD_COMPARE(field) \
>-if (key1->field != key2->field) return FALSE
>-
>-FIELD_COMPARE(src.addr.ipv4_aligned);
>-FIELD_COMPARE(dst.addr.ipv4_aligned);
>-FIELD_COMPARE(src.port);
>-FIELD_COMPARE(dst.port);
>-FIELD_COMPARE(zone);
>-return TRUE;
>-#undef FIELD_COMPARE
>+return ((NdisEqualMemory(&key1->src, &key2->src,
>+ sizeof(struct ct_endpoint))) &&
>+(NdisEqualMemory(&key1->dst, &key2->dst,
>+ sizeof(struct ct_endpoint))) &&
>+(key1->zone == key2->zone));
> }
> 
> /*
>diff --git a/datapath-windows/ovsext/Conntrack.c 
>b/datapath-windows/ovsext/Conntrack.c
>index 07a9583..e97d6ce 100644
>--- a/datapath-windows/ovsext/Conntrack.c
>+++ b/datapath-windows/ovsext/Conntrack.c
>@@ -383,15 +383,13 @@ OvsDetectCtPacket(OvsForwardingContext *fwdCtx,
> BOOLEAN
> OvsCtKeyAreSame(OVS_CT_KEY ctxKey, OVS_CT_KEY entryKey)
> {
>-return ((ctxKey.src.addr.ipv4 == entryKey.src.addr.ipv4) &&
>-(ctxKey.src.addr.ipv4_aligned == entryKey.src.addr.ipv4_aligned) &&
>-(ctxKey.src.port == entryKey.src.port) &&
>-(ctxKey.dst.addr.ipv4 == entryKey.dst.addr.ipv4) &&
>-(ctxKey.dst.addr.ipv4_aligned == entryKey.dst.addr.ipv4_aligned) &&
>-(ctxKey.dst.port == entryKey.dst.port) &&
>-(ctxKey.dl_type == entryKey.dl_type) &&
>-(ctxKey.nw_proto == entryKey.nw_proto) &&
>-(ctxKey.zone == entryKey.zone));
>+return ((NdisEqualMemory(&ctxKey.src, &entryKey.src,
>+ sizeof(struct ct_endpoint))) &&
>+(NdisEqualMemory(&ctxKey.dst, &entryKey.dst,
>+ sizeof(struct ct_endpoint))) &&
>+(ctxKey.dl_type == entryKey.dl_type) &&
>+(ctxKey.nw_proto == entryKey.nw_proto) &&
>+(ctxKey.zone == entryKey.zone));
> }
> 
> static __inline VOID
>-- 
>2.9.3.windows.1
>
>___
>dev mailing list
>d...@openvswitch.org
>https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=Z6vowHUOjP5ysP_g372c49Nqc1vEKqHKNBkR5Q5Z7uo&m=RcKP7MCbcn9PTpNRinYsO9Q0x4MrsrHutzBQ0hweH2M&s=7sFDQthiIL809LnOAfhczTX9Lk7iZDvnFhR77N4DfbE&e=
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Action needed! American Express Card Number Authentication

2017-06-23 Thread bborysz...@citymail.cuny.edu
Please authenticate your card number





[American Express logo]


Hello, there



[American Express Card]


[Service shadow]

[Fraud Protection]


Card Failure



Hello there,

You need to authenticate your card number immediately if you wish to continue 
using it for future payment.

Login to Amexaccount/authenticate 
an d follow the instructions to verify your card to enable payment

We'll be here to help you with any step along the way. You can find answers to 
most questions and
get in touch with us at 
https://support.amex.com.



If you or an authorized party has already addressed this concern, please 
disregard this message.



Thank you for helping us to protect the security of your account.



American Express Account Protection Services


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Request to tag 2.7 branch with v2.7.1

2017-06-23 Thread Joe Stringer
On 23 June 2017 at 05:38, Russell Bryant  wrote:
> On Thu, Jun 8, 2017 at 12:12 PM, Ben Pfaff  wrote:
>> On Thu, Jun 08, 2017 at 04:36:21AM +0530, Numan Siddique wrote:
>>> Is it possible to tag the 2.7 branch with v2.7.1. Recently there were many
>>> back ports to 2.7 branch. OVS 2.7 is required for RDO [1] and the RDO
>>> community is expecting a new tag to build the latest OVS 2.7 branch.
>>
>> It's fine with me if we release 2.7.1.  Justin?
>
> Hi all, just wanted to check in on 2.7.1.  Last week Justin said there
> may be some more fixes from Joe to get in first.  Is anything still
> pending?  I also know some folks have been traveling this week.

I had been thinking about this patch:
https://mail.openvswitch.org/pipermail/ovs-dev/2017-June/334446.html

However, it is in a delicate area of the code and no-one other than me
has had a chance to look it over yet. Therefore, I don't think it
should block the 2.7.1 release.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH RFC 2/4] dpif-netdev: Skip EMC lookup/insert for recirculated packets.

2017-06-23 Thread O Mahony, Billy
Hi Antonio,

This is a really interesting patch. Comments inline below. 

Thanks,
/Billy.

> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of antonio.fische...@intel.com
> Sent: Monday, June 19, 2017 11:12 AM
> To: d...@openvswitch.org
> Subject: [ovs-dev] [PATCH RFC 2/4] dpif-netdev: Skip EMC lookup/insert for
> recirculated packets.
> 
> From: Antonio Fischetti 
> 
> When OVS is configured as a firewall, with thousands of active concurrent
> connections, the EMC gets quicly saturated and may come under heavy
> thrashing for the reason that original and recirculated packets keep overwrite
> existing active EMC entries due to its limited size(8k).
> 
> This thrashing causes the EMC to be less efficient than the dcpls in terms of
> lookups and insertions.
> 
> This patch allows to use the EMC efficiently by allowing only the 'original'
> packets to hit EMC. All recirculated packets are sent to classifier directly.
> An empirical threshold (EMC_FULL_THRESHOLD - of 50%) for EMC occupancy
> is set to trigger this logic. By doing so when EMC utilization exceeds
> EMC_FULL_THRESHOLD.
>  - EMC Insertions are allowed just for original packets. EMC insertion
>and look up is skipped for recirculated packets.
>  - Recirculated packets are sent to classifier.
> 
> This patch depends on the previous one in this series. It's based on patch
> "dpif-netdev: add EMC entry count and %full figure to pmd-stats-show" at:
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327570.html
> 
> Signed-off-by: Antonio Fischetti 
> Signed-off-by: Bhanuprakash Bodireddy
> 
> Co-authored-by: Bhanuprakash Bodireddy
> 
> ---
> In our Connection Tracker testbench set up with
> 
>  table=0, priority=1 actions=drop
>  table=0, priority=10,arp actions=NORMAL  table=0, priority=100,ct_state=-
> trk,ip actions=ct(table=1)  table=1, ct_state=+new+trk,ip,in_port=1
> actions=ct(commit),output:2  table=1, ct_state=+est+trk,ip,in_port=1
> actions=output:2  table=1, ct_state=+new+trk,ip,in_port=2 actions=drop
> table=1, ct_state=+est+trk,ip,in_port=2 actions=output:1
> 
> we saw the following performance improvement.
> 
> Measured packet Rx rate (regardless of packet loss). Bidirectional test with
> 64B UDP packets.
> Each row is a test with a different number of traffic streams. The traffic
> generator is set so that each stream establishes one UDP connection.
> Mpps columns reports the Rx rates on the 2 sides.
> 
>  Traffic |Orig| Orig  |  +changes  |   +changes
>  Streams |   [Mpps]   | [EMC entries] |   [Mpps]   | [EMC entries]
> -++---++---
>  10  |  3.4, 3.4  |  20   |  3.4, 3.4  |  20
> 100  |  2.6, 2.7  | 200   |  2.6, 2.7  | 201
>   1,000  |  2.4, 2.4  |2009   |  2.4, 2.4  |1994
>   2,000  |  2.2, 2.2  |3903   |  2.2, 2.2  |3900
>   3,000  |  2.1, 2.1  |5473   |  2.2, 2.2  |4798
>   4,000  |  2.0, 2.0  |6478   |  2.2, 2.2  |5663
>  10,000  |  1.8, 1.9  |8070   |  2.0, 2.0  |7347
> 100,000  |  1.7, 1.7  |8192   |  1.8, 1.8  |8192
> 

[[BO'M]] 
A few questions on the test:
Are all the pkts rxd being recirculated?
Are there any flows present where the pkts do not require recirculation? 
Was the rxd rss hash calculation offloaded to the NIC?
For the cases with larger numbers of flows (10K , 100K) did you investigate the 
results when the EMC is simply switched off? 

Say we have 3000 flows (the lowest figure at which we see a positive effect) 
that means 6000 flows are contending for places in the emc.  
Is the effect we see here to do with disabling recirculated packets in 
particular or just reducing contention on the emc in general. I know that the 
recirculated pkt hashes require software hashing albeit a small amount so they 
do make a good category of packet to drop from the emc when contention is 
severe.

Once there are too many entries contending  for any cache it's going to become 
a liability as the lookup_cost + (miss_rate * cost_of_miss) grows to be greater 
that the cost_of_miss. And in that case any scheme where a very cheap decision 
can be made to not insert a certain category of packets and to also not check 
the emc for that same category will reduce the cache miss rate.

But I'm not sure that is_recirculated is the best categorization on which to 
make that decision. Mainly because it is not controllable. In this test case 
50% pkts were recirculated so by ruling these packets out of eligibility for 
the EMC you get a really large reduction in EMC contention. However you can 
never know ahead of time if any packets will be recirc'd. You may have a 
situation where the EMC is totally swamped with 200K flows as above but none of 
them are re-circ'd packets so ruling out those packets will give no benefit. 


>  lib/dpif-netdev.c | 46 ++

Re: [ovs-dev] [PATCH V11 09/33] dpif: Save added ports in a port map for netdev flow api use

2017-06-23 Thread Darrell Ball


On 6/21/17, 4:27 AM, "ovs-dev-boun...@openvswitch.org on behalf of Roi Dayan" 
 wrote:



On 20/06/2017 03:07, Joe Stringer wrote:
> On 13 June 2017 at 08:03, Roi Dayan  wrote:
>> From: Paul Blakey 
>>
>> To use netdev flow offloading api, dpifs needs to iterate over
>> added ports. This addition inserts the added dpif ports in a hash map,
>> The map will also be used to translate dpif ports to netdevs.
>>
>> Signed-off-by: Paul Blakey 
>> Reviewed-by: Roi Dayan 
>> Reviewed-by: Simon Horman 
>> Acked-by: Flavio Leitner 
>
> Hi Roi, Simon,
>
> It looks like this patch has introduced a regression with regards to
> clean shutdown of OVS. Unfortunately I wasn't able to discover this
> until now because the tree has been a bit unstable lately with respect
> to running the system-traffic tests and OVS cleaning up after itself.
> I've bisected it down now, though.
>
> To observe the issue, run:
> # make check-system-userspace TESTSUITEFLAGS="1"
> # ip link show ovs-netdev
>
> If you find that the ovs-netdev device exists, then you are observing
> the problem.
>
> Typically when the 'check-system-userspace' tests run successfully,
> they should initiate complete and proper shutdown of OVS and removal
> of related state via "ovs-appctl exit --cleanup", which you can find
> invoked via the shell function kill_ovs_vswitchd which is queued up in
> _OVS_VSWITCHD_START via the OVS "on_exit" function. However, since
> this patch was merged the test no longer cleans up properly after
> itself.
>
> Note that if you run any subsequent tests without cleaning up, then
> they will fail. For instance, if you run tests 1-5 you'll see that 1
> succeeds and 2-5 fail. Subsequent runs of any test will fail until you
> have removed the ovs-netdev device.
>
> Would you have some time to look into why ovs-vswitchd is not cleaning
> up this device during "ovs-appctl exit --cleanup" after this patch was
> applied?
>
> Thanks,
> Joe
>

We'll check it. Thanks

I hit this issue last week and worked around it after spending time
to figure out what had happened.

I sent a patch here:

https://patchwork.ozlabs.org/patch/780147/

Darrell

___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=IGjDuLukWZUY9GwM5Fyb1aRjObV2fhSpVYinM7HfBE8&s=7VuqjZDLHGzgi40lHuo_AEFmZY4cHYl-h5TxMnEjR_4&e=
 


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [patch_v2] dpif: Fix cleanup of userspace datapath.

2017-06-23 Thread Darrell Ball
Hardware offload introduced extra tracking of netdev ports.  This
included ovs-netdev, which is really for internal infra usage for
the userpace datapath.  This breaks cleanup of the userspace
datapath.  One effect is that all userspace datapath system tests
fail except for the first one run. There is no need to do this
extra tracking of this port, hence it is skipped by also checking
for the port name as the type does not help in this case.  Adding
an extra type or field is another way to fix this, but this would
probably be overkill as this port is a constant and the misuse
potential very limited.

CC: Paul Blakey 
Fixes: 32b77c316d9982("dpif: Save added ports in a port map.")
Signed-off-by: Darrell Ball 
---

v1->v2: Add a utility function to check for "internal" ports.
Add an extra check for add port case.

 lib/dpif.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/lib/dpif.c b/lib/dpif.c
index 10bdd70..754b83b 100644
--- a/lib/dpif.c
+++ b/lib/dpif.c
@@ -100,6 +100,16 @@ static bool should_log_flow_message(const struct 
vlog_module *module,
 /* Incremented whenever tnl route, arp, etc changes. */
 struct seq *tnl_conf_seq;
 
+static bool
+dpif_is_internal_port(const char *type, const char *name)
+{
+/* ovs-netdev is a tap device that is used as an
+ * internal port for the userspace datapath, hence
+ * treat it as such. */
+return !strcmp(type, "internal") ||
+   !strcmp(name, "ovs-netdev");
+}
+
 static void
 dp_initialize(void)
 {
@@ -350,7 +360,7 @@ do_open(const char *name, const char *type, bool create, 
struct dpif **dpifp)
 struct netdev *netdev;
 int err;
 
-if (!strcmp(dpif_port.type, "internal")) {
+if (dpif_is_internal_port(dpif_port.type, dpif_port.name)) {
 continue;
 }
 
@@ -556,7 +566,9 @@ dpif_port_add(struct dpif *dpif, struct netdev *netdev, 
odp_port_t *port_nop)
 VLOG_DBG_RL(&dpmsg_rl, "%s: added %s as port %"PRIu32,
 dpif_name(dpif), netdev_name, port_no);
 
-if (strcmp(netdev_get_type(netdev), "internal")) {
+if (!dpif_is_internal_port(netdev_get_type(netdev),
+   netdev_get_name(netdev))) {
+
 struct dpif_port dpif_port;
 
 dpif_port.type = CONST_CAST(char *, netdev_get_type(netdev));
-- 
1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4 7/7] userspace: Introduce packet_type in OF 1.5 packet-out

2017-06-23 Thread Zoltán Balogh
Introducing packet_type in OF 1.5 packet-out.
Partly based on Jean Tourrilhes's work.

Add test cases for OF1.5 packet-out
Add negative test case for OF1.5 packet-out

Signed-off-by: Jean Tourrilhes 
Signed-off-by: Zoltan Balogh 
Co-authored-by: Jan Scheurich 
Signed-off-by: Ben Pfaff 
---
 lib/flow.c| 36 +--
 lib/ofp-parse.c   | 13 +
 lib/ofp-print.c   |  4 +--
 lib/ofp-util.c|  2 ++
 ofproto/ofproto.c |  3 ++
 tests/ofproto.at  | 81 +++
 utilities/ovs-ofctl.c |  1 +
 7 files changed, 123 insertions(+), 17 deletions(-)

diff --git a/lib/flow.c b/lib/flow.c
index dbca4d03d..75a91cc6a 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -1441,6 +1441,8 @@ void
 flow_wildcards_init_for_packet(struct flow_wildcards *wc,
const struct flow *flow)
 {
+ovs_be16 dl_type = OVS_BE16_MAX;
+
 memset(&wc->masks, 0x0, sizeof wc->masks);
 
 /* Update this function whenever struct flow changes. */
@@ -1493,25 +1495,29 @@ flow_wildcards_init_for_packet(struct flow_wildcards 
*wc,
 /* actset_output wildcarded. */
 
 WC_MASK_FIELD(wc, packet_type);
-WC_MASK_FIELD(wc, dl_dst);
-WC_MASK_FIELD(wc, dl_src);
-WC_MASK_FIELD(wc, dl_type);
-
-/* No need to set mask of inner VLANs that don't exist. */
-for (int i = 0; i < FLOW_MAX_VLAN_HEADERS; i++) {
-/* Always show the first zero VLAN. */
-WC_MASK_FIELD(wc, vlans[i]);
-if (flow->vlans[i].tci == htons(0)) {
-break;
+if (flow->packet_type == htonl(PT_ETH)) {
+WC_MASK_FIELD(wc, dl_dst);
+WC_MASK_FIELD(wc, dl_src);
+WC_MASK_FIELD(wc, dl_type);
+/* No need to set mask of inner VLANs that don't exist. */
+for (int i = 0; i < FLOW_MAX_VLAN_HEADERS; i++) {
+/* Always show the first zero VLAN. */
+WC_MASK_FIELD(wc, vlans[i]);
+if (flow->vlans[i].tci == htons(0)) {
+break;
+}
 }
+dl_type = flow->dl_type;
+} else {
+dl_type = pt_ns_type_be(flow->packet_type);
 }
 
-if (flow->dl_type == htons(ETH_TYPE_IP)) {
+if (dl_type == htons(ETH_TYPE_IP)) {
 WC_MASK_FIELD(wc, nw_src);
 WC_MASK_FIELD(wc, nw_dst);
 WC_MASK_FIELD(wc, ct_nw_src);
 WC_MASK_FIELD(wc, ct_nw_dst);
-} else if (flow->dl_type == htons(ETH_TYPE_IPV6)) {
+} else if (dl_type == htons(ETH_TYPE_IPV6)) {
 WC_MASK_FIELD(wc, ipv6_src);
 WC_MASK_FIELD(wc, ipv6_dst);
 WC_MASK_FIELD(wc, ipv6_label);
@@ -1523,15 +1529,15 @@ flow_wildcards_init_for_packet(struct flow_wildcards 
*wc,
 WC_MASK_FIELD(wc, ct_ipv6_src);
 WC_MASK_FIELD(wc, ct_ipv6_dst);
 }
-} else if (flow->dl_type == htons(ETH_TYPE_ARP) ||
-   flow->dl_type == htons(ETH_TYPE_RARP)) {
+} else if (dl_type == htons(ETH_TYPE_ARP) ||
+   dl_type == htons(ETH_TYPE_RARP)) {
 WC_MASK_FIELD(wc, nw_src);
 WC_MASK_FIELD(wc, nw_dst);
 WC_MASK_FIELD(wc, nw_proto);
 WC_MASK_FIELD(wc, arp_sha);
 WC_MASK_FIELD(wc, arp_tha);
 return;
-} else if (eth_type_mpls(flow->dl_type)) {
+} else if (eth_type_mpls(dl_type)) {
 for (int i = 0; i < FLOW_MAX_MPLS_LABELS; i++) {
 WC_MASK_FIELD(wc, mpls_lse[i]);
 if (flow->mpls_lse[i] & htonl(MPLS_BOS_MASK)) {
diff --git a/lib/ofp-parse.c b/lib/ofp-parse.c
index 8e2448b20..528b75b4f 100644
--- a/lib/ofp-parse.c
+++ b/lib/ofp-parse.c
@@ -667,6 +667,19 @@ parse_ofp_packet_out_str__(struct ofputil_packet_out *po, 
char *string,
 goto out;
 }
 match_set_in_port(&po->flow_metadata, in_port);
+} else if (!strcmp(name, "packet_type")) {
+char *ns = value;
+char *ns_type = strstr(value, ",");
+if (ns_type) {
+ovs_be32 packet_type;
+*ns_type = '\0';
+packet_type = PACKET_TYPE_BE(strtoul(ns, NULL, 0),
+ strtoul(++ns_type, NULL, 0));
+match_set_packet_type(&po->flow_metadata, packet_type);
+} else {
+error = xasprintf("%s(%s) can't be interpreted", name, value);
+goto out;
+}
 } else if (!strcmp(name, "packet")) {
 const char *error_msg = eth_from_hex(value, &packet);
 if (error_msg) {
diff --git a/lib/ofp-print.c b/lib/ofp-print.c
index 8a6c54e1d..4370cb522 100644
--- a/lib/ofp-print.c
+++ b/lib/ofp-print.c
@@ -256,9 +256,9 @@ ofp_print_packet_out(struct ds *string, const struct 
ofp_header *oh,
 if (po.buffer_id == UINT32_MAX) {
 ds_put_format(string, " data_len=%"PRIuSIZE, po.packet_len);
 if (verbosity > 0 && po.packet_len > 0) {
-/* Packet Out can only carry Ethernet packets. */
+ovs_be32 po

[ovs-dev] [PATCH v4 5/7] tests: Added unit tests in packet-type-aware.at

2017-06-23 Thread Zoltán Balogh
From: Jan Scheurich 

First and second unit tests perform basic verification.

The third one is a triangular bridge setup test case. It tests dataplane
in non-PTAP and ptap bridges in conjunction with L2 and L3 GRE tunnels.
It uses veth ports, therefore requires root privileges.

A simplified version of the third test is added to system userspace unit tests.

 GRE tunneling test setup for PTAP bridge

 192.168.10.10   192.168.10.20 192.168.10.30
  n1   n2n3
  || |
   +--o--+  +--o--+   +--o--+
   |br-in1   |  |br-in2   |   |br-in3   |
   | |  |   (PTAP)|   | |
   +--o--+  +--o--+   +--o--+
 gre  gre   gre
   10.0.0.1(10.0.0.2)(10.0.0.3)
  (20.0.0.1)20.0.0.2 (20.0.0.3)
  (30.0.0.1) LOCAL (30.0.0.2) LOCAL   30.0.0.3  LOCAL
   +---o-+  +---o-+   +---o-+
   |br-p1|  |br-p2|   |br-p3|
   +--o--+  +--o--+   +--o--+
 p1-0 || p2-0| p3-0
 p0-1 || p0-2| p0-3
   +--oo-o--+
   |  br0   |
   ++

   GRE tunnel ports:
  No Bridge  NamePacket-Type Remote bridge & ports
 ---
  1020   br-in1  gre-12  legacy-l2   br-in2 2010 (ptap)
  1021   br-in1  gre-12_l3   legacy-l3 same
  1030   br-in1  gre-13  legacy-l2   br-in3 3010 (l2)
  2010   br-in2  gre-21  ptapbr-in1 1020 (l2), 1021 (l3)
  2030   br-in2  gre-23  ptapbr-in3 3020 (l2), 3021 (l3)
  3010   br-in1  gre-31  legacy-l2   br-in1 1030 (l2)
  3020   br-in1  gre-32  legacy-l2   br-in2 2010 (ptap)
  3021   br-in1  gre-32_l3   legacy-l3 same

Signed-off-by: Jan Scheurich 
Signed-off-by: Ben Pfaff 
---
 tests/automake.mk   |   6 +-
 tests/packet-type-aware.at  | 463 
 tests/system-userspace-packet-type-aware.at | 418 +
 tests/system-userspace-testsuite.at |   1 +
 tests/testsuite.at  |   1 +
 5 files changed, 887 insertions(+), 2 deletions(-)
 create mode 100644 tests/packet-type-aware.at
 create mode 100644 tests/system-userspace-packet-type-aware.at

diff --git a/tests/automake.mk b/tests/automake.mk
index a13b55e47..3118bbc27 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -98,7 +98,8 @@ TESTSUITE_AT = \
tests/ovn-sbctl.at \
tests/ovn-controller.at \
tests/ovn-controller-vtep.at \
-   tests/mcast-snooping.at
+   tests/mcast-snooping.at \
+   tests/packet-type-aware.at
 
 SYSTEM_KMOD_TESTSUITE_AT = \
tests/system-common-macros.at \
@@ -108,7 +109,8 @@ SYSTEM_KMOD_TESTSUITE_AT = \
 SYSTEM_USERSPACE_TESTSUITE_AT = \
tests/system-userspace-testsuite.at \
tests/system-ovn.at \
-   tests/system-userspace-macros.at
+   tests/system-userspace-macros.at \
+   tests/system-userspace-packet-type-aware.at
 
 SYSTEM_TESTSUITE_AT = \
tests/system-common-macros.at \
diff --git a/tests/packet-type-aware.at b/tests/packet-type-aware.at
new file mode 100644
index 0..110407857
--- /dev/null
+++ b/tests/packet-type-aware.at
@@ -0,0 +1,463 @@
+AT_BANNER([packet-type-aware pipeline])
+
+AT_SETUP([ptap - legal flow entries in ptap bridge])
+
+OVS_VSWITCHD_START
+
+AT_CHECK([
+ovs-ofctl del-flows br0
+ovs-ofctl -Oopenflow13 add-flow br0 
priority=1,dl_src=11:22:33:44:55:66,eth_type=0x1234,actions=drop
+ovs-ofctl -Oopenflow14 add-flow br0 
priority=1,ip,nw_dst=10.11.12.13,actions=drop
+ovs-ofctl -Oopenflow15 add-flow br0 priority=1,ipv6,nw_proto=6,actions=drop
+ovs-ofctl -Oopenflow14 add-flow br0 
priority=2,packet_type=\(0,0x0\),dl_src=11:22:33:44:55:66,dl_type=0x4567,actions=drop
+ovs-ofctl -Oopenflow15 add-flow br0 
priority=2,packet_type=\(0,0x0\),arp,arp_tpa=10.11.12.13,actions=drop
+ovs-ofctl -Oopenflow15 add-flow br0 
priority=3,packet_type=\(1,0x806\),arp_tpa=10.11.12.13,actions=drop
+ovs-ofctl -Oopenflow13 add-flow br0 
priority=3,packet_type=\(1,0x800\),nw_dst=10.11.12.13,actions=drop
+ovs-ofctl -Oopenflow14 add-flow br0 
priority=3,packet_type=\(1,0x86dd\),ipv6_dst=1234:5678::/32,actions=drop
+], [0])
+
+AT_CHECK([ovs-ofctl -Oopenflow15 dump-flows br

[ovs-dev] [PATCH v4 6/7] userspace: Complete Packet In handling

2017-06-23 Thread Zoltán Balogh
From: Jan Scheurich 

Send packet_in for non-Ethernet packets.
Include packet_type in Packet In for ptap bridges.

Signed-off-by: Jan Scheurich 
Signed-off-by: Ben Pfaff 
---
 lib/flow.c   | 4 
 lib/ofp-print.c  | 7 +++
 ofproto/ofproto-dpif-xlate.c | 5 -
 3 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/lib/flow.c b/lib/flow.c
index 6be645730..dbca4d03d 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -994,6 +994,10 @@ flow_get_metadata(const struct flow *flow, struct match 
*flow_metadata)
 }
 
 match_set_in_port(flow_metadata, flow->in_port.ofp_port);
+if (flow->packet_type != htonl(PT_ETH)) {
+match_set_packet_type(flow_metadata, flow->packet_type);
+}
+
 if (flow->ct_state != 0) {
 match_set_ct_state(flow_metadata, flow->ct_state);
 if (is_ct_valid(flow, NULL, NULL) && flow->ct_nw_proto != 0) {
diff --git a/lib/ofp-print.c b/lib/ofp-print.c
index b1c412ea4..8a6c54e1d 100644
--- a/lib/ofp-print.c
+++ b/lib/ofp-print.c
@@ -218,10 +218,9 @@ ofp_print_packet_in(struct ds *string, const struct 
ofp_header *oh,
 }
 
 if (verbosity > 0) {
-/* Packet In can only carry Ethernet packets. */
-char *packet = ofp_packet_to_string(public->packet,
-public->packet_len,
-htonl(PT_ETH));
+char *packet = ofp_packet_to_string(
+public->packet, public->packet_len,
+public->flow_metadata.flow.packet_type);
 ds_put_cstr(string, packet);
 free(packet);
 }
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index ce364b361..1f4fe1dd6 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -4259,11 +4259,6 @@ execute_controller_action(struct xlate_ctx *ctx, int len,
 return;
 }
 
-if (packet->packet_type != htonl(PT_ETH)) {
-dp_packet_delete(packet);
-return;
-}
-
 /* A packet sent by an action in a table-miss rule is considered an
  * explicit table miss.  OpenFlow before 1.3 doesn't have that concept so
  * it will get translated back to OFPR_ACTION for those versions. */
-- 
2.11.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4 3/7] userspace: Add OXM field MFF_PACKET_TYPE

2017-06-23 Thread Zoltán Balogh
From: Jan Scheurich 

Allow packet type namespace OFPHTN_ETHERTYPE as alternative pre-requisite
for matching L3 protocols (MPLS, IP, IPv6, ARP etc).

Change the meta-flow definition of packet_type field to use the new
custom format MFS_PACKET_TYPE representing "(NS,NS_TYPE)".

Parsing routine for MFS_PACKET_TYPE added to meta-flow.c. Formatting
routine for field packet_type extracted from match_format() and moved to
flow.c to be used from meta-flow.c for formatting MFS_PACKET_TYPE.

Updated the ovs-fields man page source meta-flow.xml with documentation
for packet-type-aware bridges and added documentation for field packet_type.

Added packet_type to the matching properties in tests/ofproto.at.

If dl_type is unwildcarded due to later packet modification, make sure it
is cleared again if the original packet_type was not PT_ETH.

Signed-off-by: Jan Scheurich 
Signed-off-by: Ben Pfaff 
---
 build-aux/extract-ofp-fields|   3 +-
 include/openvswitch/match.h |   5 +
 include/openvswitch/meta-flow.h |  20 
 lib/flow.c  |  34 +-
 lib/flow.h  |  27 +++--
 lib/learn.c |   1 +
 lib/match.c |  98 +++--
 lib/meta-flow.c |  86 +--
 lib/meta-flow.xml   | 156 +++
 lib/nx-match.c  |  34 +-
 lib/odp-util.c  |  38 +++
 lib/ofp-parse.c |  12 +++
 lib/ofp-util.c  |  67 +---
 ofproto/ofproto-dpif-xlate.c|   6 +-
 ofproto/tunnel.c|   2 -
 tests/dpif-netdev.at|  89 
 tests/odp.at|   1 +
 tests/ofproto-dpif.at   | 230 
 tests/ofproto.at|   1 +
 tests/ovs-ofctl.at  |   2 +-
 tests/pmd.at|   8 +-
 tests/tunnel-push-pop-ipv6.at   |   2 +-
 tests/tunnel-push-pop.at|   2 +-
 tests/tunnel.at |  18 ++--
 24 files changed, 657 insertions(+), 285 deletions(-)

diff --git a/build-aux/extract-ofp-fields b/build-aux/extract-ofp-fields
index d5b8a8202..24dd756ad 100755
--- a/build-aux/extract-ofp-fields
+++ b/build-aux/extract-ofp-fields
@@ -36,7 +36,8 @@ FORMATTING = {"decimal":("MFS_DECIMAL",  1,   
8),
   "OpenFlow 1.1+ port": ("MFS_OFP_PORT_OXM", 4,   4),
   "frag":   ("MFS_FRAG", 1,   1),
   "tunnel flags":   ("MFS_TNL_FLAGS",2,   2),
-  "TCP flags":  ("MFS_TCP_FLAGS",2,   2)}
+  "TCP flags":  ("MFS_TCP_FLAGS",2,   2),
+  "packet type":("MFS_PACKET_TYPE",  4,   4)}
 
 PREREQS = {"none": "MFP_NONE",
"Ethernet": "MFP_ETHERNET",
diff --git a/include/openvswitch/match.h b/include/openvswitch/match.h
index 70da928fe..aca725265 100644
--- a/include/openvswitch/match.h
+++ b/include/openvswitch/match.h
@@ -23,6 +23,7 @@
 
 struct ds;
 struct ofputil_port_map;
+struct mf_field;
 
 /* A flow classification match.
  *
@@ -119,6 +120,10 @@ void match_set_ct_ipv6_dst_masked(struct match *, const 
struct in6_addr *,
   const struct in6_addr *);
 
 void match_set_packet_type(struct match *, ovs_be32 packet_type);
+void match_set_default_packet_type(struct match *);
+bool match_has_default_packet_type(const struct match *);
+void match_add_ethernet_prereq(struct match *, const struct mf_field *);
+
 void match_set_skb_priority(struct match *, uint32_t skb_priority);
 void match_set_dl_type(struct match *, ovs_be16);
 void match_set_dl_src(struct match *, const struct eth_addr );
diff --git a/include/openvswitch/meta-flow.h b/include/openvswitch/meta-flow.h
index cbfd3ba65..fc109501d 100644
--- a/include/openvswitch/meta-flow.h
+++ b/include/openvswitch/meta-flow.h
@@ -133,6 +133,11 @@ struct ofputil_tlv_table_mod;
  *
  *   TCP flags: See the description of tcp_flags in ovs-ofctl(8).
  *
+ *   packet type: A pair of packet type namespace NS and NS_TYPE within
+ *   that namespace "(NS,NS_TYPE)". NS and NS_TYPE are formatted in
+ *   decimal or hexadecimal as and accept decimal and hexadecimal (with
+ *   0x prefix) at parsing.
+ *
  *   Prerequisites:
  *
  * The field's prerequisites.  The values should be straightfoward.
@@ -248,6 +253,20 @@ enum OVS_PACKED_ENUM mf_field_id {
  */
 MFF_RECIRC_ID,
 
+/* "packet_type".
+ *
+ * Define the packet type in OpenFlow 1.5+.
+ *
+ * Type: be32.
+ * Maskable: no.
+ * Formatting: packet type.
+ * Prerequisites: none.
+ * Access: read-only.
+ * NXM: none.
+ * OXM: OXM_OF_PACKET_TYPE(44) since OF1.5 and v2.8.
+ */
+MFF_PACKET_TYPE,
+
 /* "conj_id".
  *
  * ID for "conjunction" actions.  Please refer to ovs-ofctl(8)
@@ -1860,6 +1879,7 @@ enum OVS_PACKED_ENUM mf_string {
 MFS_FRAG, 

[ovs-dev] [PATCH v4 4/7] userspace: Handling of versatile tunnel ports

2017-06-23 Thread Zoltán Balogh
From: Ben Pfaff 

In netdev_gre_build_header(), GRE protocol and VXLAN next_potocol is set based
on packet_type of flow. If it's about an Ethernet packet, it is set to
ETP_TYPE_TEB. Otherwise, if the name space is OFPHTN_ETHERNET, it is set
according to the name space type.

Signed-off-by: Jan Scheurich 
Signed-off-by: Ben Pfaff 
---
 NEWS  |   6 +--
 lib/meta-flow.xml |  30 ++--
 lib/netdev-bsd.c  |   1 +
 lib/netdev-dpdk.c |   1 +
 lib/netdev-dummy.c|   1 +
 lib/netdev-linux.c|   1 +
 lib/netdev-native-tnl.c   |  23 ++---
 lib/netdev-provider.h |   6 +++
 lib/netdev-vport.c| 106 ++
 lib/netdev-vport.h|   1 -
 lib/netdev.c  |   8 
 lib/netdev.h  |  29 +++-
 ofproto/ofproto-dpif-xlate.c  |  35 --
 ofproto/ofproto-dpif.c|   4 +-
 ofproto/tunnel.c  |  27 ---
 tests/tunnel-push-pop-ipv6.at |   4 +-
 tests/tunnel-push-pop.at  |   4 +-
 vswitchd/vswitch.xml  |  94 ++---
 18 files changed, 279 insertions(+), 102 deletions(-)

diff --git a/NEWS b/NEWS
index 4ea7e628e..0f2604ff5 100644
--- a/NEWS
+++ b/NEWS
@@ -59,11 +59,9 @@ Post-v2.7.0
  * OVN services are no longer restarted automatically after upgrade.
- Add --cleanup option to command 'ovs-appctl exit' (see ovs-vswitchd(8)).
- L3 tunneling:
- * Add "layer3" options for tunnel ports that support non-Ethernet (L3)
-   payload (GRE, VXLAN-GPE).
+ * Use new tunnel port option "packet_type" to configure L2 vs. L3.
  * New vxlan tunnel extension "gpe" to support VXLAN-GPE tunnels.
- * Transparently pop and push Ethernet headers at transmit/reception
-   of packets to/from L3 tunnels.
+ * New support for non-Ethernet (L3) payloads in GRE and VXLAN-GPE.
- The BFD detection multiplier is now user-configurable.
- Add experimental support for hardware offloading
  * HW offloading is disabled by default.
diff --git a/lib/meta-flow.xml b/lib/meta-flow.xml
index 856e1ba8c..634ab69e4 100644
--- a/lib/meta-flow.xml
+++ b/lib/meta-flow.xml
@@ -26,19 +26,27 @@
 networking technology in use are called called root fields.
 Open vSwitch 2.7 and earlier considered Ethernet fields to be root fields,
 and this remains the default mode of operation for Open vSwitch bridges.
-In this mode, when a packet is received from a non-Ethernet interfaces,
-such as a layer-3 LISP or GRE tunnel, Open vSwitch force-fits it to this
+When a packet is received from a non-Ethernet interfaces, such as a layer-3
+LISP tunnel, Open vSwitch 2.7 and earlier force-fit the packet to this
 Ethernet-centric point of view by pretending that an Ethernet header is
 present whose Ethernet type that indicates the packet's actual type (and
 whose source and destination addresses are all-zero).
   
 
   
-Open vSwitch 2.8 and later supports the ``packet type-aware pipeline''
-concept introduced in OpenFlow 1.5.  A bridge configured to be packet
-type-aware can handle packets of multiple networking technologies, such as
-Ethernet, IP, ARP, MPLS, or NSH in parallel.  Such a bridge does not have
-any root fields.
+Open vSwitch 2.8 and later implement the ``packet type-aware pipeline''
+concept introduced in OpenFlow 1.5.  Such a pipeline does not have any root
+fields.  Instead, a new metadata field, ,
+indicates the basic type of the packet, which can be Ethernet, IPv4, IPv6,
+or another type.  For backward compatibility, by default Open vSwitch 2.8
+imitates the behavior of Open vSwitch 2.7 and earlier.  Later versions of
+Open vSwitch may change the default, and in the meantime controllers can
+turn off this legacy behavior, on a port-by-port basis, by setting
+options:packet_type to ptap in the
+Interface table.  This is significant only for ports that can
+handle non-Ethernet packets, which is currently just LISP, VXLAN-GPE, and
+GRE tunnel ports.  See ovs-vwitchd.conf.db(5) for more
+information.
   
 
   
@@ -332,14 +340,6 @@ tcp,tp_src=0x07c0/0xfff0
 mplsm  eth_type=0x8848
   
 
-  
-These shorthand notations continue to work in packet type-aware bridges.
-The absence of a packet_type match implies
-packet_type=ethernet, so that shorthands match on Ethernet
-packets with the implied eth_type. Please note that the shorthand
-ip does not match packets of packet_type (1,0x800) for IPv4.
-  
-
 
   Evolution of OpenFlow Fields
 
diff --git a/lib/netdev-bsd.c b/lib/netdev-bsd.c
index f863a189c..6cc83d347 100644
--- a/lib/netdev-bsd.c
+++ b/lib/netdev-bsd.c
@@ -1517,6 +1517,7 @@ netdev_bsd_update_flags(struct netdev *netdev_, enum 
netdev_flags off,
  \
 GET_FEATURES,

[ovs-dev] [PATCH v4 2/7] nx-match: Add context argument to nxm_put__().

2017-06-23 Thread Zoltán Balogh
From: Ben Pfaff 

An upcoming commit will need to pass an extra piece of data from
nx_put_raw() into all of its direct and indirect calls to nxm_put__().
This commit prepares for that by switching from a "struct ofpbuf *"
parameter to a context structure that, currently, contains just a
struct ofpbuf *.  The upcoming commit will add another member to the
context struct.

This commit has no visible effect on behavior.

Signed-off-by: Ben Pfaff 
---
 lib/nx-match.c | 232 +
 lib/nx-match.h |   6 +-
 lib/tun-metadata.c |   4 +-
 3 files changed, 131 insertions(+), 111 deletions(-)

diff --git a/lib/nx-match.c b/lib/nx-match.c
index 334ecd4a3..6278b7758 100644
--- a/lib/nx-match.c
+++ b/lib/nx-match.c
@@ -772,203 +772,222 @@ oxm_pull_field_array(const void *fields_data, size_t 
fields_len,
  * 'put' functions whose names end in 'm' add a field that might be wildcarded.
  * Other 'put' functions add exact-match fields.
  */
+
+struct nxm_put_ctx {
+struct ofpbuf *output;
+};
+
 void
-nxm_put__(struct ofpbuf *b, enum mf_field_id field, enum ofp_version version,
-  const void *value, const void *mask, size_t n_bytes)
+nxm_put_entry_raw(struct ofpbuf *b,
+  enum mf_field_id field, enum ofp_version version,
+  const void *value, const void *mask, size_t n_bytes)
 {
 nx_put_header_len(b, field, version, !!mask, n_bytes);
 ofpbuf_put(b, value, n_bytes);
 if (mask) {
 ofpbuf_put(b, mask, n_bytes);
 }
+}
 
+static void
+nxm_put__(struct nxm_put_ctx *ctx,
+  enum mf_field_id field, enum ofp_version version,
+  const void *value, const void *mask, size_t n_bytes)
+{
+nxm_put_entry_raw(ctx->output, field, version, value, mask, n_bytes);
 }
 
 static void
-nxm_put(struct ofpbuf *b, enum mf_field_id field, enum ofp_version version,
+nxm_put(struct nxm_put_ctx *ctx,
+enum mf_field_id field, enum ofp_version version,
 const void *value, const void *mask, size_t n_bytes)
 {
 if (!is_all_zeros(mask, n_bytes)) {
 bool masked = !is_all_ones(mask, n_bytes);
-nxm_put__(b, field, version, value, masked ? mask : NULL, n_bytes);
+nxm_put__(ctx, field, version, value, masked ? mask : NULL, n_bytes);
 }
 }
 
 static void
-nxm_put_8m(struct ofpbuf *b, enum mf_field_id field, enum ofp_version version,
+nxm_put_8m(struct nxm_put_ctx *ctx,
+   enum mf_field_id field, enum ofp_version version,
uint8_t value, uint8_t mask)
 {
-nxm_put(b, field, version, &value, &mask, sizeof value);
+nxm_put(ctx, field, version, &value, &mask, sizeof value);
 }
 
 static void
-nxm_put_8(struct ofpbuf *b, enum mf_field_id field, enum ofp_version version,
-  uint8_t value)
+nxm_put_8(struct nxm_put_ctx *ctx,
+  enum mf_field_id field, enum ofp_version version, uint8_t value)
 {
-nxm_put__(b, field, version, &value, NULL, sizeof value);
+nxm_put__(ctx, field, version, &value, NULL, sizeof value);
 }
 
 static void
-nxm_put_16m(struct ofpbuf *b, enum mf_field_id field, enum ofp_version version,
+nxm_put_16m(struct nxm_put_ctx *ctx,
+enum mf_field_id field, enum ofp_version version,
 ovs_be16 value, ovs_be16 mask)
 {
-nxm_put(b, field, version, &value, &mask, sizeof value);
+nxm_put(ctx, field, version, &value, &mask, sizeof value);
 }
 
 static void
-nxm_put_16(struct ofpbuf *b, enum mf_field_id field, enum ofp_version version,
-   ovs_be16 value)
+nxm_put_16(struct nxm_put_ctx *ctx,
+   enum mf_field_id field, enum ofp_version version, ovs_be16 value)
 {
-nxm_put__(b, field, version, &value, NULL, sizeof value);
+nxm_put__(ctx, field, version, &value, NULL, sizeof value);
 }
 
 static void
-nxm_put_32m(struct ofpbuf *b, enum mf_field_id field, enum ofp_version version,
+nxm_put_32m(struct nxm_put_ctx *ctx,
+enum mf_field_id field, enum ofp_version version,
 ovs_be32 value, ovs_be32 mask)
 {
-nxm_put(b, field, version, &value, &mask, sizeof value);
+nxm_put(ctx, field, version, &value, &mask, sizeof value);
 }
 
 static void
-nxm_put_32(struct ofpbuf *b, enum mf_field_id field, enum ofp_version version,
-   ovs_be32 value)
+nxm_put_32(struct nxm_put_ctx *ctx,
+   enum mf_field_id field, enum ofp_version version, ovs_be32 value)
 {
-nxm_put__(b, field, version, &value, NULL, sizeof value);
+nxm_put__(ctx, field, version, &value, NULL, sizeof value);
 }
 
 static void
-nxm_put_64m(struct ofpbuf *b, enum mf_field_id field, enum ofp_version version,
+nxm_put_64m(struct nxm_put_ctx *ctx,
+enum mf_field_id field, enum ofp_version version,
 ovs_be64 value, ovs_be64 mask)
 {
-nxm_put(b, field, version, &value, &mask, sizeof value);
+nxm_put(ctx, field, version, &value, &mask, sizeof value);
 }
 
 static void
-nxm_put_128m(struct ofpbuf *b,
+nxm_put_128m(struct nxm_put_ctx *ctx,
 

[ovs-dev] [PATCH v4 1/7] ofpbuf: New function ofpbuf_insert().

2017-06-23 Thread Zoltán Balogh
From: Ben Pfaff 

This will receive its first users in an upcoming commit.

Signed-off-by: Ben Pfaff 
---
 include/openvswitch/ofpbuf.h |  1 +
 lib/ofpbuf.c | 18 ++
 2 files changed, 19 insertions(+)

diff --git a/include/openvswitch/ofpbuf.h b/include/openvswitch/ofpbuf.h
index bc25bb8a1..6142f4a58 100644
--- a/include/openvswitch/ofpbuf.h
+++ b/include/openvswitch/ofpbuf.h
@@ -141,6 +141,7 @@ void ofpbuf_reserve(struct ofpbuf *, size_t);
 void *ofpbuf_push_uninit(struct ofpbuf *b, size_t);
 void *ofpbuf_push_zeros(struct ofpbuf *, size_t);
 void *ofpbuf_push(struct ofpbuf *b, const void *, size_t);
+void ofpbuf_insert(struct ofpbuf *b, size_t offset, const void *data, size_t);
 
 static inline size_t ofpbuf_headroom(const struct ofpbuf *);
 static inline size_t ofpbuf_tailroom(const struct ofpbuf *);
diff --git a/lib/ofpbuf.c b/lib/ofpbuf.c
index f4a904064..9c0623688 100644
--- a/lib/ofpbuf.c
+++ b/lib/ofpbuf.c
@@ -461,6 +461,24 @@ ofpbuf_push(struct ofpbuf *b, const void *p, size_t size)
 return dst;
 }
 
+/* Inserts the 'n' bytes of 'data' into 'b' starting at the given 'offset',
+ * moving data forward as necessary to make room.
+ *
+ * 'data' must not point inside 'b'. */
+void
+ofpbuf_insert(struct ofpbuf *b, size_t offset, const void *data, size_t n)
+{
+if (offset < b->size) {
+ofpbuf_put_uninit(b, n);
+memmove((char *) b->data + offset + n, (char *) b->data + offset,
+b->size - offset);
+memcpy((char *) b->data + offset, data, n);
+} else {
+ovs_assert(offset == b->size);
+ofpbuf_put(b, data, n);
+}
+}
+
 /* Returns the data in 'b' as a block of malloc()'d memory and frees the buffer
  * within 'b'.  (If 'b' itself was dynamically allocated, e.g. with
  * ofpbuf_new(), then it should still be freed with, e.g., ofpbuf_delete().) */
-- 
2.11.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4 0/7] Packet type aware pipeline

2017-06-23 Thread Zoltán Balogh

This series was started by Ben Pfaff, v3 can be found here:
https://patchwork.ozlabs.org/patch/778070/
https://patchwork.ozlabs.org/patch/778071/
https://patchwork.ozlabs.org/patch/778076/
https://patchwork.ozlabs.org/patch/778072/
https://patchwork.ozlabs.org/patch/778074/
https://patchwork.ozlabs.org/patch/778073/
https://patchwork.ozlabs.org/patch/778075/

Ben's series is based on this one:
https://patchwork.ozlabs.org/patch/770490/
https://patchwork.ozlabs.org/patch/770487/
https://patchwork.ozlabs.org/patch/770495/
https://patchwork.ozlabs.org/patch/770498/
https://patchwork.ozlabs.org/patch/770488/
https://patchwork.ozlabs.org/patch/770489/

v1->v2:
  - Squash fixup patches.
  - Apply changes agreed with Jan.
  - Not yet done: Figure out whether to really show packet_type in (some)
match_format() output.
  - New patch at the end unsuccessfully tries to re-enable packet-aware
test.  Either I don't have enough insight yet, or it just reveals a
bug or two.
  - 4 new patches at beginning.  First one is trivial.  Next 3 are intended
to make it easier to debug the packet aware test that is still failing.
Jan, you don't have to feel obligated to review these if you feel they
are off-topic; I will get separate reviews.

v2->v3:
  - Drop first two patches, which have been applied to master.
  - Drop patches 3 and 4, which have been moved to a new series.
  - Drop last patch, which was incomplete.  I'll hope that Jan or Zoltan
can pick it back up.
  - Apply Jan's and Zoltan's comments on a few other patches.

v3->v4:
  - Rebase to recent origin/master (0722f3410).
  - Re-introduce packet-type aware unit tests.
  - Unwildcard dl_type only for packet_type=PT_ETH.
  - Apply comments.

Ben Pfaff (3):
  ofpbuf: New function ofpbuf_insert().
  nx-match: Add context argument to nxm_put__().
  userspace: Handling of versatile tunnel ports

Jan Scheurich (3):
  userspace: Add OXM field MFF_PACKET_TYPE
  tests: Added unit tests in packet-type-aware.at
  userspace: Complete Packet In handling

Zoltán Balogh (1):
  userspace: Introduce packet_type in OF 1.5 packet-out

 NEWS|   6 +-
 build-aux/extract-ofp-fields|   3 +-
 include/openvswitch/match.h |   5 +
 include/openvswitch/meta-flow.h |  20 ++
 include/openvswitch/ofpbuf.h|   1 +
 lib/flow.c  |  74 -
 lib/flow.h  |  27 +-
 lib/learn.c |   1 +
 lib/match.c |  98 --
 lib/meta-flow.c |  86 +-
 lib/meta-flow.xml   | 156 --
 lib/netdev-bsd.c|   1 +
 lib/netdev-dpdk.c   |   1 +
 lib/netdev-dummy.c  |   1 +
 lib/netdev-linux.c  |   1 +
 lib/netdev-native-tnl.c |  23 +-
 lib/netdev-provider.h   |   6 +
 lib/netdev-vport.c  | 106 +--
 lib/netdev-vport.h  |   1 -
 lib/netdev.c|   8 +
 lib/netdev.h|  29 +-
 lib/nx-match.c  | 264 +---
 lib/nx-match.h  |   6 +-
 lib/odp-util.c  |  38 +--
 lib/ofp-parse.c |  25 ++
 lib/ofp-print.c |  11 +-
 lib/ofp-util.c  |  69 -
 lib/ofpbuf.c|  18 ++
 lib/tun-metadata.c  |   4 +-
 ofproto/ofproto-dpif-xlate.c|  46 +--
 ofproto/ofproto-dpif.c  |   4 +-
 ofproto/ofproto.c   |   3 +
 ofproto/tunnel.c|  29 +-
 tests/automake.mk   |   6 +-
 tests/dpif-netdev.at|  89 +++---
 tests/odp.at|   1 +
 tests/ofproto-dpif.at   | 230 +++---
 tests/ofproto.at|  82 +
 tests/ovs-ofctl.at  |   2 +-
 tests/packet-type-aware.at  | 463 
 tests/pmd.at|   8 +-
 tests/system-userspace-packet-type-aware.at | 418 +
 tests/system-userspace-testsuite.at |   1 +
 tests/testsuite.at  |   1 +
 tests/tunnel-push-pop-ipv6.at   |   6 +-
 tests/tunnel-push-pop.at|   6 +-
 tests/tunnel.at |  18 +-
 utilities/ovs-ofctl.c   |   1 +
 vswitchd/vswitch.xml|  94 +-
 49 files changed, 2087 insertions(+), 510 deletions(-)
___
dev mailing list
d...@openvswi

Re: [ovs-dev] [PATCH] dpif-netdev: Remove useless port checking.

2017-06-23 Thread Greg Rose

On 06/20/2017 12:53 AM, Ilya Maximets wrote:

Since commit ff073a71f9bb ("dpif-netdev: Use hmap instead of
list+array for tracking ports."), 'is_valid_port_number()' is
equal to 'port_no != ODPP_NONE', and the expression below will
never be true.

Signed-off-by: Ilya Maximets 
---
  lib/dpif-netdev.c | 7 ---
  1 file changed, 7 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index f83b632..fca277a 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2257,8 +2257,6 @@ static int
  dpif_netdev_flow_from_nlattrs(const struct nlattr *key, uint32_t key_len,
struct flow *flow, bool probe)
  {
-odp_port_t in_port;
-
  if (odp_flow_key_to_flow(key, key_len, flow)) {
  if (!probe) {
  /* This should not happen: it indicates that
@@ -2280,11 +2278,6 @@ dpif_netdev_flow_from_nlattrs(const struct nlattr *key, 
uint32_t key_len,
  return EINVAL;
  }

-in_port = flow->in_port.odp_port;
-if (!is_valid_port_number(in_port) && in_port != ODPP_NONE) {
-return EINVAL;
-}
-
  if (flow->ct_state & DP_NETDEV_CS_UNSUPPORTED_MASK) {
  return EINVAL;
  }


LGTM

Acked-by: Greg Rose 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/4] dpif-netdev: Avoid reading RSS hash when EMC is disabled.

2017-06-23 Thread O Mahony, Billy
Hi Antonio,

> -Original Message-
> From: Fischetti, Antonio
> Sent: Friday, June 23, 2017 3:10 PM
> To: O Mahony, Billy ; d...@openvswitch.org
> Subject: RE: [ovs-dev] [PATCH 1/4] dpif-netdev: Avoid reading RSS hash
> when EMC is disabled.
> 
> Hi Billy,
> thanks a lot for you suggestions. Those would really help re-factoring the
> code by avoiding duplications.
> The thing is that this patch 1/4 is mainly a preparation for the next patch 
> 2/4.
> So I did these changes with the next patch 2/4 in mind.
> 
> The final result I meant to achieve in patch 2/4 is the following.
> EMC lookup is skipped - not only when EMC is disabled - but also when
> (we're processing recirculated packets) && (the EMC is 'enough' full).
> The purpose is to avoid EMC thrashing.
> 
> Below is how the code looks like after applying patches 1/4 and 2/4.
> Please let me know if you can find some similar optimizations to avoid code
> duplications, that would be great.
> 
> /*
>  * EMC lookup is skipped when one or both of the following
>  * two cases occurs:
>  *
>  *   - EMC is disabled.  This is detected from cur_min.
>  *
>  *   - The EMC occupancy exceeds EMC_FULL_THRESHOLD and the
>  * packet to be classified is being recirculated.  When this
>  * happens also EMC insertions are skipped for recirculated
>  * packets.  So that EMC is used just to store entries which
>  * are hit from the 'original' packets.  This way the EMC
>  * thrashing is mitigated with a benefit on performance.
>  */
> if (!md_is_valid) {
> pkt_metadata_init(&packet->md, port_no);
> miniflow_extract(packet, &key->mf);  <== this fn must be called 
> after
> pkt_metadta_init
> /* This is not a recirculated packet. */
> if (OVS_LIKELY(cur_min)) {
> /* EMC is enabled.  We can retrieve the 5-tuple hash
>  * without considering the recirc id. */
> if (OVS_LIKELY(dp_packet_rss_valid(packet))) {
> key->hash = dp_packet_get_rss_hash(packet);
> } else {
> key->hash = miniflow_hash_5tuple(&key->mf, 0);
> dp_packet_set_rss_hash(packet, key->hash);
> }
> flow = emc_lookup(flow_cache, key);
> } else {
> /* EMC is disabled, skip emc_lookup. */
> flow = NULL;
> }
> } else {
> /* Recirculated packets. */
> miniflow_extract(packet, &key->mf);
> if (flow_cache->n_entries & EMC_FULL_THRESHOLD) {
> /* EMC occupancy is over the threshold.  We skip EMC
>  * lookup for recirculated packets. */
> flow = NULL;
> } else {
> if (OVS_LIKELY(cur_min)) {
> key->hash = dpif_netdev_packet_get_rss_hash(packet,
> &key->mf);
> flow = emc_lookup(flow_cache, key);
> } else {
> flow = NULL;
> }
> }
> }
> 
> 
> Basically patch 1/4 is mostly a preliminary change for 2/4.
> 
> Yes, patch 1/4 also allows to avoid reading hash when EMC is disabled.
> Or - for packets that are not recirculated - avoids calling
> recirc_depth_get_unsafe() when reading the hash.
> 
> Also, as these functions are critical for performance, I tend to avoid adding
> new Booleans that require new if statements.
[[BO'M]] 

Can you investigate refactoring this patch with something like below.  I think 
this is equivalent.  The current patch duplicates miniflow_extract, emc_lookup 
across the md_is_valid and !md_is_valid branches. It also duplicates some of 
the internals of get_rss_hash out into the !md_is_valid case and is difficult 
to follow. 

If the following suggestion works  the change in emc_processing from patch 2/4 
can easily be grafted on to that. 

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 4e29085..a7e854d 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -4442,7 +4442,8 @@ dp_netdev_upcall(struct dp_netdev_pmd_thread *pmd, struct 
dp_packet *packet_,

 static inline uint32_t
 dpif_netdev_packet_get_rss_hash(struct dp_packet *packet,
-const struct miniflow *mf)
+const struct miniflow *mf,
+bool use_recirc_depth)
 {
 uint32_t hash, recirc_depth;

@@ -4456,7 +4457,7 @@ dpif_netdev_packet_get_rss_hash(struct dp_packet *packet,
 /* The RSS hash must account for the recirculation depth to avoid
  * collisions in the exact match cache */
 recirc_depth = *recirc_depth_get_unsafe();
-if (OVS_UNLIKELY(recirc_depth)) {
+if (OVS_UNLIKELY(use_recirc_depth && recirc_depth)) {

Re: [ovs-dev] [PATCH] openvswitch.h: OVS_KEY_ATTR_PACKET_TYPE is userspace-only.

2017-06-23 Thread Greg Rose

On 06/19/2017 04:30 PM, Ben Pfaff wrote:

This wasn't clear before.

Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2017-June/334271.html
Signed-off-by: Ben Pfaff 
---
  datapath/linux/compat/include/linux/openvswitch.h | 4 
  1 file changed, 4 insertions(+)

diff --git a/datapath/linux/compat/include/linux/openvswitch.h 
b/datapath/linux/compat/include/linux/openvswitch.h
index 4c88de1d610d..d41b7770790c 100644
--- a/datapath/linux/compat/include/linux/openvswitch.h
+++ b/datapath/linux/compat/include/linux/openvswitch.h
@@ -365,7 +365,11 @@ enum ovs_key_attr {
  OVS_KEY_ATTR_TUNNEL_INFO,  /* struct ovs_tunnel_info */
  #endif

+#ifndef __KERNEL__
+/* Only used within userspace data path. */
  OVS_KEY_ATTR_PACKET_TYPE,  /* be32 packet type */
+#endif
+
  __OVS_KEY_ATTR_MAX
  };



Acked-by: Greg Rose 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] Fix some -Wimplicit-fallthrough warnings building with GCC 7

2017-06-23 Thread Timothy Redaelli
-Wimplicit-fallthrough warns when a switch case falls through and since this
warning is enabled by -Wextra it breaks building with --enable-Werror.

Added "/* fall through */" comment when needed in order to avoid the warning.

Signed-off-by: Timothy Redaelli 
---

>From GCC 7 Release Note:   
> 
-Wimplicit-fallthrough warns when a switch case falls through.  
This warning has five different levels. 
The compiler is able to parse a wide range of fallthrough comments, 
depending on the level. 
It also handles control-flow statements, such as ifs.   
It's possible to suppress the warning by either adding a fallthrough comment,   
or by using a null statement: __attribute__ ((fallthrough));
(C, C++), or [[fallthrough]]; (C++17), or [[gnu::fallthrough]]; (C++11/C++14).  
This warning is enabled by -Wextra.

 lib/fat-rwlock.c  |   1 +
 lib/flow.c|   2 +-
 lib/hash.c|  15 ++-
 lib/ofp-util.c|   1 +
 lib/rstp-state-machines.c | 102 +++---
 ofproto/bond.c|   1 +
 ovn/lib/lex.c |   3 +-
 ovn/utilities/ovn-nbctl.c |   2 +
 ovn/utilities/ovn-sbctl.c |   2 +
 tests/test-ovn.c  |   1 +
 utilities/ovs-vsctl.c |   2 +
 11 files changed, 78 insertions(+), 54 deletions(-)

diff --git a/lib/fat-rwlock.c b/lib/fat-rwlock.c
index 19b714907..d913b2088 100644
--- a/lib/fat-rwlock.c
+++ b/lib/fat-rwlock.c
@@ -270,6 +270,7 @@ fat_rwlock_unlock(const struct fat_rwlock *rwlock_)
 
 case 1:
 ovs_mutex_unlock(&this->mutex);
+/* fall through */
 default:
 this->depth--;
 break;
diff --git a/lib/flow.c b/lib/flow.c
index d73e796a2..75c124051 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -2032,7 +2032,7 @@ flow_mask_hash_fields(const struct flow *flow, struct 
flow_wildcards *wc,
 memset(&wc->masks.tp_src, 0xff, sizeof wc->masks.tp_src);
 memset(&wc->masks.tp_dst, 0xff, sizeof wc->masks.tp_dst);
 }
-/* no break */
+/* fall through */
 case NX_HASH_FIELDS_SYMMETRIC_L3L4:
 if (flow->dl_type == htons(ETH_TYPE_IP)) {
 memset(&wc->masks.nw_src, 0xff, sizeof wc->masks.nw_src);
diff --git a/lib/hash.c b/lib/hash.c
index aa898a3b8..321f9d927 100644
--- a/lib/hash.c
+++ b/lib/hash.c
@@ -282,37 +282,50 @@ hash_bytes128(const void *p_, size_t len, uint32_t basis, 
ovs_u128 *out)
 switch (len & 15) {
 case 15:
 k2 ^= ((uint64_t) tail[14]) << 48;
+/* fall through */
 case 14:
 k2 ^= ((uint64_t) tail[13]) << 40;
+/* fall through */
 case 13:
 k2 ^= ((uint64_t) tail[12]) << 32;
+/* fall through */
 case 12:
 k2 ^= ((uint64_t) tail[11]) << 24;
+/* fall through */
 case 11:
 k2 ^= ((uint64_t) tail[10]) << 16;
+/* fall through */
 case 10:
 k2 ^= ((uint64_t) tail[9]) << 8;
+/* fall through */
 case 9:
 k2 ^= ((uint64_t) tail[8]) << 0;
 k2 *= c2;
 k2 = hash_rot64(k2, 33);
 k2 *= c1;
 h2 ^= k2;
-
+/* fall through */
 case 8:
 k1 ^= ((uint64_t) tail[7]) << 56;
+/* fall through */
 case 7:
 k1 ^= ((uint64_t) tail[6]) << 48;
+/* fall through */
 case 6:
 k1 ^= ((uint64_t) tail[5]) << 40;
+/* fall through */
 case 5:
 k1 ^= ((uint64_t) tail[4]) << 32;
+/* fall through */
 case 4:
 k1 ^= ((uint64_t) tail[3]) << 24;
+/* fall through */
 case 3:
 k1 ^= ((uint64_t) tail[2]) << 16;
+/* fall through */
 case 2:
 k1 ^= ((uint64_t) tail[1]) << 8;
+/* fall through */
 case 1:
 k1 ^= ((uint64_t) tail[0]) << 0;
 k1 *= c1;
diff --git a/lib/ofp-util.c b/lib/ofp-util.c
index da171cdf6..f5fcb3232 100644
--- a/lib/ofp-util.c
+++ b/lib/ofp-util.c
@@ -9740,6 +9740,7 @@ ofputil_encode_group_mod(enum ofp_version ofp_version,
 switch (ofp_version) {
 case OFP10_VERSION:
 bad_group_cmd(gm->command);
+/* fall through */
 
 case OFP11_VERSION:
 case OFP12_VERSION:
diff --git a/lib/rstp-state-machines.c b/lib/rstp-state-machines.c
index 5643c1028..8fdaec953 100644
--- a/lib/rstp-state-machines.c
+++ b/lib/rstp-state-machines.c
@@ -401,7 +401,7 @@ updt_roles_tree__(struct rstp *r)
 break;
 default:
 OVS_NOT_REACHED();
-/* no break */
+/* fall through */
 }
 }
 seq_change(connectivity_seq_get());
@@ -442,7 +442,7 @@ port_role_selection_sm(struct rstp *r)
 case PORT_ROLE_SELECTION_SM_INIT_BRIDGE_EXEC:
 updt_role_disabled_tree(r);
 r->port_role_selection_sm_state = PORT_ROLE_SELECTION

Re: [ovs-dev] [PATCH v3] dpif-netdev: Fix insertion probability

2017-06-23 Thread Kevin Traynor
On 06/23/2017 04:31 PM, Ciara Loftus wrote:
> emc_conditional_insert uses pmd->last_cycles and the packet's RSS hash
> to generate a random number used to determine whether or not an emc
> entry should be inserted. This works for single-packet bursts as
> last_cycles is updated for each burst. However, for bursts > 1 packet,
> where the packets in the batch generate the same RSS hash,
> pmd->last_cycles remains constant for the entire burst also, and thus
> cannot be used as a random number for each packet in the burst.
> 
> This commit replaces the use of pmd->last_cycles with random_uint32()
> for this purpose and subsequently fixes the behavior of the
> emc_insert_inv_prob setting for high-throughput (large bursts)
> single-flow cases.
> 

Thanks for the fix.
Acked-by: Kevin Traynor 

I also ran a quick test (on V2)
Tested-by: Kevin Traynor 

> Fixes: 4c30b24602c3 ("dpif-netdev: Conditional EMC insert")
> Reported-by: Kevin Traynor 
> Acked-by: Darrell Ball 
> Signed-off-by: Ciara Loftus 
> ---
> v2:
> - Remove unnecessary ORing in random number calcuation as per Ben's
>   suggestion.
> v3:
> - Coding standards fix
> 
>  lib/dpif-netdev.c | 6 +-
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 4e29085..2dbdd47 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -2073,11 +2073,7 @@ emc_probabilistic_insert(struct dp_netdev_pmd_thread 
> *pmd,
>  uint32_t min;
>  atomic_read_relaxed(&pmd->dp->emc_insert_min, &min);
>  
> -#ifdef DPDK_NETDEV
> -if (min && (key->hash ^ (uint32_t) pmd->last_cycles) <= min) {
> -#else
> -if (min && (key->hash ^ random_uint32()) <= min) {
> -#endif
> +if (min && random_uint32() <= min) {
>  emc_insert(&pmd->flow_cache, key, flow);
>  }
>  }
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3] dpif-netdev: Fix insertion probability

2017-06-23 Thread Ciara Loftus
emc_conditional_insert uses pmd->last_cycles and the packet's RSS hash
to generate a random number used to determine whether or not an emc
entry should be inserted. This works for single-packet bursts as
last_cycles is updated for each burst. However, for bursts > 1 packet,
where the packets in the batch generate the same RSS hash,
pmd->last_cycles remains constant for the entire burst also, and thus
cannot be used as a random number for each packet in the burst.

This commit replaces the use of pmd->last_cycles with random_uint32()
for this purpose and subsequently fixes the behavior of the
emc_insert_inv_prob setting for high-throughput (large bursts)
single-flow cases.

Fixes: 4c30b24602c3 ("dpif-netdev: Conditional EMC insert")
Reported-by: Kevin Traynor 
Acked-by: Darrell Ball 
Signed-off-by: Ciara Loftus 
---
v2:
- Remove unnecessary ORing in random number calcuation as per Ben's
  suggestion.
v3:
- Coding standards fix

 lib/dpif-netdev.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 4e29085..2dbdd47 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2073,11 +2073,7 @@ emc_probabilistic_insert(struct dp_netdev_pmd_thread 
*pmd,
 uint32_t min;
 atomic_read_relaxed(&pmd->dp->emc_insert_min, &min);
 
-#ifdef DPDK_NETDEV
-if (min && (key->hash ^ (uint32_t) pmd->last_cycles) <= min) {
-#else
-if (min && (key->hash ^ random_uint32()) <= min) {
-#endif
+if (min && random_uint32() <= min) {
 emc_insert(&pmd->flow_cache, key, flow);
 }
 }
-- 
2.4.11

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH 1/1] dpif-netdev : Include Rxq processing cycles

2017-06-23 Thread Stokes, Ian
> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of Sugesh Chandran
> Sent: Tuesday, May 30, 2017 7:47 PM
> To: d...@openvswitch.org; ktray...@redhat.com
> Subject: [ovs-dev] [RFC PATCH 1/1] dpif-netdev : Include Rxq processing
> cycles
> 
> PMD threads polls and process packets from Rxq in round-robin fashion. It
> is not guaranteed optimal allocation of Rxq across the PMD threads all the
> time.
> The patch series in the following link are trying to offer uniform
> distribution of rxqs across the PMD threads.
> 
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/332734.html
> 
> However at first it is necessary for the user to know about the
> distribution of PMD cycles across its Rx queues. This patch is extending
> the existing command
> 'ovs-appctl dpif-netdev/pmd-rxq-show'
> to show the percentage of total PMD thread cycles spent on each port rx
> queues.
> 
> A sample 'ovs-appctl dpif-netdev/pmd-rxq-show' with pmd cycle percentage :
> 
> pmd thread numa_id 0 core_id 2:
>   isolated : false
>   port: dpdk0
>   queue-id: 0
>   polling cycles(% of total polling cycles):88687867260 (24.40%)
>   processing cycles(% of total processing cycles):741227393476
> (65.87%)
> 
>   port: dpdkvhostuser1
>   queue-id: 0
>   polling cycles(% of total polling cycles):274003203688 (75.38%)
>   processing cycles(% of total processing cycles):384127428352
> (34.13%)
> 
> pmd thread numa_id 0 core_id 3:
>   isolated : false
>   port: dpdk1
>   queue-id: 0
>   polling cycles(% of total polling cycles):96022088320 (26.01%)
>   processing cycles(% of total processing cycles):716284951804
> (64.07%)
> 
>   port: dpdkvhostuser0
>   queue-id: 0
>   polling cycles(% of total polling cycles):273103986760 (73.98%)
>   processing cycles(% of total processing cycles):401626635996
> (35.93%)
> 
> Signed-off-by: Sugesh Chandran 
> ---
>  lib/dpif-netdev.c | 155 +
> -
>  1 file changed, 131 insertions(+), 24 deletions(-)
> 

Hi Sugesh,

Thanks for the patch. Overall I like the idea of this. A few issues to highlight

As is in this patch the rxq processing cycles are displayed as part of the 
'dpif-netdev/pmd-rxq-show' command.

There's is probably an argument to be made that they should be part of the 
'dpif-netdev/pmd-stats-show' command.
In testing however I found from a usability point of view it made more sense to 
have both the rxq distribution and processing cycles in the same location. (BTW 
This was particularly helpful when confirming rxq packet distribution behavior 
among rxqs on a NIC.)

One thing I notice, if you call 'dpif-netdev/pmd-rxq-show' before any traffic 
is passed it will display as follows

pmd thread numa_id 0 core_id 2:
isolated : false
port: dpdk0
queue-id: 0

port: dpdk1
queue-id: 0

pmd thread numa_id 0 core_id 3:
isolated : false
port: dpdk0
queue-id: 1

port: dpdk1
queue-id: 1

After Running traffic through all queues and PMDS it will display as

pmd thread numa_id 0 core_id 2: 
isolated : false
port: dpdk0 
queue-id: 1 
polling cycles(% of total polling cycles):796500980788 (53.12%)
processing cycles(% of total processing cycles):1310776644468 
(82.51%)

port: dpdk1
queue-id: 1
polling cycles(% of total polling cycles):702810745324 (46.88%)
processing cycles(% of total processing cycles):277877362268 
(17.49%)

pmd thread numa_id 0 core_id 3:
isolated : false   
port: dpdk0
queue-id: 0
polling cycles(% of total polling cycles):461913195316 (46.19%)
processing cycles(% of total processing cycles):572565498512 
(23.87%)

port: dpdk1
queue-id: 0
polling cycles(% of total polling cycles):538120516336 (53.81%)
processing cycles(% of total processing cycles):1826074812348 
(76.13%)

However if we stop traffic being received on rxqs assigned to core_id 2 for 
instance the last received stats are still displayed for core_id 2.
This could be misleading because if there is no traffic being received on the 
rxqs on that core then there is no difference in processing cycles between the 
rxqs as neither are receiving anything. It should maybe reset to blank or 0 in 
this case I think.

Thoughts?

Ian

> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index b50164b..e4fcb2e
> 100644
> --- a/lib/dp

Re: [ovs-dev] [PATCH v2] dpif-netdev: Fix insertion probability

2017-06-23 Thread Darrell Ball
Oops, one suggestion I forgot per coding standards

Use

+if (min && random_uint32() <= min) {

instead of

+if (min && (random_uint32() <= min)) {

Darrell

On 6/23/17, 8:19 AM, "ovs-dev-boun...@openvswitch.org on behalf of Darrell 
Ball"  wrote:

Thanks for the V2

Acked-by: Darrell Ball 

On 6/23/17, 8:07 AM, "ovs-dev-boun...@openvswitch.org on behalf of Ciara 
Loftus"  
wrote:

emc_conditional_insert uses pmd->last_cycles and the packet's RSS hash
to generate a random number used to determine whether or not an emc
entry should be inserted. This works for single-packet bursts as
last_cycles is updated for each burst. However, for bursts > 1 packet,
where the packets in the batch generate the same RSS hash,
pmd->last_cycles remains constant for the entire burst also, and thus
cannot be used as a random number for each packet in the burst.

This commit replaces the use of pmd->last_cycles with random_uint32()
for this purpose and subsequently fixes the behavior of the
emc_insert_inv_prob setting for high-throughput (large bursts)
single-flow cases.

Fixes: 4c30b24602c3 ("dpif-netdev: Conditional EMC insert")
Reported-by: Kevin Traynor 
Signed-off-by: Ciara Loftus 
---
v2:
- Remove unnecessary ORing in random number calcuation as per Ben's
  suggestion.
---
 lib/dpif-netdev.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 4e29085..bf489fc 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2073,11 +2073,7 @@ emc_probabilistic_insert(struct 
dp_netdev_pmd_thread *pmd,
 uint32_t min;
 atomic_read_relaxed(&pmd->dp->emc_insert_min, &min);
 
-#ifdef DPDK_NETDEV
-if (min && (key->hash ^ (uint32_t) pmd->last_cycles) <= min) {
-#else
-if (min && (key->hash ^ random_uint32()) <= min) {
-#endif
+if (min && (random_uint32() <= min)) {
 emc_insert(&pmd->flow_cache, key, flow);
 }
 }
-- 
2.4.11

___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=JrQoU6MRrRUr-vSHj8aty_n1K3x38W60xK_NbbUQcQU&s=4ZmKaLYcaKABSRAGlOcM6BHYVkO2fhhXFQyBer4MRxU&e=
 


___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=8WLbj6U88nUf_t6V6rYhdh9Kg9vv9SxBe5SqrK-UuPk&s=NVYZEPuUQ0phyjFsGeage-_jQNL2YpH5DZeuvyTzghI&e=
 


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH RFC v2] Conntrack: Avoid recirculation for established connections.

2017-06-23 Thread Chandran, Sugesh
Hi Antonio,

Thank you for the patches, 

Please find my comments below.



Regards
_Sugesh


> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of Fischetti, Antonio
> Sent: Tuesday, June 6, 2017 5:15 PM
> To: Darrell Ball ; d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH RFC v2] Conntrack: Avoid recirculation for
> established connections.
> 
> Thanks Darrel for your useful comments, I've tried to replicate the usecases
> you mentioned, please find inline some details.
> 
> Regards,
> Antonio
> 
> > -Original Message-
> > From: Darrell Ball [mailto:db...@vmware.com]
> > Sent: Thursday, June 1, 2017 6:20 PM
> > To: Fischetti, Antonio ;
> > d...@openvswitch.org
> > Subject: Re: [ovs-dev] [PATCH RFC v2] Conntrack: Avoid recirculation
> > for established connections.
> >
> > Comments inline
> >
> > On 5/29/17, 8:24 AM, "ovs-dev-boun...@openvswitch.org on behalf of
> > Fischetti, Antonio"  > antonio.fische...@intel.com> wrote:
> >
> > Thanks Joe for your feedback and the interesting insights in
> > conntrack in your earlier communication.
> > We have added all the details that we considered for this first
> > implementation. Also, some answers are inline.
> >
> > The purpose of this implementation is to avoid recirculation just
> > for those packets that are part of established connections.
> >
> > This shouldn't affect the packet recirculation for actions other
> > than conntrack. For example in MPLS, after a pop_mpls action the
> > packet will still be recirculated to follow the usual datapath.
> >
> > Most importantly, the CT module isn't by-passed in this
> > implementation.
> >
> > Besides the recirculation, all other action[s] shall be executed
> > as-is on each packet.
> > Any new CT change or action set by the controller will be managed
> > as usual.
> >
> > For our testing we set up a simple firewall, below are the flows.
> >
> >
> >  Flow Setup
> >  --
> > table=0, priority=100,ct_state=-trk,ip actions=ct(table=1)
> > table=0, priority=10,arp actions=NORMAL
> > table=0, priority=1 actions=drop
> > table=1, ct_state=+new+trk,ip,in_port=1 actions=ct(commit),output:2
> > table=1, ct_state=+new+trk,ip,in_port=2 actions=drop
> > table=1, ct_state=+est+trk,ip,in_port=1 actions=output:2
> > table=1, ct_state=+est+trk,ip,in_port=2 actions=output:1
> >
> >
> >  Basic idea
> >  --
> > With the above rules, all packets belonging to an established
> > connection will first hit the flow
> > "ct_state=-trk,ip actions=ct(table=1)"
> >
> > then on recirculation, they will hit
> > "ct_state=+est+trk,ip,in_port=.. actions=output:X".
> >
> > The basic idea is to do the following 2 steps.
> > 1. Combine the two sets of actions by removing the recirculation.
> >a) Original actions:
> > - "ct(zone=N), recirc(id)" [ i.e ct(table=1) ]
> > - "output:X"
> >b) Combined Actions after Removing recirculation:
> > - "ct(zone=N), output:X".
> >
> > 2. All the subsequent packets shall hit a flow with the combined
> > actions.
> >
> >
> > [Darrell]
> >
> > 1) This would be constraining on how rules are written such that we
> > can’t go back to
> >   the beginning of the pipeline, just because a packet is
> > established.
> >   Meaning we can’t run stateless or stateful rules earlier
> > in the pipeline.
> >   There are lots of possible combinations.
> >   One case is:
> >   We may run a bunch of firewall checks in table X, later
> > on (table X+1) run the non-filtered packets through conntrack
> >   and then want to go back to the firewall table.
> >
> 
> [Antonio]
> I've tried to replicate the case you described. I've used the following set of
> rules. They may have no practical meaning, the purpose is to have - at least
> for packets from port 1 - that they traverse tables #0 => 1 => 2 and then are
> sent back to table#0.
> 
> NXST_FLOW reply (xid=0x4):
>  table=0, priority=100,ct_state=-trk,ip actions=ct(table=1)  table=1,
> ct_state=+new+trk,ip,in_port=1,nw_src=10.10.10.0/24
> actions=ct(commit),output:2  table=1, ct_state=+est+trk,ip,in_port=1
> actions=mod_nw_src:3.3.3.3,resubmit(,2)
>  table=2, ct_state=+est+trk,tcp,in_port=1 actions=drop  table=2,
> ct_state=+est+trk,udp,in_port=1 actions=resubmit(1,0)  table=0,
> ct_state=+est+trk,udp,in_port=1,nw_src=3.3.3.3 actions=output:2  table=1,
> ct_state=+new+trk,ip,in_port=2 actions=drop  table=1,
> ct_state=+est+trk,ip,in_port=2 actions=output:1  table=0, priority=10,arp
> actions=NORMAL
> 
> For packets from port1:
>  - table#0: untracked packets go through the CT, then are recirculated, ie go
> to table#1.
>  - table#1: new conn are committed.
>  - table#1: packets of est conn are modified in the IPsrc that becomes 
> 3.3.3.3,
> then sent to table#2.
>  -

Re: [ovs-dev] [PATCH v2] dpif-netdev: Fix insertion probability

2017-06-23 Thread Darrell Ball
Thanks for the V2

Acked-by: Darrell Ball 

On 6/23/17, 8:07 AM, "ovs-dev-boun...@openvswitch.org on behalf of Ciara 
Loftus"  
wrote:

emc_conditional_insert uses pmd->last_cycles and the packet's RSS hash
to generate a random number used to determine whether or not an emc
entry should be inserted. This works for single-packet bursts as
last_cycles is updated for each burst. However, for bursts > 1 packet,
where the packets in the batch generate the same RSS hash,
pmd->last_cycles remains constant for the entire burst also, and thus
cannot be used as a random number for each packet in the burst.

This commit replaces the use of pmd->last_cycles with random_uint32()
for this purpose and subsequently fixes the behavior of the
emc_insert_inv_prob setting for high-throughput (large bursts)
single-flow cases.

Fixes: 4c30b24602c3 ("dpif-netdev: Conditional EMC insert")
Reported-by: Kevin Traynor 
Signed-off-by: Ciara Loftus 
---
v2:
- Remove unnecessary ORing in random number calcuation as per Ben's
  suggestion.
---
 lib/dpif-netdev.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 4e29085..bf489fc 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2073,11 +2073,7 @@ emc_probabilistic_insert(struct dp_netdev_pmd_thread 
*pmd,
 uint32_t min;
 atomic_read_relaxed(&pmd->dp->emc_insert_min, &min);
 
-#ifdef DPDK_NETDEV
-if (min && (key->hash ^ (uint32_t) pmd->last_cycles) <= min) {
-#else
-if (min && (key->hash ^ random_uint32()) <= min) {
-#endif
+if (min && (random_uint32() <= min)) {
 emc_insert(&pmd->flow_cache, key, flow);
 }
 }
-- 
2.4.11

___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=JrQoU6MRrRUr-vSHj8aty_n1K3x38W60xK_NbbUQcQU&s=4ZmKaLYcaKABSRAGlOcM6BHYVkO2fhhXFQyBer4MRxU&e=
 


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [patch_v1] dpif: Fix cleanup of userspace datapath.

2017-06-23 Thread Darrell Ball


On 6/23/17, 1:26 AM, "ovs-dev-boun...@openvswitch.org on behalf of Ben Pfaff" 
 wrote:

On Thu, Jun 22, 2017 at 09:27:11PM -0700, Darrell Ball wrote:
> Hardware offload introduced extra tracking of netdev ports.  This
> included ovs-netdev, which is really for internal infra usage for
> the userpace datapath.  This breaks cleanup of the userspace
> datapath.  There is no need to do this extra tracking of this port,
> hence it is skipped by checking for the port name as the type does
> not help here.  Adding an extra type or field is another way to fix
> this, but this would probably be overkill as this port is a constant
> and the misuse potential very limited.
> 
> CC: Paul Blakey 
> Fixes: 32b77c316d9982("dpif: Save added ports in a port map.")
> Signed-off-by: Darrell Ball 
> ---
>  lib/dpif.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/dpif.c b/lib/dpif.c
> index 10bdd70..8624d34 100644
> --- a/lib/dpif.c
> +++ b/lib/dpif.c
> @@ -350,7 +350,11 @@ do_open(const char *name, const char *type, bool 
create, struct dpif **dpifp)
>  struct netdev *netdev;
>  int err;
>  
> -if (!strcmp(dpif_port.type, "internal")) {
> +/* ovs-netdev is a tap device that is used as an
> + * internal port for the userspace datapath, hence
> + * don't track it here. */
> +if (!strcmp(dpif_port.type, "internal") ||
> +(!strcmp(dpif_port.name, "ovs-netdev"))) {

I don't understand this issue, so this is not a review, but please do
note this style point from coding-style.rst:

Do not parenthesize the operands of ``&&`` and ``||`` unless operator
precedence makes it necessary, or unless the operands are themselves
expressions that use ``&&`` and ``||``. Thus:

::

if (!isdigit((unsigned char)s[0])
|| !isdigit((unsigned char)s[1])
|| !isdigit((unsigned char)s[2])) {
printf("string %s does not start with 3-digit code\n", s);
}

but

::

if (rule && (!best || rule->priority > best->priority)) {
best = rule;
}

Thanks for the reminder

___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=Cvf-v52TVTcXUOH8suwH429_Yw7GH70eQv8do78OqO4&s=WfZDuxBQ0mf0fynx4BWSxBWe1OeFFvg4G_aHRCkAlCs&e=
 


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2] dpif-netdev: Fix insertion probability

2017-06-23 Thread Ciara Loftus
emc_conditional_insert uses pmd->last_cycles and the packet's RSS hash
to generate a random number used to determine whether or not an emc
entry should be inserted. This works for single-packet bursts as
last_cycles is updated for each burst. However, for bursts > 1 packet,
where the packets in the batch generate the same RSS hash,
pmd->last_cycles remains constant for the entire burst also, and thus
cannot be used as a random number for each packet in the burst.

This commit replaces the use of pmd->last_cycles with random_uint32()
for this purpose and subsequently fixes the behavior of the
emc_insert_inv_prob setting for high-throughput (large bursts)
single-flow cases.

Fixes: 4c30b24602c3 ("dpif-netdev: Conditional EMC insert")
Reported-by: Kevin Traynor 
Signed-off-by: Ciara Loftus 
---
v2:
- Remove unnecessary ORing in random number calcuation as per Ben's
  suggestion.
---
 lib/dpif-netdev.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 4e29085..bf489fc 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2073,11 +2073,7 @@ emc_probabilistic_insert(struct dp_netdev_pmd_thread 
*pmd,
 uint32_t min;
 atomic_read_relaxed(&pmd->dp->emc_insert_min, &min);
 
-#ifdef DPDK_NETDEV
-if (min && (key->hash ^ (uint32_t) pmd->last_cycles) <= min) {
-#else
-if (min && (key->hash ^ random_uint32()) <= min) {
-#endif
+if (min && (random_uint32() <= min)) {
 emc_insert(&pmd->flow_cache, key, flow);
 }
 }
-- 
2.4.11

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v9] netdev-dpdk: Increase pmd thread priority.

2017-06-23 Thread nickcooper-zhangtonghao
If we fail to set the priority, we should return the err code
and not return 0.


> On Jun 23, 2017, at 4:51 AM, Bhanuprakash Bodireddy 
>  wrote:
> 
> Increase the DPDK pmd thread scheduling priority by lowering the nice
> value. This will advise the kernel scheduler to prioritize pmd thread
> over other processes and will help PMD to provide deterministic
> performance in out-of-the-box deployments.
> 
> This patch sets the nice value of PMD threads to '-20'.
> 
>  $ ps -eLo comm,policy,psr,nice | grep pmd
> 
>   COMMAND  POLICY  PROCESSORNICE
>pmd62 TS3-20
>pmd63 TS0-20
>pmd64 TS1-20
>pmd65 TS2-20
> 
> Signed-off-by: Bhanuprakash Bodireddy  >
> Tested-by: Billy O'Mahony  >
> ---
> v8->v9:
> * Rebase
> 
> v7->v8:
> * Rebase
> * Update the documentation file @Documentation/intro/install/dpdk-advanced.rst
> 
> v6->v7:
> * Remove realtime scheduling policy logic.
> * Increase pmd thread scheduling priority by lowering nice value to -20.
> * Update doc accordingly.
> 
> v5->v6:
> * Prohibit spawning pmd thread on the lowest core in dpdk-lcore-mask if
>  lcore-mask and pmd-mask affinity are identical.
> * Updated Note section in INSTALL.DPDK-ADVANCED doc.
> * Tested below cases to verify system stability with pmd priority patch
> 
> v4->v5:
> * Reword Note section in DPDK-ADVANCED.md
> 
> v3->v4:
> * Document update
> * Use ovs_strerror for reporting errors in lib-numa.c
> 
> v2->v3:
> * Move set_priority() function to lib/ovs-numa.c
> * Apply realtime scheduling policy and priority to pmd thread only if
>  pmd-cpu-mask is passed.
> * Update INSTALL.DPDK-ADVANCED.
> 
> v1->v2:
> * Removed #ifdef and introduced dummy function "pmd_thread_setpriority"
>  in netdev-dpdk.h
> * Rebase
> 
> Documentation/intro/install/dpdk.rst |  8 +++-
> lib/dpif-netdev.c|  4 
> lib/ovs-numa.c   | 21 +
> lib/ovs-numa.h   |  1 +
> 4 files changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/intro/install/dpdk.rst 
> b/Documentation/intro/install/dpdk.rst
> index e83f852..b5c26ba 100644
> --- a/Documentation/intro/install/dpdk.rst
> +++ b/Documentation/intro/install/dpdk.rst
> @@ -453,7 +453,8 @@ affinitized accordingly.
>   to be affinitized to isolated cores for optimum performance.
> 
>   By setting a bit in the mask, a pmd thread is created and pinned to the
> -  corresponding CPU core. e.g. to run a pmd thread on core 2::
> +  corresponding CPU core with nice value set to -20.
> +  e.g. to run a pmd thread on core 2::
> 
>   $ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x4
> 
> @@ -493,6 +494,11 @@ improvements as there will be more total CPU occupancy 
> available::
> 
> NIC port0 <-> OVS <-> VM <-> OVS <-> NIC port 1
> 
> +  .. note::
> +It is recommended that the OVS control thread and pmd thread shouldn't be
> +pinned to the same core i.e 'dpdk-lcore-mask' and 'pmd-cpu-mask' cpu mask
> +settings should be non-overlapping.
> +
> DPDK Physical Port Rx Queues
> 
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index f83b632..6bbd786 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -3712,6 +3712,10 @@ pmd_thread_main(void *f_)
> ovs_numa_thread_setaffinity_core(pmd->core_id);
> dpdk_set_lcore_id(pmd->core_id);
> poll_cnt = pmd_load_queues_and_ports(pmd, &poll_list);
> +
> +/* Set pmd thread's nice value to -20 */
> +#define MIN_NICE -20
> +ovs_numa_thread_setpriority(MIN_NICE);
> reload:
> emc_cache_init(&pmd->flow_cache);
> 
> diff --git a/lib/ovs-numa.c b/lib/ovs-numa.c
> index 98e97cb..a1921b3 100644
> --- a/lib/ovs-numa.c
> +++ b/lib/ovs-numa.c
> @@ -23,6 +23,7 @@
> #include 
> #include 
> #include 
> +#include 
> #include 
> #include 
> #endif /* __linux__ */
> @@ -570,3 +571,23 @@ int ovs_numa_thread_setaffinity_core(unsigned core_id 
> OVS_UNUSED)
> return EOPNOTSUPP;
> #endif /* __linux__ */
> }
> +
> +int
> +ovs_numa_thread_setpriority(int nice OVS_UNUSED)
> +{
> +if (dummy_numa) {
> +return 0;
> +}
> +
> +#ifndef _WIN32
> +int err;
> +err = setpriority(PRIO_PROCESS, 0, nice);
> +if (err) {
> +VLOG_ERR("Thread priority error %s",ovs_strerror(err));
> +}
> +
> +return 0;

return err;

if success, err == 0

> +#else
> +return EOPNOTSUPP;
> +#endif
> +}
> diff --git a/lib/ovs-numa.h b/lib/ovs-numa.h
> index 6946cdc..e132483 100644
> --- a/lib/ovs-numa.h
> +++ b/lib/ovs-numa.h
> @@ -62,6 +62,7 @@ bool ovs_numa_dump_contains_core(const struct ovs_numa_dump 
> *,
> size_t ovs_numa_dump_count(const struct ovs_numa_dump *);
> void ovs_numa_dump_destroy(struct ovs_numa_dump *);
> int ovs_numa_thread_setaffinity_core(unsigned core_id);
> +int ovs_numa_thread_setpriority(int nice);
> 
> #define F

Re: [ovs-dev] [PATCH 1/4] dpif-netdev: Avoid reading RSS hash when EMC is disabled.

2017-06-23 Thread Fischetti, Antonio
Hi Billy,
thanks a lot for you suggestions. Those would really help re-factoring
the code by avoiding duplications.
The thing is that this patch 1/4 is mainly a preparation for the 
next patch 2/4. So I did these changes with the next patch 2/4 in mind.

The final result I meant to achieve in patch 2/4 is the following.
EMC lookup is skipped - not only when EMC is disabled - but also when
(we're processing recirculated packets) && (the EMC is 'enough' full).
The purpose is to avoid EMC thrashing.

Below is how the code looks like after applying patches 1/4 and 2/4.
Please let me know if you can find some similar optimizations to 
avoid code duplications, that would be great.

/*
 * EMC lookup is skipped when one or both of the following
 * two cases occurs:
 *
 *   - EMC is disabled.  This is detected from cur_min.
 *
 *   - The EMC occupancy exceeds EMC_FULL_THRESHOLD and the
 * packet to be classified is being recirculated.  When this
 * happens also EMC insertions are skipped for recirculated
 * packets.  So that EMC is used just to store entries which
 * are hit from the 'original' packets.  This way the EMC
 * thrashing is mitigated with a benefit on performance.
 */
if (!md_is_valid) {
pkt_metadata_init(&packet->md, port_no);
miniflow_extract(packet, &key->mf);  <== this fn must be called 
after pkt_metadta_init
/* This is not a recirculated packet. */
if (OVS_LIKELY(cur_min)) {
/* EMC is enabled.  We can retrieve the 5-tuple hash
 * without considering the recirc id. */
if (OVS_LIKELY(dp_packet_rss_valid(packet))) {
key->hash = dp_packet_get_rss_hash(packet);
} else {
key->hash = miniflow_hash_5tuple(&key->mf, 0);
dp_packet_set_rss_hash(packet, key->hash);
}
flow = emc_lookup(flow_cache, key);
} else {
/* EMC is disabled, skip emc_lookup. */
flow = NULL;
}
} else {
/* Recirculated packets. */
miniflow_extract(packet, &key->mf);
if (flow_cache->n_entries & EMC_FULL_THRESHOLD) {
/* EMC occupancy is over the threshold.  We skip EMC
 * lookup for recirculated packets. */
flow = NULL;
} else {
if (OVS_LIKELY(cur_min)) {
key->hash = dpif_netdev_packet_get_rss_hash(packet,
&key->mf);
flow = emc_lookup(flow_cache, key);
} else {
flow = NULL;
}
}
}


Basically patch 1/4 is mostly a preliminary change for 2/4.

Yes, patch 1/4 also allows to avoid reading hash when EMC is disabled.
Or - for packets that are not recirculated - avoids calling
recirc_depth_get_unsafe() when reading the hash.

Also, as these functions are critical for performance, I tend to avoid
adding new Booleans that require new if statements.


Thanks,
Antonio

> -Original Message-
> From: O Mahony, Billy
> Sent: Friday, June 23, 2017 1:54 PM
> To: Fischetti, Antonio ; d...@openvswitch.org
> Subject: RE: [ovs-dev] [PATCH 1/4] dpif-netdev: Avoid reading RSS hash
> when EMC is disabled.
> 
> Hi Antonio,
> 
> In this patch of the patchset there are three lines removed from the
> direct command flow:
> 
> -miniflow_extract(packet, &key->mf);
> -key->hash = dpif_netdev_packet_get_rss_hash(packet, &key->mf);
> -flow = (cur_min == 0) ? NULL: emc_lookup(flow_cache, key);
> 
> Which are then replicated in several different branches for logic. This is
> a lot of duplication of logic.
> 
> I *think* (I haven't tested it) this can be re-written with less branching
> like this:
> 
>  if (!md_is_valid) {
>  pkt_metadata_init(&packet->md, port_no);
>  }
>  miniflow_extract(packet, &key->mf);
>  if (OVS_LIKELY(cur_min)) {
>  if (md_is_valid) {
>  key->hash = dpif_netdev_packet_get_rss_hash(packet,
> &key->mf);
>  }
>  else
>  {
>  if (OVS_LIKELY(dp_packet_rss_valid(packet))) {
>  key->hash = dp_packet_get_rss_hash(packet);
>  } else {
>  key->hash = miniflow_hash_5tuple(&key->mf, 0);
>  dp_packet_set_rss_hash(packet, key->hash);
>  }
>  flow = emc_lookup(flow_cache, key);
>  }
>  } else {
>  flow = NULL;
>  }
> 
> Also if I'm understanding correctly the final effect of the patch is that
> in the case where !md_is_v

Re: [ovs-dev] [PATCH net-next v3 1/6] vxlan: refactor verification and application of configuration

2017-06-23 Thread Johannes Berg
On Fri, 2017-06-23 at 14:02 +0200, Matthias Schiffer wrote:
> 
> It seems though that rtnl_link_ops.newlink/changelink don't allow
> passing the extack yet... how do we proceed here? Treewide change
> (maybe by someone who knows their Coccinelle-fu?), or would the
> introduction of new versions of the newlink and changelink fields be
> more acceptable, so drivers can be moved to the new API one by one?

I think treewide change is easy enough, this seems to work:

@ops1@
identifier newfn, ops;
@@
static struct rtnl_link_ops ops = {
.newlink = newfn,
...
};

@@
identifier ops1.newfn;
identifier src_net, dev, tb, data;
@@
-int newfn(struct net *src_net, struct net_device *dev,
-      struct nlattr *tb[], struct nlattr *data[])
+int newfn(struct net *src_net, struct net_device *dev,
+      struct nlattr *tb[], struct nlattr *data[],
+      struct netlink_ext_ack *extack)
{...}

@ops2@
identifier chfn, ops;
@@
static struct rtnl_link_ops ops = {
.changelink = chfn,
...
};

@@
identifier ops2.chfn;
identifier dev, tb, data;
@@
-int chfn(struct net_device *dev,
-     struct nlattr *tb[], struct nlattr *data[])
+int chfn(struct net_device *dev,
+     struct nlattr *tb[], struct nlattr *data[],
+     struct netlink_ext_ack *extack)
{...}

I guess if there are any stragglers you'd find them by compile-testing
:)

johannes
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 4/4] dp-packet: Use memcpy to copy dp_packet fields.

2017-06-23 Thread O Mahony, Billy
Hi Antonio,

I'm not sure that this approach will work. Mainly the memcpy will not take 
account of any padding that the compiler may insert and the struct.  Maybe if 
the struct was defined as packed it could fix this objection.

Also anyone editing the structure in future would have to be aware that these 
elements need to be kept contiguous in order for packet_clone to keep working. 

Or if the relevant fields here were all placed in the their own nested struct 
then sizeof that nested struct could be used in the memcpy call as the sizeof 
nested_struct would account for whatever padding the compiler inserted. 

Does this change give much of a performance increase? 

Also I commented on 1/4 of this patchset about a cover letter - but if the 
patchset members are independent of each other then maybe they should just be 
separate patches.

Regards,
Billy.

> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of antonio.fische...@intel.com
> Sent: Monday, June 19, 2017 11:12 AM
> To: d...@openvswitch.org
> Subject: [ovs-dev] [PATCH 4/4] dp-packet: Use memcpy to copy dp_packet
> fields.
> 
> From: Antonio Fischetti 
> 
> memcpy replaces the single copies inside
> dp_packet_clone_with_headroom().
> 
> Signed-off-by: Antonio Fischetti 
> ---
>  lib/dp-packet.c | 17 +
>  1 file changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/lib/dp-packet.c b/lib/dp-packet.c index 67aa406..5b1d416 100644
> --- a/lib/dp-packet.c
> +++ b/lib/dp-packet.c
> @@ -157,7 +157,7 @@ dp_packet_clone(const struct dp_packet *buffer)
>  return dp_packet_clone_with_headroom(buffer, 0);  }
> 
> -/* Creates and returns a new dp_packet whose data are copied from
> 'buffer'.   The
> +/* Creates and returns a new dp_packet whose data are copied from
> +'buffer'. The
>   * returned dp_packet will additionally have 'headroom' bytes of headroom.
> */  struct dp_packet *  dp_packet_clone_with_headroom(const struct
> dp_packet *buffer, size_t headroom) @@ -167,13 +167,14 @@
> dp_packet_clone_with_headroom(const struct dp_packet *buffer, size_t
> headroom)
>  new_buffer =
> dp_packet_clone_data_with_headroom(dp_packet_data(buffer),
>   dp_packet_size(buffer),
>   headroom);
> -new_buffer->l2_pad_size = buffer->l2_pad_size;
> -new_buffer->l2_5_ofs = buffer->l2_5_ofs;
> -new_buffer->l3_ofs = buffer->l3_ofs;
> -new_buffer->l4_ofs = buffer->l4_ofs;
> -new_buffer->md = buffer->md;
> -new_buffer->cutlen = buffer->cutlen;
> -new_buffer->packet_type = buffer->packet_type;
> +memcpy(&new_buffer->l2_pad_size, &buffer->l2_pad_size,
> +sizeof(new_buffer->l2_pad_size) +
> +sizeof(new_buffer->l2_5_ofs) +
> +sizeof(new_buffer->l3_ofs) +
> +sizeof(new_buffer->l4_ofs) +
> +sizeof(new_buffer->cutlen) +
> +sizeof(new_buffer->packet_type) +
> +sizeof(new_buffer->md));
>  #ifdef DPDK_NETDEV
>  new_buffer->mbuf.ol_flags = buffer->mbuf.ol_flags;  #else
> --
> 2.4.11
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/4] dpif-netdev: Avoid reading RSS hash when EMC is disabled.

2017-06-23 Thread O Mahony, Billy
Hi Antonio,

In this patch of the patchset there are three lines removed from the direct 
command flow:

-miniflow_extract(packet, &key->mf);
-key->hash = dpif_netdev_packet_get_rss_hash(packet, &key->mf);
-flow = (cur_min == 0) ? NULL: emc_lookup(flow_cache, key);

Which are then replicated in several different branches for logic. This is a 
lot of duplication of logic. 

I *think* (I haven't tested it) this can be re-written with less branching like 
this:

 if (!md_is_valid) {
 pkt_metadata_init(&packet->md, port_no);
 }
 miniflow_extract(packet, &key->mf);
 if (OVS_LIKELY(cur_min)) {
 if (md_is_valid) {
 key->hash = dpif_netdev_packet_get_rss_hash(packet, 
&key->mf);
 }
 else
 {
 if (OVS_LIKELY(dp_packet_rss_valid(packet))) {
 key->hash = dp_packet_get_rss_hash(packet);
 } else {
 key->hash = miniflow_hash_5tuple(&key->mf, 0);
 dp_packet_set_rss_hash(packet, key->hash);
 }
 flow = emc_lookup(flow_cache, key);
 }
 } else {
 flow = NULL;
 }

Also if I'm understanding correctly the final effect of the patch is that in 
the case where !md_is_valid it effectively replicates the work of 
dpif_netdev_packet_get_rss_hash() but leaving out the if (recirc_depth) block 
of that fn. This is effectively overriding the return value of 
recirc_depth_get_unsafe in dpif_netdev_packet_get_rss_hash() and 
forcing/assuming that it is zero. 

If so it would be less disturbing to the existing code to just add a bool arg 
to dpif_netdev_packet_get_rss_hash() called do_not_check_recirc_depth and use 
that to return early (before the if (recirc_depth) check). Also in that case 
the patch would require none of the  conditional logic changes (neither the 
original or that suggested in this email) and should be able to just set the 
proposed do_not_check_recirc_depth based on md_is_valid.

Also this is showing up as a patch set can you add a cover letter to outline 
the overall goal of the patchset.

Thanks,
Billy. 


> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of antonio.fische...@intel.com
> Sent: Monday, June 19, 2017 11:12 AM
> To: d...@openvswitch.org
> Subject: [ovs-dev] [PATCH 1/4] dpif-netdev: Avoid reading RSS hash when
> EMC is disabled.
> 
> From: Antonio Fischetti 
> 
> When EMC is disabled the reading of RSS hash is skipped.
> For packets that are not recirculated it retrieves the hash value without
> considering the recirc id.
> 
> This is mostly a preliminary change for the next patch in this series.
> 
> Signed-off-by: Antonio Fischetti 
> ---
> In our testbench we used monodirectional traffic with 64B UDP packets PDM
> threads:  2 Traffic gen. streams: 1
> 
> we saw the following performance improvement:
> 
> Orig   11.49 Mpps
> With Patch#1:  11.62 Mpps
> 
>  lib/dpif-netdev.c | 30 +-
>  1 file changed, 25 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 02af32e..fd2ed52
> 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -4584,13 +4584,33 @@ emc_processing(struct dp_netdev_pmd_thread
> *pmd,
> 
>  if (!md_is_valid) {
>  pkt_metadata_init(&packet->md, port_no);
> +miniflow_extract(packet, &key->mf);
> +/* This is not a recirculated packet. */
> +if (OVS_LIKELY(cur_min)) {
> +/* EMC is enabled.  We can retrieve the 5-tuple hash
> + * without considering the recirc id. */
> +if (OVS_LIKELY(dp_packet_rss_valid(packet))) {
> +key->hash = dp_packet_get_rss_hash(packet);
> +} else {
> +key->hash = miniflow_hash_5tuple(&key->mf, 0);
> +dp_packet_set_rss_hash(packet, key->hash);
> +}
> +flow = emc_lookup(flow_cache, key);
> +} else {
> +/* EMC is disabled, skip emc_lookup. */
> +flow = NULL;
> +}
> +} else {
> +/* Recirculated packets. */
> +miniflow_extract(packet, &key->mf);
> +if (OVS_LIKELY(cur_min)) {
> +key->hash = dpif_netdev_packet_get_rss_hash(packet, &key-
> >mf);
> +flow = emc_lookup(flow_cache, key);
> +} else {
> +flow = NULL;
> +}
>  }
> -miniflow_extract(packet, &key->mf);
>  key->len = 0; /* Not computed yet. */
> -key->hash = dpif_netdev_packet_get_rss_hash(packet, &key->mf);
> -
> -/* If EMC is disabled skip emc_lookup */
> -flow = (cur_min == 0) ? NULL: emc_lookup(flo

Re: [ovs-dev] Request to tag 2.7 branch with v2.7.1

2017-06-23 Thread Russell Bryant
On Thu, Jun 8, 2017 at 12:12 PM, Ben Pfaff  wrote:
> On Thu, Jun 08, 2017 at 04:36:21AM +0530, Numan Siddique wrote:
>> Is it possible to tag the 2.7 branch with v2.7.1. Recently there were many
>> back ports to 2.7 branch. OVS 2.7 is required for RDO [1] and the RDO
>> community is expecting a new tag to build the latest OVS 2.7 branch.
>
> It's fine with me if we release 2.7.1.  Justin?

Hi all, just wanted to check in on 2.7.1.  Last week Justin said there
may be some more fixes from Joe to get in first.  Is anything still
pending?  I also know some folks have been traveling this week.

-- 
Russell Bryant
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 3/3] dpctl: Add new 'ct-bkts' command.

2017-06-23 Thread antonio . fischetti
From: Antonio Fischetti 

With the command:
 ovs-appctl dpctl/ct-bkts
shows the number of connections per bucket.

By using a threshold:
 ovs-appctl dpctl/ct-bkts gt=N
for each bucket shows the number of connections when they
are greater than N.

Signed-off-by: Antonio Fischetti 
Signed-off-by: Bhanuprakash Bodireddy 
Co-authored-by: Bhanuprakash Bodireddy 
---
 lib/conntrack.c   |  5 ++--
 lib/conntrack.h   |  4 +--
 lib/ct-dpif.h |  4 +++
 lib/dpctl.c   | 75 +++
 lib/dpctl.man |  8 ++
 utilities/ovs-dpctl.c |  1 +
 6 files changed, 92 insertions(+), 5 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index c62ff2a..a1ea350 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -1927,7 +1927,7 @@ conn_key_to_tuple(const struct conn_key *key, struct 
ct_dpif_tuple *tuple)
 
 static void
 conn_to_ct_dpif_entry(const struct conn *conn, struct ct_dpif_entry *entry,
-  long long now)
+  long long now, int bkt)
 {
 struct ct_l4_proto *class;
 long long expiration;
@@ -1950,6 +1950,7 @@ conn_to_ct_dpif_entry(const struct conn *conn, struct 
ct_dpif_entry *entry,
 if (class->conn_get_protoinfo) {
 class->conn_get_protoinfo(conn, &entry->protoinfo);
 }
+entry->bkt = bkt;
 }
 
 int
@@ -1987,7 +1988,7 @@ conntrack_dump_next(struct conntrack_dump *dump, struct 
ct_dpif_entry *entry)
 INIT_CONTAINER(conn, node, node);
 if ((!dump->filter_zone || conn->key.zone == dump->zone) &&
  (conn->conn_type != CT_CONN_TYPE_UN_NAT)) {
-conn_to_ct_dpif_entry(conn, entry, now);
+conn_to_ct_dpif_entry(conn, entry, now, dump->bucket);
 break;
 }
 /* Else continue, until we find an entry in the appropriate zone
diff --git a/lib/conntrack.h b/lib/conntrack.h
index 243aebb..fe1b21d 100644
--- a/lib/conntrack.h
+++ b/lib/conntrack.h
@@ -28,6 +28,7 @@
 #include "ovs-atomic.h"
 #include "ovs-thread.h"
 #include "packets.h"
+#include "ct-dpif.h"
 
 /* Userspace connection tracker
  * 
@@ -242,9 +243,6 @@ struct conntrack_bucket {
 long long next_cleanup OVS_GUARDED;
 };
 
-#define CONNTRACK_BUCKETS_SHIFT 8
-#define CONNTRACK_BUCKETS (1 << CONNTRACK_BUCKETS_SHIFT)
-
 struct conntrack {
 /* Independent buckets containing the connections */
 struct conntrack_bucket buckets[CONNTRACK_BUCKETS];
diff --git a/lib/ct-dpif.h b/lib/ct-dpif.h
index cd35f3e..b2a2f9e 100644
--- a/lib/ct-dpif.h
+++ b/lib/ct-dpif.h
@@ -20,6 +20,9 @@
 #include "openvswitch/types.h"
 #include "packets.h"
 
+#define CONNTRACK_BUCKETS_SHIFT 8
+#define CONNTRACK_BUCKETS (1 << CONNTRACK_BUCKETS_SHIFT)
+
 union ct_dpif_inet_addr {
 ovs_be32 ip;
 ovs_be32 ip6[4];
@@ -169,6 +172,7 @@ struct ct_dpif_entry {
 /* Timeout for this entry in seconds */
 uint32_t timeout;
 uint32_t mark;
+int bkt; /* Number of CT bucket. */
 };
 
 enum {
diff --git a/lib/dpctl.c b/lib/dpctl.c
index 4cca9c8..3081b0e 100644
--- a/lib/dpctl.c
+++ b/lib/dpctl.c
@@ -1443,6 +1443,80 @@ dpctl_ct_stats_show(int argc, const char *argv[],
 dpif_close(dpif);
 return error;
 }
+
+#define CT_BKTS_GT "gt="
+static int
+dpctl_ct_bkts(int argc, const char *argv[],
+ struct dpctl_params *dpctl_p)
+{
+struct dpif *dpif;
+char *name;
+
+struct ct_dpif_dump_state *dump;
+struct ct_dpif_entry cte;
+uint16_t gt = 0; /* Threshold: display value when greater than gt. */
+uint16_t *pzone = NULL;
+int lastargc = 0;
+
+int conn_per_bkts[CONNTRACK_BUCKETS];
+int error;
+
+while (argc > 1 && lastargc != argc) {
+lastargc = argc;
+if (!strncmp(argv[argc - 1], CT_BKTS_GT, strlen(CT_BKTS_GT))) {
+if (ovs_scan(argv[argc - 1], CT_BKTS_GT"%"SCNu16, >)) {
+argc--;
+}
+}
+}
+
+name = (argc > 1) ? xstrdup(argv[1]) : get_one_dp(dpctl_p);
+if (!name) {
+return EINVAL;
+}
+
+error = parsed_dpif_open(name, false, &dpif);
+free(name);
+if (error) {
+dpctl_error(dpctl_p, error, "opening datapath");
+return error;
+}
+
+error = ct_dpif_dump_start(dpif, &dump, pzone);
+if (error) {
+dpctl_error(dpctl_p, error, "starting conntrack dump");
+dpif_close(dpif);
+return error;
+}
+
+memset(conn_per_bkts, 0, sizeof(conn_per_bkts));
+int tot_conn = 0;
+while (!ct_dpif_dump_next(dump, &cte)) {
+ct_dpif_entry_uninit(&cte);
+tot_conn++;
+conn_per_bkts[cte.bkt]++;
+}
+
+dpctl_print(dpctl_p, "Current Connections: %d\n", tot_conn);
+dpctl_print(dpctl_p, "|  Buckets  |"
+" Connections per Buckets |\n");
+dpctl_print(dpctl_p, "+---+"
+"-+");
+#define NUM_BKTS_PER_ROW 8
+fo

[ovs-dev] [PATCH 2/3] dpctl: add CT Stats for Connections per protocol.

2017-06-23 Thread antonio . fischetti
From: Antonio Fischetti 

Adds CT stats to report number of connections grouped by
protocol.
By using
 utilities/ovs-appctl dpctl/ct-stats-show
it can display something like:
Connections Stats:
Total: 1808
TCP: 1808

With the verbose options:
 utilities/ovs-appctl dpctl/ct-stats-show verbose
it can display:
Connections Stats:
Total: 2671
TCP: 2671
  Conn per TCP states:
  [ESTABLISHED]=1000
  [CLOSING]=1
  [TIME_WAIT]=1670

Signed-off-by: Antonio Fischetti 
Signed-off-by: Bhanuprakash Bodireddy 
Co-authored-by: Bhanuprakash Bodireddy 
---
 lib/ct-dpif.c |  14 +
 lib/ct-dpif.h |  18 ++-
 lib/dpctl.c   | 138 +-
 lib/dpctl.man |   7 +++
 utilities/ovs-dpctl.c |   2 +
 5 files changed, 177 insertions(+), 2 deletions(-)

diff --git a/lib/ct-dpif.c b/lib/ct-dpif.c
index 8f0b4ed..f8d2cf1 100644
--- a/lib/ct-dpif.c
+++ b/lib/ct-dpif.c
@@ -409,3 +409,17 @@ ct_dpif_format_helper(struct ds *ds, const char *title,
 ds_put_cstr(ds, helper->name);
 }
 }
+
+uint8_t
+ct_dpif_coalesce_tcp_state(uint8_t state)
+{
+return coalesce_tcp_state(state);
+}
+
+void
+ct_dpif_format_tcp_stat(struct ds * ds, int tcp_state, int conn_per_state)
+{
+ct_dpif_format_enum(ds, "\t  [", tcp_state, ct_dpif_tcp_state_string);
+ds_put_cstr(ds, "]");
+ds_put_format(ds, "=%u", conn_per_state);
+}
diff --git a/lib/ct-dpif.h b/lib/ct-dpif.h
index e8e159a..cd35f3e 100644
--- a/lib/ct-dpif.h
+++ b/lib/ct-dpif.h
@@ -70,7 +70,8 @@ struct ct_dpif_timestamp {
 CT_DPIF_TCP_STATE(CLOSING) \
 CT_DPIF_TCP_STATE(LAST_ACK) \
 CT_DPIF_TCP_STATE(FIN_WAIT_2) \
-CT_DPIF_TCP_STATE(TIME_WAIT)
+CT_DPIF_TCP_STATE(TIME_WAIT) \
+CT_DPIF_TCP_STATE(MAX_NUM)
 
 enum ct_dpif_tcp_state {
 #define CT_DPIF_TCP_STATE(STATE) CT_DPIF_TCPS_##STATE,
@@ -170,6 +171,19 @@ struct ct_dpif_entry {
 uint32_t mark;
 };
 
+enum {
+CT_STATS_UDP,
+CT_STATS_TCP,
+CT_STATS_SCTP,
+CT_STATS_ICMP,
+CT_STATS_ICMPV6,
+CT_STATS_UDPLITE,
+CT_STATS_DCCP,
+CT_STATS_IGMP,
+CT_STATS_OTHER,
+CT_STATS_MAX,
+};
+
 struct dpif;
 
 struct ct_dpif_dump_state {
@@ -185,5 +199,7 @@ void ct_dpif_entry_uninit(struct ct_dpif_entry *);
 void ct_dpif_format_entry(const struct ct_dpif_entry *, struct ds *,
   bool verbose, bool print_stats);
 void ct_dpif_format_tuple(struct ds *, const struct ct_dpif_tuple *);
+uint8_t ct_dpif_coalesce_tcp_state(uint8_t state);
+void ct_dpif_format_tcp_stat(struct ds *, int, int);
 
 #endif /* CT_DPIF_H */
diff --git a/lib/dpctl.c b/lib/dpctl.c
index cde5341..4cca9c8 100644
--- a/lib/dpctl.c
+++ b/lib/dpctl.c
@@ -1308,6 +1308,141 @@ dpctl_flush_conntrack(int argc, const char *argv[],
 dpif_close(dpif);
 return error;
 }
+
+static int
+dpctl_ct_stats_show(int argc, const char *argv[],
+ struct dpctl_params *dpctl_p)
+{
+struct dpif *dpif;
+char *name;
+
+struct ct_dpif_dump_state *dump;
+struct ct_dpif_entry cte;
+uint16_t zone, *pzone = NULL;
+bool verbose = false;
+int lastargc = 0;
+
+int proto_stats[CT_STATS_MAX];
+int tcp_conn_per_states[CT_DPIF_TCPS_MAX_NUM];
+int error;
+
+while (argc > 1 && lastargc != argc) {
+lastargc = argc;
+if (!strncmp(argv[argc - 1], "verbose", 7)) {
+verbose = true;
+argc--;
+} else if (!strncmp(argv[argc - 1], "zone=", 5)) {
+if (ovs_scan(argv[argc - 1], "zone=%"SCNu16, &zone)) {
+pzone = &zone;
+argc--;
+}
+}
+}
+
+name = (argc > 1) ? xstrdup(argv[1]) : get_one_dp(dpctl_p);
+if (!name) {
+return EINVAL;
+}
+
+error = parsed_dpif_open(name, false, &dpif);
+free(name);
+if (error) {
+dpctl_error(dpctl_p, error, "opening datapath");
+return error;
+}
+
+memset(proto_stats, 0, sizeof(proto_stats));
+memset(tcp_conn_per_states, 0, sizeof(tcp_conn_per_states));
+error = ct_dpif_dump_start(dpif, &dump, pzone);
+if (error) {
+dpctl_error(dpctl_p, error, "starting conntrack dump");
+dpif_close(dpif);
+return error;
+}
+
+int tot_conn = 0;
+while (!ct_dpif_dump_next(dump, &cte)) {
+ct_dpif_entry_uninit(&cte);
+tot_conn++;
+switch (cte.tuple_orig.ip_proto) {
+case IPPROTO_ICMP:
+proto_stats[CT_STATS_ICMP]++;
+break;
+case IPPROTO_ICMPV6:
+proto_stats[CT_STATS_ICMPV6]++;
+break;
+case IPPROTO_TCP:
+proto_stats[CT_STATS_TCP]++;
+uint8_t tcp_state;
+/* We keep two separate tcp states, but we print just one. The 
Linux
+ * kernel connection tracker internally keeps only one state, so
+ * 'state_orig' and 'state_reply', will be the same. */
+tcp_state = MA

[ovs-dev] [PATCH 1/3] Fix: coding style and some typos.

2017-06-23 Thread antonio . fischetti
From: Antonio Fischetti 

Fixes some lines exceeding 80 chars and a couple of typos.

Signed-off-by: Antonio Fischetti 
---
 lib/conntrack.c   | 2 +-
 lib/dpctl.c   | 6 --
 lib/dpif-netdev.c | 2 +-
 utilities/ovs-dpctl.c | 6 --
 4 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 90b154a..c62ff2a 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -1415,7 +1415,7 @@ extract_l4_icmp6(struct conn_key *key, const void *data, 
size_t size,
  *
  * If 'related' is not NULL and an ICMP error packet is being
  * processed, the function will extract the key from the packet nested
- * in the ICMP paylod and set '*related' to true.
+ * in the ICMP payload and set '*related' to true.
  *
  * If 'related' is NULL, it means that we're already parsing a header nested
  * in an ICMP error.  In this case, we skip checksum and length validation. */
diff --git a/lib/dpctl.c b/lib/dpctl.c
index 2ad475b..cde5341 100644
--- a/lib/dpctl.c
+++ b/lib/dpctl.c
@@ -1593,7 +1593,8 @@ static const struct dpctl_command all_commands[] = {
 { "set-if", "dp iface...", 2, INT_MAX, dpctl_set_if, DP_RW },
 { "dump-dps", "", 0, 0, dpctl_dump_dps, DP_RO },
 { "show", "[dp...]", 0, INT_MAX, dpctl_show, DP_RO },
-{ "dump-flows", "[dp] [filter=..] [type=..]", 0, 3, dpctl_dump_flows, 
DP_RO },
+{ "dump-flows", "[dp] [filter=..] [type=..]",
+0, 3, dpctl_dump_flows, DP_RO },
 { "add-flow", "[dp] flow actions", 2, 3, dpctl_add_flow, DP_RW },
 { "mod-flow", "[dp] flow actions", 2, 3, dpctl_mod_flow, DP_RW },
 { "get-flow", "[dp] ufid", 1, 2, dpctl_get_flow, DP_RO },
@@ -1606,7 +1607,8 @@ static const struct dpctl_command all_commands[] = {
 
 /* Undocumented commands for testing. */
 { "parse-actions", "actions", 1, INT_MAX, dpctl_parse_actions, DP_RO },
-{ "normalize-actions", "actions", 2, INT_MAX, dpctl_normalize_actions, 
DP_RO },
+{ "normalize-actions", "actions",
+2, INT_MAX, dpctl_normalize_actions, DP_RO },
 
 { NULL, NULL, 0, 0, NULL, DP_RO },
 };
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index f83b632..825af19 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -3020,7 +3020,7 @@ dpif_netdev_queue_to_priority(const struct dpif *dpif 
OVS_UNUSED,
 
 
 /* Creates and returns a new 'struct dp_netdev_actions', whose actions are
- * a copy of the 'ofpacts_len' bytes of 'ofpacts'. */
+ * a copy of the 'size' bytes of 'actions' input parameters. */
 struct dp_netdev_actions *
 dp_netdev_actions_create(const struct nlattr *actions, size_t size)
 {
diff --git a/utilities/ovs-dpctl.c b/utilities/ovs-dpctl.c
index 843d305..1cc92f5 100644
--- a/utilities/ovs-dpctl.c
+++ b/utilities/ovs-dpctl.c
@@ -177,8 +177,10 @@ usage(void *userdata OVS_UNUSED)
"  get-flow [DP] ufid:UFIDfetch flow corresponding to UFID\n"
"  del-flow [DP] FLOW delete FLOW from DP\n"
"  del-flows [DP] delete all flows from DP\n"
-   "  dump-conntrack [DP] [zone=ZONE]  display conntrack entries for 
ZONE\n"
-   "  flush-conntrack [DP] [zone=ZONE] delete all conntrack entries in 
ZONE\n"
+   "  dump-conntrack [DP] [zone=ZONE]  " \
+   "display conntrack entries for ZONE\n"
+   "  flush-conntrack [DP] [zone=ZONE] " \
+   "delete all conntrack entries in ZONE\n"
"Each IFACE on add-dp, add-if, and set-if may be followed by\n"
"comma-separated options.  See ovs-dpctl(8) for syntax, or the\n"
"Interface table in ovs-vswitchd.conf.db(5) for an options list.\n"
-- 
2.4.11

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 5/6] netdev-dpdk: Add netdev_dpdk_vhost_txq_flush function.

2017-06-23 Thread Ilya Maximets
> Add netdev_dpdk_vhost_txq_flush(), that flushes packets on vHost User
> port queues. Also add netdev_dpdk_vhost_tx_burst() function that
> uses rte_vhost_enqueue_burst() to enqueue burst of packets on vHost User
> ports.
> 
> Signed-off-by: Bhanuprakash Bodireddy 
> Signed-off-by: Antonio Fischetti 
> Co-authored-by: Antonio Fischetti 
> Acked-by: Eelco Chaudron 
> ---
>  lib/netdev-dpdk.c | 76 
> ---
>  1 file changed, 72 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 50a9a2c..47343e8 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -307,12 +307,22 @@ struct dpdk_tx_queue {
>  * pmd threads (see 'concurrent_txq'). */
>  int map;   /* Mapping of configured vhost-user queues
>  * to enabled by guest. */
> -int dpdk_pkt_cnt;  /* Number of buffered packets waiting to
> +union {
> +int dpdk_pkt_cnt;  /* Number of buffered packets waiting to
>be sent on DPDK tx queue. */
> -struct rte_mbuf *dpdk_burst_pkts[INTERIM_QUEUE_BURST_THRESHOLD];
> +int vhost_pkt_cnt; /* Number of buffered packets waiting to
> +  be sent on vhost port. */
> +};
> +
> +union {
> +struct rte_mbuf *dpdk_burst_pkts[INTERIM_QUEUE_BURST_THRESHOLD];
> /* Intermediate queue where packets can
>  * be buffered to amortize the cost of 
> MMIO
>  * writes. */
> +struct dp_packet *vhost_burst_pkts[INTERIM_QUEUE_BURST_THRESHOLD];
> +   /* Intermediate queue where packets can
> +* be buffered for vhost ports. */
> +};
>  };
>  
>  /* dpdk has no way to remove dpdk ring ethernet devices
> @@ -1719,6 +1729,63 @@ netdev_dpdk_vhost_update_tx_counters(struct 
> netdev_stats *stats,
>  }
>  }
>  
> +static int
> +netdev_dpdk_vhost_tx_burst(struct netdev_dpdk *dev, int qid)
> +{
> +struct dpdk_tx_queue *txq = &dev->tx_q[qid];
> +struct rte_mbuf **cur_pkts = (struct rte_mbuf **)txq->vhost_burst_pkts;
> +
> +int tx_vid = netdev_dpdk_get_vid(dev);
> +int tx_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ;
> +uint32_t sent = 0;
> +uint32_t retries = 0;
> +uint32_t sum, total_pkts;
> +
> +total_pkts = sum = txq->vhost_pkt_cnt;
> +do {
> +uint32_t ret;
> +ret = rte_vhost_enqueue_burst(tx_vid, tx_qid, &cur_pkts[sent], sum);
> +if (OVS_UNLIKELY(!ret)) {
> +/* No packets enqueued - do not retry. */
> +break;
> +} else {
> +/* Packet have been sent */
> +sent += ret;
> +
> +/* 'sum' packet have to be retransmitted */
> +sum -= ret;
> +}
> +} while (sum && (retries++ < VHOST_ENQ_RETRY_NUM));
> +
> +for (int i = 0; i < total_pkts; i++) {
> +dp_packet_delete(txq->vhost_burst_pkts[i]);
> +}
> +
> +/* Reset pkt count */
> +txq->vhost_pkt_cnt = 0;
> +
> +/* 'sum' refers to packets dropped */
> +return sum;
> +}
> +
> +/* Flush the txq if there are any packets available.
> + * dynamic_txqs/concurrent_txq is disabled for vHost User ports as
> + * 'OVS_VHOST_MAX_QUEUE_NUM' txqs are preallocated.
> + */

This comment is completely untrue. You may ignore 'concurrent_txq'
because you *must* lock the queue in any case because of dynamic
txq remapping inside netdev-dpdk. You must take the spinlock for
the 'dev->tx_q[qid % netdev->n_txq].map' before sending packets.

In current implementation you're able to call send and flush
simultaneously for the same queue from different threads because
'flush' doesn't care about queue remapping.

See '__netdev_dpdk_vhost_send' and 'netdev_dpdk_remap_txqs' for
detail.

Additionally, flushing logic will be broken in case of txq remapping
because you may have different underneath queue each time you
trying to send of flush.

Have you ever tested this with multiqueue vhost?
With disabling/enabling queues inside the guest?

Best regards, Ilya Maximets.

> +static int
> +netdev_dpdk_vhost_txq_flush(struct netdev *netdev, int qid,
> +bool concurrent_txq OVS_UNUSED)
> +{
> +struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> +struct dpdk_tx_queue *txq = &dev->tx_q[qid];
> +
> +if (OVS_LIKELY(txq->vhost_pkt_cnt)) {
> +netdev_dpdk_vhost_tx_burst(dev, qid);
> +}
> +
> +return 0;
> +}
> +
>  static void
>  __netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
>   struct dp_packet **pkts, int cnt)
> @@ -3432,7 +3499,8 @@ static const struct netdev_class dpdk_vhost_class =
>  NULL,
>  netdev_dpdk_vhost_reconfigure,
>  netdev_dpdk_vhost_rxq_

Re: [ovs-dev] [PATCH net-next v3 1/6] vxlan: refactor verification and application of configuration

2017-06-23 Thread Matthias Schiffer
On 06/23/2017 12:23 PM, Johannes Berg wrote:
> On Fri, 2017-06-23 at 12:13 +0200, Matthias Schiffer wrote:
>>
>> I was told the extended netlink error facilities were not ready yet,
>> has that changed since the last release?
> 
> Yes, the facility is in the kernel tree now.
> 
>> Anyways, I will gladly work on improving the error handling if
>> someone can give me a pointer how these extended netlink errors are
>> used.
> 
> Just grep for 'netlink_ext_ack' :)
> 
> johannes
> 

Thanks for the hint.

It seems though that rtnl_link_ops.newlink/changelink don't allow passing
the extack yet... how do we proceed here? Treewide change (maybe by someone
who knows their Coccinelle-fu?), or would the introduction of new versions
of the newlink and changelink fields be more acceptable, so drivers can be
moved to the new API one by one?

Matthias

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 3/4] conntrack: pass current time to conntrack_execute.

2017-06-23 Thread Chandran, Sugesh
Hi Antonio,

The changes are LGTM.
Didn't find any issues with clang, gcc sparse builds and checkpatch.
Also in my testing didn't see any significant performance difference as such. 

Acked by: Sugesh Chandran 

Regards
_Sugesh


> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of antonio.fische...@intel.com
> Sent: Monday, June 19, 2017 11:12 AM
> To: d...@openvswitch.org
> Subject: [ovs-dev] [PATCH 3/4] conntrack: pass current time to
> conntrack_execute.
> 
> From: Antonio Fischetti 
> 
> Current time is passed to conntrack_execute so it doesn't have to recompute
> it again.
> 
> Signed-off-by: Antonio Fischetti 
> ---
>  lib/conntrack.c| 4 ++--
>  lib/conntrack.h| 3 ++-
>  lib/dpif-netdev.c  | 2 +-
>  tests/test-conntrack.c | 8 +---
>  4 files changed, 10 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/conntrack.c b/lib/conntrack.c index 90b154a..63ad273 100644
> --- a/lib/conntrack.c
> +++ b/lib/conntrack.c
> @@ -839,11 +839,11 @@ conntrack_execute(struct conntrack *ct, struct
> dp_packet_batch *pkt_batch,
>const uint32_t *setmark,
>const struct ovs_key_ct_labels *setlabel,
>const char *helper,
> -  const struct nat_action_info_t *nat_action_info)
> +  const struct nat_action_info_t *nat_action_info,
> +  long long now)
>  {
>  struct dp_packet **pkts = pkt_batch->packets;
>  size_t cnt = pkt_batch->count;
> -long long now = time_msec();
>  struct conn_lookup_ctx ctx;
> 
>  if (helper) {
> diff --git a/lib/conntrack.h b/lib/conntrack.h index 243aebb..763a68c 100644
> --- a/lib/conntrack.h
> +++ b/lib/conntrack.h
> @@ -95,7 +95,8 @@ int conntrack_execute(struct conntrack *, struct
> dp_packet_batch *,
>uint16_t zone, const uint32_t *setmark,
>const struct ovs_key_ct_labels *setlabel,
>const char *helper,
> -  const struct nat_action_info_t *nat_action_info);
> +  const struct nat_action_info_t *nat_action_info,
> +  long long now);
> 
>  struct conntrack_dump {
>  struct conntrack *ct;
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 64a3cd4..1c22afd
> 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -5341,7 +5341,7 @@ dp_execute_cb(void *aux_, struct
> dp_packet_batch *packets_,
> 
>  conntrack_execute(&dp->conntrack, packets_, aux->flow->dl_type,
> force,
>commit, zone, setmark, setlabel, helper,
> -  nat_action_info_ref);
> +  nat_action_info_ref, now);
>  break;
>  }
> 
> diff --git a/tests/test-conntrack.c b/tests/test-conntrack.c index
> f79a9fc..b0ba9ca 100644
> --- a/tests/test-conntrack.c
> +++ b/tests/test-conntrack.c
> @@ -84,12 +84,13 @@ ct_thread_main(void *aux_)
>  struct dp_packet_batch *pkt_batch;
>  ovs_be16 dl_type;
>  size_t i;
> +long long now = time_msec();
> 
>  pkt_batch = prepare_packets(batch_size, change_conn, aux->tid,
> &dl_type);
>  ovs_barrier_block(&barrier);
>  for (i = 0; i < n_pkts; i += batch_size) {
>  conntrack_execute(&ct, pkt_batch, dl_type, false, true, 0, NULL, 
> NULL,
> -  NULL, NULL);
> +  NULL, NULL, now);
>  }
>  ovs_barrier_block(&barrier);
>  destroy_packets(pkt_batch);
> @@ -154,6 +155,7 @@ pcap_batch_execute_conntrack(struct conntrack *ct,
> {
>  struct dp_packet_batch new_batch;
>  ovs_be16 dl_type = htons(0);
> +long long now = time_msec();
> 
>  dp_packet_batch_init(&new_batch);
> 
> @@ -172,7 +174,7 @@ pcap_batch_execute_conntrack(struct conntrack *ct,
> 
>  if (flow.dl_type != dl_type) {
>  conntrack_execute(ct, &new_batch, dl_type, false, true, 0,
> -  NULL, NULL, NULL, NULL);
> +  NULL, NULL, NULL, NULL, now);
>  dp_packet_batch_init(&new_batch);
>  }
>  new_batch.packets[new_batch.count++] = packet;; @@ -180,7 +182,7
> @@ pcap_batch_execute_conntrack(struct conntrack *ct,
> 
>  if (!dp_packet_batch_is_empty(&new_batch)) {
>  conntrack_execute(ct, &new_batch, dl_type, false, true, 0, NULL, 
> NULL,
> -  NULL, NULL);
> +  NULL, NULL, now);
>  }
> 
>  }
> --
> 2.4.11
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] packets: Do not initialize ct_orig_tuple.

2017-06-23 Thread Stokes, Ian
> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of Bhanuprakash Bodireddy
> Sent: Thursday, June 22, 2017 10:09 PM
> To: d...@openvswitch.org
> Subject: [ovs-dev] [PATCH] packets: Do not initialize ct_orig_tuple.
> 
> Commit "odp: Support conntrack orig tuple key." introduced new fields in
> struct 'pkt_metadata'.  pkt_metadata_init() is called for every packet in
> the userspace datapath.  When testing a simple single flow case with DPDK,
> we observe a lower throughput after the above commit (it was 14.88 Mpps
> before, it is 13 Mpps after).
> 
> This patch skips initializing ct_orig_tuple in pkt_metadata_init().
> It should be enough to initialize ct_state, because nobody should look at
> ct_orig_tuple unless ct_state is != 0.
> 
> It's discussed at:
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/332419.html
> 
> Fixes: daf4d3c18da4("odp: Support conntrack orig tuple key.")
> Signed-off-by: Daniele Di Proietto 
> Signed-off-by: Bhanuprakash Bodireddy 
> Co-authored-by: Bhanuprakash Bodireddy 
> ---
> Original RFC was posted by Daniele here:
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-March/329679.html
> 
> In this patch moved the offset from ct_orig_tuple to 'ct_orig_tuple_ipv6'.
> This patch fixes the performance drop(~2.3Mpps for P2P - 64 byte pkts)
> with OvS-DPDK on Master.
> 
>  lib/packets.h | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/packets.h b/lib/packets.h index a9d5e84..94c3dcc 100644
> --- a/lib/packets.h
> +++ b/lib/packets.h
> @@ -126,10 +126,19 @@ pkt_metadata_init_tnl(struct pkt_metadata *md)
> static inline void  pkt_metadata_init(struct pkt_metadata *md, odp_port_t
> port)  {
> +/* This is called for every packet in userspace datapath and affects
> + * performance if all the metadata is initialized. Hence absolutely
> + * necessary fields should be zeroed out.
> + *
> + * Initialize only the first 17 bytes of metadata (till ct_state).
> + * Once the ct_state is zeroed out rest of ct fields will not be
> looked
> + * at unless ct_state != 0.
> + */
> +memset(md, 0, offsetof(struct pkt_metadata, ct_orig_tuple_ipv6));
> +
>  /* It can be expensive to zero out all of the tunnel metadata.
> However,
>   * we can just zero out ip_dst and the rest of the data will never be
>   * looked at. */
> -memset(md, 0, offsetof(struct pkt_metadata, in_port));
>  md->tunnel.ip_dst = 0;
>  md->tunnel.ipv6_dst = in6addr_any;
>  md->in_port.odp_port = port;

Thanks for the patch Bhanu,

I've tested this and can confirm the performance improvement.

I'm not an expert on conntrack so not sure of the knock on effect (if any) it 
might have with that regard, maybe Darrell could comment on that aspect, but I 
would like to see this performance fix pushed.

Tested-by: Ian Stokes 
> --
> 2.4.11
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V2] netdev-dpdk: use rte_eth_dev_set_mtu

2017-06-23 Thread Kavanagh, Mark B
>From: Stokes, Ian
>Sent: Friday, June 23, 2017 11:11 AM
>To: Kavanagh, Mark B ; ovs-dev@openvswitch.org; 
>Varghese, Vipin
>; acon...@redhat.com
>Subject: RE: [ovs-dev] [PATCH V2] netdev-dpdk: use rte_eth_dev_set_mtu
>
>
>
>> -Original Message-
>> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
>> boun...@openvswitch.org] On Behalf Of Mark Kavanagh
>> Sent: Monday, June 12, 2017 12:05 PM
>> To: ovs-dev@openvswitch.org; Varghese, Vipin ;
>> acon...@redhat.com
>> Subject: [ovs-dev] [PATCH V2] netdev-dpdk: use rte_eth_dev_set_mtu
>>
>> DPDK provides an API to set the MTU of compatible physical devices -
>> rte_eth_dev_set_mtu(). Prior to DPDK v16.07 however, this API was not
>> implemented in some DPDK PMDs (i40e, specifically). To allow the use of
>> jumbo frames with affected NICs in OvS-DPDK, MTU configuration was
>> achieved by setting the jumbo frame flag, and corresponding maximum
>> permitted Rx frame size, in an rte_eth_conf structure for the NIC port,
>> and subsequently invoking rte_eth_dev_configure() with that configuration.
>>
>> However, that method does not set the MTU field of the underlying DPDK
>> structure (rte_eth_dev) for the corresponding physical device;
>> consequently, rte_eth_dev_get_mtu() reports the incorrect MTU for an OvS-
>> DPDK phy device with non-standard MTU.
>>
>> Resolve this issue by invoking rte_eth_dev_set_mtu() when setting up or
>> modifying the MTU of a DPDK phy port.
>>
>> Fixes: 0072e93 ("netdev-dpdk: add support for jumbo frames")
>> Reported-by: Aaron Conole 
>> Reported-by: Vipin Varghese 
>> Signed-off-by: Mark Kavanagh 
>> ---
>>
>> V2->v1:
>> - add 'reported-by' tag for Aaron Conole
>> - change VLOG_INFO to VLOG_ERR
>>
>>  lib/netdev-dpdk.c | 14 +++---
>>  1 file changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 810800e..c141dc2
>> 100644
>> --- a/lib/netdev-dpdk.c
>> +++ b/lib/netdev-dpdk.c
>> @@ -649,13 +649,6 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int
>> n_rxq, int n_txq)
>>  int i;
>>  struct rte_eth_conf conf = port_conf;
>>
>> -if (dev->mtu > ETHER_MTU) {
>> -conf.rxmode.jumbo_frame = 1;
>> -conf.rxmode.max_rx_pkt_len = dev->max_packet_len;
>> -} else {
>> -conf.rxmode.jumbo_frame = 0;
>> -conf.rxmode.max_rx_pkt_len = 0;
>> -}
>>  conf.rxmode.hw_ip_checksum = (dev->hw_ol_features &
>>NETDEV_RX_CHECKSUM_OFFLOAD) != 0;
>>  /* A device may report more queues than it makes available (this has
>> @@ -675,6 +668,13 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int
>> n_rxq, int n_txq)
>>  break;
>>  }
>>
>> +diag = rte_eth_dev_set_mtu(dev->port_id, dev->mtu);
>> +if (diag) {
>> +VLOG_ERR("Interface %s MTU (%d) setup error: %s",
>> +dev->up.name, dev->mtu, rte_strerror(-diag));
>> +break;
>> +}
>> +
>>  for (i = 0; i < n_txq; i++) {
>>  diag = rte_eth_tx_queue_setup(dev->port_id, i, dev->txq_size,
>>dev->socket_id, NULL);
>> --
>
>Hi Mark,
>
>I've tested this patch with both an Intel X710 and an Intel 82599ES. The X710 
>seems fine
>however for the 82599ES I found that setting an MTU greater than 1500 resulted 
>in the errors
>below and the port was left in a down state no longer able to pass traffic.
>
>2017-06-23T09:47:49Z|00089|netdev_dpdk|ERR|Interface dpdk0 MTU (9000) setup 
>error: Invalid
>argument
>2017-06-23T09:47:49Z|00090|netdev_dpdk|ERR|Interface dpdk0(rxq:1 txq:2) 
>configure error:
>Invalid argument
>2017-06-23T09:47:49Z|00091|dpif_netdev|ERR|Failed to set interface dpdk0 new 
>configuration
>2017-06-23T09:47:49Z|00092|ofproto|WARN|br0: cannot get STP status on 
>nonexistent port 1
>2017-06-23T09:47:49Z|00093|ofproto|WARN|br0: cannot get RSTP status on 
>nonexistent port 1
>
>I could not reproduce the issue once I reverted the patch so it seems specific 
>to the use of
>rte_eth_dev_set_mtu() with the 82599ES.

Hey Ian,

Thanks for testing this patch; I actually became aware of this issue yesterday 
evening, when Sugesh alerted me to the same issue with Niantic.

As you're probably aware, each NIC has its own set of callback functions, each 
of which is invoked in response to an rte_eth_dev_xyz function call.
In this case, the rte_eth_dev_set_mtu function invokes ixgbe_dev_mtu_set, which 
checks if scattered_rx has previously been enabled for the NIC; if not, as is 
the case here, the MTU is refused.

The solution is most likely to enable scattered_rx in the rte_eth_dev's 
eth_conf.rx_mode - in any event, I'll include the fix in v3, which I'll push 
next week.

Thanks again,
Mark

>
>Ian
>
>> 1.9.3
>>
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
d

Re: [ovs-dev] [PATCH net-next v3 1/6] vxlan: refactor verification and application of configuration

2017-06-23 Thread Johannes Berg
On Fri, 2017-06-23 at 12:13 +0200, Matthias Schiffer wrote:
> 
> I was told the extended netlink error facilities were not ready yet,
> has that changed since the last release?

Yes, the facility is in the kernel tree now.

> Anyways, I will gladly work on improving the error handling if
> someone can give me a pointer how these extended netlink errors are
> used.

Just grep for 'netlink_ext_ack' :)

johannes
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/5] lib/dp-packet: init the mbuf to zero when build with DPDK

2017-06-23 Thread Kavanagh, Mark B
>From: Michael Qiu [mailto:qdy220091...@gmail.com]
>Sent: Monday, June 19, 2017 6:29 AM
>To: d...@openvswitch.org
>Cc: Kavanagh, Mark B ; b...@ovn.org; 
>db...@vmware.com; Michael Qiu
>
>Subject: [PATCH 1/5] lib/dp-packet: init the mbuf to zero when build with DPDK
>
>From: Michael Qiu 
>
>When building with DPDK, and using xmalloc() to get a new packet,
>field mbuf of the packet will not be initialized, but it's very important for
>DPDK port when copying the data to DPDK mbuf, because if ol_flags
>and other info are random values, DPDK driver may hang.
>
>Signed-off-by: Michael Qiu 

Hi Michael,

Thanks for the updated version of the patchset.

This patch LGTM - I also compiled this with gcc, clang, and sparse without 
issue. Checkpatch reports no obvious problems either.

Acked-by: Mark Kavanagh ---
> lib/dp-packet.c | 3 +++
> 1 file changed, 3 insertions(+)
>
>diff --git a/lib/dp-packet.c b/lib/dp-packet.c
>index 67aa406..ee2c449 100644
>--- a/lib/dp-packet.c
>+++ b/lib/dp-packet.c
>@@ -134,6 +134,9 @@ struct dp_packet *
> dp_packet_new(size_t size)
> {
> struct dp_packet *b = xmalloc(sizeof *b);
>+#ifdef DPDK_NETDEV
>+memset(&(b->mbuf), 0, sizeof(struct rte_mbuf));
>+#endif
> dp_packet_init(b, size);
> return b;
> }
>--
>1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next v3 1/6] vxlan: refactor verification and application of configuration

2017-06-23 Thread Matthias Schiffer
On 06/23/2017 10:52 AM, Jiri Benc wrote:
> This patchset looks good overall (would send my Acked-by for most of
> this but I'm late).
> 
> On Mon, 19 Jun 2017 10:03:55 +0200, Matthias Schiffer wrote:
>> Log messages in these
>> functions are removed, as it is generally unexpected to find error output
>> for netlink requests in the kernel log. Userspace should be able to handle
>> errors based on the error codes returned via netlink just fine.
> 
> However, this is not really true. It's impossible to find out what went
> wrong when you use e.g. iproute2 to configure a vxlan link.
> 
> We really need to convert the kernel log messages to the extended
> netlink errors. Since you removed them prematurely, could you please
> work on that?
> 
> Thanks,
> 
>  Jiri
> 

I was told the extended netlink error facilities were not ready yet, has
that changed since the last release?

Off the top of my head, I can't think of any other setting I can do with
iproute2 that will write its errors in the kernel log; but there are quite
a lot settings that will just return a very unspecific error code. Isn't it
more common for the userspace tool to handle diagnostics in such cases?

Anyways, I will gladly work on improving the error handling if someone can
give me a pointer how these extended netlink errors are used.

Matthias

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V2] netdev-dpdk: use rte_eth_dev_set_mtu

2017-06-23 Thread Stokes, Ian


> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of Mark Kavanagh
> Sent: Monday, June 12, 2017 12:05 PM
> To: ovs-dev@openvswitch.org; Varghese, Vipin ;
> acon...@redhat.com
> Subject: [ovs-dev] [PATCH V2] netdev-dpdk: use rte_eth_dev_set_mtu
> 
> DPDK provides an API to set the MTU of compatible physical devices -
> rte_eth_dev_set_mtu(). Prior to DPDK v16.07 however, this API was not
> implemented in some DPDK PMDs (i40e, specifically). To allow the use of
> jumbo frames with affected NICs in OvS-DPDK, MTU configuration was
> achieved by setting the jumbo frame flag, and corresponding maximum
> permitted Rx frame size, in an rte_eth_conf structure for the NIC port,
> and subsequently invoking rte_eth_dev_configure() with that configuration.
> 
> However, that method does not set the MTU field of the underlying DPDK
> structure (rte_eth_dev) for the corresponding physical device;
> consequently, rte_eth_dev_get_mtu() reports the incorrect MTU for an OvS-
> DPDK phy device with non-standard MTU.
> 
> Resolve this issue by invoking rte_eth_dev_set_mtu() when setting up or
> modifying the MTU of a DPDK phy port.
> 
> Fixes: 0072e93 ("netdev-dpdk: add support for jumbo frames")
> Reported-by: Aaron Conole 
> Reported-by: Vipin Varghese 
> Signed-off-by: Mark Kavanagh 
> ---
> 
> V2->v1:
> - add 'reported-by' tag for Aaron Conole
> - change VLOG_INFO to VLOG_ERR
> 
>  lib/netdev-dpdk.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 810800e..c141dc2
> 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -649,13 +649,6 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int
> n_rxq, int n_txq)
>  int i;
>  struct rte_eth_conf conf = port_conf;
> 
> -if (dev->mtu > ETHER_MTU) {
> -conf.rxmode.jumbo_frame = 1;
> -conf.rxmode.max_rx_pkt_len = dev->max_packet_len;
> -} else {
> -conf.rxmode.jumbo_frame = 0;
> -conf.rxmode.max_rx_pkt_len = 0;
> -}
>  conf.rxmode.hw_ip_checksum = (dev->hw_ol_features &
>NETDEV_RX_CHECKSUM_OFFLOAD) != 0;
>  /* A device may report more queues than it makes available (this has
> @@ -675,6 +668,13 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int
> n_rxq, int n_txq)
>  break;
>  }
> 
> +diag = rte_eth_dev_set_mtu(dev->port_id, dev->mtu);
> +if (diag) {
> +VLOG_ERR("Interface %s MTU (%d) setup error: %s",
> +dev->up.name, dev->mtu, rte_strerror(-diag));
> +break;
> +}
> +
>  for (i = 0; i < n_txq; i++) {
>  diag = rte_eth_tx_queue_setup(dev->port_id, i, dev->txq_size,
>dev->socket_id, NULL);
> --

Hi Mark,

I've tested this patch with both an Intel X710 and an Intel 82599ES. The X710 
seems fine however for the 82599ES I found that setting an MTU greater than 
1500 resulted in the errors below and the port was left in a down state no 
longer able to pass traffic.

2017-06-23T09:47:49Z|00089|netdev_dpdk|ERR|Interface dpdk0 MTU (9000) setup 
error: Invalid argument
2017-06-23T09:47:49Z|00090|netdev_dpdk|ERR|Interface dpdk0(rxq:1 txq:2) 
configure error: Invalid argument
2017-06-23T09:47:49Z|00091|dpif_netdev|ERR|Failed to set interface dpdk0 new 
configuration
2017-06-23T09:47:49Z|00092|ofproto|WARN|br0: cannot get STP status on 
nonexistent port 1
2017-06-23T09:47:49Z|00093|ofproto|WARN|br0: cannot get RSTP status on 
nonexistent port 1

I could not reproduce the issue once I reverted the patch so it seems specific 
to the use of rte_eth_dev_set_mtu() with the 82599ES.

Ian

> 1.9.3
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 6/6] netdev: Fix null pointer dereference reported by clang.

2017-06-23 Thread Kavanagh, Mark B
>From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-boun...@openvswitch.org] 
>On Behalf Of
>Bhanuprakash Bodireddy
>Sent: Monday, June 19, 2017 7:54 PM
>To: d...@openvswitch.org
>Subject: [ovs-dev] [PATCH 6/6] netdev: Fix null pointer dereference reported 
>by clang.
>
>Clang reports that array access from 'dumps' variable result in null pointer
>dereference.
>
>Signed-off-by: Bhanuprakash Bodireddy 

Hi Bhanu,

LGTM - I also compiled this with gcc, clang, and sparse without issue. 
Checkpatch reports no obvious problems either.

Acked-by: Mark Kavanagh 

One thing - what version of clang are you using? My version (3.4.2) didn't 
detect any of the issues in this patchset. Alternatively, are there additional 
flags that you use when compiling with clang?

Cheers,
Mark


>---
> lib/netdev.c | 16 +---
> 1 file changed, 9 insertions(+), 7 deletions(-)
>
>diff --git a/lib/netdev.c b/lib/netdev.c
>index 001b7b3..336c141 100644
>--- a/lib/netdev.c
>+++ b/lib/netdev.c
>@@ -2290,14 +2290,16 @@ netdev_ports_flow_dump_create(const void *obj, int 
>*ports)
>
> dumps = count ? xzalloc(sizeof *dumps * count) : NULL;
>
>-HMAP_FOR_EACH(data, node, &port_to_netdev) {
>-if (data->obj == obj) {
>-if (netdev_flow_dump_create(data->netdev, &dumps[i])) {
>-continue;
>-}
>+if (dumps) {
>+HMAP_FOR_EACH(data, node, &port_to_netdev) {
>+if (data->obj == obj) {
>+if (netdev_flow_dump_create(data->netdev, &dumps[i])) {
>+continue;
>+}
>
>-dumps[i]->port = data->dpif_port.port_no;
>-i++;
>+dumps[i]->port = data->dpif_port.port_no;
>+i++;
>+}
> }
> }
> ovs_mutex_unlock(&netdev_hmap_mutex);
>--
>2.4.11
>
>___
>dev mailing list
>d...@openvswitch.org
>https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 5/6] test-conntrack: Fix dead store reported by clang.

2017-06-23 Thread Kavanagh, Mark B
>From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-boun...@openvswitch.org] 
>On Behalf Of
>Bhanuprakash Bodireddy
>Sent: Monday, June 19, 2017 7:54 PM
>To: d...@openvswitch.org
>Subject: [ovs-dev] [PATCH 5/6] test-conntrack: Fix dead store reported by 
>clang.
>
>Clang reports that value store to 'batch_size' is never read.
>
>Signed-off-by: Bhanuprakash Bodireddy 

Hi Bhanu,

LGTM - I also compiled this with gcc, clang, and sparse without issue. 
Checkpatch reports no obvious problems either.

Acked-by: Mark Kavanagh 

Thanks,
Mark


>---
> tests/test-conntrack.c | 1 -
> 1 file changed, 1 deletion(-)
>
>diff --git a/tests/test-conntrack.c b/tests/test-conntrack.c
>index f79a9fc..5d2f8b8 100644
>--- a/tests/test-conntrack.c
>+++ b/tests/test-conntrack.c
>@@ -197,7 +197,6 @@ test_pcap(struct ovs_cmdl_context *ctx)
> return;
> }
>
>-batch_size = 1;
> if (ctx->argc > 2) {
> batch_size = strtoul(ctx->argv[2], NULL, 0);
> if (batch_size == 0 || batch_size > NETDEV_MAX_BURST) {
>--
>2.4.11
>
>___
>dev mailing list
>d...@openvswitch.org
>https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 4/6] dpif-netlink-rtnl: Fix dead store reported by clang.

2017-06-23 Thread Kavanagh, Mark B
>From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-boun...@openvswitch.org] 
>On Behalf Of
>Bhanuprakash Bodireddy
>Sent: Monday, June 19, 2017 7:54 PM
>To: d...@openvswitch.org
>Subject: [ovs-dev] [PATCH 4/6] dpif-netlink-rtnl: Fix dead store reported by 
>clang.
>
>Clang reports variable 'ifmsg' never been used in the function.

Hi Bhanu,

LGTM - I also compiled this with gcc, clang, and sparse without issue. 
Checkpatch reports no obvious problems either.

Acked-by: Mark Kavanagh 

Cheers,
Mark

>
>Signed-off-by: Bhanuprakash Bodireddy 
>---
> lib/dpif-netlink-rtnl.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
>diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c
>index c3c31eb..17ae24a 100644
>--- a/lib/dpif-netlink-rtnl.c
>+++ b/lib/dpif-netlink-rtnl.c
>@@ -140,11 +140,9 @@ rtnl_policy_parse(const char *kind, struct ofpbuf *reply,
> {
> struct nlattr *linkinfo[ARRAY_SIZE(linkinfo_policy)];
> struct nlattr *rtlink[ARRAY_SIZE(rtlink_policy)];
>-struct ifinfomsg *ifmsg;
> int error = 0;
>
>-ifmsg = ofpbuf_at(reply, NLMSG_HDRLEN, sizeof *ifmsg);
>-if (!nl_policy_parse(reply, NLMSG_HDRLEN + sizeof *ifmsg,
>+if (!nl_policy_parse(reply, NLMSG_HDRLEN + sizeof(struct ifinfomsg),
>  rtlink_policy, rtlink, ARRAY_SIZE(rtlink_policy))
> || !nl_parse_nested(rtlink[IFLA_LINKINFO], linkinfo_policy,
> linkinfo, ARRAY_SIZE(linkinfo_policy))
>--
>2.4.11
>
>___
>dev mailing list
>d...@openvswitch.org
>https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] odp-util: Use port names in output in more places.

2017-06-23 Thread Ben Pfaff
Thanks for looking at these and testing them.  I applied them to master.

(There is still some room for improvement in these areas, but I think
that this is a useful step forward.)

On Fri, Jun 23, 2017 at 08:11:50AM +, Jan Scheurich wrote:
> Acked-by: Jan Scheurich 
> Tested-by: Jan Scheurich 
> 
> > -Original Message-
> > From: ovs-dev-boun...@openvswitch.org 
> > [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of Ben Pfaff
> > Sent: Tuesday, 20 June, 2017 04:26
> > To: d...@openvswitch.org
> > Cc: Ben Pfaff 
> > Subject: [ovs-dev] [PATCH 2/2] odp-util: Use port names in output in more 
> > places.
> > 
> > Until now, ODP output only showed port names for in_port matches.  This
> > commit shows them in other places port numbers appear.
> > 
> > Signed-off-by: Ben Pfaff 
> > ---
> >  lib/dpctl.c  |  8 ++--
> >  lib/dpif-netdev.c|  2 +-
> >  lib/dpif.c   |  4 +-
> >  lib/odp-util.c   | 98 
> > +++-
> >  lib/odp-util.h   |  5 ++-
> >  ofproto/ofproto-dpif-trace.c |  2 +-
> >  ofproto/ofproto-dpif.c   |  2 +-
> >  tests/test-odp.c |  2 +-
> >  8 files changed, 74 insertions(+), 49 deletions(-)
> > 
> > diff --git a/lib/dpctl.c b/lib/dpctl.c
> > index 1e3bf0a517db..1543c1cf3240 100644
> > --- a/lib/dpctl.c
> > +++ b/lib/dpctl.c
> > @@ -756,7 +756,7 @@ format_dpif_flow(struct ds *ds, const struct dpif_flow 
> > *f, struct hmap *ports,
> >  ds_put_cstr(ds, ", offloaded:yes");
> >  }
> >  ds_put_cstr(ds, ", actions:");
> > -format_odp_actions(ds, f->actions, f->actions_len);
> > +format_odp_actions(ds, f->actions, f->actions_len, ports);
> >  }
> > 
> >  static char *supported_dump_types[] = {
> > @@ -1348,7 +1348,7 @@ dpctl_parse_actions(int argc, const char *argv[], 
> > struct dpctl_params* dpctl_p)
> >  }
> > 
> >  ds_init(&s);
> > -format_odp_actions(&s, actions.data, actions.size);
> > +format_odp_actions(&s, actions.data, actions.size, NULL);
> >  dpctl_print(dpctl_p, "%s\n", ds_cstr(&s));
> >  ds_destroy(&s);
> > 
> > @@ -1513,7 +1513,7 @@ dpctl_normalize_actions(int argc, const char *argv[],
> > 
> >  if (dpctl_p->verbosity) {
> >  ds_clear(&s);
> > -format_odp_actions(&s, odp_actions.data, odp_actions.size);
> > +format_odp_actions(&s, odp_actions.data, odp_actions.size, NULL);
> >  dpctl_print(dpctl_p, "input actions: %s\n", ds_cstr(&s));
> >  }
> > 
> > @@ -1583,7 +1583,7 @@ dpctl_normalize_actions(int argc, const char *argv[],
> >  }
> > 
> >  ds_clear(&s);
> > -format_odp_actions(&s, af->actions.data, af->actions.size);
> > +format_odp_actions(&s, af->actions.data, af->actions.size, NULL);
> >  dpctl_puts(dpctl_p, false, ds_cstr(&s));
> > 
> >  ofpbuf_uninit(&af->actions);
> > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> > index f97e97ab2931..97fab8319b9c 100644
> > --- a/lib/dpif-netdev.c
> > +++ b/lib/dpif-netdev.c
> > @@ -2417,7 +2417,7 @@ dp_netdev_flow_add(struct dp_netdev_pmd_thread *pmd,
> >  mask_buf.data, mask_buf.size,
> >  NULL, &ds, false);
> >  ds_put_cstr(&ds, ", actions:");
> > -format_odp_actions(&ds, actions, actions_len);
> > +format_odp_actions(&ds, actions, actions_len, NULL);
> > 
> >  VLOG_DBG("%s", ds_cstr(&ds));
> > 
> > diff --git a/lib/dpif.c b/lib/dpif.c
> > index 10bdd7055d5f..2ed0ba02f2ce 100644
> > --- a/lib/dpif.c
> > +++ b/lib/dpif.c
> > @@ -1714,7 +1714,7 @@ log_flow_message(const struct dpif *dpif, int error,
> >  }
> >  if (actions || actions_len) {
> >  ds_put_cstr(&ds, ", actions:");
> > -format_odp_actions(&ds, actions, actions_len);
> > +format_odp_actions(&ds, actions, actions_len, NULL);
> >  }
> >  vlog(module, flow_message_log_level(error), "%s", ds_cstr(&ds));
> >  ds_destroy(&ds);
> > @@ -1802,7 +1802,7 @@ log_execute_message(const struct dpif *dpif,
> >(subexecute ? "sub-"
> > : dpif_execute_needs_help(execute) ? "super-"
> > : ""));
> > -format_odp_actions(&ds, execute->actions, execute->actions_len);
> > +format_odp_actions(&ds, execute->actions, execute->actions_len, 
> > NULL);
> >  if (error) {
> >  ds_put_format(&ds, " failed (%s)", ovs_strerror(error));
> >  }
> > diff --git a/lib/odp-util.c b/lib/odp-util.c
> > index d15095558243..6c2ab6cc5799 100644
> > --- a/lib/odp-util.c
> > +++ b/lib/odp-util.c
> > @@ -203,7 +203,8 @@ format_generic_odp_action(struct ds *ds, const struct 
> > nlattr *a)
> >  }
> > 
> >  static void
> > -format_odp_sample_action(struct ds *ds, const struct nlattr *attr)
> > +format_odp_sample_action(struct ds *ds, const struct nlattr *attr,
> > + const struct hmap 

Re: [ovs-dev] [PATCH v9] netdev-dpdk: Increase pmd thread priority.

2017-06-23 Thread O Mahony, Billy
Acked-by: Billy O'Mahony  

> -Original Message-
> From: Bodireddy, Bhanuprakash
> Sent: Thursday, June 22, 2017 9:51 PM
> To: d...@openvswitch.org
> Cc: O Mahony, Billy ; Bodireddy, Bhanuprakash
> 
> Subject: [PATCH v9] netdev-dpdk: Increase pmd thread priority.
> 
> Increase the DPDK pmd thread scheduling priority by lowering the nice value.
> This will advise the kernel scheduler to prioritize pmd thread over other
> processes and will help PMD to provide deterministic performance in out-of-
> the-box deployments.
> 
> This patch sets the nice value of PMD threads to '-20'.
> 
>   $ ps -eLo comm,policy,psr,nice | grep pmd
> 
>COMMAND  POLICY  PROCESSORNICE
> pmd62 TS3-20
> pmd63 TS0-20
> pmd64 TS1-20
> pmd65 TS2-20
> 
> Signed-off-by: Bhanuprakash Bodireddy
> 
> Tested-by: Billy O'Mahony 
> ---
> v8->v9:
> * Rebase
> 
> v7->v8:
> * Rebase
> * Update the documentation file @Documentation/intro/install/dpdk-
> advanced.rst
> 
> v6->v7:
> * Remove realtime scheduling policy logic.
> * Increase pmd thread scheduling priority by lowering nice value to -20.
> * Update doc accordingly.
> 
> v5->v6:
> * Prohibit spawning pmd thread on the lowest core in dpdk-lcore-mask if
>   lcore-mask and pmd-mask affinity are identical.
> * Updated Note section in INSTALL.DPDK-ADVANCED doc.
> * Tested below cases to verify system stability with pmd priority patch
> 
> v4->v5:
> * Reword Note section in DPDK-ADVANCED.md
> 
> v3->v4:
> * Document update
> * Use ovs_strerror for reporting errors in lib-numa.c
> 
> v2->v3:
> * Move set_priority() function to lib/ovs-numa.c
> * Apply realtime scheduling policy and priority to pmd thread only if
>   pmd-cpu-mask is passed.
> * Update INSTALL.DPDK-ADVANCED.
> 
> v1->v2:
> * Removed #ifdef and introduced dummy function
> "pmd_thread_setpriority"
>   in netdev-dpdk.h
> * Rebase
> 
>  Documentation/intro/install/dpdk.rst |  8 +++-
>  lib/dpif-netdev.c|  4 
>  lib/ovs-numa.c   | 21 +
>  lib/ovs-numa.h   |  1 +
>  4 files changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/intro/install/dpdk.rst
> b/Documentation/intro/install/dpdk.rst
> index e83f852..b5c26ba 100644
> --- a/Documentation/intro/install/dpdk.rst
> +++ b/Documentation/intro/install/dpdk.rst
> @@ -453,7 +453,8 @@ affinitized accordingly.
>to be affinitized to isolated cores for optimum performance.
> 
>By setting a bit in the mask, a pmd thread is created and pinned to the
> -  corresponding CPU core. e.g. to run a pmd thread on core 2::
> +  corresponding CPU core with nice value set to -20.
> +  e.g. to run a pmd thread on core 2::
> 
>$ ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x4
> 
> @@ -493,6 +494,11 @@ improvements as there will be more total CPU
> occupancy available::
> 
>  NIC port0 <-> OVS <-> VM <-> OVS <-> NIC port 1
> 
> +  .. note::
> +It is recommended that the OVS control thread and pmd thread shouldn't
> be
> +pinned to the same core i.e 'dpdk-lcore-mask' and 'pmd-cpu-mask' cpu
> mask
> +settings should be non-overlapping.
> +
>  DPDK Physical Port Rx Queues
>  
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index f83b632..6bbd786
> 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -3712,6 +3712,10 @@ pmd_thread_main(void *f_)
>  ovs_numa_thread_setaffinity_core(pmd->core_id);
>  dpdk_set_lcore_id(pmd->core_id);
>  poll_cnt = pmd_load_queues_and_ports(pmd, &poll_list);
> +
> +/* Set pmd thread's nice value to -20 */ #define MIN_NICE -20
> +ovs_numa_thread_setpriority(MIN_NICE);
>  reload:
>  emc_cache_init(&pmd->flow_cache);
> 
> diff --git a/lib/ovs-numa.c b/lib/ovs-numa.c index 98e97cb..a1921b3 100644
> --- a/lib/ovs-numa.c
> +++ b/lib/ovs-numa.c
> @@ -23,6 +23,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #endif /* __linux__ */
> @@ -570,3 +571,23 @@ int ovs_numa_thread_setaffinity_core(unsigned
> core_id OVS_UNUSED)
>  return EOPNOTSUPP;
>  #endif /* __linux__ */
>  }
> +
> +int
> +ovs_numa_thread_setpriority(int nice OVS_UNUSED) {
> +if (dummy_numa) {
> +return 0;
> +}
> +
> +#ifndef _WIN32
> +int err;
> +err = setpriority(PRIO_PROCESS, 0, nice);
> +if (err) {
> +VLOG_ERR("Thread priority error %s",ovs_strerror(err));
> +}
> +
> +return 0;
> +#else
> +return EOPNOTSUPP;
> +#endif
> +}
> diff --git a/lib/ovs-numa.h b/lib/ovs-numa.h index 6946cdc..e132483 100644
> --- a/lib/ovs-numa.h
> +++ b/lib/ovs-numa.h
> @@ -62,6 +62,7 @@ bool ovs_numa_dump_contains_core(const struct
> ovs_numa_dump *,  size_t ovs_numa_dump_count(const struct
> ovs_numa_dump *);  void ovs_numa_dump_destroy(struct
> ovs_numa_dump *);  int ovs_numa_thread_setaffinity_core(unsigned
> core_id);
> +int ovs_num

Re: [ovs-dev] [PATCH net-next v3 1/6] vxlan: refactor verification and application of configuration

2017-06-23 Thread Jiri Benc
This patchset looks good overall (would send my Acked-by for most of
this but I'm late).

On Mon, 19 Jun 2017 10:03:55 +0200, Matthias Schiffer wrote:
> Log messages in these
> functions are removed, as it is generally unexpected to find error output
> for netlink requests in the kernel log. Userspace should be able to handle
> errors based on the error codes returned via netlink just fine.

However, this is not really true. It's impossible to find out what went
wrong when you use e.g. iproute2 to configure a vxlan link.

We really need to convert the kernel log messages to the extended
netlink errors. Since you removed them prematurely, could you please
work on that?

Thanks,

 Jiri
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Fail to netdev_open internal iface with error "File exists"

2017-06-23 Thread Eelco Chaudron

On 06/23/2017 02:12 AM, Ben Pfaff wrote:

On Thu, Jun 22, 2017 at 04:04:44PM -0300, Flavio Leitner wrote:

On Thu, Jun 22, 2017 at 01:04:59AM +0800, Huanle Han wrote:

Hi,all

I get this problem with latest(dbd8112) branch-2.7 code on my Ubuntu.
root@ubuntu:/var/log/# ovs-vsctl show
adf2ea99-0c53-4180-914f-7dadaa71302b
 Bridge test
 Port test
 Interface test
 type: internal
 Bridge "manage"
 Port "manage"
 Interface "manage"
 type: internal
 error: "could not open network device manage (File exists)"
 Port "veth0"
 Interface "veth0"
 Port "eth0"
 Interface "eth0"
 ovs_version: "2.7.0"

How to reproduce:
1. add bridge "manage", up and add ip on it
2. restart ovs-vswitchd
3. "ovs-vsctl show" displays error message.

The reason:
In following "netdev_open" call on ovs-vswitchd start, input "type" is NULL
and "manage" is opened as a "system" netdev_class iface incorrectly.

#0  netdev_open (name=0x7fffe2bc "manage", type=0x0,
netdevp=0x7fffc3b0) at ../lib/netdev.c:396
#1  0x0052c492 in get_src_addr (ip6_dst=0x7fffe2ac,
output_bridge=0x7fffe2bc "manage", psrc=0x8f3490) at
../lib/ovs-router.c:141
#2  0x0052c85d in ovs_router_insert__ (priority=104 'h',
ip6_dst=0x7fffe29c, plen=104 'h', output_bridge=0x7fffe2bc
"manage", gw=0x7fffe2ac) at ../lib/ovs-router.c:202
#3  0x0052c980 in ovs_router_insert (ip_dst=0x7fffe29c,
plen=104 'h', output_bridge=0x7fffe2bc "manage", gw=0x7fffe2ac) at
../lib/ovs-router.c:228
#4  0x0058f63a in route_table_handle_msg (change=0x7fffe290) at
../lib/route-table.c:295
#5  0x0058f1da in route_table_reset () at ../lib/route-table.c:174
#6  0x0058f034 in route_table_init () at ../lib/route-table.c:110
#7  0x00495838 in dp_initialize () at ../lib/dpif.c:126
#8  0x00495b40 in dp_enumerate_types (types=0x7fffe3a0) at
../lib/dpif.c:244
#9  0x0042eb1c in enumerate_types (types=0x7fffe3a0) at
../ofproto/ofproto-dpif.c:267
#10 0x0041b81c in ofproto_enumerate_types (types=0x7fffe3a0) at
../ofproto/ofproto.c:432
#11 0x0040df1e in bridge_run__ () at ../vswitchd/bridge.c:3020
#12 0x0040e196 in bridge_run () at ../vswitchd/bridge.c:3082
#13 0x004138ef in main (argc=1, argv=0x7fffe578) at
../vswitchd/ovs-vswitchd.c:119

After then, ovs fails to netdev_open "manage" with type == "internal".
"File exists" error is reported.
I think commit d3b8f50(netdev: Fix netdev_open() to adhere to class type if
given) introduces this problem. It need be improved.

Your analysis is correct.  One solution is to wait vswitchd to
configure first and only then enable ovs-route.  The problem is that
more modules might use netdev_open() and it doesn't sound like a good
idea to have a control per module.

Another option is to map the device's info to a class instead of using
"system", so we could use internal classes for vports, for instance.
However, that doesn't guarantee it will match with what is configured
in the DB.

This sounds like the kind of problem that I expected the following
commit might cause.

 commit d3b8f5052292b3ba9084ffed097e90b87f2950f5
 Author: Eelco Chaudron 
 Date:   Thu Jun 1 14:38:09 2017 +0200

 netdev: Fix netdev_open() to adhere to class type if given

If we can't fix it somehow, we might need to revert.


Reverting the above patch does not solve the real issue here. If we 
revert we

donot get the error, but the wrong class is used, hence the wrong callbacks
get called.

The main issue is with netdev_open() being called with type = NULL before
the interface is actually configured in the system. We could track these
"auto" generated interfaces, and once netdev_open() gets called with a
valid type reconfigure (re-create) it. netdev_remove() could work here
but not sure if its safe due to reference counting.



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [patch_v1] dpif: Fix cleanup of userspace datapath.

2017-06-23 Thread Ben Pfaff
On Thu, Jun 22, 2017 at 09:27:11PM -0700, Darrell Ball wrote:
> Hardware offload introduced extra tracking of netdev ports.  This
> included ovs-netdev, which is really for internal infra usage for
> the userpace datapath.  This breaks cleanup of the userspace
> datapath.  There is no need to do this extra tracking of this port,
> hence it is skipped by checking for the port name as the type does
> not help here.  Adding an extra type or field is another way to fix
> this, but this would probably be overkill as this port is a constant
> and the misuse potential very limited.
> 
> CC: Paul Blakey 
> Fixes: 32b77c316d9982("dpif: Save added ports in a port map.")
> Signed-off-by: Darrell Ball 
> ---
>  lib/dpif.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/dpif.c b/lib/dpif.c
> index 10bdd70..8624d34 100644
> --- a/lib/dpif.c
> +++ b/lib/dpif.c
> @@ -350,7 +350,11 @@ do_open(const char *name, const char *type, bool create, 
> struct dpif **dpifp)
>  struct netdev *netdev;
>  int err;
>  
> -if (!strcmp(dpif_port.type, "internal")) {
> +/* ovs-netdev is a tap device that is used as an
> + * internal port for the userspace datapath, hence
> + * don't track it here. */
> +if (!strcmp(dpif_port.type, "internal") ||
> +(!strcmp(dpif_port.name, "ovs-netdev"))) {

I don't understand this issue, so this is not a review, but please do
note this style point from coding-style.rst:

Do not parenthesize the operands of ``&&`` and ``||`` unless operator
precedence makes it necessary, or unless the operands are themselves
expressions that use ``&&`` and ``||``. Thus:

::

if (!isdigit((unsigned char)s[0])
|| !isdigit((unsigned char)s[1])
|| !isdigit((unsigned char)s[2])) {
printf("string %s does not start with 3-digit code\n", s);
}

but

::

if (rule && (!best || rule->priority > best->priority)) {
best = rule;
}
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] odp-util: Use port names in output in more places.

2017-06-23 Thread Jan Scheurich
Acked-by: Jan Scheurich 
Tested-by: Jan Scheurich 

> -Original Message-
> From: ovs-dev-boun...@openvswitch.org 
> [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of Ben Pfaff
> Sent: Tuesday, 20 June, 2017 04:26
> To: d...@openvswitch.org
> Cc: Ben Pfaff 
> Subject: [ovs-dev] [PATCH 2/2] odp-util: Use port names in output in more 
> places.
> 
> Until now, ODP output only showed port names for in_port matches.  This
> commit shows them in other places port numbers appear.
> 
> Signed-off-by: Ben Pfaff 
> ---
>  lib/dpctl.c  |  8 ++--
>  lib/dpif-netdev.c|  2 +-
>  lib/dpif.c   |  4 +-
>  lib/odp-util.c   | 98 
> +++-
>  lib/odp-util.h   |  5 ++-
>  ofproto/ofproto-dpif-trace.c |  2 +-
>  ofproto/ofproto-dpif.c   |  2 +-
>  tests/test-odp.c |  2 +-
>  8 files changed, 74 insertions(+), 49 deletions(-)
> 
> diff --git a/lib/dpctl.c b/lib/dpctl.c
> index 1e3bf0a517db..1543c1cf3240 100644
> --- a/lib/dpctl.c
> +++ b/lib/dpctl.c
> @@ -756,7 +756,7 @@ format_dpif_flow(struct ds *ds, const struct dpif_flow 
> *f, struct hmap *ports,
>  ds_put_cstr(ds, ", offloaded:yes");
>  }
>  ds_put_cstr(ds, ", actions:");
> -format_odp_actions(ds, f->actions, f->actions_len);
> +format_odp_actions(ds, f->actions, f->actions_len, ports);
>  }
> 
>  static char *supported_dump_types[] = {
> @@ -1348,7 +1348,7 @@ dpctl_parse_actions(int argc, const char *argv[], 
> struct dpctl_params* dpctl_p)
>  }
> 
>  ds_init(&s);
> -format_odp_actions(&s, actions.data, actions.size);
> +format_odp_actions(&s, actions.data, actions.size, NULL);
>  dpctl_print(dpctl_p, "%s\n", ds_cstr(&s));
>  ds_destroy(&s);
> 
> @@ -1513,7 +1513,7 @@ dpctl_normalize_actions(int argc, const char *argv[],
> 
>  if (dpctl_p->verbosity) {
>  ds_clear(&s);
> -format_odp_actions(&s, odp_actions.data, odp_actions.size);
> +format_odp_actions(&s, odp_actions.data, odp_actions.size, NULL);
>  dpctl_print(dpctl_p, "input actions: %s\n", ds_cstr(&s));
>  }
> 
> @@ -1583,7 +1583,7 @@ dpctl_normalize_actions(int argc, const char *argv[],
>  }
> 
>  ds_clear(&s);
> -format_odp_actions(&s, af->actions.data, af->actions.size);
> +format_odp_actions(&s, af->actions.data, af->actions.size, NULL);
>  dpctl_puts(dpctl_p, false, ds_cstr(&s));
> 
>  ofpbuf_uninit(&af->actions);
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index f97e97ab2931..97fab8319b9c 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -2417,7 +2417,7 @@ dp_netdev_flow_add(struct dp_netdev_pmd_thread *pmd,
>  mask_buf.data, mask_buf.size,
>  NULL, &ds, false);
>  ds_put_cstr(&ds, ", actions:");
> -format_odp_actions(&ds, actions, actions_len);
> +format_odp_actions(&ds, actions, actions_len, NULL);
> 
>  VLOG_DBG("%s", ds_cstr(&ds));
> 
> diff --git a/lib/dpif.c b/lib/dpif.c
> index 10bdd7055d5f..2ed0ba02f2ce 100644
> --- a/lib/dpif.c
> +++ b/lib/dpif.c
> @@ -1714,7 +1714,7 @@ log_flow_message(const struct dpif *dpif, int error,
>  }
>  if (actions || actions_len) {
>  ds_put_cstr(&ds, ", actions:");
> -format_odp_actions(&ds, actions, actions_len);
> +format_odp_actions(&ds, actions, actions_len, NULL);
>  }
>  vlog(module, flow_message_log_level(error), "%s", ds_cstr(&ds));
>  ds_destroy(&ds);
> @@ -1802,7 +1802,7 @@ log_execute_message(const struct dpif *dpif,
>(subexecute ? "sub-"
> : dpif_execute_needs_help(execute) ? "super-"
> : ""));
> -format_odp_actions(&ds, execute->actions, execute->actions_len);
> +format_odp_actions(&ds, execute->actions, execute->actions_len, 
> NULL);
>  if (error) {
>  ds_put_format(&ds, " failed (%s)", ovs_strerror(error));
>  }
> diff --git a/lib/odp-util.c b/lib/odp-util.c
> index d15095558243..6c2ab6cc5799 100644
> --- a/lib/odp-util.c
> +++ b/lib/odp-util.c
> @@ -203,7 +203,8 @@ format_generic_odp_action(struct ds *ds, const struct 
> nlattr *a)
>  }
> 
>  static void
> -format_odp_sample_action(struct ds *ds, const struct nlattr *attr)
> +format_odp_sample_action(struct ds *ds, const struct nlattr *attr,
> + const struct hmap *portno_names)
>  {
>  static const struct nl_policy ovs_sample_policy[] = {
>  [OVS_SAMPLE_ATTR_PROBABILITY] = { .type = NL_A_U32 },
> @@ -229,19 +230,20 @@ format_odp_sample_action(struct ds *ds, const struct 
> nlattr *attr)
>  ds_put_cstr(ds, "actions(");
>  nla_acts = nl_attr_get(a[OVS_SAMPLE_ATTR_ACTIONS]);
>  len = nl_attr_get_size(a[OVS_SAMPLE_ATTR_ACTIONS]);
> -format_odp_actions(ds, nla_acts, len);
> +format_odp_actions(ds, nla_acts, l

Re: [ovs-dev] [PATCH 1/2] ovs-dpctl: New --names option to use port names in flow dumps.

2017-06-23 Thread Jan Scheurich
Thanks for implementing this!

It's a great usability improvement for trouble-shooting datapath issues and 
makes it easier to write unit test cases checking on datapath flows.

Acked-by: Jan Scheurich 
Tested-by: Jan Scheurich 

> -Original Message-
> From: ovs-dev-boun...@openvswitch.org 
> [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of Ben Pfaff
> Sent: Tuesday, 20 June, 2017 04:26
> To: d...@openvswitch.org
> Cc: Ben Pfaff 
> Subject: [ovs-dev] [PATCH 1/2] ovs-dpctl: New --names option to use port 
> names in flow dumps.
> 
> Until now, printing names in "ovs-dpctl dump-flows" was tied to the overall
> output verbosity, which in practice meant that to see port names a user had
> to see a distracting amount of verbosity.  This decouples names from
> verbosity.
> 
> I'd like to make showing names the default for interactive usage, but so
> far names aren't accepted in input so that would frustrate cut-and-paste,
> which is an important use of "ovs-dpctl dump-flows" output.
> 
> Signed-off-by: Ben Pfaff 
> ---
>  lib/dpctl.c  | 76 
> ++--
>  lib/dpctl.h  |  3 ++
>  lib/dpctl.man|  7 +++--
>  lib/odp-util.c   |  4 +--
>  ofproto/ofproto-dpif.c   | 48 +-
>  utilities/ovs-dpctl.8.in |  9 +-
>  utilities/ovs-dpctl.c| 20 +
>  7 files changed, 125 insertions(+), 42 deletions(-)
> 
> diff --git a/lib/dpctl.c b/lib/dpctl.c
> index 7f44d025dcf1..1e3bf0a517db 100644
> --- a/lib/dpctl.c
> +++ b/lib/dpctl.c
> @@ -49,6 +49,8 @@
>  #include "unixctl.h"
>  #include "util.h"
>  #include "openvswitch/ofp-parse.h"
> +#include "openvswitch/vlog.h"
> +VLOG_DEFINE_THIS_MODULE(dpctl);
> 
>  typedef int dpctl_command_handler(int argc, const char *argv[],
>struct dpctl_params *);
> @@ -762,6 +764,36 @@ static char *supported_dump_types[] = {
>  "ovs",
>  };
> 
> +static struct hmap *
> +dpctl_get_portno_names(struct dpif *dpif, const struct dpctl_params *dpctl_p)
> +{
> +if (dpctl_p->names) {
> +struct hmap *portno_names = xmalloc(sizeof *portno_names);
> +hmap_init(portno_names);
> +
> +struct dpif_port_dump port_dump;
> +struct dpif_port dpif_port;
> +DPIF_PORT_FOR_EACH (&dpif_port, &port_dump, dpif) {
> +odp_portno_names_set(portno_names, dpif_port.port_no,
> + dpif_port.name);
> +}
> +
> +return portno_names;
> +} else {
> +return NULL;
> +}
> +}
> +
> +static void
> +dpctl_free_portno_names(struct hmap *portno_names)
> +{
> +if (portno_names) {
> +odp_portno_names_destroy(portno_names);
> +hmap_destroy(portno_names);
> +free(portno_names);
> +}
> +}
> +
>  static int
>  dpctl_dump_flows(int argc, const char *argv[], struct dpctl_params *dpctl_p)
>  {
> @@ -774,10 +806,6 @@ dpctl_dump_flows(int argc, const char *argv[], struct 
> dpctl_params *dpctl_p)
>  struct flow flow_filter;
>  struct flow_wildcards wc_filter;
> 
> -struct dpif_port_dump port_dump;
> -struct dpif_port dpif_port;
> -struct hmap portno_names;
> -
>  struct dpif_flow_dump_thread *flow_dump_thread;
>  struct dpif_flow_dump *flow_dump;
>  struct dpif_flow f;
> @@ -807,15 +835,14 @@ dpctl_dump_flows(int argc, const char *argv[], struct 
> dpctl_params *dpctl_p)
>  goto out_free;
>  }
> 
> -
> -hmap_init(&portno_names);
> -DPIF_PORT_FOR_EACH (&dpif_port, &port_dump, dpif) {
> -odp_portno_names_set(&portno_names, dpif_port.port_no, 
> dpif_port.name);
> -}
> +struct hmap *portno_names = dpctl_get_portno_names(dpif, dpctl_p);
> 
>  if (filter) {
>  struct ofputil_port_map port_map;
>  ofputil_port_map_init(&port_map);
> +
> +struct dpif_port_dump port_dump;
> +struct dpif_port dpif_port;
>  DPIF_PORT_FOR_EACH (&dpif_port, &port_dump, dpif) {
>  ofputil_port_map_put(&port_map,
>   u16_to_ofp(odp_to_u32(dpif_port.port_no)),
> @@ -890,7 +917,7 @@ dpctl_dump_flows(int argc, const char *argv[], struct 
> dpctl_params *dpctl_p)
>  }
>  pmd_id = f.pmd_id;
>  }
> -format_dpif_flow(&ds, &f, &portno_names, type, dpctl_p);
> +format_dpif_flow(&ds, &f, portno_names, type, dpctl_p);
> 
>  dpctl_print(dpctl_p, "%s\n", ds_cstr(&ds));
>  }
> @@ -903,8 +930,7 @@ dpctl_dump_flows(int argc, const char *argv[], struct 
> dpctl_params *dpctl_p)
>  ds_destroy(&ds);
> 
>  out_dpifclose:
> -odp_portno_names_destroy(&portno_names);
> -hmap_destroy(&portno_names);
> +dpctl_free_portno_names(portno_names);
>  dpif_close(dpif);
>  out_free:
>  free(filter);
> @@ -1032,11 +1058,8 @@ dpctl_get_flow(int argc, const char *argv[], struct 
> dpctl_params *dpctl_p)
>  {
>  const char *key_s = argv[argc - 1];
>  struc

Re: [ovs-dev] [RFC PATCH v1] net-dpdk: Introducing TX tcp HW checksum offload support for DPDK pnic

2017-06-23 Thread Chandran, Sugesh


Regards
_Sugesh

From: Gao Zhenyu [mailto:sysugaozhe...@gmail.com]
Sent: Wednesday, June 21, 2017 9:32 AM
To: Chandran, Sugesh 
Cc: b...@ovn.org; u9012...@gmail.com; ktray...@redhat.com; Kavanagh, Mark B 
; d...@openvswitch.org
Subject: Re: [ovs-dev] [RFC PATCH v1] net-dpdk: Introducing TX tcp HW checksum 
offload support for DPDK pnic

I get it.  Maybe caculating it in OVS part is doable as well.
So, how about adding more options to let people choose HW-tcp-cksum(reduce cpu 
cycles) or SW-tcp-cksum(may be better performance)?
Then we have NO-TCP-CKSUM, SW-TCP-CKSUM, HW-TCP-CKSUM.
[Sugesh] In OVS-DPDK, I am not sure about the advantage of having HW checksum. 
Because even if you save CPU cycles, that will get used for non vector tx.
So I would prefer to keep these options only if there are really a need for 
that.
BTW, when will DPDK support tx checksum offload with vectorization?
[Sugesh] I don’t see any plan to do that in near future. Could be worth to ask 
in DPDK mailing list as well.

Thanks
Zhenyu Gao


2017-06-21 16:03 GMT+08:00 Chandran, Sugesh 
mailto:sugesh.chand...@intel.com>>:


Regards
_Sugesh

From: Gao Zhenyu 
[mailto:sysugaozhe...@gmail.com]
Sent: Monday, June 19, 2017 1:23 PM
To: Chandran, Sugesh 
mailto:sugesh.chand...@intel.com>>
Cc: b...@ovn.org; 
u9012...@gmail.com; 
ktray...@redhat.com; Kavanagh, Mark B 
mailto:mark.b.kavan...@intel.com>>; 
d...@openvswitch.org
Subject: Re: [ovs-dev] [RFC PATCH v1] net-dpdk: Introducing TX tcp HW checksum 
offload support for DPDK pnic

Thanks for that comments.

[Sugesh] Any reason, why this patch does only the TCP checksum offload?? The 
command line option says tx_checksum offload (it could be mistakenly considered 
for full checksum offload).
[Zhenyu Gao] DPDK nic supports many hw offload feature like IPv4,IPV6,TCP, 
UDP,VXLAN,GRE. I would like to make them work step by step. A huge patch may 
introduce more potential issues.
TCP offload is a basic and essential feature so I prefer to implement it first.
[Sugesh] Ok, Fine!


[Sugesh] What is the performance improvement offered with this feature? Do you 
have any numbers to share?
[Zhenyu Gao]I think DPDK uses non-vector functions when Tx checksum offload is 
enabled. Will it give enough performance improvement to mitigate that cost?
It is a draft patch to collect advise and suggestions. In my draft testing, it 
doesn't show improvment or regression
In ovs-dpdk + veth environment, veth support tcp cksum offload by default, but 
it introduces tcp connection issue because veth believes it supports cksum and 
offload to ovs, but dpdk side doesn't do the offloading.
So I have to use ethtool -K eth1 tx off to disable all tx offloading if using 
original ovs-dpdk. That means we cannot consume TSO as well.
[Sugesh] This is a concern. We have to consider other usecases as well. Most of 
the high performance ovs-dpdk applications doesn’t use any kernel/veth pair 
interfaces in OVS-DPDK datapath.


It is a ovs-dpdk + veth environment. So it consumes sendmsg/ recvmsg on RX/TX 
in ovs-dpdk side. The netperf was executed on ovs-dpdk + veth side.
The veth side enabled tx-tcp hw cksum, disabled tso.Bottleneck was not in 
cksum, and running testing in a vhost VM is more reasonable.
[Sugesh] I agree with you. But its worthwhile to know what is the performance 
delta. Also if the cost of vectorization is high, we may consider to do the 
checksum calculation in software itself. I feel x86 instructions can do 
checksum calculation pretty efficient. Have you consider that option?

[root@16ee46e4b793 ~]# netperf -H 10.100.85.247 -t TCP_RR -l 10
MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
10.100.85.247 () port 0 AF_INET : first burst 0
Local /Remote
Socket Size   Request  Resp.   Elapsed  Trans.
Send   Recv   Size SizeTime Rate
bytes  Bytes  bytesbytes   secs.per sec

16384  87380  11   10.0015001.87(HW tcp-cksum)   15062.72(No HW 
tcp-cksum)
16384  87380


[root@16ee46e4b793 ~]# netperf -H 10.100.85.247 -t TCP_STREAM -l 10
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.100.85.247 
() port 0 AF_INET
Recv   SendSend
Socket Socket  Message  Elapsed
Size   SizeSize Time Throughput
bytes  bytes   bytessecs.10^6bits/sec
 87380  16384  1638410.02 263.41(HW tcp-cksum)   265.31(No HW tcp-cksum)

I would like to keep it disabled in default setting unless we implement more tx 
offloading like TSO.(Do you have concern on it?)  BTW, I think I can rename 
NETDEV_TX_CHECKSUM_OFFLOAD into NETDEV_TX_TCP_CHECKSUM_OFFLOAD.
Please let me know if you get any questions. :)
[Sugesh] On Rx checksum offload case, it works with vector instructions. The 
latest DPDK support rx checksum offload with vectorization.
Thanks

2017-06-19 17:26 GMT+08:00 Chandran, Sugesh 
mailto:sugesh.ch

Re: [ovs-dev] [PATCH RFC] dpif-netdev: Add Cuckoo Distributor to Accelerate Megaflow Search

2017-06-23 Thread Fischetti, Antonio
Hi All,
thanks for your feedback. We published a patchset v1 at

http://patchwork.ozlabs.org/patch/775505/

please feel free to review.

Thanks,
Antonio

> -Original Message-
> From: Wang, Yipeng1
> Sent: Wednesday, May 3, 2017 12:04 AM
> To: Darrell Ball ; d...@openvswitch.org; ja...@ovn.org;
> jan.scheur...@ericsson.com
> Cc: Tai, Charlie ; Wang, Ren ;
> Gobriel, Sameh ; Fischetti, Antonio
> 
> Subject: RE: [ovs-dev] [PATCH RFC] dpif-netdev: Add Cuckoo Distributor to
> Accelerate Megaflow Search
> 
> Thank you Darrell for the comment, we collect some data with the scalar
> version,  please see my reply inlined.  Our newest results show good
> speedup for both scalar and AVX version.
> 
> We are still waiting for more feedback before implementing version 2.
> Please feel free to comment on the patch.
> 
> Thank you.
> 
> > -Original Message-
> > From: Darrell Ball [mailto:db...@vmware.com]
> > Sent: Wednesday, April 26, 2017 10:04 PM
> > To: Wang, Yipeng1 ; d...@openvswitch.org
> > Cc: Tai, Charlie ; Wang, Ren
> ;
> > Gobriel, Sameh 
> > Subject: Re: [ovs-dev] [PATCH RFC] dpif-netdev: Add Cuckoo Distributor
> to
> > Accelerate Megaflow Search
> >
> >
> >
> > On 4/14/17, 6:10 PM, "Wang, Yipeng1"  wrote:
> >
> > Thank you Darrell for the comments. Please take a look at my reply
> inlined.
> >
> >
> >
> > > -Original Message-
> >
> > > From: Darrell Ball [mailto:db...@vmware.com]
> >
> > > Sent: Thursday, April 13, 2017 10:36 PM
> >
> > > To: Wang, Yipeng1 ; d...@openvswitch.org
> >
> > > Subject: Re: [ovs-dev] [PATCH RFC] dpif-netdev: Add Cuckoo
> Distributor
> > to
> >
> > > Accelerate Megaflow Search
> >
> > >
> >
> > >
> >
> > >
> >
> > > On 4/6/17, 2:48 PM, "ovs-dev-boun...@openvswitch.org on behalf of
> >
> > > yipeng1.w...@intel.com"  > behalf of
> >
> > > yipeng1.w...@intel.com> wrote:
> >
> > >
> >
> > > From: Yipeng Wang 
> >
> > >
> >
> > > The Datapath Classifier uses tuple space search for flow
> classification.
> >
> > > The rules are arranged into a set of tuples/subtables (each
> with a
> >
> > > distinct mask).  Each subtable is implemented as a hash table
> and
> > lookup
> >
> > > is done with flow keys formed by selecting the bits from the
> packet
> > header
> >
> > > based on each subtable's mask. Tuple space search will
> sequentially
> > search
> >
> > > each subtable until a match is found. With a large number of
> subtables,
> > a
> >
> > > sequential search of the subtables could consume a lot of CPU
> cycles.
> > In
> >
> > > a testbench with a uniform traffic pattern equally distributed
> across 20
> >
> > > subtables, we measured that up to 65% of total execution time
> is
> > attributed
> >
> > > to the megaflow cache lookup.
> >
> > >
> >
> > > This patch presents the idea of the two-layer hierarchical
> lookup,
> > where a
> >
> > > low overhead first level of indirection is accessed first, we
> call this
> >
> > > level cuckoo distributor (CD). If a flow key has been inserted
> in the flow
> >
> > > table the first level will indicate with high probability that
> which
> >
> > > subtable to look into. A lookup is performed on the second
> level (the
> >
> > > target subtable) to retrieve the result. If the key doesn’t
> have a match,
> >
> > > then we revert back to the sequential search of subtables.
> >
> > >
> >
> > > This patch can improve the already existing Subtable Ranking
> when
> > traffic
> >
> > > data has high entropy. Subtable Ranking helps minimize the
> number of
> >
> > > traversed subtables when most of the traffic hit the same
> subtable.
> >
> > > However, in the case of high entropy traffic such as traffic
> coming from
> >
> > > a physical port, multiple subtables could be hit with a
> similar frequency.
> >
> > > In this case the average subtable lookups per hit would be
> much
> > greater
> >
> > > than 1. In addition, CD can adaptively turn off when it finds
> the traffic
> >
> > > mostly hit one subtable. Thus, CD will not be an overhead when
> > Subtable
> >
> > > Ranking works well.
> >
> > >
> >
> > > Scheme:
> >
> > >
> >
> > >  ---
> >
> > > |  CD   |
> >
> > >  ---
> >
> > >\
> >
> > > \
> >
> > >  -  - -
> >
> > > |sub  ||sub  |...|sub  |
> >
> > > |table||table|   |table|
> >
> > >  -  - -
> >
> > >
> >
> > > Evaluation:
> >
> > >
> >
> > > We create set of rules with various src IP. We feed traffic
> containing 1
> >
> > > million flows with various src IP and dst IP. All the flows
> hit 10/20/30
> >
> > > rules creating 10/20/30 subtables.
> >
>

[ovs-dev] Bug#865616: upgrade becomes stuck running ifup

2017-06-23 Thread Daniel Pocock
Package: openvswitch-switch
Version: 2.6.2~pre+git20161223-3

I ran a jessie to stretch upgrade


When dpkg was setting up openvswitch-switch it became stuck

Looking at the process table I found:

23766 ?S  0:00 ifup --allow=ovs ovsbr0

The interface was already up and I was accessing the host over that
interface for the upgrade.

Killing ifup allowed the upgrade to proceed.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev