Re: [ovs-dev] [RFC PATCH 0/8] dpif-netdev: Refactor cycle count and rebased patches

2018-01-08 Thread Ilya Maximets
Hello everyone. I just returned from the holidays.
Thanks for working on this. Find my replies inline.

Best regards, Ilya Maximets.

On 08.01.2018 20:58, Kevin Traynor wrote:
> On 01/08/2018 05:31 PM, Stokes, Ian wrote:
>>>
>>> Hi Ian,
>>>
 -Original Message-
 From: Stokes, Ian [mailto:ian.sto...@intel.com]
 Sent: Monday, 08 January, 2018 17:09
 To: Jan Scheurich ; Kevin Traynor
 ; Ilya Maximets 
 Cc: d...@openvswitch.org
 Subject: RE: [ovs-dev] [RFC PATCH 0/8] dpif-netdev: Refactor cycle
 count and rebased patches

> Hi Ian,
>
> That's why I brought up my original question before Christmas, but
> apparently too late ☹.

 Apologies, Christmas was a bit hectic with the last push for output
>>> batching and finishing for the holidays so I missed following up on it.
>>>
>>> No worries, I understand. Actually Ilya responded and agreed to my
>>> proposal for deciding on an order. But we didn't have time to discuss the
>>> best order. He had v9 out before Christmas based on dpdk_merge (later
>>> merged to master) and I took that for rebase.

I have a few concerns about this rebase. It wan't trivial and you made few
decisions which I would argue with. For example, adding a new parameter
to 'dp_netdev_process_rxq_port'. It's not needed. I'll reply later to
the patch itself. There are also few style issues. I personally prefer to
rebase my patches myself to avoid such situations where I disagree with
rebase made by someone else especially without mentioning in commit-message.

>>>
>
> Myself, was hoping that we could get all three changes: PMD
> performance metrics, time-based tx batching and Kevin's enhancement
> for the pmd-rxq- show command into 2.9.
>

 I think this is the ideal situation, but does require sign off from
 Ilya and Kevin as it will be their features it will impact on, I guess
>>> they can answer and give an idea on bandwidth to cooperate on something
>>> like this.

> PMD performance metrics are for sure around long enough to warrant
>>> that.

It's not a first time I hear that something is ready to be merged because
it was submitted long time ago and has few iterations of rebasing. But
the truth is that our "OVS-DPDK" community is really small and we don't have
enough active reviewers/time to handle all the huge patch series that
constantly appears on the mail list. It's sad, but true.

About "PMD performance metrics" patch-set: v4 is the only iteration that was
really reviewed. And these reviews was mostly on a high level. At a first
glance v5 has at least 2 real bugs and, IMHO, design of some points is not
really good. I'll reply with details to the series soon.

> And they are highly wanted too.

 To my mind the detailed PMD logs patchset has a lot of content to work
 through and there are reviews in progress as well as re-work requests
 from previous reviews of the v4. I was waiting to see these completed
>>> before focusing on it myself as I haven't had the bandwidth to date to
>>> take it on.
>>>
>>> All pending review comments have been incorporated in v5 posted on Jan 4th
>>> (http://patchwork.ozlabs.org/cover/855572/). That version is rebased to
>>> master after merging  the non-tima-based output batching.
>>
>> Apologies, Jan, I've spotted the v5 now (it was v4 assigned to myself in 
>> patchwork so that was what I had been planning to look at).
>>
>>>
>>> I am just now preparing a v6 that adds the missing documentation for the
>>> new commands to the ovs-vswitchd man page.
>>
>> Excellent, is there an ETA on the v6? I think Billy is already reviewing the 
>> v5 but I'll confirm, the switch to v6 should be painless if it's mainly 
>> documentation.
>>
>>>

>
> Since all three are heavily affecting same parts of the code the
> order of merging matters a lot. If we want to make this happen in
> reasonable time we need to avoid constant manual re-basing for all of
>>> us.
>
> That’s why I have taken the initiative to serialize them into one
> particular order for merging. That was a painful exercise and I am
> not looking forward to doing it again in a different order.

 Agreed and thanks for taking the initiative. I do think this is
>>> important if all three are to make it in.
> 
> +1. thanks for this Jan.
> 

>
> So I would greatly appreciate if we could agree on the proposed
> order and discuss how we review/test the resulting overall
> contribution with the ambition to get all parts into 2.9.

 I'm open to others input here, based on the points you've raised I
 would suggest the following (only a suggestion, please feel free to
 counter):

 1: Output Time batching (it's v9 and is well understood at this point,
 I would think would be little re-work needed if the cases suit the PMD
>>> balancing 

Re: [ovs-dev] [PATCH] netdev-dpdk: fix port addition for ports sharing same PCI id

2018-01-08 Thread Yuanhan Liu
On Mon, Jan 08, 2018 at 01:10:10PM +, Stokes, Ian wrote:
> Hi Yuanhan,
> 
> Thanks for working on this, I've done some testing with ani40e device and a 
> review of the current patch, please find comments below.
> 
> > On 12/20/2017 04:03 PM, Yuanhan Liu wrote:
> > > Some NICs have only one PCI address associated with multiple ports.
> > > This patch extends the dpdk-devargs option's format to cater for such
> > devices.
> > >
> > > To achieve that, this patch uses a new syntax that will be adapted and
> > > implemented in future DPDK release (likely, v18.05):
> > > http://dpdk.org/ml/archives/dev/2017-December/084234.html
> > >
> > > And since it's the DPDK duty to parse the (complete and full) syntax
> > > and this patch is more likely to serve as an intermediate workaround,
> > > here I take a simpler and shorter syntax from it (note it's allowed to
> > > have only one category being provided):
> > > class=eth,mac=00:11:22:33:44:55:66
> > >
> > > Also, old compatibility is kept. Users can still go on with using the
> > > PCI id to add a port (if that's enough for them). Meaning, this patch
> > > will not break anything.
> 
> I've validated that both the old method/new method both work with i40e 
> devices as expected.

Thanks a lot for testing!

> > Hi Yuanhan, I think there would need to be some doc updates also for a new
> > syntax.
>
> Fully agree, would need updates to the relevant OVS DPDK related docs as well 
> as an explanation regarding why there are 2 approaches to adding ports.

Sure, I will add it in v2. Documentation/howto/dpdk.rst is the one to
update, right?

> > 
> > How settled/agreed is the syntax in DPDK now? Ideally it is totally
> > settled and we use it for these types of devices.

It's not 100% settled, but I think we are really close to it.

> > But if not...then considering we will continue to keep compatibility with
> > older simpler "pci" anyway, maybe it would be safer to add something
> > simple now like "pci","mac" or just "mac" and keep compatibility for that
> > when the new syntax is finally agreed in DPDK. It may mean some parsing to
> > distinguish pci from mac, or vdev from pci,mac though.
> > 
> > Anyway, agreed syntax in DPDK is better so hopefully that can be done.
> > 
> > > This patch is basically based on the one from Ciara:
> > >
> > > https://mail.openvswitch.org/pipermail/ovs-dev/2017-October/339496.htm
> > > l
> > >
> > > Cc: Loftus Ciara 
> > > Cc: Thomas Monjalon 
> > > Cc: Kevin Traynor 
> > > Signed-off-by: Yuanhan Liu 
> > > ---
> > >  lib/netdev-dpdk.c | 77
> > > ---
> > >  1 file changed, 62 insertions(+), 15 deletions(-)
> > >
> > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
> > > 45fcc74..4e5cc25 100644
> > > --- a/lib/netdev-dpdk.c
> > > +++ b/lib/netdev-dpdk.c
> > > @@ -1205,30 +1205,77 @@ netdev_dpdk_lookup_by_port_id(dpdk_port_t
> > port_id)
> > >  return NULL;
> > >  }
> > >
> > > +static int
> > > +netdev_dpdk_str_to_ether(const char *mac, struct ether_addr *ea) {
> > > +unsigned int bytes[6];
> > > +int i;
> > > +
> > > +if (sscanf(mac, "%x:%x:%x:%x:%x:%x",
> > > +   [0], [1], [2],
> > > +   [3], [4], [5]) != 6) {
> > > +return -1;
> 
> Should an error be logged here?

I thought an error (which will be threw out when port finding is failed) is
enough? Just like we didn't do the sanity check to pci id.

Anyway, I have no objection to add an error log to make it more explicit.

> I flag the same question when checking the return type for this function 
> later but something to think about. The error log could be here or after the 
> return type check but I do think it's useful.

I will put the error log here (inside the if clause).

> > > +}
> > > +
> > > +for (i = 0; i < 6; i++) {
> > > +ea->addr_bytes[i] = bytes[i];
> > > +}
> > > +
> > > +return 0;
> > > +}
> > > +
> > > +static dpdk_port_t
> > > +netdev_dpdk_get_port_by_mac(const char *mac_str) {
> > > +int i;
> > > +struct ether_addr mac;
> > > +struct ether_addr port_mac;
> > > +
> > > +netdev_dpdk_str_to_ether(mac_str, );
> 
> I think there should be an error check for the call to 
> netdev_dpdk_str_to_ether() above and an associated error log, in the case 
> where sscanf fails within netdev_dpdk_str_to_ether() (i.e. Mac is too long, 
> too short etc.) it will help with the logs to zero in on the issue.

Right, I missed the sanity check here.

> > > +for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > > +if (!rte_eth_dev_is_valid_port(i)) {
> > > +continue;
> > > +}
> > > +
> > > +rte_eth_macaddr_get(i, _mac);
> > > +if (is_same_ether_addr(, _mac)) {
> > > +return i;
> > > +}
> > > +}
> > > +
> > > +return DPDK_ETH_PORT_ID_INVALID;
> > > +}
> > > +
> 
> In general for 

Re: [ovs-dev] [PATCH v2] openvswitch: Trim off padding before L3+ netfilter processing

2018-01-08 Thread Ed Swierk via dev
On 1/6/18 10:57, Pravin Shelar wrote:
> On Fri, Jan 5, 2018 at 10:59 PM, Ed Swierk  wrote:
>>
>>
>> On Jan 5, 2018 22:17, "Pravin Shelar"  wrote:
>>
>> On Fri, Jan 5, 2018 at 3:20 PM, Ed Swierk 
>> wrote:
>>> On Fri, Jan 5, 2018 at 10:14 AM, Ed Swierk 
>>> wrote:
 On Thu, Jan 4, 2018 at 7:36 PM, Pravin Shelar  wrote:
> OVS already pull all required headers in skb linear data, so no need
> to redo all of it. only check required is the ip-checksum validation.
> I think we could avoid it in most of cases by checking skb length to
> ipheader length before verifying the ip header-checksum.

 Shouldn't the IP header checksum be verified even earlier, like in
 key_extract(), before actually using any of the fields in the IP
 header?
>>>
>>> Something like this for verifying the IP header checksum (not tested):
>>>
>> AFAIU openflow does not need this verification, so it is not required
>> in flow extract.
>>
>>
>> Okay. How about my proposed trimming implementation, caching the pad length
>> in the ovs cb?
>>
> Caching the length is not that simple, OVS actions can change the
> length. Keeping it consistent with packet would be more work, so lets
> calculate it in ovs-ct function.
> 

Something like this?

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index a38c80e..282325d 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -4084,6 +4084,8 @@ struct sk_buff *skb_checksum_trimmed(struct sk_buff *skb,
 unsigned int transport_len,
 __sum16(*skb_chkf)(struct sk_buff *skb));
 
+int skb_network_trim(struct sk_buff *skb);
+
 /**
  * skb_head_is_locked - Determine if the skb->head is locked down
  * @skb: skb to check
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 08f5740..c68e927 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4740,6 +4740,41 @@ struct sk_buff *skb_checksum_trimmed(struct sk_buff *skb,
 }
 EXPORT_SYMBOL(skb_checksum_trimmed);
 
+/**
+ * skb_network_trim - trim skb to length specified by the network header
+ * @skb: the skb to trim
+ *
+ * Trims the skb to the length specified by the network header,
+ * removing any trailing padding. Leaves the skb alone if the protocol
+ * is not IP or IPv6. Frees the skb on error.
+ * 
+ * Caller needs to pull the skb to the network header.
+ */
+int skb_network_trim(struct sk_buff *skb)
+{
+   unsigned int len;
+   int err;
+
+   switch (skb->protocol) {
+   case htons(ETH_P_IP):
+   len = ntohs(ip_hdr(skb)->tot_len);
+   break;
+   case htons(ETH_P_IPV6):
+   len = sizeof(struct ipv6hdr)
+   + ntohs(ipv6_hdr(skb)->payload_len);
+   break;
+   default:
+   len = skb->len;
+   }
+
+   err = pskb_trim_rcsum(skb, len);
+   if (unlikely(err))
+   kfree_skb(skb);
+
+   return err;
+}
+EXPORT_SYMBOL(skb_network_trim);
+
 void __skb_warn_lro_forwarding(const struct sk_buff *skb)
 {
net_warn_ratelimited("%s: received packets cannot be forwarded while 
LRO is enabled\n",
diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index b27c5c6..73418d3 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -1112,6 +1112,10 @@ int ovs_ct_execute(struct net *net, struct sk_buff *skb,
nh_ofs = skb_network_offset(skb);
skb_pull_rcsum(skb, nh_ofs);
 
+   err = skb_network_trim(skb);
+   if (err)
+   return err;
+
if (key->ip.frag != OVS_FRAG_TYPE_NONE) {
err = handle_fragments(net, key, info->zone.id, skb);
if (err)
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RESEND PATCH 2/3] net: ovs: remove unused hardirq.h

2018-01-08 Thread David Miller
From: "Yang Shi" 
Date: Tue, 09 Jan 2018 03:52:53 +0800

> Preempt counter APIs have been split out, currently, hardirq.h just
> includes irq_enter/exit APIs which are not used by openvswitch at all.
> 
> So, remove the unused hardirq.h.
> 
> Signed-off-by: Yang Shi 
> Acked-by: Pravin B Shelar 

Applied.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ofproto:fix abort issue when delete bridge

2018-01-08 Thread Zhanghaibo (Euler)
Ben,

According to current implementation, There is no problem if rule A has a 
non-NULL struct rule_dpif's 'new_rule'  B, because there is rcu synchronization 
during bridge deletion, and ovsrcu_postone() is called twice by 
ofproto_destroy(), which could also help to make sure ofproto will not be 
released before release rule B (see function ofproto_destroy() and 
ofproto_destroy_defer__() for detail). 

But if rule A has a non-NULL 'new_rule'  B, and rule B has a non-NULL 
'new_rule' C also, there will be problem, because ofproto has been released 
when release rule C, but ofproto will be used when release a rule in function 
ofproto_rule_destroy__().

I copied the bt information below for reference.

 (gdb) bt
#0  0x7f811052d157 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7f811052e848 in __GI_abort () at abort.c:90
#2  0x006218a9 in PAT_abort ()
#3  0x0061ebbd in patchIllInsHandler ()
#4  
#5  0x00470169 in rule_destroy_cb (rule=0x7f80601ec280) at 
ofproto/ofproto.c:2851
#6  0x0052d546 in ovsrcu_call_postponed () at lib/ovs_rcu.c:323
#7  0x0052d714 in ovsrcu_postpone_thread (arg=) at 
lib/ovs_rcu.c:338
#8  0x0052fa81 in ovsthread_wrapper (aux_=) at 
lib/ovs_thread.c:651
#9  0x7f8111a3edc5 in start_thread (arg=0x7f80c67fc700) at 
pthread_create.c:308
#10 0x7f81105ef75d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:113
(gdb) f 5
#5  0x00470169 in rule_destroy_cb (rule=0x7f80601ec280) at 
ofproto/ofproto.c:2851
2851rule->ofproto->ofproto_class->rule_destruct(rule);



-Original Message-
From: Ben Pfaff [mailto:b...@ovn.org] 
Sent: 2018年1月9日 5:05
To: Zhanghaibo (Euler) 
Cc: d...@openvswitch.org; Chengwentao (Vintorcheng) 
Subject: Re: [ovs-dev] [PATCH] ofproto:fix abort issue when delete bridge

This seems like the wrong approach to me.  If there's a problem that struct 
rule_dpif's 'new_rule' member can sometimes point to a rule that gets freed, 
then one would normally fix that through some kind of synchronization between 
the rules (for example, one could make sure that RCU keeps ->new_rule from 
being freed while it is still in use), not by adding a refcount on a data 
structure multiple levels higher.

On Mon, Jan 08, 2018 at 11:09:58AM +0800, Haibo Zhang wrote:
> When delete a birdge, all the rule of the bridge wil be removed. When 
> destruct a rule, it is possible that the rule has a non-NULL new_rule A, and 
> the new_rule A might has a non-NULL new_rule B, and the new_rule B might has 
> a non-NULL new_rule C... in this case, the ofproto has been freed before rule 
> B or C was freed, and it will cause crash issue when free rule B or C using 
> rcu mechanism. To fix the issue, a reference count is introduced to ofproto 
> to make sure all rules of the ofproto were freed completely before the 
> ofproto was freed.
> 
> Signed-off-by: Haibo Zhang 
> ---
>  ofproto/ofproto-dpif.c | 14 --
>  ofproto/ofproto-provider.h |  9 -
>  ofproto/ofproto.c  | 26 --
>  3 files changed, 36 insertions(+), 13 deletions(-)
> 
> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index 
> 43b7b89..968a51a 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> @@ -4254,16 +4254,26 @@ static struct rule_dpif *rule_dpif_cast(const 
> struct rule *rule)  }
>  
>  static struct rule *
> -rule_alloc(void)
> +rule_alloc(struct ofproto * ofproto)
>  {
> +struct rule * rule_;
>  struct rule_dpif *rule = xzalloc(sizeof *rule);
> -return >up;
> +
> +if (OVS_UNLIKELY(!rule)) {
> +return NULL;
> +}
> +
> +rule_ = >up;
> +*CONST_CAST(struct ofproto **, _->ofproto) = ofproto;
> +ofproto_ref(ofproto);
> +return rule_;
>  }
>  
>  static void
>  rule_dealloc(struct rule *rule_)
>  {
>  struct rule_dpif *rule = rule_dpif_cast(rule_);
> +ofproto_unref(rule_->ofproto);
>  free(rule);
>  }
>  
> diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h 
> index 9dc73c4..ce3e0f7 100644
> --- a/ofproto/ofproto-provider.h
> +++ b/ofproto/ofproto-provider.h
> @@ -133,6 +133,11 @@ struct ofproto {
>  /* Variable length mf_field mapping. Stores all configured variable 
> length
>   * meta-flow fields (struct mf_field) in a switch. */
>  struct vl_mff_map vl_mff_map;
> +/* Number of references.
> + * bridge keep one reference
> + * Any rule trying to keep ofproto from being freed should hold its own
> + * reference. */
> +struct ovs_refcount ref_count;
>  };
>  
>  void ofproto_init_tables(struct ofproto *, int n_tables); @@ -433,6 
> +438,8 @@ struct rule {  void ofproto_rule_ref(struct rule *);  bool 
> ofproto_rule_try_ref(struct rule *);  void ofproto_rule_unref(struct 
> rule *);
> +void ofproto_ref(struct ofproto *);
> +void ofproto_unref(struct 

[ovs-dev] Cobranza judicial y extrajudicial

2018-01-08 Thread No puede perderse este seminario
Un último recurso para la recuperación de la cartera morosa 

Cobranza judicial y extrajudicial 
25 de enero- CP. y MAN. Fernando García Zárate - 9am-6pm

En el proceso de cobranza extrajudicial, el propósito es llegar a un acuerdo 
sin la necesidad de ingresar a un proceso de cobranza judicial. La idea es 
recuperar las deudas morosas en el más corto plazo posible, mejorando los 
títulos actuales, constituyendo nuevas garantías y prendas, y resguardando la 
documentación original de las empresas; como último recurso podemos acudir a la 
cobranza judicial, cuyo objetivo es realizar el cobro ejecutivo estableciendo 
demandas, de cheques, letras, pagarés y facturas; cobros especiales para 
petición de quiebras y convenios judiciales; constitución y ejecución de 
prendas y garantías hipotecarias; querellas por estafa y apropiación indebida, 
entre otros, son parte de las acciones y gestiones, que en su caso, deberían 
contemplarse. 

BENEFICIOS DE ASISTIR: 

• Estar al día en las formas de cobro extrajudicial. 
• Conocer los títulos de crédito como garantía. 
• Manejar plazos judiciales para demandar.
• Conocer los procedimientos judiciales de los diversos juicios.
• Estar al tanto del procedimiento de los embargos. 
• Conocer el procedimiento de ejecución de las sentencias. 

¿Requiere la información a la Brevedad? responda este email con la palabra: 
Cobranza + nombre - teléfono - correo.


centro telefónico:018002120744


 


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 4/4] nsh: add dec_nsh_ttl action

2018-01-08 Thread Ben Pfaff
On Sat, Jan 06, 2018 at 01:47:54PM +0800, Yi Yang wrote:
> NSH ttl is a 6-bit field ranged from 0 to 63, it should be
> decremented by 1 every hop, if it is 0 or it is so after
> decremented, the packet should be dropped and a packet-in
> message is sent to main controller.
> 
> Signed-off-by: Yi Yang 

This introduces lots of sparse warnings:

../include/openvswitch/nsh.h:314:14: warning: cast to restricted ovs_be32
../include/openvswitch/nsh.h:314:14: warning: cast from restricted ovs_be16
../include/openvswitch/nsh.h:314:14: warning: restricted ovs_be32 degrades to 
integer
../include/openvswitch/nsh.h:314:40: warning: cast to restricted ovs_be32
../include/openvswitch/nsh.h:314:40: warning: cast from restricted ovs_be16
../include/openvswitch/nsh.h:314:40: warning: restricted ovs_be32 degrades to 
integer
../include/openvswitch/nsh.h:314:37: warning: incorrect type in return 
expression (different base types)
../include/openvswitch/nsh.h:314:37:expected restricted ovs_be32
../include/openvswitch/nsh.h:314:37:got unsigned int
../include/openvswitch/nsh.h:327:30: warning: incorrect type in initializer 
(different base types)
../include/openvswitch/nsh.h:327:30:expected restricted ovs_be32 [usertype] 
path_hdr
../include/openvswitch/nsh.h:327:30:got unsigned int
../include/openvswitch/nsh.h:328:19: warning: restricted ovs_be32 degrades to 
integer
../include/openvswitch/nsh.h:314:14: warning: cast to restricted ovs_be32
../include/openvswitch/nsh.h:314:14: warning: cast from restricted ovs_be16
../include/openvswitch/nsh.h:314:14: warning: restricted ovs_be32 degrades to 
integer
../include/openvswitch/nsh.h:314:40: warning: cast to restricted ovs_be32
../include/openvswitch/nsh.h:314:40: warning: cast from restricted ovs_be16
../include/openvswitch/nsh.h:314:40: warning: restricted ovs_be32 degrades to 
integer
../include/openvswitch/nsh.h:314:37: warning: incorrect type in return 
expression (different base types)
../include/openvswitch/nsh.h:314:37:expected restricted ovs_be32
../include/openvswitch/nsh.h:314:37:got unsigned int
../include/openvswitch/nsh.h:334:30: warning: incorrect type in initializer 
(different base types)
../include/openvswitch/nsh.h:334:30:expected restricted ovs_be32 [usertype] 
path_hdr
../include/openvswitch/nsh.h:334:30:got unsigned int
../include/openvswitch/nsh.h:335:13: warning: restricted ovs_be32 degrades to 
integer
../include/openvswitch/nsh.h:314:14: warning: cast to restricted ovs_be32
../include/openvswitch/nsh.h:314:14: warning: cast from restricted ovs_be16
../include/openvswitch/nsh.h:314:14: warning: restricted ovs_be32 degrades to 
integer
../include/openvswitch/nsh.h:314:40: warning: cast to restricted ovs_be32
../include/openvswitch/nsh.h:314:40: warning: cast from restricted ovs_be16
../include/openvswitch/nsh.h:314:40: warning: restricted ovs_be32 degrades to 
integer
../include/openvswitch/nsh.h:314:37: warning: incorrect type in return 
expression (different base types)
../include/openvswitch/nsh.h:314:37:expected restricted ovs_be32
../include/openvswitch/nsh.h:314:37:got unsigned int

Thanks,

Ben.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 2/4] nsh: add new flow key 'ttl'

2018-01-08 Thread Ben Pfaff
On Sat, Jan 06, 2018 at 01:47:52PM +0800, Yi Yang wrote:
> IETF NSH draft added a new filed ttl in NSH header, this patch
> is to add new nsh key 'ttl' for it.
> 
> Signed-off-by: Yi Yang 

Thanks for v7!

The field assignments in meta-flow.h seem wrong to me:

- The TTL field is new in v2.9, so it shouldn't say v2.8.

- The existing fields should not be renumbered because that
  breaks OpenFlow wire format compatibility with anything that
  already knows how to talk to OVS 2.8.  Please keep the
  existing numbering.

Why does nsh_16aligned_be32 exist?  Please use get_16aligned_be32, which
is identical.

In meta-flow.xml, please properly document the NSH fields, following the
pattern set by the other documentation in the file.

I see several uses of memcpy() for copying a struct.  Please use an
assignment to copy structs.

This statement looks suspicious, since the target and the "sizeof" are
different:
memset(nsh, 0, sizeof(nsh->context));

I'm concerned about how this patch introduces different structs with
identical layouts and then uses memcpy() to copy between them.  This is
a trap for unsuspecting developers who change one structure or the other
(even by reordering fields).  It would probably be better to figure out
a way to either use the same struct in each case, or to do
member-by-member copies.  Another way would be to use assertions to make
sure that the structures really are identical.

Thanks,

Ben.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] openvswitch: Trim off padding before L3+ netfilter processing

2018-01-08 Thread Pravin Shelar
On Sat, Jan 6, 2018 at 10:57 AM, Pravin Shelar  wrote:
> On Fri, Jan 5, 2018 at 10:59 PM, Ed Swierk  wrote:
>>
>>
>> On Jan 5, 2018 22:17, "Pravin Shelar"  wrote:
>>
>> On Fri, Jan 5, 2018 at 3:20 PM, Ed Swierk 
>> wrote:
>>> On Fri, Jan 5, 2018 at 10:14 AM, Ed Swierk 
>>> wrote:
 On Thu, Jan 4, 2018 at 7:36 PM, Pravin Shelar  wrote:
> OVS already pull all required headers in skb linear data, so no need
> to redo all of it. only check required is the ip-checksum validation.
> I think we could avoid it in most of cases by checking skb length to
> ipheader length before verifying the ip header-checksum.

 Shouldn't the IP header checksum be verified even earlier, like in
 key_extract(), before actually using any of the fields in the IP
 header?
>>>
>>> Something like this for verifying the IP header checksum (not tested):
>>>
>> AFAIU openflow does not need this verification, so it is not required
>> in flow extract.
>>
>>
>> Okay. How about my proposed trimming implementation, caching the pad length
>> in the ovs cb?
>>
> Caching the length is not that simple, OVS actions can change the
> length. Keeping it consistent with packet would be more work, so lets
> calculate it in ovs-ct function.

You could make it specific for skb-len-trimming, something like
boolean flag. so that it is easy to reason with.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ofp-errors: Send as much of a message as possible in an error reply.

2018-01-08 Thread Justin Pettit

> On Jan 6, 2018, at 10:33 AM, Ben Pfaff  wrote:
> 
> /* Creates and returns an OpenFlow message of type OFPT_ERROR that conveys the
> diff --git a/ofproto/bundles.c b/ofproto/bundles.c
> index 849f99a15e40..195b9f90ef11 100644
> --- a/ofproto/bundles.c
> +++ b/ofproto/bundles.c
> @@ -51,13 +51,10 @@ ofp_bundle_create(uint32_t id, uint16_t flags, const 
> struct ofp_header *oh)
> bundle->id = id;
> bundle->flags = flags;
> bundle->state = BS_OPEN;
> +bundle->msg = xmemdup(oh, ntohs(oh->length));
> 
> ovs_list_init(>msg_list);
> 
> return bundle;
> }
> 
> @@ -71,6 +68,7 @@ ofp_bundle_remove__(struct ofconn *ofconn, struct 
> ofp_bundle *bundle)
> }
> 
> ofconn_remove_bundle(ofconn, bundle);
> +free(bundle->msg);
> free(bundle);
> }

There are two other places that free 'bundle' directly in the file, so you may 
want to handle those as well.

> diff --git a/ofproto/connmgr.c b/ofproto/connmgr.c
> index 7f33fbddfe3f..ae53bfb04a83 100644
> --- a/ofproto/connmgr.c
> +++ b/ofproto/connmgr.c
> @@ -1256,7 +1256,7 @@ bundle_remove_expired(struct ofconn *ofconn, long long 
> int now)
> 
> HMAP_FOR_EACH_SAFE (b, next, node, >bundles) {
> if (b->used <= limit) {
> -ofconn_send_error(ofconn, >ofp_msg, OFPERR_OFPBFC_TIMEOUT);
> +ofconn_send_error(ofconn, b->msg, OFPERR_OFPBFC_TIMEOUT);

The comment for ofconn_send_error() should be updated to indicate it's not 
limited to 64 bytes anymore.


Acked-by: Justin Pettit 

This seems like it will be helpful.  Thanks!

--Justin


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 1/4] nsh: rework NSH netlink keys and actions

2018-01-08 Thread Ben Pfaff
On Sat, Jan 06, 2018 at 01:47:51PM +0800, Yi Yang wrote:
> This patch changes OVS_KEY_ATTR_NSH
> to nested attribute and adds three new NSH sub attribute keys:
> 
> OVS_NSH_KEY_ATTR_BASE: for length-fixed NSH base header
> OVS_NSH_KEY_ATTR_MD1:  for length-fixed MD type 1 context
> OVS_NSH_KEY_ATTR_MD2:  for length-variable MD type 2 metadata
> 
> Its intention is to align to NSH kernel implementation.

I applied this to master, thanks!
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [patch v2 3/3] tests: Add dpctl test for conntrack nconns/maxconns.

2018-01-08 Thread Darrell Ball
Signed-off-by: Darrell Ball 
---
 tests/system-kmod-macros.at  | 19 +
 tests/system-traffic.at  | 85 
 tests/system-userspace-macros.at | 12 ++
 3 files changed, 116 insertions(+)

diff --git a/tests/system-kmod-macros.at b/tests/system-kmod-macros.at
index 34db21a..12b0adf 100644
--- a/tests/system-kmod-macros.at
+++ b/tests/system-kmod-macros.at
@@ -104,3 +104,22 @@ m4_define([CHECK_CONNTRACK_NAT])
 # feature. Will remove this check after both kernel and userspace datapath
 # support it.
 m4_define([CHECK_CT_DPIF_FLUSH_BY_CT_TUPLE])
+
+# CHECK_CT_DPIF_SET_GET_MAXCONNS()
+#
+# Perform requirements checks for running ovs-dpctl ct-set-maxconns or
+# ovs-dpctl ct-get-maxconns. The kernel datapath does not support this
+# feature.
+m4_define([CHECK_CT_DPIF_SET_GET_MAXCONNS],
+[
+AT_SKIP_IF([:])
+])
+
+# CHECK_CT_DPIF_GET_NCONNS()
+#
+# Perform requirements checks for running ovs-dpctl ct-get-nconns. The
+# kernel datapath does not support this feature.
+m4_define([CHECK_CT_DPIF_GET_NCONNS],
+[
+AT_SKIP_IF([:])
+])
diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index eda7de5..a9e1fb9 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -936,6 +936,91 @@ NS_CHECK_EXEC([at_ns1], [ping -q -c 3 -i 0.3 -w 2 10.1.1.1 
| FORMAT_PING], [0],
 OVS_TRAFFIC_VSWITCHD_STOP
 AT_CLEANUP
 
+AT_SETUP([conntrack - get_nconns and get/set_maxconns])
+CHECK_CONNTRACK()
+CHECK_CT_DPIF_SET_GET_MAXCONNS()
+CHECK_CT_DPIF_GET_NCONNS()
+OVS_TRAFFIC_VSWITCHD_START()
+
+ADD_NAMESPACES(at_ns0, at_ns1)
+
+ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
+ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")
+
+dnl Allow any traffic from ns0->ns1. Only allow nd, return traffic from 
ns1->ns0.
+AT_DATA([flows.txt], [dnl
+priority=1,action=drop
+priority=10,arp,action=normal
+priority=100,in_port=1,icmp,action=ct(commit),2
+priority=100,in_port=2,icmp,ct_state=-trk,action=ct(table=0)
+priority=100,in_port=2,icmp,ct_state=+trk+est,action=1
+])
+
+AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt])
+
+dnl Pings from ns0->ns1 should work fine.
+NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 | FORMAT_PING], 
[0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+
+AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(10.1.1.2)], [0], [dnl
+icmp,orig=(src=10.1.1.1,dst=10.1.1.2,id=,type=8,code=0),reply=(src=10.1.1.2,dst=10.1.1.1,id=,type=0,code=0)
+])
+
+AT_CHECK([ovs-appctl dpctl/ct-set-maxconns one-bad-dp], [2], [], [dnl
+ovs-vswitchd: maxconns missing or malformed (Invalid argument)
+ovs-appctl: ovs-vswitchd: server returned an error
+])
+
+AT_CHECK([ovs-appctl dpctl/ct-set-maxconns a], [2], [], [dnl
+ovs-vswitchd: maxconns missing or malformed (Invalid argument)
+ovs-appctl: ovs-vswitchd: server returned an error
+])
+
+AT_CHECK([ovs-appctl dpctl/ct-set-maxconns one-bad-dp 10], [2], [], [dnl
+ovs-vswitchd: opening datapath (Address family not supported by protocol)
+ovs-appctl: ovs-vswitchd: server returned an error
+])
+
+AT_CHECK([ovs-appctl dpctl/ct-get-maxconns one-bad-dp], [2], [], [dnl
+ovs-vswitchd: opening datapath (Address family not supported by protocol)
+ovs-appctl: ovs-vswitchd: server returned an error
+])
+
+AT_CHECK([ovs-appctl dpctl/ct-get-nconns one-bad-dp], [2], [], [dnl
+ovs-vswitchd: opening datapath (Address family not supported by protocol)
+ovs-appctl: ovs-vswitchd: server returned an error
+])
+
+AT_CHECK([ovs-appctl dpctl/ct-get-nconns], [], [dnl
+1
+])
+
+AT_CHECK([ovs-appctl dpctl/ct-get-maxconns], [], [dnl
+300
+])
+
+AT_CHECK([ovs-appctl dpctl/ct-set-maxconns 10], [], [dnl
+setting maxconns successful
+])
+
+AT_CHECK([ovs-appctl dpctl/ct-get-maxconns], [], [dnl
+10
+])
+
+AT_CHECK([ovs-appctl dpctl/flush-conntrack])
+
+AT_CHECK([ovs-appctl dpctl/ct-get-nconns], [], [dnl
+0
+])
+
+AT_CHECK([ovs-appctl dpctl/ct-get-maxconns], [], [dnl
+10
+])
+
+OVS_TRAFFIC_VSWITCHD_STOP(["/could not create datapath one-bad-dp of unknown 
type system/d"])
+AT_CLEANUP
+
 AT_SETUP([conntrack - IPv6 ping])
 CHECK_CONNTRACK()
 OVS_TRAFFIC_VSWITCHD_START()
diff --git a/tests/system-userspace-macros.at b/tests/system-userspace-macros.at
index f220612..20a8635 100644
--- a/tests/system-userspace-macros.at
+++ b/tests/system-userspace-macros.at
@@ -109,3 +109,15 @@ m4_define([CHECK_CT_DPIF_FLUSH_BY_CT_TUPLE],
 [
 AT_SKIP_IF([:])
 ])
+
+# CHECK_CT_DPIF_SET_GET_MAXCONNS()
+#
+# Perform requirements checks for running ovs-dpctl ct-set-maxconns or
+# ovs-dpctl ct-get-maxconns. The userspace datapath does support this feature.
+m4_define([CHECK_CT_DPIF_SET_GET_MAXCONNS])
+
+# CHECK_CT_DPIF_GET_NCONNS()
+#
+# Perform requirements checks for running ovs-dpctl ct-get-nconns. The
+# userspace datapath does support this feature.
+m4_define([CHECK_CT_DPIF_GET_NCONNS])
-- 
1.9.1

___
dev mailing list
d...@openvswitch.org

[ovs-dev] [patch v2 2/3] dpctl conntrack: Add get number of connections.

2018-01-08 Thread Darrell Ball
A get command is added for number of conntrack connections.
This command is only supported in the userspace datapath
at this time.

Signed-off-by: Darrell Ball 
Signed-off-by: Antonio Fischetti 
Co-authored-by: Antonio Fischetti 
---
 NEWS|  1 +
 lib/conntrack.c |  7 +++
 lib/conntrack.h |  1 +
 lib/ct-dpif.c   |  8 
 lib/ct-dpif.h   |  1 +
 lib/dpctl.c | 22 ++
 lib/dpctl.man   |  5 +
 lib/dpif-netdev.c   |  9 +
 lib/dpif-netlink.c  |  1 +
 lib/dpif-provider.h |  2 ++
 10 files changed, 57 insertions(+)

diff --git a/NEWS b/NEWS
index 886aacc..edfc5c2 100644
--- a/NEWS
+++ b/NEWS
@@ -33,6 +33,7 @@ Post-v2.8.0
  * Datapath IDs may now be specified as 0x1 (etc.) instead of 16 digits.
- Add dpctl/ct-set-maxconns and dpctl/ct-get-maxconns commands; supported in
  userspace datapath.
+   - Add dpctl/ct-get-nconns commands; supported in userspace datapath.
 
 
 v2.8.0 - 31 Aug 2017
diff --git a/lib/conntrack.c b/lib/conntrack.c
index 0b15dd6..0a706ce 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -2576,6 +2576,13 @@ conntrack_get_maxconns(struct conntrack *ct, uint32_t 
*maxconns)
 return 0;
 }
 
+int
+conntrack_get_nconns(struct conntrack *ct, uint32_t *nconns)
+{
+*nconns = atomic_count_get(>n_conn);
+return 0;
+}
+
 /* This function must be called with the ct->resources read lock taken. */
 static struct alg_exp_node *
 expectation_lookup(struct hmap *alg_expectations,
diff --git a/lib/conntrack.h b/lib/conntrack.h
index c7f9b77..e453170 100644
--- a/lib/conntrack.h
+++ b/lib/conntrack.h
@@ -116,6 +116,7 @@ int conntrack_dump_done(struct conntrack_dump *);
 int conntrack_flush(struct conntrack *, const uint16_t *zone);
 int conntrack_set_maxconns(struct conntrack *ct, uint32_t maxconns);
 int conntrack_get_maxconns(struct conntrack *ct, uint32_t *maxconns);
+int conntrack_get_nconns(struct conntrack *ct, uint32_t *nconns);
 
 /* 'struct ct_lock' is a wrapper for an adaptive mutex.  It's useful to try
  * different types of locks (e.g. spinlocks) */
diff --git a/lib/ct-dpif.c b/lib/ct-dpif.c
index 21f900f..5fa3a97 100644
--- a/lib/ct-dpif.c
+++ b/lib/ct-dpif.c
@@ -156,6 +156,14 @@ ct_dpif_get_maxconns(struct dpif *dpif, uint32_t *maxconns)
 : EOPNOTSUPP);
 }
 
+int
+ct_dpif_get_nconns(struct dpif *dpif, uint32_t *nconns)
+{
+return (dpif->dpif_class->ct_get_nconns
+? dpif->dpif_class->ct_get_nconns(dpif, nconns)
+: EOPNOTSUPP);
+}
+
 void
 ct_dpif_entry_uninit(struct ct_dpif_entry *entry)
 {
diff --git a/lib/ct-dpif.h b/lib/ct-dpif.h
index 37001b4..09e7698 100644
--- a/lib/ct-dpif.h
+++ b/lib/ct-dpif.h
@@ -199,6 +199,7 @@ int ct_dpif_flush(struct dpif *, const uint16_t *zone,
   const struct ct_dpif_tuple *);
 int ct_dpif_set_maxconns(struct dpif *dpif, uint32_t maxconns);
 int ct_dpif_get_maxconns(struct dpif *dpif, uint32_t *maxconns);
+int ct_dpif_get_nconns(struct dpif *dpif, uint32_t *nconns);
 void ct_dpif_entry_uninit(struct ct_dpif_entry *);
 void ct_dpif_format_entry(const struct ct_dpif_entry *, struct ds *,
   bool verbose, bool print_stats);
diff --git a/lib/dpctl.c b/lib/dpctl.c
index 0fe86e0..87f0412 100644
--- a/lib/dpctl.c
+++ b/lib/dpctl.c
@@ -1725,6 +1725,27 @@ dpctl_ct_get_maxconns(int argc, const char *argv[],
 return error;
 }
 
+static int
+dpctl_ct_get_nconns(int argc, const char *argv[],
+struct dpctl_params *dpctl_p)
+{
+struct dpif *dpif;
+int error = dpctl_ct_open_dp(argc, argv, dpctl_p, , 2);
+if (!error) {
+uint32_t nconns;
+error = ct_dpif_get_nconns(dpif, );
+
+if (!error) {
+dpctl_print(dpctl_p, "%u\n", nconns);
+} else {
+dpctl_error(dpctl_p, error, "nconns could not be retrieved");
+}
+dpif_close(dpif);
+}
+
+return error;
+}
+
 /* Undocumented commands for unit testing. */
 
 static int
@@ -2023,6 +2044,7 @@ static const struct dpctl_command all_commands[] = {
 { "ct-bkts", "[dp] [gt=N]", 0, 2, dpctl_ct_bkts, DP_RO },
 { "ct-set-maxconns", "[dp] maxconns", 1, 2, dpctl_ct_set_maxconns, DP_RW },
 { "ct-get-maxconns", "[dp]", 0, 1, dpctl_ct_get_maxconns, DP_RO },
+{ "ct-get-nconns", "[dp]", 0, 1, dpctl_ct_get_nconns, DP_RO },
 { "help", "", 0, INT_MAX, dpctl_help, DP_RO },
 { "list-commands", "", 0, INT_MAX, dpctl_list_commands, DP_RO },
 
diff --git a/lib/dpctl.man b/lib/dpctl.man
index b859c5a..9e9d2dc 100644
--- a/lib/dpctl.man
+++ b/lib/dpctl.man
@@ -263,3 +263,8 @@ datapath.
 \*(DX\fBct\-get\-maxconns\fR [\fIdp\fR]
 Read the maximum limit of connection tracker connections.
 Only supported for userspace datapath.
+.
+.TP
+\*(DX\fBct\-get\-nconns\fR [\fIdp\fR]
+Read the current number of connection tracker connections.
+Only supported for userspace datapath.
diff 

[ovs-dev] [patch v2 1/3] dpctl conntrack: Add get and set maxconns command.

2018-01-08 Thread Darrell Ball
Get and set dpctl commands are added for conntrack maxconns.
These commands are only supported in the userspace
datapath at this time.

Signed-off-by: Darrell Ball 
Signed-off-by: Antonio Fischetti 
Co-authored-by: Antonio Fischetti 
---
 NEWS|  3 +++
 lib/conntrack.c | 14 +++
 lib/conntrack.h |  2 ++
 lib/ct-dpif.c   | 16 
 lib/ct-dpif.h   |  2 ++
 lib/dpctl.c | 72 +
 lib/dpctl.man   | 16 
 lib/dpif-netdev.c   | 18 ++
 lib/dpif-netlink.c  |  2 ++
 lib/dpif-provider.h |  4 +++
 10 files changed, 149 insertions(+)

diff --git a/NEWS b/NEWS
index a7f2def..886aacc 100644
--- a/NEWS
+++ b/NEWS
@@ -31,6 +31,9 @@ Post-v2.8.0
  * All the netdev-dpdk appctl commands described in ovs-vswitchd man page.
- vswitchd:
  * Datapath IDs may now be specified as 0x1 (etc.) instead of 16 digits.
+   - Add dpctl/ct-set-maxconns and dpctl/ct-get-maxconns commands; supported in
+ userspace datapath.
+
 
 v2.8.0 - 31 Aug 2017
 
diff --git a/lib/conntrack.c b/lib/conntrack.c
index 6d078f5..0b15dd6 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -2562,6 +2562,20 @@ conntrack_flush(struct conntrack *ct, const uint16_t 
*zone)
 return 0;
 }
 
+int
+conntrack_set_maxconns(struct conntrack *ct, uint32_t maxconns)
+{
+atomic_store_relaxed(>n_conn_limit, maxconns);
+return 0;
+}
+
+int
+conntrack_get_maxconns(struct conntrack *ct, uint32_t *maxconns)
+{
+atomic_read_relaxed(>n_conn_limit, maxconns);
+return 0;
+}
+
 /* This function must be called with the ct->resources read lock taken. */
 static struct alg_exp_node *
 expectation_lookup(struct hmap *alg_expectations,
diff --git a/lib/conntrack.h b/lib/conntrack.h
index 990f6c2..c7f9b77 100644
--- a/lib/conntrack.h
+++ b/lib/conntrack.h
@@ -114,6 +114,8 @@ int conntrack_dump_next(struct conntrack_dump *, struct 
ct_dpif_entry *);
 int conntrack_dump_done(struct conntrack_dump *);
 
 int conntrack_flush(struct conntrack *, const uint16_t *zone);
+int conntrack_set_maxconns(struct conntrack *ct, uint32_t maxconns);
+int conntrack_get_maxconns(struct conntrack *ct, uint32_t *maxconns);
 
 /* 'struct ct_lock' is a wrapper for an adaptive mutex.  It's useful to try
  * different types of locks (e.g. spinlocks) */
diff --git a/lib/ct-dpif.c b/lib/ct-dpif.c
index 239c848..21f900f 100644
--- a/lib/ct-dpif.c
+++ b/lib/ct-dpif.c
@@ -140,6 +140,22 @@ ct_dpif_flush(struct dpif *dpif, const uint16_t *zone,
 : EOPNOTSUPP);
 }
 
+int
+ct_dpif_set_maxconns(struct dpif *dpif, uint32_t maxconns)
+{
+return (dpif->dpif_class->ct_set_maxconns
+? dpif->dpif_class->ct_set_maxconns(dpif, maxconns)
+: EOPNOTSUPP);
+}
+
+int
+ct_dpif_get_maxconns(struct dpif *dpif, uint32_t *maxconns)
+{
+return (dpif->dpif_class->ct_get_maxconns
+? dpif->dpif_class->ct_get_maxconns(dpif, maxconns)
+: EOPNOTSUPP);
+}
+
 void
 ct_dpif_entry_uninit(struct ct_dpif_entry *entry)
 {
diff --git a/lib/ct-dpif.h b/lib/ct-dpif.h
index 5e2de53..37001b4 100644
--- a/lib/ct-dpif.h
+++ b/lib/ct-dpif.h
@@ -197,6 +197,8 @@ int ct_dpif_dump_next(struct ct_dpif_dump_state *, struct 
ct_dpif_entry *);
 int ct_dpif_dump_done(struct ct_dpif_dump_state *);
 int ct_dpif_flush(struct dpif *, const uint16_t *zone,
   const struct ct_dpif_tuple *);
+int ct_dpif_set_maxconns(struct dpif *dpif, uint32_t maxconns);
+int ct_dpif_get_maxconns(struct dpif *dpif, uint32_t *maxconns);
 void ct_dpif_entry_uninit(struct ct_dpif_entry *);
 void ct_dpif_format_entry(const struct ct_dpif_entry *, struct ds *,
   bool verbose, bool print_stats);
diff --git a/lib/dpctl.c b/lib/dpctl.c
index b769544..0fe86e0 100644
--- a/lib/dpctl.c
+++ b/lib/dpctl.c
@@ -1655,6 +1655,76 @@ dpctl_ct_bkts(int argc, const char *argv[],
 return error;
 }
 
+static int
+dpctl_ct_open_dp(int argc, const char *argv[],
+ struct dpctl_params *dpctl_p, struct dpif **dpif,
+ uint8_t max_args)
+{
+int error = 0;
+/* The datapath name is not a mandatory parameter for this command.
+ * If it is not specified - so argc < max_args - we retrieve it from the
+ * current setup, assuming only one exists. */
+char *dpname = argc >= max_args ? xstrdup(argv[1]) : get_one_dp(dpctl_p);
+if (!dpname) {
+error = EINVAL;
+dpctl_error(dpctl_p, error, "datapath not found");
+} else {
+error = parsed_dpif_open(dpname, false, dpif);
+free(dpname);
+if (error) {
+dpctl_error(dpctl_p, error, "opening datapath");
+}
+}
+return error;
+}
+
+static int
+dpctl_ct_set_maxconns(int argc, const char *argv[],
+  struct dpctl_params *dpctl_p)
+{
+struct dpif *dpif;
+int error = 

[ovs-dev] [patch v2 0/3] dpctl conntrack: Add nconns/maxconns commands.

2018-01-08 Thread Darrell Ball
Commands are added to:
Get the number of conntrack connections.
Get the maximum limit of conntrack connections.
Set the maximum limit of conntrack connections.

These commands are only supported in the userspace
datapath at this time.

A supporting test is added.

v1->v2: Use atomic_store_relaxed instead of atomic_init
Add NEWS items.
Add missing comments to the documentation that new
commands are only supported for the userspace datapath. 

Darrell Ball (3):
  dpctl conntrack: Add get and set maxconns command.
  dpctl conntrack: Add get number of connections.
  tests: Add dpctl test for conntrack nconns/maxconns.

 NEWS |  4 ++
 lib/conntrack.c  | 21 +
 lib/conntrack.h  |  3 ++
 lib/ct-dpif.c| 24 ++
 lib/ct-dpif.h|  3 ++
 lib/dpctl.c  | 94 
 lib/dpctl.man| 21 +
 lib/dpif-netdev.c| 27 
 lib/dpif-netlink.c   |  3 ++
 lib/dpif-provider.h  |  6 +++
 tests/system-kmod-macros.at  | 19 
 tests/system-traffic.at  | 85 
 tests/system-userspace-macros.at | 12 +
 13 files changed, 322 insertions(+)

-- 
1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 0/3] Initial support for new SIP Alg.

2018-01-08 Thread Tiago Lam

Hi Darrell,

Thanks for your initial review.

I'll work on incorporating your comments into v2 (which I'm planning to 
submit already with some tests). A couple of small comments are in-line.


On 01/05/2018 06:40 AM, Darrell Ball wrote:

Thanks for the series/work; I’ll be reviewing this series, but focusing on the 
high level aspects initially.

Some high level comments:

I noticed that your own comments in the series often pointed out various issues 
with this series, such as
assuming TCP transport, which is ‘unusual’, no NAT support, single media 
session only, lack of testing support etc.

Although SIP does not store as much dynamic state as other algs, it still needs 
to have a separate hidden portion of memory
for that. Embedding that in struct conn at top level is ‘not ideal’.

As I mentioned I would in the below referred thread, I sent a patch to support 
more algs, including SIP.
https://patchwork.ozlabs.org/patch/853010/
My patch is generic plumbing for additional algs, including NAT aspects, for 
internal future alg requirements and SIP.
It will help this series and remove a bunch of duplicated code.


That was going to be part of my next iteration as well - I've seen you 
just sent v3 of the series, so I'll work based on that.


I mainly wanted to get what I had "out there" in order to start syncing, 
so worked mainly based on master (and the previous "conntrack: Alg 
improvements." patch that was merged back in December, 
https://patchwork.ozlabs.org/cover/844311/).



Patch 2: sip_delete_conn() should not exist; when a conn is deleted, just check 
for SIP state to clean and call a SIP function
   Same idea for sip_set_conn_state()
   handle_sip() should be in conntrack-sip.c
   sip_expectation_create() – remove function and use my patch, 
calling with correct  arguments since it is more
   general and handles 
NAT.

>

Patch 3 should not be a separate patch; it is basic SIP and clearer to fold 
into patch 2.



Sure, I'll merge that (3/3) into 2/3 in v2 as well.


I did not review Patch 1 yet in enough depth to provide useful comments.

BTW, I already have some testing infra for SIP, but I am not sure I will submit 
soon since I am doing something else at the
moment. I’ll hold off on this for now.

Maybe an RFC label would be better initially.


Good point. Thought of doing it but then missed it when actually 
sending. Will do in v2, thanks.




Darrell


On 12/22/17, 11:54 AM, "ovs-dev-boun...@openvswitch.org on behalf of Tiago Lam" 
 wrote:

 This patch-set is an initial approach at implementing the new SIP Alg,
 mentioned by Aaron at [1].
 
 I'm mostly interested in getting to know your thoughts of how this is

 headed. There are a couple of points that are worth bringing up:
 - As mentioned in patches 1/3 and 2/3, this is still a preliminary
   implementation, and some work will be needed to move away from some
   assuptions, like assuming the SIP traffic is always going over IPv4
   and TCP;
 - At the moment, the sip state is being stored in the conn struct. I
   followed the example of seq_skew_dir here, which is also stored there,
   but realise this is not ideal. It seems storing it somewhere agnostic
   will be ideal in the future, to avoid polluting that struct with
   different Alg's details;

Embed it deeper into struct conn as pointer to SIP stuff


That's what I've at the moment, having set a sip_state struct for that.

Tiago.



 - The SIP helpers functions and structures are in conntrack-sip.h and
   conntrack-sip.c. This can create confusion when comparing to
   conntrack-tcp.c and other protocols since SIP is an Alg and is at a
   different level.

SIP is complex and deserves separate files.
I don’t think it is confusing – it is fine.

 
 With regards to testing, for now, this has been tested manually, by

 setting up the flows mentioned in patch 2/3 and having two VMs connected
 to OvS, both using SIPp to simulate real traffic both ways. I'm going to
 have a look at how this can be automated and added to
 tests/system-traffic.at, together with the rest of the already existing
 tests.
 
 [1] [CONNTRACK] Discussions at OvS 2017:

 
https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2017-2DNovember_341089.html=DwICAg=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=DDHX2MTCsXS7GD8ie27aEdUDgGRK2EIntHQAxtrkWmI=md5csJDVqD97O6SvpYWNjbuQZYN2sfYKe4cF1-dzt1A=

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpif-netdev: Refactor datapath flow cache

2018-01-08 Thread Wang, Yipeng1
I think this is an interesting idea.

One caveat is that in this case we use the rules' field to infer the flows' 
field. If the rule
does not consider a field within which the flow has high entropy, it still does 
not help.
Or if the rule considers many fields but the flow does not have high entropy 
there, then
we add the incremental software hash overhead for little benefit.

I am not sure if that is common in real cases, I am more concerned with the 
second scenario, 
what do you think?

We should profile a little bit to see how much overhead the incremental CRC will
cost.

Thanks
Yipeng

>-Original Message-
>
>I like the idea of incremental hash using CRC. As for adding new
>field, can we do it on-demand without relying on user to choose which
>fields?
>
>Assuming that in the beginning, 5-tuple covers unions of all fields
>used in all subtables, so RSS works fine. As subtables increase, there
>might be new subtables using new fields, such as vlan, or tunnel
>metadata. At the moment (or periodically) when we detect new subtable
>using new field, then we consider adding new field into the hash
>function. So for simply traffic using 5-tuple, RSS remains effective
>and unchanged, for tunneled traffic or traffic matching more fields,
>this on-demand hashing approach can alleviate the collision. Of
>course, changing the hashing function requires rehashing all existing
>elements, so we should do it less frequent.
>
>Regards,
>William
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/3] Save performance stats to database.

2018-01-08 Thread Mark Michelson

On 01/04/2018 06:38 PM, Ben Pfaff wrote:

On Mon, Dec 18, 2017 at 04:51:38PM -0600, Mark Michelson wrote:

This commit builds on the performance library by storing compiled
statistics in the OVS vswitchd database.

This can make it more useful for administrators to keep tabs on
performance in real time and make adjustments if necessary.

Signed-off-by: Mark Michelson 


It seems somewhat odd to put this table in the OVS database, when the
(initial) intent is for the performance measurements to come from OVN
instead.


Yeah, this was a case where my thought process was that I was adding 
something that is an OVS-level library, so it made the most sense to add 
the data to the OVS database.


Further, since my first use of the library was within ovn-controller, it 
made sense to be adding the data to databases that are local to each 
ovn-controller. That again led me to using the OVS database.




performance_destroy() needs to be called within a database transaction.

Database transactions don't always succeed; sometimes they fail and need
to be retried.  I think that this will happen naturally with
performance_run(), but I think that the deletion from
performance_destroy() will not be retried if it fails (because the
internal object has been destroyed).  This is not ideal.

One way to fix that might be to make performance_run() something that
runs once, in the daemon's main loop, instead of once per performance
object.  Then, performance_run() could compare the database table to
'performances', adding missing rows, updating existing rows, and
deleting rows that should be abandoned.

The database schema seems pretty rough to me.  It is going to be hard to
modify and extend.  I can think of a few ways to make it more flexible.
One would be to have columns (name, age, samples, average, min, max,
percentiles), where percentiles would be a map from a real (a
percentile) to its score, and each performance object would have three
rows, with ages 1, 5, and 10, (You might consider using columns of type
'real'.)  Another would be to have (name, samples, average, min, max,
p95), where average, min, max, and p95 were maps from an age to the
value for that age.  There are other possibilities too.

Thanks,

Ben.



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/3] Add performance measuring API

2018-01-08 Thread Ben Pfaff
OK, thanks, your feedback makes sense.

On Mon, Jan 08, 2018 at 02:55:10PM -0600, Mark Michelson wrote:
> Thanks for the detailed feedback Ben. I'll work up a new revision that
> addresses what you have found here. There are a couple of in-line comments
> below.
> 
> On 01/04/2018 05:58 PM, Ben Pfaff wrote:
> >On Mon, Dec 18, 2017 at 04:51:37PM -0600, Mark Michelson wrote:
> >>This is similar to the existing coverage and perf-counter APIs in OVS.
> >>However, rather than keeping counters, this is aimed at timing how long
> >>operations take to perform. "Operations" in this case can be anything
> >>from a loop iteration, to a function, to something more complex.
> >>
> >>The library will keep track of how long it takes to perform the
> >>particular operations and will maintain statistics of those running
> >>times.
> >>
> >>Statistics for a particular operation can be queried from the command
> >>line by using ovs-appctl -t  performance/show .
> >>
> >>The API is designed to be pretty small. The expected steps are as
> >>follows:
> >>
> >>1) Create a performance measurement, providing a unique name, using
> >>performance_create()
> >>2) Add calls to start_sample() and end_sample() to mark the start and
> >>stop of the operation you wish to measure.
> >>3) Periodically call performance_run() in order to compile statistics.
> >>4) Upon completion (likely program shutdown), call performance_destroy()
> >>to clean up.
> >>
> >>Signed-off-by: Mark Michelson 
> >
> >Thanks.  I guess that this will be useful from time to time
> >
> >It would be helpful to have a comment on each public function explaining
> >how to use it.  In particular, the meaning of parameters isn't
> >necessarily obvious (e.g. I guess that sample_rate is in msec?).
> >
> >In performance_init(), usually we would use ovsthread_once instead of
> >raw pthread_once().
> >
> >In performance_create(), you can use xzalloc() instead of xcalloc().
> >
> >Usually we put struct and data definitions at the top of a file, unless
> >there's a good reason otherwise.  In this case, I'd move struct stats
> >and struct performance and 'performances' to the top of the file.
> >
> >This doesn't look thread-safe; that would require a mutex (etc.) to
> >protect 'performances'.  Or it could be thread-local.
> >
> >I think that 'performances' should be static.
> >
> >Usually, in simple cases like this we put comments next to struct
> >members, e.g.:
> >
> >struct sample {
> > long long int start_time;   /* Time when we started this sample. */
> > long long int end_time; /* Time when we ended this sample. */
> > long long int elapsed;  /* Elapsed time: end_time - start_time. */
> >};
> >
> >It bothers me a little to see find_earliest use a size_t to iterate
> >through the samples and then return -1 to indicate failure.
> >
> >There should probably be documentation on the precision here.  I think
> >it's 1 ms, so only fairly long kinds of activities can be timed with any
> >accuracy.
> >
> >Maybe percentile() should document the valid range for 'percentile'.  In
> >particular, a 'percentile' of 100 will read past the end of the array.
> >
> >I'd prefer if the names of start_sample() and end_sample() started with
> >performance_, for consistency.
> >
> >sort_times() could use xmemdup().
> >
> >Usually we line up function declarations more like this:
> > static int
> > get_stats(const struct sample_vec *vec, long long int age_ms,
> >   struct stats *stats)
> >with the second line of parameters indented just past the (.
> >
> >In many cases, we've found that for "show" kinds of commands, it is
> >helpful to allow an empty set of parameters to show all of the entities
> >of that type.
> >
> >unixctl_command_register() should pass "NAME" as the second argument, to
> >let the user know what parameter is expected (or "[NAME]", if you make
> >it optional).
> >
> >Is there value in calculating stats periodically in performance_run()?
> >It looks to me like nothing is lost if they are calculated only when
> >requested via performance/show.
> 
> In this patch alone, no there is not any value in calculating stats in
> performance_run(). However, the next patch in the series stores the
> statistics in the database, meaning that they could be queried outside of
> the performance/show command. (Note, I have not yet read your response to
> that patch)
> 
> >
> >I think that running cull_old_times() in each call to performance_run()
> >means that we will normally cull off only a single element, which is
> >expensive due to the copying.  Maybe there should be a strategy to cull
> >less frequently so that on average maybe half of the elements are
> >culled
> 
> It actually won't usually cull one element. The thing to remember is that
> even though you may call performance_run() often, cull_old_times() will only
> be run if the sample rate timer is expired. So if you're measuring the
> performance of something that runs 

Re: [ovs-dev] [PATCH v2] ofproto: Delete all groups and meters when (un)configuring a controller.

2018-01-08 Thread Ben Pfaff
On Tue, Dec 26, 2017 at 11:12:02AM -0800, Gregory Rose wrote:
> On 12/26/2017 9:33 AM, Gregory Rose wrote:
> >On 12/22/2017 2:09 PM, Ben Pfaff wrote:
> >>On Tue, Nov 07, 2017 at 07:04:02PM -0800, Ben Pfaff wrote:
> >>>Open vSwitch has always deleted all flows from the flow table whenever
> >>>a
> >>>controller is configured or whenever all the controllers are
> >>>unconfigured.
> >>>After this commit, OVS additionally deletes all OpenFlow groups and
> >>>meters.
> >>>
> >>>Suggested-by: Periyasamy Palanisamy
> >>>
> >>>Suggested-by: Jan Scheurich 
> >>>Signed-off-by: Ben Pfaff 
> >>>---
> >>>v1->v2: Clear the meter table too (thanks Jan).
> >>Anyone want to review this?
> 
> I can't seem to find the original patch in my mailbox but I applied it from
> patchworks and tested it
> with 'make check' and 'make check-kmod'.  The patch itself looks good to me
> and it passes checkpatch.
> 
> Tested-by: Greg Rose 
> Reviewed-by: Greg Rose 
> 

Thanks for the review.  I applied this to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] ofproto: Delete all groups and meters when (un)configuring a controller.

2018-01-08 Thread Ben Pfaff
Thanks for the review.  I agree with your comments.  I fixed them and
applied this to master.

On Fri, Jan 05, 2018 at 11:21:55AM +, Jan Scheurich wrote:
> Just one small observation below. Otherwise LGTM.
> 
> I have tested the patch and it worked for all cases. I couldn't test the case 
> that the switch loses connection to a controller in stand-alone fail mode.
> 
> Acked-by: Jan Scheurich   
> Tested-by: Jan Scheurich 
> 
> > -Original Message-
> > From: Ben Pfaff [mailto:b...@ovn.org]
> > Sent: Wednesday, 08 November, 2017 04:04
> > To: d...@openvswitch.org
> > Cc: Ben Pfaff ; Periyasamy Palanisamy 
> > ; Jan Scheurich
> > 
> > Subject: [PATCH v2] ofproto: Delete all groups and meters when 
> > (un)configuring a controller.
> > 
> > Open vSwitch has always deleted all flows from the flow table whenever a
> > controller is configured or whenever all the controllers are unconfigured.
> > After this commit, OVS additionally deletes all OpenFlow groups and meters.
> > 
> > Suggested-by: Periyasamy Palanisamy 
> > Suggested-by: Jan Scheurich 
> > Signed-off-by: Ben Pfaff 
> > ---
> > v1->v2: Clear the meter table too (thanks Jan).
> > 
> >  AUTHORS.rst  |  1 +
> >  NEWS |  3 +++
> >  ofproto/ofproto.c| 26 +--
> >  tests/ofproto.at | 58 
> > 
> >  vswitchd/vswitch.xml | 15 +++---
> >  5 files changed, 90 insertions(+), 13 deletions(-)
> > 
> > diff --git a/AUTHORS.rst b/AUTHORS.rst
> > index 139e99b330d8..26f81508d3d5 100644
> > --- a/AUTHORS.rst
> > +++ b/AUTHORS.rst
> > @@ -519,6 +519,7 @@ Pasi Kärkkäinen pa...@iki.fi
> >  Patrik Andersson R  patrik.r.anders...@ericsson.com
> >  Paulo Cravero   pcrav...@as2594.net
> >  Pawan Shuklashuk...@vmware.com
> > +Periyasamy Palanisamy   periyasamy.palanis...@ericsson.com
> >  Peter Amidonpe...@picnicpark.org
> >  Peter Balland   pe...@nicira.com
> >  Peter Phaal peter.ph...@inmon.com
> > diff --git a/NEWS b/NEWS
> > index 047f34b9f402..7ef4d6728d28 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -1,5 +1,8 @@
> >  Post-v2.8.0
> >  
> > +   - ovs-vswitchd:
> > + * Configuring a controller, or unconfiguring all controllers, now 
> > deletes
> > +   all groups and meters (as well as all flows).
> > - OVN:
> >   * The "requested-chassis" option for a logical switch port now 
> > accepts a
> > chassis "hostname" in addition to a chassis "name".
> > diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
> > index 82c2bb27d348..e762888b746e 100644
> > --- a/ofproto/ofproto.c
> > +++ b/ofproto/ofproto.c
> > @@ -251,6 +251,8 @@ static void delete_flows__(struct rule_collection *,
> > const struct openflow_mod_requester *)
> >  OVS_REQUIRES(ofproto_mutex);
> > 
> > +static void ofproto_group_delete_all__(struct ofproto *)
> > +OVS_REQUIRES(ofproto_mutex);
> >  static bool ofproto_group_exists(const struct ofproto *, uint32_t 
> > group_id);
> >  static void handle_openflow(struct ofconn *, const struct ofpbuf *);
> >  static enum ofperr ofproto_flow_mod_init(struct ofproto *,
> > @@ -1566,6 +1568,8 @@ ofproto_flush__(struct ofproto *ofproto)
> >  }
> >  delete_flows__(, OFPRR_DELETE, NULL);
> >  }
> > +ofproto_group_delete_all__(ofproto);
> > +meter_delete_all(ofproto);
> 
> When the ofproto instance is destroyed meter_delete_all() appears to be now 
> invoked twice:
> first from ofproto_destroy() directly and then a second time from here. It 
> doesn't seem to do any harm,
> but I guess it would be cleaner to remove the meter_delete_all() call from 
> ofproto_destroy().
> 
> It seems that the hmap_destroy(>meters) call in ofproto_destroy() should 
> also better be moved to the rcu-deferred ofproto_destroy__() function.
> 
> >  /* XXX: Concurrent handler threads may insert new learned flows based 
> > on
> >   * learn actions of the now deleted flows right after we release
> >   * 'ofproto_mutex'. */
> > @@ -7348,20 +7352,30 @@ ofproto_group_mod_finish(struct ofproto *ofproto,
> >   *
> >   * This is intended for use within an ofproto provider's 'destruct'
> >   * function. */
> > -void
> > -ofproto_group_delete_all(struct ofproto *ofproto)
> > -OVS_EXCLUDED(ofproto_mutex)
> > +static void
> > +ofproto_group_delete_all__(struct ofproto *ofproto)
> > +OVS_REQUIRES(ofproto_mutex)
> >  {
> >  struct ofproto_group_mod ogm;
> > -
> >  ogm.gm.command = OFPGC11_DELETE;
> >  ogm.gm.group_id = OFPG_ALL;
> > -
> > -ovs_mutex_lock(_mutex);
> >  ogm.version = ofproto->tables_version + 1;
> > +

Re: [ovs-dev] [PATCH] ofproto:fix abort issue when delete bridge

2018-01-08 Thread Ben Pfaff
This seems like the wrong approach to me.  If there's a problem that
struct rule_dpif's 'new_rule' member can sometimes point to a rule that
gets freed, then one would normally fix that through some kind of
synchronization between the rules (for example, one could make sure that
RCU keeps ->new_rule from being freed while it is still in use), not by
adding a refcount on a data structure multiple levels higher.

On Mon, Jan 08, 2018 at 11:09:58AM +0800, Haibo Zhang wrote:
> When delete a birdge, all the rule of the bridge wil be removed. When 
> destruct a rule, it is possible that the rule has a non-NULL new_rule A, and 
> the new_rule A might has a non-NULL new_rule B, and the new_rule B might has 
> a non-NULL new_rule C... in this case, the ofproto has been freed before rule 
> B or C was freed, and it will cause crash issue when free rule B or C using 
> rcu mechanism. To fix the issue, a reference count is introduced to ofproto 
> to make sure all rules of the ofproto were freed completely before the 
> ofproto was freed.
> 
> Signed-off-by: Haibo Zhang 
> ---
>  ofproto/ofproto-dpif.c | 14 --
>  ofproto/ofproto-provider.h |  9 -
>  ofproto/ofproto.c  | 26 --
>  3 files changed, 36 insertions(+), 13 deletions(-)
> 
> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> index 43b7b89..968a51a 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> @@ -4254,16 +4254,26 @@ static struct rule_dpif *rule_dpif_cast(const struct 
> rule *rule)
>  }
>  
>  static struct rule *
> -rule_alloc(void)
> +rule_alloc(struct ofproto * ofproto)
>  {
> +struct rule * rule_;
>  struct rule_dpif *rule = xzalloc(sizeof *rule);
> -return >up;
> +
> +if (OVS_UNLIKELY(!rule)) {
> +return NULL;
> +}
> +
> +rule_ = >up;
> +*CONST_CAST(struct ofproto **, _->ofproto) = ofproto;
> +ofproto_ref(ofproto);
> +return rule_;
>  }
>  
>  static void
>  rule_dealloc(struct rule *rule_)
>  {
>  struct rule_dpif *rule = rule_dpif_cast(rule_);
> +ofproto_unref(rule_->ofproto);
>  free(rule);
>  }
>  
> diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h
> index 9dc73c4..ce3e0f7 100644
> --- a/ofproto/ofproto-provider.h
> +++ b/ofproto/ofproto-provider.h
> @@ -133,6 +133,11 @@ struct ofproto {
>  /* Variable length mf_field mapping. Stores all configured variable 
> length
>   * meta-flow fields (struct mf_field) in a switch. */
>  struct vl_mff_map vl_mff_map;
> +/* Number of references.
> + * bridge keep one reference
> + * Any rule trying to keep ofproto from being freed should hold its own
> + * reference. */
> +struct ovs_refcount ref_count;
>  };
>  
>  void ofproto_init_tables(struct ofproto *, int n_tables);
> @@ -433,6 +438,8 @@ struct rule {
>  void ofproto_rule_ref(struct rule *);
>  bool ofproto_rule_try_ref(struct rule *);
>  void ofproto_rule_unref(struct rule *);
> +void ofproto_ref(struct ofproto *);
> +void ofproto_unref(struct ofproto *);
>  
>  static inline const struct rule_actions * rule_get_actions(const struct rule 
> *);
>  static inline bool rule_is_table_miss(const struct rule *);
> @@ -1288,7 +1295,7 @@ struct ofproto_class {
>   * ->rule_destruct() must uninitialize derived state.
>   *
>   * Rule destruction must not fail. */
> -struct rule *(*rule_alloc)(void);
> +struct rule *(*rule_alloc)(struct ofproto *);
>  enum ofperr (*rule_construct)(struct rule *rule)
>  /* OVS_REQUIRES(ofproto_mutex) */;
>  void (*rule_insert)(struct rule *rule, struct rule *old_rule,
> diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
> index 84eb18e..d68de6b 100644
> --- a/ofproto/ofproto.c
> +++ b/ofproto/ofproto.c
> @@ -523,6 +523,7 @@ ofproto_create(const char *datapath_name, const char 
> *datapath_type,
>  ofproto->connmgr = connmgr_create(ofproto, datapath_name, datapath_name);
>  ofproto->min_mtu = INT_MAX;
>  cmap_init(>groups);
> +ovs_refcount_init(>ref_count);
>  ovs_mutex_unlock(_mutex);
>  ofproto->ogf.types = 0xf;
>  ofproto->ogf.capabilities = OFPGFC_CHAINING | OFPGFC_SELECT_LIVENESS |
> @@ -1616,14 +1617,20 @@ ofproto_destroy__(struct ofproto *ofproto)
>  ofproto->ofproto_class->dealloc(ofproto);
>  }
>  
> -/* Destroying rules is doubly deferred, must have 'ofproto' around for them.
> - * - 1st we defer the removal of the rules from the classifier
> - * - 2nd we defer the actual destruction of the rules. */
> -static void
> -ofproto_destroy_defer__(struct ofproto *ofproto)
> -OVS_EXCLUDED(ofproto_mutex)
> +void
> +ofproto_ref(struct ofproto *ofproto)
>  {
> -ovsrcu_postpone(ofproto_destroy__, ofproto);
> +if (ofproto) {
> +ovs_refcount_ref(>ref_count);
> +}
> +}
> +
> +void
> +ofproto_unref(struct ofproto *ofproto)
> +{
> +if (ofproto && ovs_refcount_unref_relaxed(>ref_count) == 1) {
> +

[ovs-dev] [patch v3 4/5] conntrack: Some style improvements.

2018-01-08 Thread Darrell Ball
Fix up some instances where variable declarations were not close
enough to their use, as these were missed before.  This is the
preferred art in OVS code and flagged heavily in code reviews.
This is highly desirable due to code clarity reasons.

There are also some cases where newlines were not needed by prior art
and some cases where they were needed but missed.

There was one case where there was a missing space after "}".

There were a few cases where for loop index declarations could be
folded into the loop.

One function was missing some const qualifiers.

Signed-off-by: Darrell Ball 
---

v2->v3: Incorporate review comments from Flavio.
A separate following patch is split out for reordering
of ip fragmentation checks.

 lib/conntrack.c | 144 +++-
 1 file changed, 60 insertions(+), 84 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index a036a12..0902e0e 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -249,8 +249,8 @@ conn_key_cmp(const struct conn_key *key1, const struct 
conn_key *key2)
 }
 
 static void
-ct_print_conn_info(struct conn *c, char *log_msg, enum vlog_level vll,
-   bool force, bool rl_on)
+ct_print_conn_info(const struct conn *c, const char *log_msg,
+   enum vlog_level vll, bool force, bool rl_on)
 {
 #define CT_VLOG(RL_ON, LEVEL, ...)  \
 do {\
@@ -308,7 +308,6 @@ ct_print_conn_info(struct conn *c, char *log_msg, enum 
vlog_level vll,
 void
 conntrack_init(struct conntrack *ct)
 {
-unsigned i, j;
 long long now = time_msec();
 
 ct_rwlock_init(>resources_lock);
@@ -319,13 +318,13 @@ conntrack_init(struct conntrack *ct)
 ovs_list_init(>alg_exp_list);
 ct_rwlock_unlock(>resources_lock);
 
-for (i = 0; i < CONNTRACK_BUCKETS; i++) {
+for (unsigned i = 0; i < CONNTRACK_BUCKETS; i++) {
 struct conntrack_bucket *ctb = >buckets[i];
 
 ct_lock_init(>lock);
 ct_lock_lock(>lock);
 hmap_init(>connections);
-for (j = 0; j < ARRAY_SIZE(ctb->exp_lists); j++) {
+for (unsigned j = 0; j < ARRAY_SIZE(ctb->exp_lists); j++) {
 ovs_list_init(>exp_lists[j]);
 }
 ct_lock_unlock(>lock);
@@ -345,12 +344,10 @@ conntrack_init(struct conntrack *ct)
 void
 conntrack_destroy(struct conntrack *ct)
 {
-unsigned i;
-
 latch_set(>clean_thread_exit);
 pthread_join(ct->clean_thread, NULL);
 latch_destroy(>clean_thread_exit);
-for (i = 0; i < CONNTRACK_BUCKETS; i++) {
+for (unsigned i = 0; i < CONNTRACK_BUCKETS; i++) {
 struct conntrack_bucket *ctb = >buckets[i];
 struct conn *conn;
 
@@ -415,10 +412,11 @@ write_ct_md(struct dp_packet *pkt, uint16_t zone, const 
struct conn *conn,
 pkt->md.ct_label = alg_exp->master_label;
 key = _exp->master_key;
 }
+
 pkt->md.ct_orig_tuple_ipv6 = false;
+
 if (key) {
 if (key->dl_type == htons(ETH_TYPE_IP)) {
-
 pkt->md.ct_orig_tuple.ipv4 = (struct ovs_key_ct_tuple_ipv4) {
 key->src.addr.ipv4_aligned,
 key->dst.addr.ipv4_aligned,
@@ -650,7 +648,6 @@ reverse_nat_packet(struct dp_packet *pkt, const struct conn 
*conn)
 struct ip_header *inner_l3 = (struct ip_header *) (icmp + 1);
 extract_l3_ipv4(_key, inner_l3, tail - ((char *)inner_l3) - pad,
 _l4, false);
-
 pkt->l3_ofs += (char *) inner_l3 - (char *) nh;
 pkt->l4_ofs += inner_l4 - (char *) icmp;
 
@@ -661,6 +658,7 @@ reverse_nat_packet(struct dp_packet *pkt, const struct conn 
*conn)
 packet_set_ipv4_addr(pkt, _l3->ip_dst,
  conn->key.dst.addr.ipv4_aligned);
 }
+
 reverse_pat_packet(pkt, conn);
 icmp->icmp_csum = 0;
 icmp->icmp_csum = csum(icmp, tail - (char *) icmp - pad);
@@ -775,20 +773,16 @@ nat_clean(struct conntrack *ct, struct conn *conn,
   struct conntrack_bucket *ctb)
 OVS_REQUIRES(ctb->lock)
 {
-long long now = time_msec();
 ct_rwlock_wrlock(>resources_lock);
 nat_conn_keys_remove(>nat_conn_keys, >rev_key, ct->hash_basis);
 ct_rwlock_unlock(>resources_lock);
 ct_lock_unlock(>lock);
-
 uint32_t hash_rev_conn = conn_key_hash(>rev_key, ct->hash_basis);
 unsigned bucket_rev_conn = hash_to_bucket(hash_rev_conn);
-
 ct_lock_lock(>buckets[bucket_rev_conn].lock);
 ct_rwlock_wrlock(>resources_lock);
-
+long long now = time_msec();
 struct conn *rev_conn = conn_lookup(ct, >rev_key, now);
-
 struct nat_conn_key_node *nat_conn_key_node =
 nat_conn_keys_lookup(>nat_conn_keys, >rev_key,
  ct->hash_basis);
@@ -802,8 +796,8 @@ nat_clean(struct conntrack *ct, struct conn *conn,
 _conn->node);
 free(rev_conn);
 }
-

Re: [ovs-dev] [PATCH 1/3] Add performance measuring API

2018-01-08 Thread Mark Michelson
Thanks for the detailed feedback Ben. I'll work up a new revision that 
addresses what you have found here. There are a couple of in-line 
comments below.


On 01/04/2018 05:58 PM, Ben Pfaff wrote:

On Mon, Dec 18, 2017 at 04:51:37PM -0600, Mark Michelson wrote:

This is similar to the existing coverage and perf-counter APIs in OVS.
However, rather than keeping counters, this is aimed at timing how long
operations take to perform. "Operations" in this case can be anything
from a loop iteration, to a function, to something more complex.

The library will keep track of how long it takes to perform the
particular operations and will maintain statistics of those running
times.

Statistics for a particular operation can be queried from the command
line by using ovs-appctl -t  performance/show .

The API is designed to be pretty small. The expected steps are as
follows:

1) Create a performance measurement, providing a unique name, using
performance_create()
2) Add calls to start_sample() and end_sample() to mark the start and
stop of the operation you wish to measure.
3) Periodically call performance_run() in order to compile statistics.
4) Upon completion (likely program shutdown), call performance_destroy()
to clean up.

Signed-off-by: Mark Michelson 


Thanks.  I guess that this will be useful from time to time

It would be helpful to have a comment on each public function explaining
how to use it.  In particular, the meaning of parameters isn't
necessarily obvious (e.g. I guess that sample_rate is in msec?).

In performance_init(), usually we would use ovsthread_once instead of
raw pthread_once().

In performance_create(), you can use xzalloc() instead of xcalloc().

Usually we put struct and data definitions at the top of a file, unless
there's a good reason otherwise.  In this case, I'd move struct stats
and struct performance and 'performances' to the top of the file.

This doesn't look thread-safe; that would require a mutex (etc.) to
protect 'performances'.  Or it could be thread-local.

I think that 'performances' should be static.

Usually, in simple cases like this we put comments next to struct
members, e.g.:

struct sample {
 long long int start_time;   /* Time when we started this sample. */
 long long int end_time; /* Time when we ended this sample. */
 long long int elapsed;  /* Elapsed time: end_time - start_time. */
};

It bothers me a little to see find_earliest use a size_t to iterate
through the samples and then return -1 to indicate failure.

There should probably be documentation on the precision here.  I think
it's 1 ms, so only fairly long kinds of activities can be timed with any
accuracy.

Maybe percentile() should document the valid range for 'percentile'.  In
particular, a 'percentile' of 100 will read past the end of the array.

I'd prefer if the names of start_sample() and end_sample() started with
performance_, for consistency.

sort_times() could use xmemdup().

Usually we line up function declarations more like this:
 static int
 get_stats(const struct sample_vec *vec, long long int age_ms,
   struct stats *stats)
with the second line of parameters indented just past the (.

In many cases, we've found that for "show" kinds of commands, it is
helpful to allow an empty set of parameters to show all of the entities
of that type.

unixctl_command_register() should pass "NAME" as the second argument, to
let the user know what parameter is expected (or "[NAME]", if you make
it optional).

Is there value in calculating stats periodically in performance_run()?
It looks to me like nothing is lost if they are calculated only when
requested via performance/show.


In this patch alone, no there is not any value in calculating stats in 
performance_run(). However, the next patch in the series stores the 
statistics in the database, meaning that they could be queried outside 
of the performance/show command. (Note, I have not yet read your 
response to that patch)




I think that running cull_old_times() in each call to performance_run()
means that we will normally cull off only a single element, which is
expensive due to the copying.  Maybe there should be a strategy to cull
less frequently so that on average maybe half of the elements are
culled


It actually won't usually cull one element. The thing to remember is 
that even though you may call performance_run() often, cull_old_times() 
will only be run if the sample rate timer is expired. So if you're 
measuring the performance of something that runs constantly and 
typically finishes in sub 1-second, and your sample rate is ten seconds, 
then you'll only cull old times every ten seconds. So you could 
potentially remove a bunch of elements.


That all being said, I will still look at potentially optimizing this.



Thanks,

Ben.



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [patch v3 5/5] conntrack: Reorder sanity checks in extract_l3_ipvx().

2018-01-08 Thread Darrell Ball
The functions extract_l3_ipv4 and extract_l3_ipv6 check for
unsupported ip fragments and return early.  The checks were after
an assignment that would not be needed when early return happens.
This is slightly inefficient, but mostly reads poorly.
Hence, reorder the ip fragment checks before the assignments.

Signed-off-by: Darrell Ball 
---
 lib/conntrack.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 0902e0e..a068910 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -1517,11 +1517,11 @@ extract_l3_ipv4(struct conn_key *key, const void *data, 
size_t size,
 return false;
 }
 
-*new_data = (char *) data + ip_len;
-}
+if (IP_IS_FRAGMENT(ip->ip_frag_off)) {
+return false;
+}
 
-if (IP_IS_FRAGMENT(ip->ip_frag_off)) {
-return false;
+*new_data = (char *) data + ip_len;
 }
 
 if (validate_checksum && csum(data, ip_len) != 0) {
@@ -1561,14 +1561,14 @@ extract_l3_ipv6(struct conn_key *key, const void *data, 
size_t size,
 return false;
 }
 
-if (new_data) {
-*new_data = data;
-}
-
 if (nw_frag) {
 return false;
 }
 
+if (new_data) {
+*new_data = data;
+}
+
 key->src.addr.ipv6 = ip6->ip6_src;
 key->dst.addr.ipv6 = ip6->ip6_dst;
 key->nw_proto = nw_proto;
-- 
1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [patch v3 3/5] conntrack: Add additional alg support.

2018-01-08 Thread Darrell Ball
In order to support more algs with different requirements,
expectation handling is allowed to handle more cases, such as
a wildcard source ip as in the case of SIP.  NAT can also be
skipped in some alg cases.
Expectation_create() was otherwise simplified in the process.

Some renaming was done to support the above changes.

Signed-off-by: Darrell Ball 
---

 lib/conntrack-private.h |  6 ++--
 lib/conntrack.c | 89 ++---
 2 files changed, 58 insertions(+), 37 deletions(-)

diff --git a/lib/conntrack-private.h b/lib/conntrack-private.h
index 60e2902..a344801 100644
--- a/lib/conntrack-private.h
+++ b/lib/conntrack-private.h
@@ -82,9 +82,9 @@ struct alg_exp_node {
  * connection label and mark. */
 ovs_u128 master_label;
 uint32_t master_mark;
-/* True if the expectation is for passive mode, as is
- * one option for FTP. */
-bool passive_mode;
+/* True if for NAT application, the alg replaces the dest address;
+ * otherwise, the source address is replaced.  */
+bool nat_rpl_dst;
 };
 
 struct conn {
diff --git a/lib/conntrack.c b/lib/conntrack.c
index 6bb30ce..a036a12 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -71,6 +71,7 @@ enum ct_alg_ctl_type {
 CT_ALG_CTL_NONE,
 CT_ALG_CTL_FTP,
 CT_ALG_CTL_TFTP,
+CT_ALG_CTL_SIP,
 };
 
 static bool conn_key_extract(struct conntrack *, struct dp_packet *,
@@ -128,8 +129,8 @@ extract_l3_ipv6(struct conn_key *key, const void *data, 
size_t size,
 const char **new_data);
 
 static struct alg_exp_node *
-expectation_lookup(struct hmap *alg_expectations,
-   const struct conn_key *key, uint32_t basis);
+expectation_lookup(struct hmap *alg_expectations, const struct conn_key *key,
+   uint32_t basis, bool src_ip_wc);
 
 static int
 repl_ftp_v4_addr(struct dp_packet *pkt, ovs_be32 v4_addr_rep,
@@ -168,7 +169,7 @@ handle_ftp_ctl(struct conntrack *ct, const struct 
conn_lookup_ctx *ctx,
 static void
 handle_tftp_ctl(struct conntrack *ct,
 const struct conn_lookup_ctx *ctx OVS_UNUSED,
-struct dp_packet *pkt OVS_UNUSED,
+struct dp_packet *pkt,
 const struct conn *conn_for_expectation,
 long long now OVS_UNUSED,
 enum ftp_ctl_pkt ftp_ctl OVS_UNUSED, bool nat OVS_UNUSED);
@@ -503,6 +504,15 @@ get_alg_ctl_type(const struct dp_packet *pkt, ovs_be16 
tp_src, ovs_be16 tp_dst,
 return CT_ALG_CTL_NONE;
 }
 
+static bool
+alg_src_ip_wc(enum ct_alg_ctl_type alg_ctl_type)
+{
+if (alg_ctl_type == CT_ALG_CTL_SIP) {
+return true;
+}
+return false;
+}
+
 static void
 handle_alg_ctl(struct conntrack *ct, const struct conn_lookup_ctx *ctx,
struct dp_packet *pkt, enum ct_alg_ctl_type ct_alg_ctl,
@@ -889,7 +899,7 @@ conn_not_found(struct conntrack *ct, struct dp_packet *pkt,
 nc->nat_info = xmemdup(nat_action_info, sizeof *nc->nat_info);
 
 if (alg_exp) {
-if (alg_exp->passive_mode) {
+if (alg_exp->nat_rpl_dst) {
 nc->rev_key.dst.addr = alg_exp->alg_nat_repl_addr;
 nc->nat_info->nat_action = NAT_ACTION_SRC;
 } else {
@@ -1249,7 +1259,8 @@ process_one(struct conntrack *ct, struct dp_packet *pkt,
 
 ct_rwlock_rdlock(>resources_lock);
 alg_exp = expectation_lookup(>alg_expectations, >key,
- ct->hash_basis);
+ ct->hash_basis,
+ alg_src_ip_wc(ct_alg_ctl));
 if (alg_exp) {
 alg_exp_entry = *alg_exp;
 alg_exp = _exp_entry;
@@ -2530,11 +2541,14 @@ conntrack_flush(struct conntrack *ct, const uint16_t 
*zone)
 
 /* This function must be called with the ct->resources read lock taken. */
 static struct alg_exp_node *
-expectation_lookup(struct hmap *alg_expectations,
-   const struct conn_key *key, uint32_t basis)
+expectation_lookup(struct hmap *alg_expectations, const struct conn_key *key,
+   uint32_t basis, bool src_ip_wc)
 {
 struct conn_key check_key = *key;
 check_key.src.port = ALG_WC_SRC_PORT;
+if (src_ip_wc) {
+memset(_key.src.addr, 0, sizeof check_key.src.addr);
+}
 struct alg_exp_node *alg_exp_node;
 
 uint32_t alg_exp_conn_key_hash = conn_key_hash(_key, basis);
@@ -2650,33 +2664,38 @@ expectation_clean(struct conntrack *ct, const struct 
conn_key *master_key,
 }
 
 static void
-expectation_create(struct conntrack *ct,
-   ovs_be16 dst_port,
-   enum ct_alg_mode mode,
-   const struct conn *master_conn)
+expectation_create(struct conntrack *ct, ovs_be16 dst_port,
+   const struct conn *master_conn, bool reply, bool src_ip_wc,
+   bool skip_nat)
 {
 struct ct_addr src_addr;
 struct 

[ovs-dev] [patch v3 2/5] conntrack: Fix alg expectation cleanup.

2018-01-08 Thread Darrell Ball
Presently, alg expectations are removed by being time expired.
This was intended to happen before the control connections and
was intended to minimize the extra work involved for tracking and
removing the expectations.  This is not the best option since it
should be possible to remove expectations when a control connection
is removed and a new api is in the works to do this. Also, conceptually
an expectation should not exist without a control connection context
and it can be argued that this should be a strict requirement.

The approach is changed to remove the expectations when the control
connections are removed.  The previous code to expire the expectations
is removed at the same time.

Fixes: bd5e81a0e ("Userspace Datapath: Add ALG infra and FTP.")
Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341683.html

Signed-off-by: Darrell Ball 
---

v2->v3: Use hindex map in lieu of hmap for efficiency: Ben P.

 lib/conntrack-private.h |   7 +-
 lib/conntrack.c | 191 
 lib/conntrack.h |   8 +-
 3 files changed, 136 insertions(+), 70 deletions(-)

diff --git a/lib/conntrack-private.h b/lib/conntrack-private.h
index ac0198f..60e2902 100644
--- a/lib/conntrack-private.h
+++ b/lib/conntrack-private.h
@@ -68,11 +68,10 @@ struct nat_conn_key_node {
  * connection. The expectation is created by the control
  * connection. */
 struct alg_exp_node {
+/* Node in alg_expectations. */
 struct hmap_node node;
-/* Expiry list node for an expectation. */
-struct ovs_list exp_node;
-/* The time when this expectation will expire. */
-long long expiration;
+/* Node in alg_expectation_refs. */
+struct hindex_node node_ref;
 /* Key of data connection to be created. */
 struct conn_key key;
 /* Corresponding key of the control connection. */
diff --git a/lib/conntrack.c b/lib/conntrack.c
index 6d078f5..6bb30ce 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -140,7 +140,7 @@ static enum ftp_ctl_pkt
 process_ftp_ctl_v4(struct conntrack *ct,
struct dp_packet *pkt,
const struct conn *conn_for_expectation,
-   long long now, ovs_be32 *v4_addr_rep,
+   ovs_be32 *v4_addr_rep,
char **ftp_data_v4_start,
size_t *addr_offset_from_ftp_data_start);
 
@@ -148,6 +148,10 @@ static enum ftp_ctl_pkt
 detect_ftp_ctl_type(const struct conn_lookup_ctx *ctx,
 struct dp_packet *pkt);
 
+static void
+expectation_clean(struct conntrack *ct, const struct conn_key *master_key,
+  uint32_t basis);
+
 static struct ct_l4_proto *l4_protos[] = {
 [IPPROTO_TCP] = _proto_tcp,
 [IPPROTO_UDP] = _proto_other,
@@ -166,8 +170,8 @@ handle_tftp_ctl(struct conntrack *ct,
 const struct conn_lookup_ctx *ctx OVS_UNUSED,
 struct dp_packet *pkt OVS_UNUSED,
 const struct conn *conn_for_expectation,
-long long now, enum ftp_ctl_pkt ftp_ctl OVS_UNUSED,
-bool nat OVS_UNUSED);
+long long now OVS_UNUSED,
+enum ftp_ctl_pkt ftp_ctl OVS_UNUSED, bool nat OVS_UNUSED);
 
 typedef void (*alg_helper)(struct conntrack *ct,
const struct conn_lookup_ctx *ctx,
@@ -190,8 +194,6 @@ long long ct_timeout_val[] = {
 
 /* The maximum TCP or UDP port number. */
 #define CT_MAX_L4_PORT 65535
-/* Alg expectation timeout. */
-#define CT_ALG_EXP_TIMEOUT (30 * 1000)
 /* String buffer used for parsing FTP string messages.
  * This is sized about twice what is needed to leave some
  * margin of error. */
@@ -312,6 +314,7 @@ conntrack_init(struct conntrack *ct)
 ct_rwlock_wrlock(>resources_lock);
 hmap_init(>nat_conn_keys);
 hmap_init(>alg_expectations);
+hindex_init(>alg_expectation_refs);
 ovs_list_init(>alg_exp_list);
 ct_rwlock_unlock(>resources_lock);
 
@@ -373,8 +376,10 @@ conntrack_destroy(struct conntrack *ct)
 HMAP_FOR_EACH_POP (alg_exp_node, node, >alg_expectations) {
 free(alg_exp_node);
 }
+
 ovs_list_poison(>alg_exp_list);
 hmap_destroy(>alg_expectations);
+hindex_destroy(>alg_expectation_refs);
 ct_rwlock_unlock(>resources_lock);
 ct_rwlock_destroy(>resources_lock);
 }
@@ -512,16 +517,6 @@ handle_alg_ctl(struct conntrack *ct, const struct 
conn_lookup_ctx *ctx,
 }
 
 static void
-alg_exp_init_expiration(struct conntrack *ct,
-struct alg_exp_node *alg_exp_node,
-long long now)
-OVS_REQ_WRLOCK(ct->resources_lock)
-{
-alg_exp_node->expiration = now + CT_ALG_EXP_TIMEOUT;
-ovs_list_push_back(>alg_exp_list, _exp_node->exp_node);
-}
-
-static void
 pat_packet(struct dp_packet *pkt, const struct conn *conn)
 {
 if (conn->nat_info->nat_action & NAT_ACTION_SRC) {
@@ -809,6 +804,9 @@ conn_clean(struct conntrack *ct, struct conn *conn,
  

[ovs-dev] [patch v3 1/5] hindex: Add hindex_next_node_with_hash.

2018-01-08 Thread Darrell Ball
Add hindex_next_node_with_hash() api which gets a next hindex
node with the same hash as the parameter node or null if there
is no such next node.  This api will be used is a subsequent
patch.

Signed-off-by: Darrell Ball 
---
 lib/hindex.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/lib/hindex.h b/lib/hindex.h
index 876c5a9..f70d086 100644
--- a/lib/hindex.h
+++ b/lib/hindex.h
@@ -154,6 +154,15 @@ hindex_node_with_hash(const struct hindex *hindex, size_t 
hash)
 return node;
 }
 
+/* Returns the next node in 'hindex' with the same 'hash' of node, or a null
+ * pointer if no more nodes have that hash value.  Node must be a valid
+ * non-null node. */
+static inline struct hindex_node *
+hindex_next_node_with_hash(const struct hindex_node *node)
+{
+return node->s;
+}
+
 /* Iteration. */
 
 /* Iterates through every node in HINDEX. */
-- 
1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [RESEND PATCH 2/3] net: ovs: remove unused hardirq.h

2018-01-08 Thread Yang Shi
Preempt counter APIs have been split out, currently, hardirq.h just
includes irq_enter/exit APIs which are not used by openvswitch at all.

So, remove the unused hardirq.h.

Signed-off-by: Yang Shi 
Acked-by: Pravin B Shelar 
Cc: "David S. Miller" 
Cc: d...@openvswitch.org
---
 net/openvswitch/vport-internal_dev.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/openvswitch/vport-internal_dev.c 
b/net/openvswitch/vport-internal_dev.c
index 04a3128..2f47c65 100644
--- a/net/openvswitch/vport-internal_dev.c
+++ b/net/openvswitch/vport-internal_dev.c
@@ -16,7 +16,6 @@
  * 02110-1301, USA
  */
 
-#include 
 #include 
 #include 
 #include 
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] netdev: Custom statistics.

2018-01-08 Thread Ben Pfaff
On Mon, Jan 08, 2018 at 08:28:45AM -0800, Ben Pfaff wrote:
> On Mon, Jan 08, 2018 at 11:55:26AM +, Weglicki, MichalX wrote:
> > Hi Ben, 
> > > I don't like the idea implied in the code in a few places that name[] in
> > > struct netdev_custom_counter might not be null-terminated.  I think that
> > > we should ensure that it is always null terminated.  Otherwise there is
> > > a pitfall for carelessly written code.
> > To be honest I'm not sure what I could do here, each time statistics are
> > Requested from netdev, whole buffer is set to "0", so even it 
> > Particular netdev implementation would return string which 
> > Is not null-terminated, it will be as all other characters in 
> > counter name field, will be "\0". When message is decoded from 
> > open flow buffer, proper check is done, so this part of the code is safe. 
> > If there is anything else you would like me to do, just let me know. 
> 
> Maybe I misinterpreted some code.  I'll take another look.
> 
> Thanks for v3!

Actually, could you respond to Kevin Traynor first?
https://mail.openvswitch.org/pipermail/ovs-dev/2018-January/342747.html
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] lex: Fix parsing of long tokens.

2018-01-08 Thread Ben Pfaff
On Fri, Jan 05, 2018 at 10:20:21AM -0800, Gregory Rose wrote:
> On 1/2/2018 11:15 AM, Ben Pfaff wrote:
> >When a token is longer than the built-in 256-byte buffer, a buffer is
> >malloc()'d but it was not properly null-terminated.
> >
> >Found by afl-fuzz.
> >
> >Reported-by: Bhargava Shastry 
> >Signed-off-by: Ben Pfaff 
> >---
> >  ovn/lib/lex.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> >diff --git a/ovn/lib/lex.c b/ovn/lib/lex.c
> >index 6f2b570f5c65..2f49af0e91e2 100644
> >--- a/ovn/lib/lex.c
> >+++ b/ovn/lib/lex.c
> >@@ -89,7 +89,7 @@ lex_token_strcpy(struct lex_token *token, const char *s, 
> >size_t length)
> >  ? token->buffer
> >  : xmalloc(length + 1));
> >  memcpy(token->s, s, length);
> >-token->buffer[length] = '\0';
> >+token->s[length] = '\0';
> >  }
> >  void
> 
> Reviewed-by: Greg Rose 

Thanks, applied to master and backported as far as necessary.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] OVN pacemaker: Fix issues when started as pacemaker container bundles

2018-01-08 Thread Ben Pfaff
Russell, I'm going to assume that you will review this.

Thanks,

Ben.

On Mon, Jan 08, 2018 at 01:05:56PM +0530, nusid...@redhat.com wrote:
> From: Numan Siddique 
> 
> When OVN dbs are created as a pacemaker container bundle resource with
> meta attribute "container-attribute-target=host" defined, the OVN OCF script
> is not working properly. It should use the function provided by the OCF lib
> 'ocf_attribute_target' [1] to get the physical hostname and use that to set 
> the
> master/slave scores. This patch makes use of this function when setting the
> scores. Also fixes other issues seen and deletes the local unused function
> 'ovsdb_server_find_active_peers'.
> 
> [1] - Please see this commit in ResourceAgents for more information on
> 'ocf_attribute_target'
> https://github.com/ClusterLabs/resource-agents/commit/9bd94137d77f770967d35db5de716590cfaf0435
> 
> Signed-off-by: Numan Siddique 
> CC: Russell Bryant 
> ---
>  ovn/utilities/ovndb-servers.ocf | 51 
> ++---
>  1 file changed, 27 insertions(+), 24 deletions(-)
> 
> diff --git a/ovn/utilities/ovndb-servers.ocf b/ovn/utilities/ovndb-servers.ocf
> index f256aefe9..164b6bce6 100755
> --- a/ovn/utilities/ovndb-servers.ocf
> +++ b/ovn/utilities/ovndb-servers.ocf
> @@ -26,7 +26,12 @@ 
> INACTIVE_PROBE=${OCF_RESKEY_inactive_probe_interval:-${INACTIVE_PROBE_DEFAULT}}
>  # a master is promoted and the IPAddr2 resource is started.
>  INVALID_IP_ADDRESS=192.0.2.254
>  
> -host_name=$(ocf_local_nodename)
> +host_name=$(ocf_attribute_target)
> +if [ "x$host_name" = "x" ]; then
> +# function ocf_attribute_target may not be available if the pacemaker
> +# version is old. Fall back to ocf_local_nodename.
> +host_name=$(ocf_local_nodename)
> +fi
>  : ${slave_score=5}
>  : ${master_score=10}
>  
> @@ -142,7 +147,7 @@ ovsdb_server_notify() {
>  fi
>  
>  ocf_log debug "ovndb_server: notified of event $type_op"
> -if [ "x${OCF_RESKEY_CRM_meta_notify_promote_uname}" = "x${host_name}" ]; 
> then
> +if [ "x$(ovsdb_server_last_known_master)" = "x${host_name}" ]; then
>  # Record ourselves so that the agent has a better chance of doing
>  # the right thing at startup
>  ocf_log debug "ovndb_server: $host_name is the master"
> @@ -220,31 +225,20 @@ ovsdb_server_find_active_master() {
>  esac
>  }
>  
> -ovsdb_server_find_active_peers() {
> -# Do we have any peers that are not stopping
> -for peer in ${OCF_RESKEY_CRM_meta_notify_slave_uname}; do
> -found=0
> -for old in ${OCF_RESKEY_CRM_meta_notify_stop_uname}; do
> -if [ $peer = $old ]; then
> -found=1
> -fi
> -done
> -if [ $found = 0 ]; then
> -# Rely on master-max=1
> -# Pacemaker will demote any additional ones it finds before 
> starting new copies
> -echo "$peer"
> -return
> -fi
> -done
> +ovsdb_server_last_known_master()
> +{
> +if [ -z "$MASTER_HOST" ]; then
> +MASTER_HOST="$(${CRM_ATTR_REPL_INFO} --query  -q  2>/dev/null)"
> +fi
> +echo "$MASTER_HOST"
>  }
>  
>  ovsdb_server_master_update() {
> -
>  case $1 in
>  $OCF_SUCCESS)
> -$CRM_MASTER -v ${slave_score};;
> +$CRM_MASTER -N $host_name -v ${slave_score};;
>  $OCF_RUNNING_MASTER)
> -$CRM_MASTER -v ${master_score};;
> +$CRM_MASTER -N $host_name -v ${master_score};;
>  #*) $CRM_MASTER -D;;
>  esac
>  }
> @@ -349,12 +343,17 @@ ovsdb_server_start() {
>  # When the start action is called, it is possible for the
>  # ovsdb-server's to be started as active. This could happen
>  # if the node owns the $MASTER_IP. At this point, pacemaker
> -# has not promoted this node yet. So return OCF_SUCCESS.
> +# has not promoted this node yet. Demote it and check for
> +# status again.
>  # Let pacemaker promote it in subsequent actions.
>  # As per the OCF guidelines, only monitor action should 
> return
>  # OCF_RUNNING_MASTER.
>  # 
> http://www.linux-ha.org/doc/dev-guides/_literal_ocf_running_master_literal_8.html
> -return $OCF_SUCCESS;;
> +${OVN_CTL} demote_ovnnb \
> +--db-nb-sync-from-addr=${INVALID_IP_ADDRESS}
> +${OVN_CTL} demote_ovnsb \
> +--db-sb-sync-from-addr=${INVALID_IP_ADDRESS}
> +;;
>  $OCF_ERR_GENERIC)return $rc;;
>  # Otherwise loop, waiting for the service to start, until
>  # the cluster times the operation out
> @@ -373,7 +372,11 @@ ovsdb_server_stop() {
>  
>  ovsdb_server_check_status ignore_northd
>  case $? in
> -$OCF_NOT_RUNNING)return ${OCF_SUCCESS};;
> +

Re: [ovs-dev] [PATCH v1] ovn: Add sanity check when adding tunnel port

2018-01-08 Thread Ben Pfaff
On Mon, Jan 08, 2018 at 11:36:22AM +, Zhenyu Gao wrote:
> 1. ovs cannot create two ports with same tunnel type and options:remote_ip
> 2. add santity check to detect if two chassises have same encap ip
> 3. add a sset to store tunnel ports' type & encap ip information
> 
> Signed-off-by: Zhenyu Gao 

Thanks for working on improving OVN.

This code leaks memory because it never destroys tunnel_type_ips.

I am unsure whether this handles changes properly.  Deletions happen
only at the end of encaps_run().  Insertions happen earlier.  What would
happen if a pass through encaps_run() was supposed to swap two tunnels
that would otherwise conflict?

Thanks,

Ben.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 0/3] rhel: Add force-reload-kmod support in ovs-systemd-reload

2018-01-08 Thread Ben Pfaff
On Fri, Dec 22, 2017 at 04:00:50PM +0100, Timothy Redaelli wrote:
> On Fedora / RHEL7 "ovs-ctl force-reload-mod" doesn't work due to systemd
> dependencies so this series adds support for "force-reload-mod" inside the
> rhel-only "ovs-systemd-reload" script.
> 
> Timothy Redaelli (3):
>   utilities: move some functions from ovs-ctl.in to ovs-lib.in
>   rhel: use the functions in ovs-lib.in in ovs-systemd-reload
>   rhel: add "force-reload-kmod" support in "ovs-systemd-reload"
> 
>  ...sr_share_openvswitch_scripts_ovs-systemd-reload |  41 +++--
>  utilities/ovs-ctl.in   | 173 
> -
>  utilities/ovs-lib.in   | 173 
> +
>  3 files changed, 200 insertions(+), 187 deletions(-)

Thanks for the series.  Greg, thanks for the reviews and testing.

I applied this to master.  (I fixed up the white space issues that Greg
pointed out.)
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovn-northd: Avoid duplicate logical flows in SB db

2018-01-08 Thread Ben Pfaff
Let's step back and consider the options.  Duplicate flow matches can
happen, either because of bugs at any given level of the code, or
because of user-provided data that can't practically be validated before
it is passed down to the next level.

Consider just the ovn-northd level.  ovn-northd can do string
comparisons to determine whether two flow matches are identical, but
flow matches can be duplicates without being the same string (due to
white space differences, order of clauses, different ways to express the
same condition, implied versus explicit prerequisites, and so on).  You
quickly get into a question of the big-O to determine whether two
Boolean expressions are the same.  Furthermore, ovn-northd doesn't
currently parse flow matches (and it's probably better if it doesn't).

Worse, for correctness, it is not enough to know whether two flow
matches are identical.  Instead, you have to know whether they overlap.
Consider "ip4" versus "ip4 && ip4.src == 1.2.3.4".  These expressions
overlap because they both match ipv4 packets with source address
1.2.3.4; if they have the same priority, then the treatment of the
packet is ambiguous.  Most people would say that, in this case, the more
specific match should "win", but that's not always obvious (what if the
matches were "ip4 && ip4.src == 1.2.3.4" and "ip4 && ip4.dst ==
1.2.3.4"?), OpenFlow says the behavior is unspecified, and OVS doesn't
have predictable behavior in this case.  I believe that determining
whether a group of matches overlap requires superlinear time.

Some of this is a little easier at the ovn-controller level.
ovn-controller converts flow matches into OpenFlow matches, and
duplicates are more likely to be found out at that point.  I say "more
likely" because simple differences like white space, etc. will not
matter.  Some kinds of overlap will be found out too.

So it might be worthwhile to think more about the particular bug, and
determine whether whatever observed bad behavior can be better
suppressed at a lower level.

On Mon, Jan 08, 2018 at 09:43:08AM -0800, Han Zhou wrote:
> If both ACLs have same priority, match, ..., but different actions, it is a
> misconfiguration in NB. What could northd do here besides raising an error
> log?
> 
> Another point, would this type of check increase the difficulty of probably
> future incremental-processing in northd?
> 
> From my point of view, it might be better just keep northd simple, and let
> clients handle the correctness, and let ovn-controller to do the final
> check. In this case, could Neutron maintain multiple sg-rule ids in
> external-ids of the same ACL entry?
> 
> Thanks,
> Han
> 
> On Mon, Jan 8, 2018 at 8:52 AM, Miguel Angel Ajo Pelayo  > wrote:
> 
> > Right!
> >
> > We didn't hit that issue, but it'd make sense to fix in this patch I guess.
> >
> > We could modify the hashing function to not include the action (not sure if
> > it does now..),
> > and also have a separate search function that ignores the action.
> >
> > On Mon, Jan 8, 2018 at 5:39 PM Ben Pfaff  wrote:
> >
> > > I suspect that this doesn't go far enough, because it includes actions
> > > in the hash, so that it would fail to deduplicate two identical ACLs
> > > with different actions (e.g. "drop" versus "allow").
> > >
> > > On Fri, Jan 05, 2018 at 10:43:16AM +, Daniel Alvarez wrote:
> > > > When there are two ACLs in a Logical Switch with same direction,
> > > > priority, match and action fields, ovn-northd will generate the
> > > > exact same logical flow for them into SB database. This will make
> > > > ovn-controller log messages (INFO) saying that the duplicate flow
> > > > is going to be dropped.
> > > >
> > > > This patch avoids adding duplicate lflows into SB database so that
> > > > ovn-controller doesn't have to process them.
> > > >
> > > > Signed-off-by: Daniel Alvarez 
> > > > ---
> > > >
> > > > This patch is needed as part of the consistency work we're doing in the
> > > > OpenStack integration [0]. In our effort to ensure consistency across
> > > > objects in Neutron and OVN databases we find some special cases like
> > > > security group rules which match OVN ACLs but not in 1:1 relationship.
> > > > Until now, two identical security group rules beloning each to a
> > > > different security group would generate a single ACL in NB database.
> > > > With this behavior, there's no way to map the ACL in OVN to the
> > > > corresponding Neutron object.
> > > >
> > > > By implementing [0] we're trying to ensure this mapping so we make use
> > > > of the external_ids column of every table for this purpose. It may
> > happen
> > > > that we'll have two identical ACLs but each referencing a different
> > > > Neutron object in their external_ids field. However, this will make
> > > > ovn-northd to generate two duplicate lflows into SB database which will
> > > > make ovn-controller drop them when installing the actual flows. With
> > this
> > 

Re: [ovs-dev] [PATCH v2] netdev: Custom statistics.

2018-01-08 Thread Kevin Traynor
On 12/19/2017 03:00 PM, Kevin Traynor wrote:
> On 12/19/2017 02:35 PM, Weglicki, MichalX wrote:
>>> -Original Message-
>>> From: Kevin Traynor [mailto:ktray...@redhat.com]
>>> Sent: Tuesday, December 19, 2017 3:07 PM
>>> To: Weglicki, MichalX ; d...@openvswitch.org
>>> Subject: Re: [ovs-dev] [PATCH v2] netdev: Custom statistics.
>>>
>>> On 12/05/2017 02:55 PM, Michal Weglicki wrote:
 - New get_custom_stats interface function is added to netdev. It
   allows particular netdev implementation to expose custom
   counters in dictionary format (counter name/counter value).
 - New statistics are retrieved using experimenter code and
   are printed as a result to ofctl dump-ports.
 - New counters are available for OpenFlow 1.4+.
 - New statistics are printed to output via ofctl only if those
   are present in reply message.
 - New statistics definition is added to include/openflow/intel-ext.h.
 - Custom statistics are implemented only for dpdk-physical
   port type.
 - DPDK-physical implementation uses xstats to collect statistics.
   Only dropped and error counters are exposed.

>>> Hi Michal - why only dropped and error counters? why not just expose
>>> them all. For example, IIUC this would report management dropped packets
>>> but there would not be a stat for management rx/tx successful packets.
>>>
>>> Kevin.
>> Hi Kevin - those counters were of biggest value to us at the point of making 
>> this 
>> patch, sending all counters (where for IXGBE is about 150) will produce 
>> some movement on the network. I think that biggest advantage of this 
>> particular patch is that it introduces a mechanism to expose 
>> any counters, counters list can be extended in the future 
>> if necessary. However I'm not sure if sending all counters 
>> is good idea, as there could be thousands of it in the future - in this 
>> solution, we have some kind of control over the data size. 
>>
> Ok thanks, that makes sense. I would like to suggest that *_management_*
> be added as part of this as I think it's only 2 additional stats and
> I've seen at least one user saying they needed this information.
> 

Hi Michal, this does not look to be in v3, do you think it should be
added as a separate patch? or they should not be reported?

thanks,
Kevin.



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH 0/8] dpif-netdev: Refactor cycle count and rebased patches

2018-01-08 Thread Kevin Traynor
On 01/08/2018 05:31 PM, Stokes, Ian wrote:
>>
>> Hi Ian,
>>
>>> -Original Message-
>>> From: Stokes, Ian [mailto:ian.sto...@intel.com]
>>> Sent: Monday, 08 January, 2018 17:09
>>> To: Jan Scheurich ; Kevin Traynor
>>> ; Ilya Maximets 
>>> Cc: d...@openvswitch.org
>>> Subject: RE: [ovs-dev] [RFC PATCH 0/8] dpif-netdev: Refactor cycle
>>> count and rebased patches
>>>
 Hi Ian,

 That's why I brought up my original question before Christmas, but
 apparently too late ☹.
>>>
>>> Apologies, Christmas was a bit hectic with the last push for output
>> batching and finishing for the holidays so I missed following up on it.
>>
>> No worries, I understand. Actually Ilya responded and agreed to my
>> proposal for deciding on an order. But we didn't have time to discuss the
>> best order. He had v9 out before Christmas based on dpdk_merge (later
>> merged to master) and I took that for rebase.
>>

 Myself, was hoping that we could get all three changes: PMD
 performance metrics, time-based tx batching and Kevin's enhancement
 for the pmd-rxq- show command into 2.9.

>>>
>>> I think this is the ideal situation, but does require sign off from
>>> Ilya and Kevin as it will be their features it will impact on, I guess
>> they can answer and give an idea on bandwidth to cooperate on something
>> like this.
>>>
 PMD performance metrics are for sure around long enough to warrant
>> that.
 And they are highly wanted too.
>>>
>>> To my mind the detailed PMD logs patchset has a lot of content to work
>>> through and there are reviews in progress as well as re-work requests
>>> from previous reviews of the v4. I was waiting to see these completed
>> before focusing on it myself as I haven't had the bandwidth to date to
>> take it on.
>>
>> All pending review comments have been incorporated in v5 posted on Jan 4th
>> (http://patchwork.ozlabs.org/cover/855572/). That version is rebased to
>> master after merging  the non-tima-based output batching.
> 
> Apologies, Jan, I've spotted the v5 now (it was v4 assigned to myself in 
> patchwork so that was what I had been planning to look at).
> 
>>
>> I am just now preparing a v6 that adds the missing documentation for the
>> new commands to the ovs-vswitchd man page.
> 
> Excellent, is there an ETA on the v6? I think Billy is already reviewing the 
> v5 but I'll confirm, the switch to v6 should be painless if it's mainly 
> documentation.
> 
>>
>>>

 Since all three are heavily affecting same parts of the code the
 order of merging matters a lot. If we want to make this happen in
 reasonable time we need to avoid constant manual re-basing for all of
>> us.

 That’s why I have taken the initiative to serialize them into one
 particular order for merging. That was a painful exercise and I am
 not looking forward to doing it again in a different order.
>>>
>>> Agreed and thanks for taking the initiative. I do think this is
>> important if all three are to make it in.

+1. thanks for this Jan.

>>>

 So I would greatly appreciate if we could agree on the proposed
 order and discuss how we review/test the resulting overall
 contribution with the ambition to get all parts into 2.9.
>>>
>>> I'm open to others input here, based on the points you've raised I
>>> would suggest the following (only a suggestion, please feel free to
>>> counter):
>>>
>>> 1: Output Time batching (it's v9 and is well understood at this point,
>>> I would think would be little re-work needed if the cases suit the PMD
>> balancing already in place).
>>> 2: Detailed PMD logs (from what I understand there is a review in
>>> press from Billy, and re-work requests from Aaron, we could roll these
>> changes into the next rebase on top of the time output batching).
>>
>> As stated above, v5 is already out since Thursday and used as my #1.
> 
> Ok.
> 
>>
>>> 3: Kevin's PMD patches: probably the smallest of the set and lowest risk
>> IMO as you seem familiar already with these Jan?
>>
>> The patch was IMHU unnecessarily big and is now, after my simplifications,
>> rather small.
> 
> Ok, that lowers the risk a little bit more, sounds good to me but will wait 
> for an ACK from Kevin on this before signing off myself after 
> reviewing/testing.
> 

I like the new scheme of not collecting the idle cycles per rxq,
especially with the changes that will be there now for tx-batching/pmd
metrics.  There was some data structure simplification in the original
patches which got dropped, so I'd like to take a look at that area and
see if there's something that I can add. Possibly a couple of minor
issues/changes too, but I need to review more/test.

>>
>>>
>>> Does this seem acceptable?
>>
>> Actually, as I have already prepared the patches in the other order
>>   1. Detailed PMD metrics (Jan)
>>   2. Time-based output batching (Ilya)
>>   3. Refactor cycle counting 

Re: [ovs-dev] [PATCH] ovn-northd: Avoid duplicate logical flows in SB db

2018-01-08 Thread Han Zhou
If both ACLs have same priority, match, ..., but different actions, it is a
misconfiguration in NB. What could northd do here besides raising an error
log?

Another point, would this type of check increase the difficulty of probably
future incremental-processing in northd?

>From my point of view, it might be better just keep northd simple, and let
clients handle the correctness, and let ovn-controller to do the final
check. In this case, could Neutron maintain multiple sg-rule ids in
external-ids of the same ACL entry?

Thanks,
Han

On Mon, Jan 8, 2018 at 8:52 AM, Miguel Angel Ajo Pelayo  wrote:

> Right!
>
> We didn't hit that issue, but it'd make sense to fix in this patch I guess.
>
> We could modify the hashing function to not include the action (not sure if
> it does now..),
> and also have a separate search function that ignores the action.
>
> On Mon, Jan 8, 2018 at 5:39 PM Ben Pfaff  wrote:
>
> > I suspect that this doesn't go far enough, because it includes actions
> > in the hash, so that it would fail to deduplicate two identical ACLs
> > with different actions (e.g. "drop" versus "allow").
> >
> > On Fri, Jan 05, 2018 at 10:43:16AM +, Daniel Alvarez wrote:
> > > When there are two ACLs in a Logical Switch with same direction,
> > > priority, match and action fields, ovn-northd will generate the
> > > exact same logical flow for them into SB database. This will make
> > > ovn-controller log messages (INFO) saying that the duplicate flow
> > > is going to be dropped.
> > >
> > > This patch avoids adding duplicate lflows into SB database so that
> > > ovn-controller doesn't have to process them.
> > >
> > > Signed-off-by: Daniel Alvarez 
> > > ---
> > >
> > > This patch is needed as part of the consistency work we're doing in the
> > > OpenStack integration [0]. In our effort to ensure consistency across
> > > objects in Neutron and OVN databases we find some special cases like
> > > security group rules which match OVN ACLs but not in 1:1 relationship.
> > > Until now, two identical security group rules beloning each to a
> > > different security group would generate a single ACL in NB database.
> > > With this behavior, there's no way to map the ACL in OVN to the
> > > corresponding Neutron object.
> > >
> > > By implementing [0] we're trying to ensure this mapping so we make use
> > > of the external_ids column of every table for this purpose. It may
> happen
> > > that we'll have two identical ACLs but each referencing a different
> > > Neutron object in their external_ids field. However, this will make
> > > ovn-northd to generate two duplicate lflows into SB database which will
> > > make ovn-controller drop them when installing the actual flows. With
> this
> > > patch we'll avoid duplicate flows to be inserted in SB database in such
> > > cases.
> > >
> > > [0]
> > https://docs.openstack.org/networking-ovn/latest/
> contributor/design/database_consistency.html
> > >
> > >  ovn/northd/ovn-northd.c | 11 +++
> > >  tests/ovn-northd.at | 24 
> > >  2 files changed, 35 insertions(+)
> > >
> > > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> > > index 7e6b1d9..cc64861 100644
> > > --- a/ovn/northd/ovn-northd.c
> > > +++ b/ovn/northd/ovn-northd.c
> > > @@ -428,6 +428,13 @@ struct macam_node {
> > >  struct eth_addr mac_addr; /* Allocated MAC address. */
> > >  };
> > >
> > > +static struct ovn_lflow *ovn_lflow_find(struct hmap *lflows,
> > > +struct ovn_datapath *od,
> > > +enum ovn_stage stage,
> > > +uint16_t priority,
> > > +const char *match,
> > > +const char *actions);
> > > +
> > >  static void
> > >  cleanup_macam(struct hmap *macam)
> > >  {
> > > @@ -2298,6 +2305,10 @@ ovn_lflow_add_at(struct hmap *lflow_map, struct
> > ovn_datapath *od,
> > >   const char *stage_hint, const char *where)
> > >  {
> > >  ovs_assert(ovn_stage_to_datapath_type(stage) ==
> > ovn_datapath_get_type(od));
> > > +
> > > +if (ovn_lflow_find(lflow_map, od, stage, priority, match,
> actions))
> > {
> > > +return;
> > > +}
> > >
> > >  struct ovn_lflow *lflow = xmalloc(sizeof *lflow);
> > >  ovn_lflow_init(lflow, od, stage, priority,
> > > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > > index 954e259..ba96c81 100644
> > > --- a/tests/ovn-northd.at
> > > +++ b/tests/ovn-northd.at
> > > @@ -152,3 +152,27 @@ ovn-nbctl lsp-set-options S1-R1 router-port=R1-S1
> > >  AT_CHECK([test x`ovn-nbctl lsp-get-up S1-R1` = xup])
> > >
> > >  AT_CLEANUP
> > > +
> > > +AT_SETUP([ovn -- check that duplicate acls don't generate duplicate
> > lflows])
> > > +AT_SKIP_IF([test $HAVE_PYTHON = no])
> > > +ovn_start
> > > +
> > > +ovn-nbctl ls-add S1
> > > +
> > > +# 

Re: [ovs-dev] [RFC PATCH 6/8] dpif-netdev: Refactor cycle counting

2018-01-08 Thread Jan Scheurich
Hi Kevin,

Thanks for the feedback. See below.

> > @@ -3429,16 +3381,17 @@ dp_netdev_process_rxq_port(struct 
> > dp_netdev_pmd_thread *pmd,
> >  struct pmd_perf_stats *s = >perf_stats;
> >  struct dp_packet_batch batch;
> >  int error;
> > -int batch_cnt = 0, output_cnt = 0;
> > +int batch_cnt = 0;
> >
> >  dp_packet_batch_init();
> > -
> > -cycles_count_intermediate(pmd, NULL, 0);
> >  pmd->ctx.last_rxq = rxq;
> > -pmd->ctx.current_pmd_cycles_type = PMD_CYCLES_POLL_BUSY;
> > +
> > +/* Measure duration for polling and processing rx burst. */
> > +uint64_t cycles = cycles_counter(pmd);
> 
> hi Jan - not a full review. Consider this function is called from
> dp_netdev_input()...dp_execute_cb(), if the code below is hit, the above
> measurement will make for double counting.
> 
> 
> #ifdef DPDK_NETDEV
> if (OVS_UNLIKELY(!dp_packet_batch_is_empty(>output_pkts)
>  && packets_->packets[0]->source
> != p->output_pkts.packets[0]->source)) {
> /* XXX: netdev-dpdk assumes that all packets in a single
>  *  output batch has the same source. Flush here to
>  *  avoid memory access issues. */
> dp_netdev_pmd_flush_output_on_port(pmd, p);
> }
> #endif
> if (dp_packet_batch_size(>output_pkts)
> + dp_packet_batch_size(packets_) > NETDEV_MAX_BURST) {
> /* Flush here to avoid overflow. */
> dp_netdev_pmd_flush_output_on_port(pmd, p);
> }
> 
> 
> I wouldn't be too concerned about the first one because it's unlikely. I
> would be more concerned about the second one, as it is quite likely that
> multiple rxqs are sending to a single port and the cycles for tx could
> be significant. Take 2 rxq's sending packets to a vhost port, unless we
> got batches of 32 on each rxq there would be double counting for one of
> the queues everytime. There could be more cases like this also, as there
> is flush from a lot of places. I didn't compare, but from memory I don't
> think this would be an issues in Ilya's patches.
> 
> start/intermediate/stop type functions might be better than storing
> cycles locally in functions, because you could stop and start the count
> from any function you need. In this case, you could avoid the double
> counting by something like stop/call
> dp_netdev_pmd_flush_output_on_port()/start. That might start to make the
> code more like Ilya's, so it's probably good to get his review.

You are right, I overlooked the possibility for a tx burst triggered 
immediately by
executing an output action. This needs fixing, and I agree that a scheme which
is able to "suspend" an ongoing measurement at any time will be needed for 
that. Let me have another look at the previous scheme to see if I can simplify
that. If not we can stick to the original solution.

> 
> >  error = netdev_rxq_recv(rxq->rx, );
> >
> >  if (!error) {
> > +/* At least one packet received. */
> >  *recirc_depth_get() = 0;
> >  pmd_thread_ctx_time_update(pmd);
> >  batch_cnt = batch.count;
> > @@ -3448,7 +3401,7 @@ dp_netdev_process_rxq_port(struct 
> > dp_netdev_pmd_thread *pmd,
> >  histogram_add_sample(>pkts_per_batch, batch_cnt);
> >  /* Update the maximum Rx queue fill level. */
> >  uint32_t qfill = batch.qfill;
> > -switch (netdev_dpdk_get_type(netdev_rxq_get_netdev(rx))) {
> > +switch (netdev_dpdk_get_type(netdev_rxq_get_netdev(rxq->rx))) {
> 
> It looks like this is fixing an error from a previous patch

Good spot. 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovn-northd: Avoid duplicate logical flows in SB db

2018-01-08 Thread Miguel Angel Ajo Pelayo
Right!

We didn't hit that issue, but it'd make sense to fix in this patch I guess.

We could modify the hashing function to not include the action (not sure if
it does now..),
and also have a separate search function that ignores the action.

On Mon, Jan 8, 2018 at 5:39 PM Ben Pfaff  wrote:

> I suspect that this doesn't go far enough, because it includes actions
> in the hash, so that it would fail to deduplicate two identical ACLs
> with different actions (e.g. "drop" versus "allow").
>
> On Fri, Jan 05, 2018 at 10:43:16AM +, Daniel Alvarez wrote:
> > When there are two ACLs in a Logical Switch with same direction,
> > priority, match and action fields, ovn-northd will generate the
> > exact same logical flow for them into SB database. This will make
> > ovn-controller log messages (INFO) saying that the duplicate flow
> > is going to be dropped.
> >
> > This patch avoids adding duplicate lflows into SB database so that
> > ovn-controller doesn't have to process them.
> >
> > Signed-off-by: Daniel Alvarez 
> > ---
> >
> > This patch is needed as part of the consistency work we're doing in the
> > OpenStack integration [0]. In our effort to ensure consistency across
> > objects in Neutron and OVN databases we find some special cases like
> > security group rules which match OVN ACLs but not in 1:1 relationship.
> > Until now, two identical security group rules beloning each to a
> > different security group would generate a single ACL in NB database.
> > With this behavior, there's no way to map the ACL in OVN to the
> > corresponding Neutron object.
> >
> > By implementing [0] we're trying to ensure this mapping so we make use
> > of the external_ids column of every table for this purpose. It may happen
> > that we'll have two identical ACLs but each referencing a different
> > Neutron object in their external_ids field. However, this will make
> > ovn-northd to generate two duplicate lflows into SB database which will
> > make ovn-controller drop them when installing the actual flows. With this
> > patch we'll avoid duplicate flows to be inserted in SB database in such
> > cases.
> >
> > [0]
> https://docs.openstack.org/networking-ovn/latest/contributor/design/database_consistency.html
> >
> >  ovn/northd/ovn-northd.c | 11 +++
> >  tests/ovn-northd.at | 24 
> >  2 files changed, 35 insertions(+)
> >
> > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> > index 7e6b1d9..cc64861 100644
> > --- a/ovn/northd/ovn-northd.c
> > +++ b/ovn/northd/ovn-northd.c
> > @@ -428,6 +428,13 @@ struct macam_node {
> >  struct eth_addr mac_addr; /* Allocated MAC address. */
> >  };
> >
> > +static struct ovn_lflow *ovn_lflow_find(struct hmap *lflows,
> > +struct ovn_datapath *od,
> > +enum ovn_stage stage,
> > +uint16_t priority,
> > +const char *match,
> > +const char *actions);
> > +
> >  static void
> >  cleanup_macam(struct hmap *macam)
> >  {
> > @@ -2298,6 +2305,10 @@ ovn_lflow_add_at(struct hmap *lflow_map, struct
> ovn_datapath *od,
> >   const char *stage_hint, const char *where)
> >  {
> >  ovs_assert(ovn_stage_to_datapath_type(stage) ==
> ovn_datapath_get_type(od));
> > +
> > +if (ovn_lflow_find(lflow_map, od, stage, priority, match, actions))
> {
> > +return;
> > +}
> >
> >  struct ovn_lflow *lflow = xmalloc(sizeof *lflow);
> >  ovn_lflow_init(lflow, od, stage, priority,
> > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > index 954e259..ba96c81 100644
> > --- a/tests/ovn-northd.at
> > +++ b/tests/ovn-northd.at
> > @@ -152,3 +152,27 @@ ovn-nbctl lsp-set-options S1-R1 router-port=R1-S1
> >  AT_CHECK([test x`ovn-nbctl lsp-get-up S1-R1` = xup])
> >
> >  AT_CLEANUP
> > +
> > +AT_SETUP([ovn -- check that duplicate acls don't generate duplicate
> lflows])
> > +AT_SKIP_IF([test $HAVE_PYTHON = no])
> > +ovn_start
> > +
> > +ovn-nbctl ls-add S1
> > +
> > +# Insert a duplicate ACL into NB database.
> > +ovn-nbctl -- --id=@acl create acl direction=to-lport priority=1000 \
> > +match='"tcp.dst == 22"' action=drop -- add logical_switch S1 acl
> @acl
> > +ovn-nbctl -- --id=@acl create acl direction=to-lport priority=1000 \
> > +match='"tcp.dst == 22"' action=drop -- add logical_switch S1 acl
> @acl
> > +
> > +# Check that there are two entries in ACL table in NB database.
> > +AT_CHECK([ovn-nbctl find ACL match='"tcp.dst == 22"' | \
> > +grep _uuid | wc -l], [0], [2
> > +])
> > +
> > +# Now make sure that only one logical flow is added to SB database.
> > +AT_CHECK([ovn-sbctl find Logical_Flow match='"tcp.dst == 22"' | \
> > +grep _uuid | wc -l], [0], [1
> > +])
> > +
> > +AT_CLEANUP
> > --
> > 1.8.3.1
> >
> > ___
> > dev mailing 

Re: [ovs-dev] [RFC PATCH 6/8] dpif-netdev: Refactor cycle counting

2018-01-08 Thread Kevin Traynor
On 01/04/2018 08:02 PM, Jan Scheurich wrote:
> Simplify the historically grown TSC cycle counting in PMD threads.
> Cycles are currently counted for the following purposes:
> 
> 1. Measure PMD ustilization
> 
> PMD utilization is defined as ratio of cycles spent in busy iterations
> (at least one packet received or sent) over the total number of cycles.
> 
> This is already done in pmd_perf_start_iteration() and
> pmd_perf_end_iteration() based on a TSC timestamp saved in current
> iteration at start_iteration() and the actual TSC at end_iteration().
> No dependency on intermediate cycle accounting.
> 
> 2. Measure the processing load per RX queue
> 
> This comprises cycles spend on polling and processing packets received
> from the rx queue and the cycles spent on delayed sending of these packets
> to tx queues (with time-based batching).
> 
> 3. Measure the cycles spend on processing upcalls
> 
> These are part of the processing cycles of PMD and rxq but are also
> measured separately for the purpose of supervising upcall performance
> and load.
> 
> The previous scheme using cycles_count_start(), cycles_count_intermediate()
> and cycles-count_end() originally introduced to simplify cycle counting
> and saving calls to rte_get_tsc_cycles() was rather obscuring things.
> 
> Replaced this with dedicated pairs of cycles_count() around each task to
> be measured and accounting the difference in cycles as appropriate for
> the task.
> 
> Each call to cycles_count(pmd) will now store the read TSC counter in
> pmd->ctx.last_cycles, so that users with lower accuracy requirements can
> read that value instead of calling cycles_count().
> 
> Signed-off-by: Jan Scheurich 
> ---
>  lib/dpif-netdev.c | 132 
> --
>  1 file changed, 39 insertions(+), 93 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index d16ba93..5d23128 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -522,11 +522,9 @@ struct dp_netdev_pmd_thread_ctx {
>  /* Latest measured time. See 'pmd_thread_ctx_time_update()'. */
>  long long now;
>  /* Used to count cycles. See 'cycles_count_end()' */
> -unsigned long long last_cycles;
> +uint64_t last_cycles;
>  /* RX queue from which last packet was received. */
>  struct dp_netdev_rxq *last_rxq;
> -/* Indicates how should be treated last counted cycles. */
> -enum pmd_cycles_counter_type current_pmd_cycles_type;
>  };
>  
>  /* PMD: Poll modes drivers.  PMD accesses devices via polling to eliminate
> @@ -3246,69 +3244,16 @@ dp_netdev_actions_free(struct dp_netdev_actions 
> *actions)
>  free(actions);
>  }
>  
> -static inline unsigned long long
> -cycles_counter(void)
> +static inline uint64_t
> +cycles_counter(struct dp_netdev_pmd_thread *pmd)
>  {
>  #ifdef DPDK_NETDEV
> -return rte_get_tsc_cycles();
> +return pmd->ctx.last_cycles = rte_get_tsc_cycles();
>  #else
> -return 0;
> +return pmd->ctx.last_cycles = 0;
>  #endif
>  }
>  
> -/* Fake mutex to make sure that the calls to cycles_count_* are balanced */
> -extern struct ovs_mutex cycles_counter_fake_mutex;
> -
> -/* Start counting cycles.  Must be followed by 'cycles_count_end()'.
> - * Counting starts from the idle type state.  */
> -static inline void
> -cycles_count_start(struct dp_netdev_pmd_thread *pmd)
> -OVS_ACQUIRES(_counter_fake_mutex)
> -OVS_NO_THREAD_SAFETY_ANALYSIS
> -{
> -pmd->ctx.current_pmd_cycles_type = PMD_CYCLES_IDLE;
> -pmd->ctx.last_cycles = cycles_counter();
> -}
> -
> -/* Stop counting cycles and add them to the counter of the current type. */
> -static inline void
> -cycles_count_end(struct dp_netdev_pmd_thread *pmd)
> -OVS_RELEASES(_counter_fake_mutex)
> -OVS_NO_THREAD_SAFETY_ANALYSIS
> -{
> -unsigned long long interval = cycles_counter() - pmd->ctx.last_cycles;
> -enum pmd_cycles_counter_type type = pmd->ctx.current_pmd_cycles_type;
> -
> -pmd_perf_update_counter(>perf_stats, type, interval);
> -}
> -
> -/* Calculate the intermediate cycle result and add to the counter of
> - * the current type */
> -static inline void
> -cycles_count_intermediate(struct dp_netdev_pmd_thread *pmd,
> -  struct dp_netdev_rxq **rxqs, int n_rxqs)
> -OVS_NO_THREAD_SAFETY_ANALYSIS
> -{
> -unsigned long long new_cycles = cycles_counter();
> -unsigned long long interval = new_cycles - pmd->ctx.last_cycles;
> -enum pmd_cycles_counter_type type = pmd->ctx.current_pmd_cycles_type;
> -int i;
> -
> -pmd->ctx.last_cycles = new_cycles;
> -
> -pmd_perf_update_counter(>perf_stats, type, interval);
> -if (n_rxqs && (type == PMD_CYCLES_POLL_BUSY)) {
> -/* Add to the amount of current processing cycles. */
> -interval /= n_rxqs;
> -for (i = 0; i < n_rxqs; i++) {
> -if (rxqs[i]) {
> -non_atomic_ullong_add([i]->cycles[RXQ_CYCLES_PROC_CURR],
> -   

Re: [ovs-dev] [PATCH V2] Documentation: Update Faucet tutorial.

2018-01-08 Thread Ben Pfaff
On Sat, Jan 06, 2018 at 04:45:17PM +1300, Brad Cowie wrote:
> Updates Faucet tutorial to work with newer versions than 1.6.7:
> 
>  * Tutorial now shows how to check out latest tag from Faucet's git.
> 
>  * Set minimum_ip_size_check flag to False so that the
>payloadless packets generated by ofproto/trace aren't dropped by Faucet.
> 
>  * Update Faucet ACL syntax
> 
>  * Update output from commands/log files to reflect changes in the
>Faucet pipeline and to use OpenFlow 1.3 format.
> 
> Signed-off-by: Brad Cowie 

It's very gracious of you to update the tutorial.  I really didn't
expect that, but I very much appreciate it.  Thanks a lot!  I applied
this to OVS master (so it'll be in 2.9).
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] OVN: remove useless ds_clear() on actions ds

2018-01-08 Thread Ben Pfaff
On Fri, Jan 05, 2018 at 06:52:00PM +0100, Lorenzo Bianconi wrote:
> Remove ds_clear() on actions dynamic string in build_acls()
> since they have just been initialized to DS_EMPTY_INITIALIZER
> 
> Signed-off-by: Lorenzo Bianconi 

Thanks, applied to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovn-northd: Avoid duplicate logical flows in SB db

2018-01-08 Thread Ben Pfaff
I suspect that this doesn't go far enough, because it includes actions
in the hash, so that it would fail to deduplicate two identical ACLs
with different actions (e.g. "drop" versus "allow").

On Fri, Jan 05, 2018 at 10:43:16AM +, Daniel Alvarez wrote:
> When there are two ACLs in a Logical Switch with same direction,
> priority, match and action fields, ovn-northd will generate the
> exact same logical flow for them into SB database. This will make
> ovn-controller log messages (INFO) saying that the duplicate flow
> is going to be dropped.
> 
> This patch avoids adding duplicate lflows into SB database so that
> ovn-controller doesn't have to process them.
> 
> Signed-off-by: Daniel Alvarez 
> ---
> 
> This patch is needed as part of the consistency work we're doing in the
> OpenStack integration [0]. In our effort to ensure consistency across
> objects in Neutron and OVN databases we find some special cases like
> security group rules which match OVN ACLs but not in 1:1 relationship.
> Until now, two identical security group rules beloning each to a
> different security group would generate a single ACL in NB database.
> With this behavior, there's no way to map the ACL in OVN to the
> corresponding Neutron object.
> 
> By implementing [0] we're trying to ensure this mapping so we make use
> of the external_ids column of every table for this purpose. It may happen
> that we'll have two identical ACLs but each referencing a different
> Neutron object in their external_ids field. However, this will make
> ovn-northd to generate two duplicate lflows into SB database which will
> make ovn-controller drop them when installing the actual flows. With this
> patch we'll avoid duplicate flows to be inserted in SB database in such
> cases.
> 
> [0] 
> https://docs.openstack.org/networking-ovn/latest/contributor/design/database_consistency.html
> 
>  ovn/northd/ovn-northd.c | 11 +++
>  tests/ovn-northd.at | 24 
>  2 files changed, 35 insertions(+)
> 
> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> index 7e6b1d9..cc64861 100644
> --- a/ovn/northd/ovn-northd.c
> +++ b/ovn/northd/ovn-northd.c
> @@ -428,6 +428,13 @@ struct macam_node {
>  struct eth_addr mac_addr; /* Allocated MAC address. */
>  };
>  
> +static struct ovn_lflow *ovn_lflow_find(struct hmap *lflows,
> +struct ovn_datapath *od,
> +enum ovn_stage stage,
> +uint16_t priority,
> +const char *match,
> +const char *actions);
> +
>  static void
>  cleanup_macam(struct hmap *macam)
>  {
> @@ -2298,6 +2305,10 @@ ovn_lflow_add_at(struct hmap *lflow_map, struct 
> ovn_datapath *od,
>   const char *stage_hint, const char *where)
>  {
>  ovs_assert(ovn_stage_to_datapath_type(stage) == 
> ovn_datapath_get_type(od));
> +   
> +if (ovn_lflow_find(lflow_map, od, stage, priority, match, actions)) {
> +return;
> +}
>  
>  struct ovn_lflow *lflow = xmalloc(sizeof *lflow);
>  ovn_lflow_init(lflow, od, stage, priority,
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index 954e259..ba96c81 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -152,3 +152,27 @@ ovn-nbctl lsp-set-options S1-R1 router-port=R1-S1
>  AT_CHECK([test x`ovn-nbctl lsp-get-up S1-R1` = xup])
>  
>  AT_CLEANUP
> +
> +AT_SETUP([ovn -- check that duplicate acls don't generate duplicate lflows])
> +AT_SKIP_IF([test $HAVE_PYTHON = no])
> +ovn_start
> +
> +ovn-nbctl ls-add S1
> +
> +# Insert a duplicate ACL into NB database.
> +ovn-nbctl -- --id=@acl create acl direction=to-lport priority=1000 \
> +match='"tcp.dst == 22"' action=drop -- add logical_switch S1 acl @acl
> +ovn-nbctl -- --id=@acl create acl direction=to-lport priority=1000 \
> +match='"tcp.dst == 22"' action=drop -- add logical_switch S1 acl @acl
> +
> +# Check that there are two entries in ACL table in NB database.
> +AT_CHECK([ovn-nbctl find ACL match='"tcp.dst == 22"' | \
> +grep _uuid | wc -l], [0], [2
> +])
> +
> +# Now make sure that only one logical flow is added to SB database.
> +AT_CHECK([ovn-sbctl find Logical_Flow match='"tcp.dst == 22"' | \
> +grep _uuid | wc -l], [0], [1
> +])
> +
> +AT_CLEANUP
> -- 
> 1.8.3.1
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH 0/8] dpif-netdev: Refactor cycle count and rebased patches

2018-01-08 Thread Jan Scheurich
Hi Ian,

> -Original Message-
> From: Stokes, Ian [mailto:ian.sto...@intel.com]
> Sent: Monday, 08 January, 2018 17:09
> To: Jan Scheurich ; Kevin Traynor 
> ; Ilya Maximets 
> Cc: d...@openvswitch.org
> Subject: RE: [ovs-dev] [RFC PATCH 0/8] dpif-netdev: Refactor cycle count and 
> rebased patches
> 
> > Hi Ian,
> >
> > That's why I brought up my original question before Christmas, but
> > apparently too late ☹.
> 
> Apologies, Christmas was a bit hectic with the last push for output batching 
> and finishing for the holidays so I missed following up on it.

No worries, I understand. Actually Ilya responded and agreed to my proposal for 
deciding on an order. But we didn't have time to discuss the best order. He had 
v9 out before Christmas based on dpdk_merge (later merged to master) and I took 
that for rebase.

> >
> > Myself, was hoping that we could get all three changes: PMD performance
> > metrics, time-based tx batching and Kevin's enhancement for the pmd-rxq-
> > show command into 2.9.
> >
> 
> I think this is the ideal situation, but does require sign off from Ilya and 
> Kevin as it will be their features it will impact on, I guess they can
> answer and give an idea on bandwidth to cooperate on something like this.
> 
> > PMD performance metrics are for sure around long enough to warrant that.
> > And they are highly wanted too.
> 
> To my mind the detailed PMD logs patchset has a lot of content to work 
> through and there are reviews in progress as well as re-work
> requests from previous reviews of the v4. I was waiting to see these 
> completed before focusing on it myself as I haven't had the bandwidth
> to date to take it on.

All pending review comments have been incorporated in v5 posted on Jan 4th 
(http://patchwork.ozlabs.org/cover/855572/). That version is rebased to master 
after merging  the non-tima-based output batching.

I am just now preparing a v6 that adds the missing documentation for the new 
commands to the ovs-vswitchd man page.

> 
> >
> > Since all three are heavily affecting same parts of the code the order of
> > merging matters a lot. If we want to make this happen in reasonable time
> > we need to avoid constant manual re-basing for all of us.
> >
> > That’s why I have taken the initiative to serialize them into one
> > particular order for merging. That was a painful exercise and I am not
> > looking forward to doing it again in a different order.
> 
> Agreed and thanks for taking the initiative. I do think this is important if 
> all three are to make it in.
> 
> >
> > So I would greatly appreciate if we could agree on the proposed order and
> > discuss how we review/test the resulting overall contribution with the
> > ambition to get all parts into 2.9.
> 
> I'm open to others input here, based on the points you've raised I would 
> suggest the following (only a suggestion, please feel free to
> counter):
> 
> 1: Output Time batching (it's v9 and is well understood at this point, I 
> would think would be little re-work needed if the cases suit the PMD
> balancing already in place).
> 2: Detailed PMD logs (from what I understand there is a review in press from 
> Billy, and re-work requests from Aaron, we could roll these
> changes into the next rebase on top of the time output batching).

As stated above, v5 is already out since Thursday and used as my #1.

> 3: Kevin's PMD patches: probably the smallest of the set and lowest risk IMO 
> as you seem familiar already with these Jan?

The patch was IMHU unnecessarily big and is now, after my simplifications, 
rather small.

> 
> Does this seem acceptable?

Actually, as I have already prepared the patches in the other order
  1. Detailed PMD metrics (Jan)
  2. Time-based output batching (Ilya)
  3. Refactor cycle counting (Jan)
  4. PMD load in pmd-rxq-show (Kevin/Jan)
I would prefer not to swap 1 and 2. I am afraid it will imply a few extra hours 
to sort out the conflicts again. Time none of us has and which would be better 
spent on reviewing/testing.
 
The end result should anyway be the same. This is what we need to focus on. We 
are just talking about different intermediate steps on the way that will not 
live for more than a few hours (or days).

> It's probably a good topic for the community meeting Wednesday but I'd like 
> to try and get agreement before then if we are to coordinate
> to get these upstreamed.

I hope we can agree on a common approach earlier than the meeting. If mail I 
too slow, I can call for a short dedicated Skype call to discuss and agree.

> 
> Ian
> >
> > Thanks, Jan
> >
> > > -Original Message-
> > > From: ovs-dev-boun...@openvswitch.org
> > > [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of Stokes, Ian
> > > Sent: Monday, 08 January, 2018 16:04
> > > To: Jan Scheurich ; Kevin Traynor
> > > ; Ilya Maximets 
> > > 

Re: [ovs-dev] why so many junk mails in ovs community's dev-maillist? 转发:Response !!!

2018-01-08 Thread Ben Pfaff
I don't know.  Maybe Linux Foundation does not have a good spam filter.

On Mon, Jan 08, 2018 at 09:54:11AM +0800, xiucai wrote:
> Hi,
> 
> 
> In other maillist had never seen junk mails such as python, qemu.
> 
> 
> But in ovs, you know.
> 
> 
> I responsed the mail below because i regarded it as a real donate, that 
> is good to ovs.
> 
> 
> But when 'maria~' responses "introduce *self in full details", you know 
> again,  seems junk mail again.
> 
> 
> embarrassed
> 
> 
> -- 原始邮件 --
> 发件人: "mail2mariaelisabeth";;
> 发送时间: 2018年1月4日(星期四) 晚上10:17
> 收件人: "xiucai";
> 
> 主题: Response !!!
> 
> 
> 
> Thank you for the prompt response, please don't be bothered as to why you 
> have been contacted for this and its genuineness as i have done so from a 
> pure motive, I am Maria Elisabeth Schaeffler, a German citizen, wife of late 
> Georg W Schaeffler,73 years old, you can see here 
> :forbes.com/profile/maria-elisabeth-schaeffler.
> 
> The intention of this email is to be of immense blessing to people, with my 
> age, i can't continue to amass wealth without giving out,i am indulging 
> myself with such act that will be considered a blessing to other, it is my 
> hope that as you receive this money and that it would be of great use to you 
> and also that you can also assist others from it.
> 
> I crave your indulgence to keep this offer discreet as i don't want to make 
> this a public affair, so if you are willing to accept the gift, I will make 
> you a bonafide beneficiary, As i send this message to you, I have also sent 
> to churches, orphanages and charitable organization. now with all this being 
> said, if you are ready to receive your part of €1,700,000{one Million, Seven 
> Hundred Thousand Euro}from my cede of €50Million,kindly introduce yourself in 
> full details,{Full Name, Address, Occupation}thereafter i shall grant you a 
> letter to authorize the money to you from the bank, please my dear ,this is 
> my personal wealth as it doesn't need many formal procedure, and for the sake 
> that it might seem too easy for you to receive this gift don't justify that 
> life is easy, I would crave your indulgence not to refer any person 
> whatsoever to me so that i can have my peace. I am doing this as a 
> free-spirit gift and I made the contact myself to you, therefore, don’t refer 
> any person and don’t make a public/media show of this as i would not like any 
> publicity of any sort.
> 
> I will wait for your email to do further. Stay bless
> 
> Maria-Elisabeth Schaeffler
> SCHAEFFLER AG,
> HERZOGENAURACH
> 
> On Thu, Jan 4, 2018 at 11:33 AM, xiucai  wrote:
> Really?
> 
> 
> -- Original --
> From:  "Maria-Elisabeth";;
> Date:  Wed, Jan 3, 2018 07:23 PM
> To:  "Recipients";
> 
> Subject:  Re: [ovs-dev] Donation!!!
> 
> 
> 
> I am Maria-Elisabeth Schaeffler, a German citizen, wife of late Georg W. 
> Schaeffler, 75 years old. You can see here: 
> en.wikipedia.org/wiki/Maria-Elisabeth_Schaeffler I intend to give to you a 
> portion of my Wealth as a free-will financial donation to you. Respond now to 
> partake.
> 
> Regards,
> Maria-Elisabeth Schaeffler.
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] netdev: Custom statistics.

2018-01-08 Thread Ben Pfaff
On Mon, Jan 08, 2018 at 11:55:26AM +, Weglicki, MichalX wrote:
> Hi Ben, 
> > I don't like the idea implied in the code in a few places that name[] in
> > struct netdev_custom_counter might not be null-terminated.  I think that
> > we should ensure that it is always null terminated.  Otherwise there is
> > a pitfall for carelessly written code.
> To be honest I'm not sure what I could do here, each time statistics are
> Requested from netdev, whole buffer is set to "0", so even it 
> Particular netdev implementation would return string which 
> Is not null-terminated, it will be as all other characters in 
> counter name field, will be "\0". When message is decoded from 
> open flow buffer, proper check is done, so this part of the code is safe. 
> If there is anything else you would like me to do, just let me know. 

Maybe I misinterpreted some code.  I'll take another look.

Thanks for v3!
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH 0/8] dpif-netdev: Refactor cycle count and rebased patches

2018-01-08 Thread Stokes, Ian
> Hi Ian,
> 
> That's why I brought up my original question before Christmas, but
> apparently too late ☹.

Apologies, Christmas was a bit hectic with the last push for output batching 
and finishing for the holidays so I missed following up on it.

> 
> Myself, was hoping that we could get all three changes: PMD performance
> metrics, time-based tx batching and Kevin's enhancement for the pmd-rxq-
> show command into 2.9.
> 

I think this is the ideal situation, but does require sign off from Ilya and 
Kevin as it will be their features it will impact on, I guess they can answer 
and give an idea on bandwidth to cooperate on something like this.

> PMD performance metrics are for sure around long enough to warrant that.
> And they are highly wanted too.

To my mind the detailed PMD logs patchset has a lot of content to work through 
and there are reviews in progress as well as re-work requests from previous 
reviews of the v4. I was waiting to see these completed before focusing on it 
myself as I haven't had the bandwidth to date to take it on.

> 
> Since all three are heavily affecting same parts of the code the order of
> merging matters a lot. If we want to make this happen in reasonable time
> we need to avoid constant manual re-basing for all of us.
> 
> That’s why I have taken the initiative to serialize them into one
> particular order for merging. That was a painful exercise and I am not
> looking forward to doing it again in a different order.

Agreed and thanks for taking the initiative. I do think this is important if 
all three are to make it in.

> 
> So I would greatly appreciate if we could agree on the proposed order and
> discuss how we review/test the resulting overall contribution with the
> ambition to get all parts into 2.9.

I'm open to others input here, based on the points you've raised I would 
suggest the following (only a suggestion, please feel free to counter):

1: Output Time batching (it's v9 and is well understood at this point, I would 
think would be little re-work needed if the cases suit the PMD balancing 
already in place).
2: Detailed PMD logs (from what I understand there is a review in press from 
Billy, and re-work requests from Aaron, we could roll these changes into the 
next rebase on top of the time output batching).
3: Kevin's PMD patches: probably the smallest of the set and lowest risk IMO as 
you seem familiar already with these Jan?

Does this seem acceptable?

It's probably a good topic for the community meeting Wednesday but I'd like to 
try and get agreement before then if we are to coordinate to get these 
upstreamed.

Ian
> 
> Thanks, Jan
> 
> > -Original Message-
> > From: ovs-dev-boun...@openvswitch.org
> > [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of Stokes, Ian
> > Sent: Monday, 08 January, 2018 16:04
> > To: Jan Scheurich ; Kevin Traynor
> > ; Ilya Maximets 
> > Cc: d...@openvswitch.org
> > Subject: Re: [ovs-dev] [RFC PATCH 0/8] dpif-netdev: Refactor cycle
> > count and rebased patches
> >
> > > Hi guys,
> > >
> > > It would be great to get your feedback on the proposal for combining
> > > (and
> > > simplifying) our patches.
> > >
> > > Do you agree with a) the cycle counting refactoring and b) the
> > > simplification of rxq pmd load calculation?
> > >
> > > For actual merge I would suggest to re-order the changes and take
> > > the cycle counting refactoring before the time-based output
> > > batching. But that is perhaps just a matter of taste affecting some
> > > intermediate commits and not so important for the end-result...
> > >
> > > How should we proceed with including these to the dpdk_merge branch?
> > > - One patch series at a time in the order now laid out in my RFC
> > > series
> > > - One large series comprising all 4 contributions?
> > >
> >
> > Hi Jan,
> >
> > From my side I was hoping to review/test Ilya's v9 time based output
> > batching first with a view to upstreaming that feature for the 2.9
> release.
> >
> > My worry was that by attaching it to other feature changes it may not
> get up streamed.
> >
> > I've only had a cursory look at the Kevin's percentage pad patches.
> >
> > I was thinking of working to the following on dpdk_merge
> >
> > 1: Review/Upstream Ilya's time based output.
> > https://patchwork.ozlabs.org/project/openvswitch/list/?series=19865
> > 2: Review/Upstream Kevin's
> > https://patchwork.ozlabs.org/user/todo/openvswitch/?series=18301
> >
> > Just a question, in your mail below I see 'Detailed PMD performance
> > metrics and supervision' is included in [1], Billy O'Mahony is
> > currently reviewing the latest revision of this from what I'm aware of,
> this a pre-requisite for you before upstreaming Ilya and Kevin's patches?
> >
> > I'd like to hear from Ilya on Kevin on this also? If they are happy to
> > combine the work and review/validate as a group then it may be possible.
> >
> > Thanks
> > Ian
> >
> > > 

Re: [ovs-dev] [RFC PATCH 0/8] dpif-netdev: Refactor cycle count and rebased patches

2018-01-08 Thread Jan Scheurich
Hi Ian,

That's why I brought up my original question before Christmas, but apparently 
too late ☹.

Myself, was hoping that we could get all three changes: PMD performance 
metrics, time-based tx batching and Kevin's enhancement for the pmd-rxq-show 
command into 2.9.

PMD performance metrics are for sure around long enough to warrant that. And 
they are highly wanted too.

Since all three are heavily affecting same parts of the code the order of 
merging matters a lot. If we want to make this happen in reasonable time we 
need to avoid constant manual re-basing for all of us.

That’s why I have taken the initiative to serialize them into one particular 
order for merging. That was a painful exercise and I am not looking forward to 
doing it again in a different order.

So I would greatly appreciate if we could agree on the proposed order and 
discuss how we review/test the resulting overall contribution with the ambition 
to get all parts into 2.9.

Thanks, Jan

> -Original Message-
> From: ovs-dev-boun...@openvswitch.org 
> [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of Stokes, Ian
> Sent: Monday, 08 January, 2018 16:04
> To: Jan Scheurich ; Kevin Traynor 
> ; Ilya Maximets 
> Cc: d...@openvswitch.org
> Subject: Re: [ovs-dev] [RFC PATCH 0/8] dpif-netdev: Refactor cycle count and 
> rebased patches
> 
> > Hi guys,
> >
> > It would be great to get your feedback on the proposal for combining (and
> > simplifying) our patches.
> >
> > Do you agree with a) the cycle counting refactoring and b) the
> > simplification of rxq pmd load calculation?
> >
> > For actual merge I would suggest to re-order the changes and take the
> > cycle counting refactoring before the time-based output batching. But that
> > is perhaps just a matter of taste affecting some intermediate commits and
> > not so important for the end-result...
> >
> > How should we proceed with including these to the dpdk_merge branch?
> > - One patch series at a time in the order now laid out in my RFC series
> > - One large series comprising all 4 contributions?
> >
> 
> Hi Jan,
> 
> From my side I was hoping to review/test Ilya's v9 time based output batching 
> first with a view to upstreaming that feature for the 2.9
> release.
> 
> My worry was that by attaching it to other feature changes it may not get up 
> streamed.
> 
> I've only had a cursory look at the Kevin's percentage pad patches.
> 
> I was thinking of working to the following on dpdk_merge
> 
> 1: Review/Upstream Ilya's time based output. 
> https://patchwork.ozlabs.org/project/openvswitch/list/?series=19865
> 2: Review/Upstream Kevin's 
> https://patchwork.ozlabs.org/user/todo/openvswitch/?series=18301
> 
> Just a question, in your mail below I see 'Detailed PMD performance metrics 
> and supervision' is included in [1], Billy O'Mahony is
> currently reviewing the latest revision of this from what I'm aware of, this 
> a pre-requisite for you before upstreaming Ilya and Kevin's
> patches?
> 
> I'd like to hear from Ilya on Kevin on this also? If they are happy to 
> combine the work and review/validate as a group then it may be
> possible.
> 
> Thanks
> Ian
> 
> > Regards, Jan
> >
> > > -Original Message-
> > > From: jan.scheur...@web.de [mailto:jan.scheur...@web.de] On Behalf Of
> > > Jan Scheurich
> > > Sent: Thursday, 04 January, 2018 21:03
> > > To: d...@openvswitch.org
> > > Cc: Jan Scheurich 
> > > Subject: [RFC PATCH 0/8] dpif-netdev: Refactor cycle count and rebased
> > > patches
> > >
> > > This RFC patch series contains three contributions:
> > >
> > > 1. A rebase of Ilya's series "[PATCH v9,0/5] Output packet batching
> > (Time-based)."
> > > (http://patchwork.ozlabs.org/cover/852003/) on top of "[PATCH v5,0/3]
> > dpif-netdev:
> > > Detailed PMD performance metrics and supervision"
> > (http://patchwork.ozlabs.org/cover/855572/).
> > >
> > > 2. Refactoring and simplification of the PMD cycle counting id dpif-
> > netdev.c.
> > >
> > > 3. A rebase and simplification of Kevin's patches
> > > (http://patchwork.ozlabs.org/patch/847972/)
> > > to display PMD usage per queue in the ovs-appctl pmd-rxq-show command.
> > >
> > > The patches pass check_patch and "make check" and I have done some
> > > basic tests of the simplified rxq cycle counting and the PMD usage
> > reporting in pmd-rxq-show.
> > >
> > > This is my proposal how to combine the three existing patch series
> > > together with the simplified cycle counting the into branch dpdk_merge
> > for release in OSV 2.9.
> > >
> > > Before merging the last two patches should probably be combined.
> > >
> > >
> > > Ilya Maximets (5):
> > >   dpif-netdev: Use microsecond granularity.
> > >   dpif-netdev: Count cycles on per-rxq basis.
> > >   dpif-netdev: Time based output batching.
> > >   docs: Describe output packet batching in DPDK guide.
> > >   NEWS: Mark output packet batching support.
> > >
> > 

Re: [ovs-dev] [RFC PATCH 0/8] dpif-netdev: Refactor cycle count and rebased patches

2018-01-08 Thread Stokes, Ian
> Hi guys,
> 
> It would be great to get your feedback on the proposal for combining (and
> simplifying) our patches.
> 
> Do you agree with a) the cycle counting refactoring and b) the
> simplification of rxq pmd load calculation?
> 
> For actual merge I would suggest to re-order the changes and take the
> cycle counting refactoring before the time-based output batching. But that
> is perhaps just a matter of taste affecting some intermediate commits and
> not so important for the end-result...
> 
> How should we proceed with including these to the dpdk_merge branch?
> - One patch series at a time in the order now laid out in my RFC series
> - One large series comprising all 4 contributions?
> 

Hi Jan,

>From my side I was hoping to review/test Ilya's v9 time based output batching 
>first with a view to upstreaming that feature for the 2.9 release.

My worry was that by attaching it to other feature changes it may not get up 
streamed.

I've only had a cursory look at the Kevin's percentage pad patches.

I was thinking of working to the following on dpdk_merge

1: Review/Upstream Ilya's time based output. 
https://patchwork.ozlabs.org/project/openvswitch/list/?series=19865
2: Review/Upstream Kevin's 
https://patchwork.ozlabs.org/user/todo/openvswitch/?series=18301

Just a question, in your mail below I see 'Detailed PMD performance metrics and 
supervision' is included in [1], Billy O'Mahony is currently reviewing the 
latest revision of this from what I'm aware of, this a pre-requisite for you 
before upstreaming Ilya and Kevin's patches?

I'd like to hear from Ilya on Kevin on this also? If they are happy to combine 
the work and review/validate as a group then it may be possible.

Thanks
Ian

> Regards, Jan
> 
> > -Original Message-
> > From: jan.scheur...@web.de [mailto:jan.scheur...@web.de] On Behalf Of
> > Jan Scheurich
> > Sent: Thursday, 04 January, 2018 21:03
> > To: d...@openvswitch.org
> > Cc: Jan Scheurich 
> > Subject: [RFC PATCH 0/8] dpif-netdev: Refactor cycle count and rebased
> > patches
> >
> > This RFC patch series contains three contributions:
> >
> > 1. A rebase of Ilya's series "[PATCH v9,0/5] Output packet batching
> (Time-based)."
> > (http://patchwork.ozlabs.org/cover/852003/) on top of "[PATCH v5,0/3]
> dpif-netdev:
> > Detailed PMD performance metrics and supervision"
> (http://patchwork.ozlabs.org/cover/855572/).
> >
> > 2. Refactoring and simplification of the PMD cycle counting id dpif-
> netdev.c.
> >
> > 3. A rebase and simplification of Kevin's patches
> > (http://patchwork.ozlabs.org/patch/847972/)
> > to display PMD usage per queue in the ovs-appctl pmd-rxq-show command.
> >
> > The patches pass check_patch and "make check" and I have done some
> > basic tests of the simplified rxq cycle counting and the PMD usage
> reporting in pmd-rxq-show.
> >
> > This is my proposal how to combine the three existing patch series
> > together with the simplified cycle counting the into branch dpdk_merge
> for release in OSV 2.9.
> >
> > Before merging the last two patches should probably be combined.
> >
> >
> > Ilya Maximets (5):
> >   dpif-netdev: Use microsecond granularity.
> >   dpif-netdev: Count cycles on per-rxq basis.
> >   dpif-netdev: Time based output batching.
> >   docs: Describe output packet batching in DPDK guide.
> >   NEWS: Mark output packet batching support.
> >
> > Jan Scheurich (2):
> >   dpif-netdev: Refactor cycle counting
> >   dpif-netdev: Add percentage of pmd/core used by each rxq.
> >
> > Kevin Traynor (1):
> >   dpif-netdev: Reset the rxq current cycle counter on reload.
> >
> >  Documentation/howto/dpdk.rst |  12 ++
> >  Documentation/intro/install/dpdk.rst |  58 ++
> >  NEWS |   3 +
> >  lib/dpif-netdev.c| 350 ++--
> ---
> >  tests/pmd.at |  51 +++--
> >  vswitchd/vswitch.xml |  16 ++
> >  6 files changed, 349 insertions(+), 141 deletions(-)
> >
> > --
> > 1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 0/3] Fixes while compiling with 1709 and VS2017

2018-01-08 Thread aserdean
Thanks Ben! Applied on master.

> -Original Message-
> From: Ben Pfaff [mailto:b...@ovn.org]
> Sent: Friday, January 5, 2018 5:14 PM
> To: aserd...@ovn.org
> Cc: d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH 0/3] Fixes while compiling with 1709 and
> VS2017
> 
> OK.
> 
> Acked-by: Ben Pfaff 
> 
> 
> On Fri, Jan 05, 2018 at 03:59:02PM +0200, aserd...@ovn.org wrote:
> > Sure. Do you mind if I add your acked-by on each one?
> >
> > Thanks,
> > Alin.
> >
> > > -Original Message-
> > > From: Ben Pfaff [mailto:b...@ovn.org]
> > > Sent: Thursday, January 4, 2018 11:37 PM
> > > To: Alin Gabriel Serdean 
> > > Cc: d...@openvswitch.org
> > > Subject: Re: [ovs-dev] [PATCH 0/3] Fixes while compiling with 1709 and
> > > VS2017
> > >
> > > On Mon, Nov 06, 2017 at 02:07:15PM -0800, Ben Pfaff wrote:
> > > > On Mon, Nov 06, 2017 at 05:51:54PM +0200, Alin Gabriel Serdean
wrote:
> > > > > This series includes fixes that were found while compiling using
WDK
> > > > > 1709 and Visual Studio 2017.
> > > > >
> > > > > Alin Gabriel Serdean (3):
> > > > >   datapath-windows: Add directory to .gitignore
> > > > >   datapath-windows: Change include type in Iphelper.h
> > > > >   datapath-windows: Add include directory to ovsext project
> > > >
> > > > All of these seem reasonable to me.
> > >
> > > Do you want to apply these?
> >

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6 2/2] OVN: Add support for periodic router advertisements.

2018-01-08 Thread Russell Bryant
On Mon, Jan 8, 2018 at 3:24 AM, Miguel Angel Ajo Pelayo
 wrote:
> Awesome!, do you believe it would be possible to have this on the 2.9
> series too?
>
> Having the periodic router advertisements on the next openstack release was
> one of our items
> towards parity with the reference solution in the land of IPv6.

The 2.9 branch has not been created yet, so anything in master now
will be in 2.9.

>
>
>
> On Fri, Jan 5, 2018 at 6:05 PM Ben Pfaff  wrote:
>
>> On Wed, Nov 29, 2017 at 03:59:48PM -0600, Mark Michelson wrote:
>> > This change adds three new options to the Northbound
>> > Logical_Router_Port's ipv6_ra_configs option:
>> >
>> > * send_periodic: If set to "true", then OVN will send periodic router
>> > advertisements out of this router port.
>> > * max_interval: The maximum amount of time to wait between sending
>> > periodic router advertisements.
>> > * min_interval: The minimum amount of time to wait between sending
>> > periodic router advertisements.
>> >
>> > When send_periodic is true, then IPv6 RA configs, as well as some layer
>> > 2 and layer 3 information about the router port, are copied to the
>> > southbound database. From there, ovn-controller can use this information
>> > to know when to send periodic RAs and what to send in them.
>> >
>> > Because periodic RAs originate from each ovn-controller, the new
>> > keep-local flag is set on the packet so that ports don't receive an
>> > overabundance of RAs.
>> >
>> > Signed-off-by: Mark Michelson 
>>
>> Thanks a lot for the revised series.
>>
>> I folded in the following changes and applied this series to master.
>>
>> diff --git a/lib/packets.h b/lib/packets.h
>> index 8819f829970e..395599f08c92 100644
>> --- a/lib/packets.h
>> +++ b/lib/packets.h
>> @@ -1020,7 +1020,12 @@ BUILD_ASSERT_DECL(RA_MSG_LEN == sizeof(struct
>> ovs_ra_msg));
>>   * 6.2.1
>>   */
>>  #define ND_RA_MAX_INTERVAL_DEFAULT 600
>> -#define ND_RA_MIN_INTERVAL_DEFAULT(max) ((max) >= 9 ? (max) / 3 : (max) *
>> 3 / 4)
>> +
>> +static inline int
>> +nd_ra_min_interval_default(int max)
>> +{
>> +return max >= 9 ? max / 3 : max * 3 / 4;
>> +}
>>
>>  /*
>>   * Use the same struct for MLD and MLD2, naming members as the defined
>> fields in
>> @@ -1420,7 +1425,7 @@ void compose_nd_ra(struct dp_packet *,
>> const struct in6_addr *ipv6_dst,
>> uint8_t cur_hop_limit, uint8_t mo_flags,
>> ovs_be16 router_lt, ovs_be32 reachable_time,
>> -   ovs_be32 retrans_timer, ovs_be32 mtu);
>> +   ovs_be32 retrans_timer, uint32_t mtu);
>>  void packet_put_ra_prefix_opt(struct dp_packet *,
>>uint8_t plen, uint8_t la_flags,
>>ovs_be32 valid_lifetime,
>> diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
>> index cf414b8f229b..7542db3f4854 100644
>> --- a/ovn/controller/pinctrl.c
>> +++ b/ovn/controller/pinctrl.c
>> @@ -1167,7 +1167,7 @@ ipv6_ra_update_config(const struct
>> sbrec_port_binding *pb)
>>  config->max_interval = smap_get_int(>options,
>> "ipv6_ra_max_interval",
>>  ND_RA_MAX_INTERVAL_DEFAULT);
>>  config->min_interval = smap_get_int(>options,
>> "ipv6_ra_min_interval",
>> -ND_RA_MIN_INTERVAL_DEFAULT(config->max_interval));
>> +nd_ra_min_interval_default(config->max_interval));
>>  config->mtu = smap_get_int(>options, "ipv6_ra_mtu",
>> ND_MTU_DEFAULT);
>>  config->la_flags = ND_PREFIX_ON_LINK;
>>
>> @@ -1194,7 +1194,7 @@ ipv6_ra_update_config(const struct
>> sbrec_port_binding *pb)
>>  }
>>
>>  /* All nodes multicast addresses */
>> -config->eth_dst = ETH_ADDR_C(33,33,00,00,00,01);
>> +config->eth_dst = (struct eth_addr) ETH_ADDR_C(33,33,00,00,00,01);
>>  ipv6_parse("ff02::1", >ipv6_dst);
>>
>>  const char *eth_addr = smap_get(>options, "ipv6_ra_src_eth");
>> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
>> index fc14dc8c38eb..e3ddc1fd9bc1 100644
>> --- a/ovn/northd/ovn-northd.c
>> +++ b/ovn/northd/ovn-northd.c
>> @@ -4486,7 +4486,7 @@ copy_ra_to_sb(struct ovn_port *op, const char
>> *address_mode)
>>  smap_add_format(, "ipv6_ra_max_interval", "%d", max_interval);
>>
>>  int min_interval = smap_get_int(>nbrp->ipv6_ra_configs,
>> -"min_interval", ND_RA_MIN_INTERVAL_DEFAULT(max_interval));
>> +"min_interval", nd_ra_min_interval_default(max_interval));
>>  if (min_interval > ND_RA_MIN_INTERVAL_MAX(max_interval)) {
>>  min_interval = ND_RA_MIN_INTERVAL_MAX(max_interval);
>>  }
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev



-- 
Russell Bryant

Re: [ovs-dev] [RFC PATCH 0/8] dpif-netdev: Refactor cycle count and rebased patches

2018-01-08 Thread Jan Scheurich
Hi guys,

It would be great to get your feedback on the proposal for combining (and 
simplifying) our patches.

Do you agree with a) the cycle counting refactoring and b) the simplification 
of rxq pmd load calculation?

For actual merge I would suggest to re-order the changes and take the cycle 
counting refactoring before the time-based output batching. But that is perhaps 
just a matter of taste affecting some intermediate commits and not so important 
for the end-result...

How should we proceed with including these to the dpdk_merge branch? 
- One patch series at a time in the order now laid out in my RFC series
- One large series comprising all 4 contributions?

Regards, Jan

> -Original Message-
> From: jan.scheur...@web.de [mailto:jan.scheur...@web.de] On Behalf Of Jan 
> Scheurich
> Sent: Thursday, 04 January, 2018 21:03
> To: d...@openvswitch.org
> Cc: Jan Scheurich 
> Subject: [RFC PATCH 0/8] dpif-netdev: Refactor cycle count and rebased patches
> 
> This RFC patch series contains three contributions:
> 
> 1. A rebase of Ilya's series "[PATCH v9,0/5] Output packet batching 
> (Time-based)."
> (http://patchwork.ozlabs.org/cover/852003/) on top of "[PATCH v5,0/3] 
> dpif-netdev:
> Detailed PMD performance metrics and supervision" 
> (http://patchwork.ozlabs.org/cover/855572/).
> 
> 2. Refactoring and simplification of the PMD cycle counting id dpif-netdev.c.
> 
> 3. A rebase and simplification of Kevin's patches 
> (http://patchwork.ozlabs.org/patch/847972/)
> to display PMD usage per queue in the ovs-appctl pmd-rxq-show command.
> 
> The patches pass check_patch and "make check" and I have done some basic 
> tests of the
> simplified rxq cycle counting and the PMD usage reporting in pmd-rxq-show.
> 
> This is my proposal how to combine the three existing patch series together 
> with the
> simplified cycle counting the into branch dpdk_merge for release in OSV 2.9.
> 
> Before merging the last two patches should probably be combined.
> 
> 
> Ilya Maximets (5):
>   dpif-netdev: Use microsecond granularity.
>   dpif-netdev: Count cycles on per-rxq basis.
>   dpif-netdev: Time based output batching.
>   docs: Describe output packet batching in DPDK guide.
>   NEWS: Mark output packet batching support.
> 
> Jan Scheurich (2):
>   dpif-netdev: Refactor cycle counting
>   dpif-netdev: Add percentage of pmd/core used by each rxq.
> 
> Kevin Traynor (1):
>   dpif-netdev: Reset the rxq current cycle counter on reload.
> 
>  Documentation/howto/dpdk.rst |  12 ++
>  Documentation/intro/install/dpdk.rst |  58 ++
>  NEWS |   3 +
>  lib/dpif-netdev.c| 350 
> ++-
>  tests/pmd.at |  51 +++--
>  vswitchd/vswitch.xml |  16 ++
>  6 files changed, 349 insertions(+), 141 deletions(-)
> 
> --
> 1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] netdev-dpdk: fix port addition for ports sharing same PCI id

2018-01-08 Thread Stokes, Ian
Hi Yuanhan,

Thanks for working on this, I've done some testing with ani40e device and a 
review of the current patch, please find comments below.

> On 12/20/2017 04:03 PM, Yuanhan Liu wrote:
> > Some NICs have only one PCI address associated with multiple ports.
> > This patch extends the dpdk-devargs option's format to cater for such
> devices.
> >
> > To achieve that, this patch uses a new syntax that will be adapted and
> > implemented in future DPDK release (likely, v18.05):
> > http://dpdk.org/ml/archives/dev/2017-December/084234.html
> >
> > And since it's the DPDK duty to parse the (complete and full) syntax
> > and this patch is more likely to serve as an intermediate workaround,
> > here I take a simpler and shorter syntax from it (note it's allowed to
> > have only one category being provided):
> > class=eth,mac=00:11:22:33:44:55:66
> >
> > Also, old compatibility is kept. Users can still go on with using the
> > PCI id to add a port (if that's enough for them). Meaning, this patch
> > will not break anything.

I've validated that both the old method/new method both work with i40e devices 
as expected.

> >
> 
> Hi Yuanhan, I think there would need to be some doc updates also for a new
> syntax.

Fully agree, would need updates to the relevant OVS DPDK related docs as well 
as an explanation regarding why there are 2 approaches to adding ports.

> 
> How settled/agreed is the syntax in DPDK now? Ideally it is totally
> settled and we use it for these types of devices.
> 
> But if not...then considering we will continue to keep compatibility with
> older simpler "pci" anyway, maybe it would be safer to add something
> simple now like "pci","mac" or just "mac" and keep compatibility for that
> when the new syntax is finally agreed in DPDK. It may mean some parsing to
> distinguish pci from mac, or vdev from pci,mac though.
> 
> Anyway, agreed syntax in DPDK is better so hopefully that can be done.
> 
> > This patch is basically based on the one from Ciara:
> >
> > https://mail.openvswitch.org/pipermail/ovs-dev/2017-October/339496.htm
> > l
> >
> > Cc: Loftus Ciara 
> > Cc: Thomas Monjalon 
> > Cc: Kevin Traynor 
> > Signed-off-by: Yuanhan Liu 
> > ---
> >  lib/netdev-dpdk.c | 77
> > ---
> >  1 file changed, 62 insertions(+), 15 deletions(-)
> >
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
> > 45fcc74..4e5cc25 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -1205,30 +1205,77 @@ netdev_dpdk_lookup_by_port_id(dpdk_port_t
> port_id)
> >  return NULL;
> >  }
> >
> > +static int
> > +netdev_dpdk_str_to_ether(const char *mac, struct ether_addr *ea) {
> > +unsigned int bytes[6];
> > +int i;
> > +
> > +if (sscanf(mac, "%x:%x:%x:%x:%x:%x",
> > +   [0], [1], [2],
> > +   [3], [4], [5]) != 6) {
> > +return -1;

Should an error be logged here? I flag the same question when checking the 
return type for this function later but something to think about. The error log 
could be here or after the return type check but I do think it's useful.

> > +}
> > +
> > +for (i = 0; i < 6; i++) {
> > +ea->addr_bytes[i] = bytes[i];
> > +}
> > +
> > +return 0;
> > +}
> > +
> > +static dpdk_port_t
> > +netdev_dpdk_get_port_by_mac(const char *mac_str) {
> > +int i;
> > +struct ether_addr mac;
> > +struct ether_addr port_mac;
> > +
> > +netdev_dpdk_str_to_ether(mac_str, );

I think there should be an error check for the call to 
netdev_dpdk_str_to_ether() above and an associated error log, in the case where 
sscanf fails within netdev_dpdk_str_to_ether() (i.e. Mac is too long, too short 
etc.) it will help with the logs to zero in on the issue.

> > +for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > +if (!rte_eth_dev_is_valid_port(i)) {
> > +continue;
> > +}
> > +
> > +rte_eth_macaddr_get(i, _mac);
> > +if (is_same_ether_addr(, _mac)) {
> > +return i;
> > +}
> > +}
> > +
> > +return DPDK_ETH_PORT_ID_INVALID;
> > +}
> > +

In general for this function I'd like to see a comment added explaining the 
behavior now that there are 2 methods to add ports.

> >  static dpdk_port_t
> >  netdev_dpdk_process_devargs(struct netdev_dpdk *dev,
> >  const char *devargs, char **errp)  {
> > -/* Get the name up to the first comma. */
> > -char *name = xmemdup0(devargs, strcspn(devargs, ","));
> > +char *name;
> >  dpdk_port_t new_port_id = DPDK_ETH_PORT_ID_INVALID;
> >
> > -if (rte_eth_dev_get_port_by_name(name, _port_id)
> > -|| !rte_eth_dev_is_valid_port(new_port_id)) {
> > -/* Device not found in DPDK, attempt to attach it */
> > -if (!rte_eth_dev_attach(devargs, _port_id)) {
> > -/* Attach successful */
> > -

Re: [ovs-dev] [PATCH V2] OF1.5/EXT-334 OXS/Extensible Flow Entry Statistics Support

2018-01-08 Thread Jan Scheurich
Hi Satyavalli,

Please find my responses below.

Regards, Jan

From: Satyavalli Rama [mailto:satyavalli.r...@tcs.com]
Sent: Friday, 05 January, 2018 12:25
To: Ben Pfaff ; Jan Scheurich 
Cc: SatyaValli ; d...@openvswitch.org; 
manasa.cherukupa...@tcs.com; p.pava...@tcs.com; Harivelam Lavanya 
; muttamsetty.su...@tcs.com
Subject: Re: [ovs-dev] [PATCH V2] OF1.5/EXT-334 OXS/Extensible Flow Entry 
Statistics Support

Hi Jan and Ben,

Please find the inline responses.


-Ben Pfaff > wrote: -
To: Jan Scheurich 
>
From: Ben Pfaff >
Date: 01/05/2018 02:35AM
Cc: SatyaValli >, 
"d...@openvswitch.org" 
>, Manasa Cherukupally 
>, Pavani 
Panthagada >, Lavanya Harivelam 
>, Surya 
Muttamsetty >, 
SatyaValli >
Subject: Re: [ovs-dev] [PATCH V2] OF1.5/EXT-334 OXS/Extensible Flow Entry 
Statistics Support
On Wed, Jan 03, 2018 at 04:24:06PM +, Jan Scheurich wrote:
> > > >
> > > > This Patch provides implementation Existing flow entry statistics are
> > > > redefined as standard OXS(OpenFlow Extensible Statistics) fields for
> > > > displaying the arbitrary flow stats.The existing Flow Stats were renamed
> > > > as Flow Description.
> > > >
> > > > To support this implementation below messages are newly added
> > > >
> > > > OFPRAW_OFPT15_FLOW_REMOVED,
> > > > OFPRAW_OFPST15_FLOW_REQUEST,
> > > > OFPRAW_OFPST15_FLOW_DESC_REQUEST,
> > > > OFPRAW_OFPST15_AGGREGATE_REQUEST,
> > > > OFPRAW_OFPST15_FLOW_REPLY,
> > > > OFPRAW_OFPST15_FLOW_DESC_REPLY,
> > > > OFPRAW_OFPST15_AGGREGATE_REPLY,
> > > >
> > > > The current commit adds support for the new feature in flow statistics
> > > > multipart messages,aggregate multipart messages and OXS support for flow
> > > > removal message, individual flow description messages.
> > > >
> > > > "ovs-ofctl dump-flows" needs to be provided with the arbitrary OXS 
> > > > fields
> > > > for displaying the desired flow stats.
> > > >
> > > > Below are Commands to display OXS stats field wise
> > > >
> > > > Flow Statistics Multipart
> > > > ovs-ofctl dump-flows -O OpenFlow15  idle_time
> > > > ovs-ofctl dump-flows -O OpenFlow15  packet_count
> > > > ovs-ofctl dump-flows -O OpenFlow15  byte_count
> > >
> > > This would break backward compatibility for one of the most frequently 
> > > used OVS CLI commands. Why don't you introduce a new
> > command such as "ovs-ofctl dump-flow-stats" for the new OXS stats?
> >
> > I think you might be misinterpreting the meaning here.  It doesn't
> > appear to break compatibility, at least not in a major way, since it
> > doesn't do a lot of updates to the tests that would otherwise be
> > required.
>
> Perhaps I am missing the point of some of these changes. I understand that 
> OVS needs to support the new extensible OXS flow stats syntax in OpenFlow 1.5 
> and the differentiated MP request/reply pairs OFPMP_FLOW_DESC (replacing the 
> former OFPMP_FLOW) and OFPMP_FLOW_STATS (just fetching flow stats per flow 
> w/o the rest of the flow data).
>
> But I don't understand why this should have any impact on the existing CLI 
> command "ovs-ofctl dump-flows" and its output. This tool expressly fetches 
> and displays the complete flow dump from OVS, including match, 
> instructions/actions and statistics. When using OF 1.5 it should 
> transparently apply OFPMP_FLOW_DESC MP request/reply to fetch the data, up to 
> OF 1.4 it should use the original OFPMP_FLOW.
>
> I can't see any ovs-ofctl use case that would justify the use of the new 
> OFPMP_FLOW_STATS request/reply. The removed data in the reply compared to the 
> full flow description are mainly the instructions, the full match is still 
> there to identify each flow. So cutting down the transferred data volume can 
> hardly be the reason (Note, this may still be different for real OF 1.5 
> controllers).
>
> If you believe we should have an ovs-ofctl command anyhow, e.g. for testing 
> purposes, I suggest to introduce a new command or add an option to dump-flows 
> to force use of this particular MP message. The output would be limited to 
> flow match and stats in that case.
>

As per our understanding and from previous review comments we treated OF1.5+ 
has two different ways to request and get replies for Flow Stats: FLOW_DESC and 
FLOW_STATS (which will be even used for Flow Stats Trigger). And we've 
supported this with the help of two commands


Re: [ovs-dev] [PATCH v2] netdev: Custom statistics.

2018-01-08 Thread Weglicki, MichalX
Hi Ben, 

I've just sent V3 of the patch, but please find my comments inline. 

> -Original Message-
> From: Ben Pfaff [mailto:b...@ovn.org]
> Sent: Wednesday, December 20, 2017 12:42 AM
> To: Weglicki, MichalX 
> Cc: d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v2] netdev: Custom statistics.
> 
> On Tue, Dec 05, 2017 at 02:55:20PM +, Michal Weglicki wrote:
> > - New get_custom_stats interface function is added to netdev. It
> >   allows particular netdev implementation to expose custom
> >   counters in dictionary format (counter name/counter value).
> > - New statistics are retrieved using experimenter code and
> >   are printed as a result to ofctl dump-ports.
> > - New counters are available for OpenFlow 1.4+.
> > - New statistics are printed to output via ofctl only if those
> >   are present in reply message.
> > - New statistics definition is added to include/openflow/intel-ext.h.
> > - Custom statistics are implemented only for dpdk-physical
> >   port type.
> > - DPDK-physical implementation uses xstats to collect statistics.
> >   Only dropped and error counters are exposed.
> >
> > v1->v2:
> > - Buffer overrun check in parse_intel_port_custom_property.
> > - ofputil_append_ofp14_port_stats uses "postappend" instead
> >   of "reserve" during message creation.
> > - NEWS update.
> > - DPDK documentation update.
> > - Compilation and sparse warnings corrections.
> >
> > Signed-off-by: Michal Weglicki 
> 
> Thank you for the updated patch.
> 
> I still see some "sparse" warnings (there's an ovs-dev thread about
> trouble with "sparse"--maybe you should join it):
> 
> ../lib/ofp-util.c:8105:27: warning: incorrect type in assignment 
> (different base types)
> ../lib/ofp-util.c:8105:27:expected unsigned long long [unsigned] 
> [usertype] counter_value
> ../lib/ofp-util.c:8105:27:got restricted ovs_be64
> ../lib/ofp-util.c:8333:54: warning: incorrect type in argument 1 
> (different base types)
> ../lib/ofp-util.c:8333:54:expected restricted ovs_be64 [usertype] 
> 
> ../lib/ofp-util.c:8333:54:got unsigned long long [unsigned] 
> [addressable] [usertype] counter_value
> 
It is corrected. 

> I don't think that the new ofproto_class function port_get_custom_stats
> is needed.  The generic ofproto code already knows that every port is
> associated with a netdev.  I think that append_port_stats() in ofproto.c
> can just call netdev_get_custom_stats() directly rather than through
> this additional level of indirection.
It is corrected. 

> 
> I don't like the idea implied in the code in a few places that name[] in
> struct netdev_custom_counter might not be null-terminated.  I think that
> we should ensure that it is always null terminated.  Otherwise there is
> a pitfall for carelessly written code.
To be honest I'm not sure what I could do here, each time statistics are
Requested from netdev, whole buffer is set to "0", so even it 
Particular netdev implementation would return string which 
Is not null-terminated, it will be as all other characters in 
counter name field, will be "\0". When message is decoded from 
open flow buffer, proper check is done, so this part of the code is safe. 
If there is anything else you would like me to do, just let me know. 

> 
> I would like to see some tests.  The most obvious need is a new test for
> ofp-print.at that exercises parsing and printing a stats message that
> includes some of these custom stats.
I've added custom counters to current test case. 

> 
> I have a few other minor suggestions.  I'm appending them as an
> incremental patch.  Many of these are minor style points.  I think that
> most of them will be self-explanatory.  Please let me know if they make
> sense.
[MW] Those changes are applied. 

> 
> I'll look forward to v3.  (After I'm happy with this, I'll probably ask
> Ian to review it to be added to his dpdk merge branch.)
> 
> --8<--cut here-->8--
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 07be4a21cd2c..3ccbec11f592 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -1205,7 +1205,6 @@ netdev_dpdk_configure_xstats(struct netdev_dpdk *dev)
>  int xstats_no;
>  const char *name;
> 
> -
>  /* Retrieving all XSTATS names. If something will go wrong
>   * or amount of counters will be equal 0, rte_xstats_names
>   * buffer will be marked as NULL, and any further xstats
> @@ -1216,7 +1215,6 @@ netdev_dpdk_configure_xstats(struct netdev_dpdk *dev)
>  rte_xstats = NULL;
> 
>  if (dev->rte_xstats_names == NULL || dev->rte_xstats_ids == NULL) {
> -
>  dev->rte_xstats_names_size =
>  rte_eth_xstats_get_names(dev->port_id, NULL, 0);
> 
> @@ -1239,8 +1237,7 @@ netdev_dpdk_configure_xstats(struct netdev_dpdk *dev)
>  VLOG_WARN("Cannot get XSTATS names for port: %"PRIu8,
> 

[ovs-dev] [PATCH v3] netdev: Custom statistics.

2018-01-08 Thread Michal Weglicki
- New get_custom_stats interface function is added to netdev. It
  allows particular netdev implementation to expose custom
  counters in dictionary format (counter name/counter value).
- New statistics are retrieved using experimenter code and
  are printed as a result to ofctl dump-ports.
- New counters are available for OpenFlow 1.4+.
- New statistics are printed to output via ofctl only if those
  are present in reply message.
- New statistics definition is added to include/openflow/intel-ext.h.
- Custom statistics are implemented only for dpdk-physical
  port type.
- DPDK-physical implementation uses xstats to collect statistics.
  Only dropped and error counters are exposed.

v1->v2:
- Buffer overrun check in parse_intel_port_custom_property.
- ofputil_append_ofp14_port_stats uses "postappend" instead
  of "reserve" during message creation.
- NEWS update.
- DPDK documentation update.
- Compilation and sparse warnings corrections.
v2->v3
- Netdev statistics and custom statistics are inserted into
  ovsdb at the same time (in single transaction).
- ofproto_class function has been removed, direct call is
  used instead.
- Current "dump ports" test case has been adjusted to check
  also custom counters.
- Some other minor corrections.

Co-authored-by: Ben Pfaff 
Signed-off-by: Ben Pfaff 
Signed-off-by: Michal Weglicki 
---
 Documentation/howto/dpdk.rst   |  15 ++--
 NEWS   |   5 ++
 include/openflow/intel-ext.h   |  28 ++
 include/openvswitch/netdev.h   |  17 
 include/openvswitch/ofp-util.h |   1 +
 lib/netdev-dpdk.c  | 192 -
 lib/netdev-dummy.c |  36 
 lib/netdev-linux.c |   1 +
 lib/netdev-provider.h  |  13 +++
 lib/netdev-vport.c |   1 +
 lib/netdev.c   |  27 ++
 lib/netdev.h   |   2 +
 lib/ofp-print.c|  18 
 lib/ofp-util.c | 106 +--
 lib/util.c |  13 +++
 lib/util.h |   2 +
 ofproto/ofproto.c  |   3 +
 tests/ofproto.at   |   6 +-
 vswitchd/bridge.c  |  36 ++--
 19 files changed, 502 insertions(+), 20 deletions(-)

diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
index 2393c2f..4061e40 100644
--- a/Documentation/howto/dpdk.rst
+++ b/Documentation/howto/dpdk.rst
@@ -311,12 +311,16 @@ performance of non-tunnel traffic, specifically for 
smaller size packet.
 
 .. _extended-statistics:
 
-Extended Statistics

+Extended & Custom Statistics
+
 
 DPDK Extended Statistics API allows PMD to expose unique set of statistics.
 The Extended statistics are implemented and supported only for DPDK physical
-and vHost ports.
+and vHost ports. Custom statistics are dynamic set of counters which can
+vary depenend on a driver. Those statistics are implemented
+for DPDK physical ports and contain all "dropped" and "error"
+counters from XSTATS. XSTATS counters list can be found here:
+`__.
 
 To enable statistics, you have to enable OpenFlow 1.4 support for OVS.
 Configure bridge br0 to support OpenFlow version 1.4::
@@ -333,8 +337,9 @@ Query the port statistics by explicitly specifying -O 
OpenFlow14 option::
 
 $ ovs-ofctl -O OpenFlow14 dump-ports br0
 
-Note: vHost ports supports only partial statistics. RX packet size based
-counter are only supported and doesn't include TX packet size counters.
+Note about "Extended Statistics": vHost ports supports only partial
+statistics. RX packet size based counter are only supported and
+doesn't include TX packet size counters.
 
 .. _port-hotplug:
 
diff --git a/NEWS b/NEWS
index a7f2def..28b9d5a 100644
--- a/NEWS
+++ b/NEWS
@@ -29,6 +29,11 @@ Post-v2.8.0
  * Add support for vHost IOMMU
  * New debug appctl command 'netdev-dpdk/get-mempool-info'.
  * All the netdev-dpdk appctl commands described in ovs-vswitchd man page.
+ * Custom statistics:
+- DPDK physical ports now return custom set of "dropped" and "error"
+  statistics.
+- ovs-ofctl dump-ports command now prints new of set custom statistics
+  if available (for OpenFlow 1.4+).
- vswitchd:
  * Datapath IDs may now be specified as 0x1 (etc.) instead of 16 digits.
 
diff --git a/include/openflow/intel-ext.h b/include/openflow/intel-ext.h
index 974e63e..3d73171 100644
--- a/include/openflow/intel-ext.h
+++ b/include/openflow/intel-ext.h
@@ -27,9 +27,11 @@
 
 enum intel_port_stats_subtype {
 INTEL_PORT_STATS_RFC2819 = 1,
+INTEL_PORT_STATS_CUSTOM
 };
 
 #define INTEL_PORT_STATS_RFC2819_SIZE 184
+#define INTEL_PORT_STATS_CUSTOM_SIZE 16
 
 /* Struct implements custom property type based on
  * 'ofp_prop_experimenter'. */
@@ -70,4 +72,30 @@ struct 

Re: [ovs-dev] [PATCH v2 3/3] xlate: call tnl_neigh_snoop() from terminate_native_tunnel()

2018-01-08 Thread Zoltán Balogh
Hi Ben,

> I don't understand yet the motivation here.  What does it mean "to keep
> ARP neighbor cache clean"?  Is this a bug fix, a performance fix, a
> cleanup, or something else?
> 
> What is the goal of the filtering you mention?  Is that a second change,
> that can and should be a separate patch, or is it closely coupled to the
> first change you mention?

Jan gave a very good summary in his mail:
https://mail.openvswitch.org/pipermail/ovs-dev/2018-January/342662.html

> This gives me the following Clang errors:
> 
> ../ofproto/ofproto-dpif-xlate.c:3749:13: error: cast from 'const uint8_t 
> *' (aka 'const unsigned char *') to
> 'const struct in6_addr *' increases required alignment from 1 to 4 
> [-Werror,-Wcast-align]
> /usr/include/netinet/in.h:454:36: note: expanded from macro 
> 'IN6_ARE_ADDR_EQUAL'
> ../ofproto/ofproto-dpif-xlate.c:3749:13: error: cast from 'const uint8_t 
> *' (aka 'const unsigned char *') to
> 'const struct in6_addr *' increases required alignment from 1 to 4 
> [-Werror,-Wcast-align]
> /usr/include/netinet/in.h:455:36: note: expanded from macro 
> 'IN6_ARE_ADDR_EQUAL'

Thank you for the review, I'm going to fix this.

> The change to tnl-neigh-cache.c that just adds an #include directive
> seems odd to me.

Sorry, this is a leftover from a previous version. There is no need for the 
include.

> 
> The treatment of the new xbridge_addr is different from the treatment of
> everything already handled in xlate_ofproto_set().  Why does it need
> special treatment?  Is it RCU safe?  Why does it need a refcount (none
> of the other structs do)?
> 

The xbridge_addr holds the current IP addresses assigned to the bridge. These
can change by time. Xbridge_addr is passed to xlate_xbridge_set(), where it is
set as xbridge->addr. Most of the xbridge members are handled the same way
(counting references etc.) in xlate_xbridge_set().
Maybe it's disturbing to retrieve xbridge_addr in xlate_ofproto_set() and then
pass it to xlate_bridge_set().
Do you think, would it be better to store xbridge_addr in ofproto and update it
when an IP address is added or removed, then pass an ofproto member to
xlate_ofproto_set()?

Best regards,
Zoltan

> Thanks,
> 
> Ben.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v1] ovn: Add sanity check when adding tunnel port

2018-01-08 Thread Zhenyu Gao
1. ovs cannot create two ports with same tunnel type and options:remote_ip
2. add santity check to detect if two chassises have same encap ip
3. add a sset to store tunnel ports' type & encap ip information

Signed-off-by: Zhenyu Gao 
---
 ovn/controller/encaps.c | 46 ++
 1 file changed, 46 insertions(+)

diff --git a/ovn/controller/encaps.c b/ovn/controller/encaps.c
index f187a8f..06bffe8 100644
--- a/ovn/controller/encaps.c
+++ b/ovn/controller/encaps.c
@@ -50,6 +50,10 @@ struct tunnel_ctx {
  * adding a new tunnel. */
 struct sset port_names;
 
+/* IP,type of all tunnel ports in bridge, to allow checking uniqueness
+ * when adding a new tunnel. */
+struct sset tunnel_type_ips;
+
 struct ovsdb_idl_txn *ovs_txn;
 const struct ovsrec_bridge *br_int;
 };
@@ -78,6 +82,19 @@ tunnel_create_name(struct tunnel_ctx *tc, const char 
*chassis_id)
 return NULL;
 }
 
+static char*
+create_tunnel_type_ip_key(struct tunnel_ctx *tc, const char *type,
+  const char *ip)
+{
+char *tunnel_type_ip = xasprintf("%s-%s", type, ip);
+if (!sset_contains(>tunnel_type_ips, tunnel_type_ip)) {
+return tunnel_type_ip;
+}
+
+free(tunnel_type_ip);
+return NULL;
+}
+
 static void
 tunnel_add(struct tunnel_ctx *tc, const char *new_chassis_id,
const struct sbrec_encap *encap)
@@ -116,6 +133,15 @@ tunnel_add(struct tunnel_ctx *tc, const char 
*new_chassis_id,
 goto exit;
 }
 
+char *tunnel_type_ip = create_tunnel_type_ip_key(tc, encap->type,
+ encap->ip);
+if (!tunnel_type_ip) {
+VLOG_WARN("Unable to allocate unique remote_ip for '%s' %s tunnel",
+  encap->type, new_chassis_id);
+free(port_name);
+goto exit;
+}
+
 struct ovsrec_interface *iface = ovsrec_interface_insert(tc->ovs_txn);
 ovsrec_interface_set_name(iface, port_name);
 ovsrec_interface_set_type(iface, encap->type);
@@ -130,6 +156,7 @@ tunnel_add(struct tunnel_ctx *tc, const char 
*new_chassis_id,
 ovsrec_bridge_update_ports_addvalue(tc->br_int, port);
 
 sset_add_and_free(>port_names, port_name);
+sset_add_and_free(>tunnel_type_ips, tunnel_type_ip);
 
 exit:
 smap_destroy();
@@ -166,6 +193,7 @@ encaps_run(struct controller_ctx *ctx, const struct 
ovsrec_bridge *br_int,
 struct tunnel_ctx tc = {
 .chassis = SHASH_INITIALIZER(),
 .port_names = SSET_INITIALIZER(_names),
+.tunnel_type_ips = SSET_INITIALIZER(_type_ips),
 .br_int = br_int
 };
 
@@ -194,6 +222,24 @@ encaps_run(struct controller_ctx *ctx, const struct 
ovsrec_bridge *br_int,
  * to delete this one. */
 ovsrec_bridge_update_ports_delvalue(br, port);
 }
+for (size_t i = 0; i< port->n_interfaces; i++) {
+const struct ovsrec_interface *iface = port->interfaces[i];
+if (strcmp(iface->type, "geneve")
+&& strcmp(iface->type, "stt")
+&& strcmp(iface->type, "vxlan")) {
+continue;
+}
+const char *if_remote_ip = smap_get(>options,
+"remote_ip");
+char *tunnel_type_ip;
+tunnel_type_ip = create_tunnel_type_ip_key(,
+   iface->type,
+   if_remote_ip);
+if (!tunnel_type_ip) {
+continue;
+}
+sset_add_and_free(_type_ips, tunnel_type_ip);
+}
 }
 }
 }
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] ovs-dpdk poor scalable capacity with PMDS for dpflow limit ?

2018-01-08 Thread wenxu
Hi,


I just test ovs-dpdk with much more pmds to improve the performace.  But in 
some case the performance more poor with more PMDS


Like follow flows:
in_port=1,nw_dst=10.0.0.x, actions=output:2


Assuming there are 1 differnet flows which is active by packet.


if there are 12 PMDS.  In the worst situation, each pmd has the 1 dpflows. 
so all the dpflows is 12
if there are 24 PMDS.  so all the dpflows should be 24.   the max dpflows 
is 20, it leads dpflows up and down


So the performance of 24 PMDS is worst than 12PMDS


I think in the worst situation, the dpflows = concurrent flows * PMDS
The more PMDS the less concurrent flows


Is there any suggest to modify the following flow_limit strategy?


if (duration > 2000) {
flow_limit /= duration / 1000;
} else if (duration > 1300) {
flow_limit = flow_limit * 3 / 4;
} else if (duration < 1000 && n_flows > 2000
   && flow_limit < n_flows * 1000 / duration) {
flow_limit += 1000;
}
flow_limit = MIN(ofproto_flow_limit, MAX(flow_limit, 1000));


atomic_store_relaxed(>flow_limit, flow_limit);




BR
wenxu






___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] master now "frozen" for forking

2018-01-08 Thread Miguel Angel Ajo Pelayo
It actually makes a lot of sense.

Please ignore my other comment on master about having the IPv6 RA patch on
2.9,
doing it this way, we effectively have it already.


On Fri, Jan 5, 2018 at 5:51 PM Ben Pfaff  wrote:

> On Fri, Jan 05, 2018 at 03:58:08PM +, Stokes, Ian wrote:
> > > Hello everyone.  We are at the point in the release cycle where
> > > traditionally we would fork a branch from master for release.  We have
> > > tried a slightly different approach a few times and I'd like to propose
> > > that we do it again.  Instead of forking immediately, I propose that we
> > > "freeze" master so that only bug fixes and previously posted feature
> > > enhancements go in, until approximately January 15, and then fork
> > > branch-2.9 from master.  Unless anyone has a serious objection, let's
> do
> > > this.
> >
> > Hi Ben,
> >
> > Am I right in think that cut off for feature enhancements previously
> > discussed on the ML is now January 15th if the above proposal goes
> > ahead?
> >
> > I guess there was an assumption on the OVS DPDK side that we would
> > branch on January 1st with bug fixes and previously discussed feature
> > enhancements possible until end of January for the 2.9 release,
> > official release of 2.9 then being February 15th?
>
> That's still roughly correct.  The main difference here is that, until
> Jan. 15 or so, we're applying the rules that will be used for branch-2.9
> to master.  That's for the practical reason that, otherwise, almost
> every patch that goes to master would also go to branch-2.9.  With these
> rules, instead, patches that would only go to master will be deferred
> until after the branch.
>
> The other difference is qualitative: we want to get changes in those
> categories in as early as we can, so please don't delay until the end of
> the month if you can avoid it.
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovn-northd: Avoid duplicate logical flows in SB db

2018-01-08 Thread Miguel Angel Ajo Pelayo
Acked-By: Miguel Angel Ajo 


On Fri, Jan 5, 2018 at 11:44 AM Daniel Alvarez  wrote:

> When there are two ACLs in a Logical Switch with same direction,
> priority, match and action fields, ovn-northd will generate the
> exact same logical flow for them into SB database. This will make
> ovn-controller log messages (INFO) saying that the duplicate flow
> is going to be dropped.
>
> This patch avoids adding duplicate lflows into SB database so that
> ovn-controller doesn't have to process them.
>
> Signed-off-by: Daniel Alvarez 
> ---
>
> This patch is needed as part of the consistency work we're doing in the
> OpenStack integration [0]. In our effort to ensure consistency across
> objects in Neutron and OVN databases we find some special cases like
> security group rules which match OVN ACLs but not in 1:1 relationship.
> Until now, two identical security group rules beloning each to a
> different security group would generate a single ACL in NB database.
> With this behavior, there's no way to map the ACL in OVN to the
> corresponding Neutron object.
>
> By implementing [0] we're trying to ensure this mapping so we make use
> of the external_ids column of every table for this purpose. It may happen
> that we'll have two identical ACLs but each referencing a different
> Neutron object in their external_ids field. However, this will make
> ovn-northd to generate two duplicate lflows into SB database which will
> make ovn-controller drop them when installing the actual flows. With this
> patch we'll avoid duplicate flows to be inserted in SB database in such
> cases.
>
> [0]
> https://docs.openstack.org/networking-ovn/latest/contributor/design/database_consistency.html
>
>  ovn/northd/ovn-northd.c | 11 +++
>  tests/ovn-northd.at | 24 
>  2 files changed, 35 insertions(+)
>
> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> index 7e6b1d9..cc64861 100644
> --- a/ovn/northd/ovn-northd.c
> +++ b/ovn/northd/ovn-northd.c
> @@ -428,6 +428,13 @@ struct macam_node {
>  struct eth_addr mac_addr; /* Allocated MAC address. */
>  };
>
> +static struct ovn_lflow *ovn_lflow_find(struct hmap *lflows,
> +struct ovn_datapath *od,
> +enum ovn_stage stage,
> +uint16_t priority,
> +const char *match,
> +const char *actions);
> +
>  static void
>  cleanup_macam(struct hmap *macam)
>  {
> @@ -2298,6 +2305,10 @@ ovn_lflow_add_at(struct hmap *lflow_map, struct
> ovn_datapath *od,
>   const char *stage_hint, const char *where)
>  {
>  ovs_assert(ovn_stage_to_datapath_type(stage) ==
> ovn_datapath_get_type(od));
> +
> +if (ovn_lflow_find(lflow_map, od, stage, priority, match, actions)) {
> +return;
> +}
>
>  struct ovn_lflow *lflow = xmalloc(sizeof *lflow);
>  ovn_lflow_init(lflow, od, stage, priority,
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index 954e259..ba96c81 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -152,3 +152,27 @@ ovn-nbctl lsp-set-options S1-R1 router-port=R1-S1
>  AT_CHECK([test x`ovn-nbctl lsp-get-up S1-R1` = xup])
>
>  AT_CLEANUP
> +
> +AT_SETUP([ovn -- check that duplicate acls don't generate duplicate
> lflows])
> +AT_SKIP_IF([test $HAVE_PYTHON = no])
> +ovn_start
> +
> +ovn-nbctl ls-add S1
> +
> +# Insert a duplicate ACL into NB database.
> +ovn-nbctl -- --id=@acl create acl direction=to-lport priority=1000 \
> +match='"tcp.dst == 22"' action=drop -- add logical_switch S1 acl @acl
> +ovn-nbctl -- --id=@acl create acl direction=to-lport priority=1000 \
> +match='"tcp.dst == 22"' action=drop -- add logical_switch S1 acl @acl
> +
> +# Check that there are two entries in ACL table in NB database.
> +AT_CHECK([ovn-nbctl find ACL match='"tcp.dst == 22"' | \
> +grep _uuid | wc -l], [0], [2
> +])
> +
> +# Now make sure that only one logical flow is added to SB database.
> +AT_CHECK([ovn-sbctl find Logical_Flow match='"tcp.dst == 22"' | \
> +grep _uuid | wc -l], [0], [1
> +])
> +
> +AT_CLEANUP
> --
> 1.8.3.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6 2/2] OVN: Add support for periodic router advertisements.

2018-01-08 Thread Miguel Angel Ajo Pelayo
Awesome!, do you believe it would be possible to have this on the 2.9
series too?

Having the periodic router advertisements on the next openstack release was
one of our items
towards parity with the reference solution in the land of IPv6.



On Fri, Jan 5, 2018 at 6:05 PM Ben Pfaff  wrote:

> On Wed, Nov 29, 2017 at 03:59:48PM -0600, Mark Michelson wrote:
> > This change adds three new options to the Northbound
> > Logical_Router_Port's ipv6_ra_configs option:
> >
> > * send_periodic: If set to "true", then OVN will send periodic router
> > advertisements out of this router port.
> > * max_interval: The maximum amount of time to wait between sending
> > periodic router advertisements.
> > * min_interval: The minimum amount of time to wait between sending
> > periodic router advertisements.
> >
> > When send_periodic is true, then IPv6 RA configs, as well as some layer
> > 2 and layer 3 information about the router port, are copied to the
> > southbound database. From there, ovn-controller can use this information
> > to know when to send periodic RAs and what to send in them.
> >
> > Because periodic RAs originate from each ovn-controller, the new
> > keep-local flag is set on the packet so that ports don't receive an
> > overabundance of RAs.
> >
> > Signed-off-by: Mark Michelson 
>
> Thanks a lot for the revised series.
>
> I folded in the following changes and applied this series to master.
>
> diff --git a/lib/packets.h b/lib/packets.h
> index 8819f829970e..395599f08c92 100644
> --- a/lib/packets.h
> +++ b/lib/packets.h
> @@ -1020,7 +1020,12 @@ BUILD_ASSERT_DECL(RA_MSG_LEN == sizeof(struct
> ovs_ra_msg));
>   * 6.2.1
>   */
>  #define ND_RA_MAX_INTERVAL_DEFAULT 600
> -#define ND_RA_MIN_INTERVAL_DEFAULT(max) ((max) >= 9 ? (max) / 3 : (max) *
> 3 / 4)
> +
> +static inline int
> +nd_ra_min_interval_default(int max)
> +{
> +return max >= 9 ? max / 3 : max * 3 / 4;
> +}
>
>  /*
>   * Use the same struct for MLD and MLD2, naming members as the defined
> fields in
> @@ -1420,7 +1425,7 @@ void compose_nd_ra(struct dp_packet *,
> const struct in6_addr *ipv6_dst,
> uint8_t cur_hop_limit, uint8_t mo_flags,
> ovs_be16 router_lt, ovs_be32 reachable_time,
> -   ovs_be32 retrans_timer, ovs_be32 mtu);
> +   ovs_be32 retrans_timer, uint32_t mtu);
>  void packet_put_ra_prefix_opt(struct dp_packet *,
>uint8_t plen, uint8_t la_flags,
>ovs_be32 valid_lifetime,
> diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
> index cf414b8f229b..7542db3f4854 100644
> --- a/ovn/controller/pinctrl.c
> +++ b/ovn/controller/pinctrl.c
> @@ -1167,7 +1167,7 @@ ipv6_ra_update_config(const struct
> sbrec_port_binding *pb)
>  config->max_interval = smap_get_int(>options,
> "ipv6_ra_max_interval",
>  ND_RA_MAX_INTERVAL_DEFAULT);
>  config->min_interval = smap_get_int(>options,
> "ipv6_ra_min_interval",
> -ND_RA_MIN_INTERVAL_DEFAULT(config->max_interval));
> +nd_ra_min_interval_default(config->max_interval));
>  config->mtu = smap_get_int(>options, "ipv6_ra_mtu",
> ND_MTU_DEFAULT);
>  config->la_flags = ND_PREFIX_ON_LINK;
>
> @@ -1194,7 +1194,7 @@ ipv6_ra_update_config(const struct
> sbrec_port_binding *pb)
>  }
>
>  /* All nodes multicast addresses */
> -config->eth_dst = ETH_ADDR_C(33,33,00,00,00,01);
> +config->eth_dst = (struct eth_addr) ETH_ADDR_C(33,33,00,00,00,01);
>  ipv6_parse("ff02::1", >ipv6_dst);
>
>  const char *eth_addr = smap_get(>options, "ipv6_ra_src_eth");
> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> index fc14dc8c38eb..e3ddc1fd9bc1 100644
> --- a/ovn/northd/ovn-northd.c
> +++ b/ovn/northd/ovn-northd.c
> @@ -4486,7 +4486,7 @@ copy_ra_to_sb(struct ovn_port *op, const char
> *address_mode)
>  smap_add_format(, "ipv6_ra_max_interval", "%d", max_interval);
>
>  int min_interval = smap_get_int(>nbrp->ipv6_ra_configs,
> -"min_interval", ND_RA_MIN_INTERVAL_DEFAULT(max_interval));
> +"min_interval", nd_ra_min_interval_default(max_interval));
>  if (min_interval > ND_RA_MIN_INTERVAL_MAX(max_interval)) {
>  min_interval = ND_RA_MIN_INTERVAL_MAX(max_interval);
>  }
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev