Re: [ovs-dev] [PATCH] ofproto-dpif-upcall: Improve concurrency by adjust flow-limit

2021-05-11 Thread taoyunupt



At 2021-05-12 04:00:30, "Ben Pfaff"  wrote:
>On Tue, May 11, 2021 at 03:52:00PM +0800, Tao YunXiang wrote:
>> Author: Tao YunXiang 
>
>You don't need an Author: tag because Git stores the author in a
>separate field.
>
>> +/* Use duration as a reference, adjust the number of flow_limit,
>> + * when the duration is small, increase the flow-limit, and 
>> vice versa */
>> +if (duration >= 1000) {
>>  flow_limit /= duration / 1000;
>> -} else if (duration > 1300) {
>> -flow_limit = flow_limit * 3 / 4;
>> -} else if (duration < 1000 &&
>> -   flow_limit < n_flows * 1000 / duration) {
>> -flow_limit += 1000;
>> +} else {
>> +flow_limit *= 1000 / duration;
>>  }
>>  flow_limit = MIN(ofproto_flow_limit, MAX(flow_limit, 1000));
>>  atomic_store_relaxed(>flow_limit, flow_limit);
>
>The above is very abrupt.  It always tries to adjust the flow limit
>upward or downward.  I think that this is a bad idea, especially in the
>upward direction.  If there are only a few flows, which only take a few
>milliseconds to revalidate, then it will keep increasing the flow limit
>upward until it overflows the range of unsigned int.  It will happen
>very quickly, in fact: if duration is 1 ms three times in a row, then we
>will multiply flow_limit by 1,000,000,000 and overflow 32-bit UINT_MAX;
>if it happens six times in a row, we will overflow 64-bit.

>
Hi,Ben, 
Thanks for your review. 
It won't overflow, because of the following line of 
code(https://github.com/openvswitch/ovs/blob/master/ofproto/ofproto-dpif-upcall.c#L974),which
 makes it won't over   `ofproto_flow_limit` .
For example, if currently flow-limit is 200,000(the default 
ofproto_flow_limit), and duration is 1ms, it will become 200,000,000 ,which is 
much less than 32-bit UINT_MAX(4,294,967,295).  
And then,  it wil be back to 200,000,because of the limtitaion mentioned above, 
I think this is important.


>Furthermore, it doesn't work well even if we have longer durations.  If
>revalidation takes 501 ms, then we can adjust the flow_limit upward, but
>this won't do it.



emm...though,it won't change when `501ms`, but it will change when `500ms` or 
times of 2 or 5 .
If you think it's needed to make  process more linear, we can change its type 
from int to float. 
what do you think?


>On the downward direction, this new code does nothing if the duration is
>less than 2 seconds.  We want to aim for 1-second revalidation times.

>


In this code ,I changed the benchmark value from 2s and 1.3s to 1s. It aims for 
1-second revalidation times. 
Do I misunderstand your point?  please let me know , thanks. The following is 
the new code. 


 if (duration >= 1000) {
 flow_limit /= duration / 1000;
 } else {
 flow_limit *= 1000 / duration;
 }
 flow_limit = MIN(ofproto_flow_limit, MAX(flow_limit, 1000));


>I don't think that this approach has been thought through very well.


>___
>dev mailing list
>d...@openvswitch.org
>https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC 0/3] dpif-netlink: Introduce per-cpu upcall dispatching

2021-05-11 Thread Flavio Leitner
Hi,

On Fri, Apr 30, 2021 at 11:31:26AM -0400, Mark Gray wrote:
> This series proposes a new method of distributing upcalls
> to user space threads attempting to resolve a number of
> issues with the current method.
> 

I ran some tests with old V10, current master and this RFC
including the kernel (based on 5.11.0) on a 28 cores system.

The old v10 had the issue of not scaling up in case of a high
load of upcalls. The test sends a burst of UDP packets which
causes upcalls. The table below shows how many packets could
be sent without increasing the upcall loss counter.
   v10   master rfc
packets2k5   >55k   10k

So, it reproduced the same old v10 value. Regarding to branch
master then it's not determined due to test limitation. It is
at least above 55k (last time I think it was 63k). The RFC patch
resulted in a better number compared with v10 though the test
should be using only one thread as v10. I think that keeping
the CPU context could explain the difference.

Running the test with 8 parallel threads sending one burst of
UDP packets each resulted in the following table:
  Branch   missed   lost   
   v10 5201850288
  master   520220
   RFC 520210

Now the wake ups, one thread:
  Branch   wakeprocessing
  master   20+   16+
   RFC 3 1

Column wake: number of different threads receiving
   sched:sched_wakeup or irq:softirq_entry.
Column processing: number of CPUs with double digits
   usage.

And 8 parallel threads:
  Branch   wakeprocessing
  master   20+   20+
   RFC 108+

The results show that this new patch-set addressed the main
thundering herd issue and the scalability issue I reported
during V10 review.

Unfortunately I can review the patches only next week.

Thanks,
fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v1] Fix Segmentation Fault when LR requests duplicate requested-tnl-key

2021-05-11 Thread svc . eng . git-mail
From: Karthik Chandrashekar 

Fixes: b6c349a271363faf277041b230f2244489c7e7a3

Signed-off-by: Karthik Chandrashekar 
---
 northd/ovn-northd.c |  4 ++--
 tests/ovn-northd.at | 18 ++
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index dec6e4efb..36cec10f3 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -1281,8 +1281,8 @@ ovn_datapath_assign_requested_tnl_id(struct hmap 
*dp_tnlids,
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
 VLOG_WARN_RL(, "Logical %s %s requests same tunnel key "
  "%"PRIu32" as another logical switch or router",
- od->nbs ? "switch" : "router", od->nbs->name,
- tunnel_key);
+ od->nbs ? "switch" : "router",
+ od->nbs ? od->nbs->name : od->nbr->name, tunnel_key);
 }
 }
 }
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 48de8aabe..a7eebc826 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -2223,6 +2223,24 @@ AT_CHECK([test $ls2 = 3])
 AT_CLEANUP
 ])
 
+AT_SETUP([ovn -- LR requested-tnl-key])
+ovn_start
+
+ovn-nbctl --wait=sb lr-add lr0
+AT_CHECK([test 1 = $(ovn-sbctl get datapath_binding lr0 tunnel_key)])
+
+ovn-nbctl --wait=sb lr-add lr1
+AT_CHECK([test 2 = $(ovn-sbctl get datapath_binding lr1 tunnel_key)])
+
+AT_CHECK(
+  [ovn-nbctl --wait=sb set logical-router lr0 options:requested-tnl-key=100])
+AT_CHECK([test 100 = $(ovn-sbctl get datapath_binding lr0 tunnel_key)])
+
+AT_CHECK(
+  [ovn-nbctl --wait=sb set logical-router lr1 options:requested-tnl-key=100])
+
+AT_CLEANUP
+
 OVN_FOR_EACH_NORTHD([
 AT_SETUP([port requested-tnl-key])
 AT_KEYWORDS([requested tnl tunnel key keys])
-- 
2.22.3

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] bounce for Tao YunXiang

2021-05-11 Thread Ben Pfaff
It seems that email that I send to Tao YunXiang bounces; at least, this
one did.  I hope that he sees my previous email via the list.

- Forwarded message from Mail Delivery System 
 -

Date: Tue, 11 May 2021 20:00:42 + (UTC)
From: Mail Delivery System 
To: b...@ovn.org
Subject: Undelivered Mail Returned to Sender
Message-Id: <20210511200042.99d1f200...@relay12.mail.gandi.net>

This is the mail system at host relay12.mail.gandi.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

   The mail system

: host mx.cmss.chinamobile.com[221.176.66.77]
said: 550 2eec609ae26442b-87c46 Mail rejected (in reply to end of DATA
command)

Reporting-MTA: dns; relay12.mail.gandi.net
X-Postfix-Queue-ID: E4E0E26
X-Postfix-Sender: rfc822; b...@ovn.org
Arrival-Date: Tue, 11 May 2021 20:00:34 + (UTC)

Final-Recipient: rfc822; taoyunxi...@cmss.chinamobile.com
Original-Recipient: rfc822;taoyunxi...@cmss.chinamobile.com
Action: failed
Status: 5.0.0
Remote-MTA: dns; mx.cmss.chinamobile.com
Diagnostic-Code: smtp; 550 2eec609ae26442b-87c46 Mail rejected

Date: Tue, 11 May 2021 13:00:30 -0700
From: Ben Pfaff 
To: Tao YunXiang 
Cc: d...@openvswitch.org
Subject: Re: [ovs-dev] [PATCH] ofproto-dpif-upcall: Improve concurrency by
 adjust flow-limit
Message-ID: 

On Tue, May 11, 2021 at 03:52:00PM +0800, Tao YunXiang wrote:
> Author: Tao YunXiang 

You don't need an Author: tag because Git stores the author in a
separate field.

> +/* Use duration as a reference, adjust the number of flow_limit,
> + * when the duration is small, increase the flow-limit, and vice 
> versa */
> +if (duration >= 1000) {
>  flow_limit /= duration / 1000;
> -} else if (duration > 1300) {
> -flow_limit = flow_limit * 3 / 4;
> -} else if (duration < 1000 &&
> -   flow_limit < n_flows * 1000 / duration) {
> -flow_limit += 1000;
> +} else {
> +flow_limit *= 1000 / duration;
>  }
>  flow_limit = MIN(ofproto_flow_limit, MAX(flow_limit, 1000));
>  atomic_store_relaxed(>flow_limit, flow_limit);

The above is very abrupt.  It always tries to adjust the flow limit
upward or downward.  I think that this is a bad idea, especially in the
upward direction.  If there are only a few flows, which only take a few
milliseconds to revalidate, then it will keep increasing the flow limit
upward until it overflows the range of unsigned int.  It will happen
very quickly, in fact: if duration is 1 ms three times in a row, then we
will multiply flow_limit by 1,000,000,000 and overflow 32-bit UINT_MAX;
if it happens six times in a row, we will overflow 64-bit.

Furthermore, it doesn't work well even if we have longer durations.  If
revalidation takes 501 ms, then we can adjust the flow_limit upward, but
this won't do it.

On the downward direction, this new code does nothing if the duration is
less than 2 seconds.  We want to aim for 1-second revalidation times.

I don't think that this approach has been thought through very well.


- End forwarded message -
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ofproto-dpif-upcall: Improve concurrency by adjust flow-limit

2021-05-11 Thread Ben Pfaff
On Tue, May 11, 2021 at 03:52:00PM +0800, Tao YunXiang wrote:
> Author: Tao YunXiang 

You don't need an Author: tag because Git stores the author in a
separate field.

> +/* Use duration as a reference, adjust the number of flow_limit,
> + * when the duration is small, increase the flow-limit, and vice 
> versa */
> +if (duration >= 1000) {
>  flow_limit /= duration / 1000;
> -} else if (duration > 1300) {
> -flow_limit = flow_limit * 3 / 4;
> -} else if (duration < 1000 &&
> -   flow_limit < n_flows * 1000 / duration) {
> -flow_limit += 1000;
> +} else {
> +flow_limit *= 1000 / duration;
>  }
>  flow_limit = MIN(ofproto_flow_limit, MAX(flow_limit, 1000));
>  atomic_store_relaxed(>flow_limit, flow_limit);

The above is very abrupt.  It always tries to adjust the flow limit
upward or downward.  I think that this is a bad idea, especially in the
upward direction.  If there are only a few flows, which only take a few
milliseconds to revalidate, then it will keep increasing the flow limit
upward until it overflows the range of unsigned int.  It will happen
very quickly, in fact: if duration is 1 ms three times in a row, then we
will multiply flow_limit by 1,000,000,000 and overflow 32-bit UINT_MAX;
if it happens six times in a row, we will overflow 64-bit.

Furthermore, it doesn't work well even if we have longer durations.  If
revalidation takes 501 ms, then we can adjust the flow_limit upward, but
this won't do it.

On the downward direction, this new code does nothing if the duration is
less than 2 seconds.  We want to aim for 1-second revalidation times.

I don't think that this approach has been thought through very well.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4] ovn-controller: Fix port group conjunction flow explosion problem.

2021-05-11 Thread Mark Michelson

On 5/6/21 6:03 PM, Han Zhou wrote:



On Thu, May 6, 2021 at 1:18 PM Mark Michelson > wrote:

 >
 > Thanks, Han!
 >
 > Acked-by: Mark Michelson >


Thanks Mark. I applied it to master branch.
I'd also like to backport it to branch-21.03. Do you have any concerns 
before I do that? I know it is not a bug fix, but it is kind of a 
blocker for larger scale with big port groups in use. (It is ok if it a 
concern, then we can have downstream maintained cherry-picking the patches)


I'm personally fine with this. This sort of thing definitely straddles 
the line between bug fix and improvement.




 >
 > On 5/3/21 9:33 PM, Han Zhou wrote:
 > > For an ACL with match: outport == @PG && ip4.src == $PG_AS, given below
 > > scale:
 > >
 > > P: PG size
 > > LP: number of local lports
 > > D: number of all datapaths (logical switches)
 > > LD: number of datapaths that contain local lports
 > >
 > > With current OVN implementation, the total number of OF flows is:
 > >      LP + (P * D) + D
 > >
 > > The reason is, firstly, datapath is not part of the conjunction, so for
 > > each datapath the lflow is reparsed.
 > >
 > > Secondly, although ovn-controller tries to filter out the flows 
that are
 > > for non-local lports, with the conjunction match, the logic that 
filters

 > > out non-local flows doesn't work for the conjunction part that doesn't
 > > have the lport in the match (the P * D part). When there is only one
 > > port on each LS it is fine, because no conjunction will be used because
 > > SB port groups are split per datapath, so each port group would have
 > > only one port. However, when more than one ports are on each LS the 
flow

 > > explosion happens.
 > >
 > > This patch deal with the second reason above, by refining the SB port
 > > groups to store only locally bound lports: empty const sets will not
 > > generate any flows. This reduces the related flow number from
 > > LP + (P * D) + D to LP + (P * LD) + LD.
 > >
 > > Since LD is expected to be small, so even if it is a multiplier, the
 > > total number is reduced significantly. In particular, in ovn-k8s use
 > > cases the LD is always 1, so the formula above becomes LP + P + LD.
 > >
 > > With a scale of 1k k8s nodes, each has 4 ports for the same PG: P = 4k,
 > > LP = 4, D = 1k, LD = 1. The current implementation generates ~4m flows.
 > > With this patch it becomes only ~4k.
 > >
 > > Reported-by: Girish Moodalbail >
 > > Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2021-March/381082.html 

 > > Reported-by: Dumitru Ceara >
 > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1944098 

 > > Acked-by: Mark D. Gray >

 > > Signed-off-by: Han Zhou mailto:hz...@ovn.org>>
 > > ---
 > > v3 -> v4: addresses Mark Michelson's comments - improved the test case.
 > >
 > >   controller/binding.c        |  20 
 > >   controller/binding.h        |   9 ++
 > >   controller/ovn-controller.c | 212 
+++-

 > >   include/ovn/expr.h          |   3 +-
 > >   lib/expr.c                  |  13 ++-
 > >   tests/ovn.at                 |  94 
 > >   tests/test-ovn.c            |   4 +-
 > >   utilities/ovn-trace.c       |   2 +-
 > >   8 files changed, 321 insertions(+), 36 deletions(-)
 > >
 > > diff --git a/controller/binding.c b/controller/binding.c
 > > index 451f00e34..4ca2b4d9a 100644
 > > --- a/controller/binding.c
 > > +++ b/controller/binding.c
 > > @@ -2988,3 +2988,23 @@ cleanup:
 > >
 > >       return b_lport;
 > >   }
 > > +
 > > +struct sset *
 > > +binding_collect_local_binding_lports(struct local_binding_data 
*lbinding_data)

 > > +{
 > > +    struct sset *lports = xzalloc(sizeof *lports);
 > > +    sset_init(lports);
 > > +    struct shash_node *shash_node;
 > > +    SHASH_FOR_EACH (shash_node, _data->lports) {
 > > +        struct binding_lport *b_lport = shash_node->data;
 > > +        sset_add(lports, b_lport->name);
 > > +    }
 > > +    return lports;
 > > +}
 > > +
 > > +void
 > > +binding_destroy_local_binding_lports(struct sset *lports)
 > > +{
 > > +    sset_destroy(lports);
 > > +    free(lports);
 > > +}
 > > diff --git a/controller/binding.h b/controller/binding.h
 > > index 4fc9ef207..cd573dbbe 100644
 > > --- a/controller/binding.h
 > > +++ b/controller/binding.h
 > > @@ -128,4 +128,13 @@ void binding_seqno_run(struct 
local_binding_data *lbinding_data);

 > >   void binding_seqno_install(struct local_binding_data *lbinding_data);
 > >   void binding_seqno_flush(void);
 > >   void binding_dump_local_bindings(struct local_binding_data *, 
struct ds *);

 > > +
 > > +/* Generates a sset of lport names from local_binding_data.
 > > + * Note: the caller is responsible for destroying and 

Re: [ovs-dev] [PATCH ovn v2] if-status: Add OVS interface status management module.

2021-05-11 Thread Mark Michelson

Hi Dumitru,

The change looks good to me. It's very easy to understand the state 
machine, and it's easy to follow how the code is structured. Yay! I 
would have liked to see some more test cases, but given the discussion 
you had on v1 of this patch, I can understand why they're not present in 
this version.


Acked-by: Mark Michelson 

On 5/6/21 11:35 AM, Dumitru Ceara wrote:

The initial implementation of the notification mechanism that indicates
if OVS flows corresponding to a logical switch port have been installed
is not resilient enough.  It didn't cover the case when transactions to
the local OVS DB or to the Southbound fail.

In order to deal with that, factor out the code that tracks the state
of the interfaces to a new module, 'if-status', and implement it with
a state machine that will retry setting Port_Binding.UP and
OVS.Interface.external_ids:ovn-installed in the Southbound and local
OVS databases.

Having a separate module makes sense because tracking the interface
state doesn't really depend on the rest of the binding module, except
for interacting with the Port_Binding and Interface IDL records.  For
this we add some additional APIs to binding.c.

Reported-at: https://bugzilla.redhat.com/1952846
Fixes: 4d3cb42b076b ("binding: Set Logical_Switch_Port.up when all OVS flows are 
installed.")
Signed-off-by: Dumitru Ceara 
---
V2:
- Address Numan's comments:
   - Update Port_Binding.up whenever an OVS interface is released.
   - Add a new test.
- Factor out functions that set Port_Binding.up in binding.c
- Remove useless hmapx.h include from binding.c.
- Fix some indentation.
---
  controller/automake.mk  |   2 +
  controller/binding.c| 338 -
  controller/binding.h|  13 +-
  controller/if-status.c  | 415 
  controller/if-status.h  |  37 
  controller/ovn-controller.c |  24 ++-
  tests/ovn.at|   8 +
  7 files changed, 625 insertions(+), 212 deletions(-)
  create mode 100644 controller/if-status.c
  create mode 100644 controller/if-status.h

diff --git a/controller/automake.mk b/controller/automake.mk
index e664f1980..2f6c50890 100644
--- a/controller/automake.mk
+++ b/controller/automake.mk
@@ -10,6 +10,8 @@ controller_ovn_controller_SOURCES = \
controller/encaps.h \
controller/ha-chassis.c \
controller/ha-chassis.h \
+   controller/if-status.c \
+   controller/if-status.h \
controller/ip-mcast.c \
controller/ip-mcast.h \
controller/lflow.c \
diff --git a/controller/binding.c b/controller/binding.c
index 451f00e34..31f3a210f 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -16,9 +16,9 @@
  #include 
  #include "binding.h"
  #include "ha-chassis.h"
+#include "if-status.h"
  #include "lflow.h"
  #include "lport.h"
-#include "ofctrl-seqno.h"
  #include "patch.h"
  
  #include "lib/bitmap.h"

@@ -41,32 +41,6 @@ VLOG_DEFINE_THIS_MODULE(binding);
   */
  #define OVN_INSTALLED_EXT_ID "ovn-installed"
  
-/* Set of OVS interface IDs that have been released in the most recent

- * processing iterations.  This gets updated in release_lport() and is
- * periodically emptied in binding_seqno_run().
- */
-static struct sset binding_iface_released_set =
-SSET_INITIALIZER(_iface_released_set);
-
-/* Set of OVS interface IDs that have been bound in the most recent
- * processing iterations.  This gets updated in release_lport() and is
- * periodically emptied in binding_seqno_run().
- */
-static struct sset binding_iface_bound_set =
-SSET_INITIALIZER(_iface_bound_set);
-
-static void
-binding_iface_released_add(const char *iface_id)
-{
-sset_add(_iface_released_set, iface_id);
-}
-
-static void
-binding_iface_bound_add(const char *iface_id)
-{
-sset_add(_iface_bound_set, iface_id);
-}
-
  #define OVN_QOS_TYPE "linux-htb"
  
  struct qos_queue {

@@ -672,7 +646,8 @@ static void local_binding_destroy(struct local_binding *,
struct shash *binding_lports);
  static void local_binding_delete(struct local_binding *,
   struct shash *local_bindings,
- struct shash *binding_lports);
+ struct shash *binding_lports,
+ struct if_status_mgr *if_mgr);
  static struct binding_lport *local_binding_add_lport(
  struct shash *binding_lports,
  struct local_binding *,
@@ -692,6 +667,8 @@ static void binding_lport_delete(struct shash 
*binding_lports,
   struct binding_lport *);
  static void binding_lport_add(struct shash *binding_lports,
struct binding_lport *);
+static void binding_lport_set_up(struct binding_lport *, bool sb_readonly);
+static void binding_lport_set_down(struct binding_lport *, bool sb_readonly);
  static struct binding_lport *binding_lport_find(
  struct shash 

Re: [ovs-dev] [PATCH V2 1/1] dpdk: Add debug appctl to get malloc statistics.

2021-05-11 Thread Kevin Traynor
On 11/05/2021 14:22, Eli Britstein wrote:
> 
> On 5/11/2021 4:13 PM, Eelco Chaudron wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On 11 May 2021, at 14:53, Eli Britstein wrote:
>>
>>> On 5/11/2021 2:29 PM, Eelco Chaudron wrote:
 External email: Use caution opening links or attachments


 On 2 May 2021, at 9:13, Eli Britstein wrote:

> New appctl 'dpdk/get-malloc-stats' implemented to get result of
> 'rte_malloc_dump_stats()' function.
>
> Could be used for debugging.
 This patch looks good, however, my suggestion on the first patchset was to 
 include the rte_malloc_get_socket_stats() output for all sockets in the 
 system with this command. Or was there a specific reason to abandon this?
>>> Looking again into it, following by your comments, I saw the exact same 
>>> information with the 2 commands, so I abandoned the less elaborated one.
>> The rte_malloc_get_socket_stats() shows the information per numa node, which 
>> might be of interest in a multi-node system. Copied in Kevin/David, do you 
>> have any experience needing this information per node? I guess it will not 
>> hurt putting it in.
> 
> I do run on a multi-node system.
> 
> Example run with v1, those are the outputs:
> 
> Socket = 0
> heap_totalsz_bytes = 1073741824
> heap_freesz_bytes = 1071313728
> greatest_free_size = 1071313728
> free_count = 1
> alloc_count = 227
> heap_allocsz_bytes = 2428096
> Socket = 1
> heap_totalsz_bytes = 1073741824
> heap_freesz_bytes = 277099968
> greatest_free_size = 276979584
> free_count = 34
> alloc_count = 90
> heap_allocsz_bytes = 796641856
> 
> Heap id:0
>      Heap name:socket_0
>      Heap_size:1073741824,
>      Free_size:1071313728,
>      Alloc_size:2428096,
>      Greatest_free_size:1071313728,
>      Alloc_count:227,
>      Free_count:1,
> Heap id:1
>      Heap name:socket_1
>      Heap_size:1073741824,
>      Free_size:277099968,
>      Alloc_size:796641856,
>      Greatest_free_size:276979584,
>      Alloc_count:90,
>      Free_count:34,
> Heap id:2
>      Heap name:
>      Heap_size:0,
> ... (all sizes are 0), until "Heap id:31".
> 
> 
> So the latter (e.g. rte_malloc_dump_stats()) gives the same info, and 
> more, than rte_malloc_get_socket_stats().
> 
> This is why I abandoned the socket stats, as we get them anyway, with 
> less code in OVS side.
> 

yes, I confirm this too.

Heap id:0
Heap name:socket_0
Heap_size:8589934592,
Free_size:8583704896,
Alloc_size:6229696,
Greatest_free_size:8583704896,
Alloc_count:80,
Free_count:1,
Heap id:1
Heap name:socket_1
Heap_size:8589934592,
Free_size:8589934592,
Alloc_size:0,
Greatest_free_size:8589934592,
Alloc_count:0,
Free_count:1,
Heap id:2
Heap name:
Heap_size:0,
Free_size:0,
Alloc_size:0,
Greatest_free_size:0,
Alloc_count:0,
Free_count:0,
...31

LGTM,

Acked-by: Kevin Traynor 


>>
> Signed-off-by: Eli Britstein 
> Reviewed-by: Salem Sol 
> ---
>NEWS |  2 ++
>lib/dpdk-unixctl.man |  2 ++
>lib/dpdk.c   | 30 ++
>3 files changed, 34 insertions(+)
>
> diff --git a/NEWS b/NEWS
> index 95cf922aa..705baa90d 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -9,6 +9,8 @@ Post-v2.15.0
> * New option '--no-record-hostname' to disable hostname 
> configuration
>   in ovsdb on startup.
> * New command 'record-hostname-if-not-set' to update hostname in 
> ovsdb.
> +   - DPDK:
> + * New debug appctl command 'dpdk/get-malloc-stats'.
>
>
>v2.15.0 - 15 Feb 2021
> diff --git a/lib/dpdk-unixctl.man b/lib/dpdk-unixctl.man
> index 2d6d576f2..a0d1fa2ea 100644
> --- a/lib/dpdk-unixctl.man
> +++ b/lib/dpdk-unixctl.man
> @@ -10,5 +10,7 @@ list of words separated by spaces: a word can be either 
> a logging \fBlevel\fR
>\fBnotice\fR, \fBinfo\fR or \fBdebug\fR) or a \fBpattern\fR matching 
> DPDK
>components (see \fBdpdk/log-list\fR command on \fBovs\-appctl\fR(8)) 
> separated
>by a colon from the logging \fBlevel\fR to apply.
> +.IP "\fBdpdk/get-malloc-stats\fR"
> +Prints the heap information statistics about DPDK malloc.
>.RE
>.
> diff --git a/lib/dpdk.c b/lib/dpdk.c
> index 319540394..a22de66eb 100644
> --- a/lib/dpdk.c
> +++ b/lib/dpdk.c
> @@ -25,6 +25,7 @@
>#include 
>#include 
>#include 
> +#include 
>#include 
>#include 
>
> @@ -356,6 +357,33 @@ dpdk_unixctl_log_set(struct unixctl_conn *conn, int 
> argc, const char *argv[],
>unixctl_command_reply(conn, NULL);
>}
>
> +static void
> +dpdk_unixctl_get_malloc_stats(struct unixctl_conn *conn,
> + 

Re: [ovs-dev] [PATCH] github: Fix up malformed /etc/hosts.

2021-05-11 Thread Ilya Maximets
On 5/11/21 4:52 PM, Ilya Maximets wrote:
> On 5/11/21 4:41 PM, Aaron Conole wrote:
>> Ilya Maximets  writes:
>>
>>> For some reason /etc/hosts in GHA now contains a plain text line
>>> like this:
>>>
>>>   Note: Don't Delete this file. Also, don't remove this line. ...
>>>
>>> This breaks libunbound and makes a series of unit tests to emit
>>> following warning:
>>>   |1|dns_resolve|WARN|Failed to read etc/hosts: syntax error
>>>
>>> Working around this issue by removing a bad line from /etc/hosts
>>> until this fixed properly by GitHub team.  This in combination
>>> with other fixes should unblock CI.
>>>
>>> Bug for virtual-environments:
>>>   https://github.com/actions/virtual-environments/issues/3353
>>>
>>> Signed-off-by: Ilya Maximets 
>>> ---
>>
>> Like David, I am amused by the fact that we need to delete the line for
>> gha to work.
>>
>> Acked-by: Aaron Conole 
>>
> 
> Thanks, Aaron and David.
> 
> I'm running final test on my private branch.  Once it finished, I'll
> push these changes to all active branches.
> 
> Best regards, Ilya Maximets.
> 

Applied to master and backported down to 2.12 to unblock CI.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] doc: automake: Add support for sphinx 4.0.

2021-05-11 Thread Ilya Maximets
On 5/10/21 5:57 PM, Ilya Maximets wrote:
> File layout for man pages in sphinx 4 by default changed [1] from:
> 
>   Documentation/_ref/man/page.section
> 
>  to:
> 
>   Documentation/_ref/man/section/page.section
> 
> Ajusting our build scripts so they will be able to locate files
> in new places.  This fixes our CI build.
> 
> [1] https://github.com/sphinx-doc/sphinx/issues/7996
> 
> Signed-off-by: Ilya Maximets 
> ---

Applied to master and backported down to 2.12 to unblock CI.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] cirrus: Look up existing versions of python dependencies.

2021-05-11 Thread Ilya Maximets
On 5/6/21 2:27 PM, Ilya Maximets wrote:
> FreeBSD sometimes changes the base version of python3 that is
> used for packages.  This affects package names.  For example,
> currently CI is broken, because there is no more py37- versions
> of sphinx and openssl available, only py38- ones:
> 
>   pkg: No packages available to install matching 'py37-openssl'
>have been found in the repositories
>   pkg: No packages available to install matching 'py37-sphinx'
>have been found in the repositories
> 
> We had the same issue last year with 3.6 -> 3.7 transition:
>   dfa2e3d04948 ("cirrus: Use python 3.7 packages on FreeBSD.")
> 
> Fixing that by searching for a package instead of using a specific
> version.  This should help to avoid same issues in the future.
> 
> Signed-off-by: Ilya Maximets 
> ---

Applied to master and backported down to all 2.12.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] github: Fix up malformed /etc/hosts.

2021-05-11 Thread Numan Siddique
On Tue, May 11, 2021 at 10:34 AM Dumitru Ceara  wrote:
>
> On 5/11/21 4:23 PM, Ilya Maximets wrote:
> > For some reason /etc/hosts in GHA now contains a plain text line
> > like this:
> >
> >   Note: Don't Delete this file. Also, don't remove this line. ...
> >
> > This breaks libunbound and makes a series of unit tests to emit
> > following warning:
> >   |1|dns_resolve|WARN|Failed to read etc/hosts: syntax error
> >
> > Working around this issue by removing a bad line from /etc/hosts
> > until this fixed properly by GitHub team.
> >
> > Bug for virtual-environments:
> >   https://github.com/actions/virtual-environments/issues/3353
> >
> > Signed-off-by: Ilya Maximets 
> > ---
>
> Looks good to me, thanks!
>
> Acked-by: Dumitru Ceara 

Thanks Ilya and Dumitru for fixin this.

I applied this patch to the main branch.  At this point, I'm not
backporting to the branches.
Let me know if it is required.

Thanks
Numan

>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] github: Fix up malformed /etc/hosts.

2021-05-11 Thread Ilya Maximets
On 5/11/21 4:41 PM, Aaron Conole wrote:
> Ilya Maximets  writes:
> 
>> For some reason /etc/hosts in GHA now contains a plain text line
>> like this:
>>
>>   Note: Don't Delete this file. Also, don't remove this line. ...
>>
>> This breaks libunbound and makes a series of unit tests to emit
>> following warning:
>>   |1|dns_resolve|WARN|Failed to read etc/hosts: syntax error
>>
>> Working around this issue by removing a bad line from /etc/hosts
>> until this fixed properly by GitHub team.  This in combination
>> with other fixes should unblock CI.
>>
>> Bug for virtual-environments:
>>   https://github.com/actions/virtual-environments/issues/3353
>>
>> Signed-off-by: Ilya Maximets 
>> ---
> 
> Like David, I am amused by the fact that we need to delete the line for
> gha to work.
> 
> Acked-by: Aaron Conole 
> 

Thanks, Aaron and David.

I'm running final test on my private branch.  Once it finished, I'll
push these changes to all active branches.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] conntrack: add coverage counters for L3 bad checksum

2021-05-11 Thread Aaron Conole
Paolo Valerio  writes:

> similarly to what already exists for L4, add conntrack_l3csum_err
> and ipf_l3csum_err for L3.
>
> Received packets with L3 bad checksum will increase respectively
> ipf_l3csum_err if they are fragments and conntrack_l3csum_err
> otherwise.
>
> Although the patch basically covers IPv4, the names are kept generic.
>
> Signed-off-by: Paolo Valerio 
> ---

Reviewed-by: Aaron Conole 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] github: Fix up malformed /etc/hosts.

2021-05-11 Thread Aaron Conole
Ilya Maximets  writes:

> For some reason /etc/hosts in GHA now contains a plain text line
> like this:
>
>   Note: Don't Delete this file. Also, don't remove this line. ...
>
> This breaks libunbound and makes a series of unit tests to emit
> following warning:
>   |1|dns_resolve|WARN|Failed to read etc/hosts: syntax error
>
> Working around this issue by removing a bad line from /etc/hosts
> until this fixed properly by GitHub team.  This in combination
> with other fixes should unblock CI.
>
> Bug for virtual-environments:
>   https://github.com/actions/virtual-environments/issues/3353
>
> Signed-off-by: Ilya Maximets 
> ---

Like David, I am amused by the fact that we need to delete the line for
gha to work.

Acked-by: Aaron Conole 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] github: Fix up malformed /etc/hosts.

2021-05-11 Thread Dumitru Ceara
On 5/11/21 4:23 PM, Ilya Maximets wrote:
> For some reason /etc/hosts in GHA now contains a plain text line
> like this:
> 
>   Note: Don't Delete this file. Also, don't remove this line. ...
> 
> This breaks libunbound and makes a series of unit tests to emit
> following warning:
>   |1|dns_resolve|WARN|Failed to read etc/hosts: syntax error
> 
> Working around this issue by removing a bad line from /etc/hosts
> until this fixed properly by GitHub team.
> 
> Bug for virtual-environments:
>   https://github.com/actions/virtual-environments/issues/3353
> 
> Signed-off-by: Ilya Maximets 
> ---

Looks good to me, thanks!

Acked-by: Dumitru Ceara 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] github: Fix up malformed /etc/hosts.

2021-05-11 Thread Ilya Maximets
For some reason /etc/hosts in GHA now contains a plain text line
like this:

  Note: Don't Delete this file. Also, don't remove this line. ...

This breaks libunbound and makes a series of unit tests to emit
following warning:
  |1|dns_resolve|WARN|Failed to read etc/hosts: syntax error

Working around this issue by removing a bad line from /etc/hosts
until this fixed properly by GitHub team.

Bug for virtual-environments:
  https://github.com/actions/virtual-environments/issues/3353

Signed-off-by: Ilya Maximets 
---

Almost successful build with this change applied:
  https://github.com/igsilya/ovn/actions/runs/831854402
(system tests failed because of a flaky test)

Same patch for OVS:
  
https://patchwork.ozlabs.org/project/openvswitch/patch/20210511132049.3906854-1-i.maxim...@ovn.org/

 .github/workflows/test.yml | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
index 4a150a312..d7bb7eecf 100644
--- a/.github/workflows/test.yml
+++ b/.github/workflows/test.yml
@@ -87,6 +87,12 @@ jobs:
   if:   matrix.m32 != ''
   run:  sudo apt install -y ${{ env.m32_dependecies }}
 
+- name: fix up /etc/hosts
+  # https://github.com/actions/virtual-environments/issues/3353
+  run:  |
+cat /etc/hosts
+sudo sed -i "/don't remove this line/d" /etc/hosts || true
+
 - name: update PATH
   run:  |
 echo "$HOME/bin">> $GITHUB_PATH
-- 
2.26.3

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] doc: automake: Add support for sphinx 4.0.

2021-05-11 Thread Aaron Conole
Ilya Maximets  writes:

> File layout for man pages in sphinx 4 by default changed [1] from:
>
>   Documentation/_ref/man/page.section
>
>  to:
>
>   Documentation/_ref/man/section/page.section
>
> Ajusting our build scripts so they will be able to locate files
> in new places.  This fixes our CI build.
>
> [1] https://github.com/sphinx-doc/sphinx/issues/7996
>
> Signed-off-by: Ilya Maximets 
> ---

Reviewed-by: Aaron Conole 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] github: Fix up malformed /etc/hosts.

2021-05-11 Thread David Marchand
On Tue, May 11, 2021 at 3:21 PM Ilya Maximets  wrote:
>
> For some reason /etc/hosts in GHA now contains a plain text line
> like this:
>
>   Note: Don't Delete this file. Also, don't remove this line. ...
>
> This breaks libunbound and makes a series of unit tests to emit
> following warning:
>   |1|dns_resolve|WARN|Failed to read etc/hosts: syntax error
>
> Working around this issue by removing a bad line from /etc/hosts
> until this fixed properly by GitHub team.  This in combination
> with other fixes should unblock CI.
>
> Bug for virtual-environments:
>   https://github.com/actions/virtual-environments/issues/3353
>
> Signed-off-by: Ilya Maximets 
> ---
>
> Succesful build with this change + previous fix for shinx 4.0:
>   https://github.com/igsilya/ovs/actions/runs/831558418
>
> OVN is also affected, so the same patch could be applied there.
>
>  .github/workflows/build-and-test.yml | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/.github/workflows/build-and-test.yml 
> b/.github/workflows/build-and-test.yml
> index ce98a9f98..e2350c6d9 100644
> --- a/.github/workflows/build-and-test.yml
> +++ b/.github/workflows/build-and-test.yml
> @@ -113,6 +113,12 @@ jobs:
>  - name: checkout
>uses: actions/checkout@v2
>
> +- name: fix up /etc/hosts
> +  # https://github.com/actions/virtual-environments/issues/3353
> +  run:  |
> +cat /etc/hosts
> +sudo sed -i "/don't remove this line/d" /etc/hosts || true

I like the irony of deleting this line :-).


This || true is probably not necessary: sed won't complain if this
line is not present in /etc/hosts.
But it won't hurt to have it.


Either way lgtm:
Reviewed-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V2 1/1] dpdk: Add debug appctl to get malloc statistics.

2021-05-11 Thread Eelco Chaudron


On 11 May 2021, at 15:22, Eli Britstein wrote:

> On 5/11/2021 4:13 PM, Eelco Chaudron wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On 11 May 2021, at 14:53, Eli Britstein wrote:
>>
>>> On 5/11/2021 2:29 PM, Eelco Chaudron wrote:
 External email: Use caution opening links or attachments


 On 2 May 2021, at 9:13, Eli Britstein wrote:

> New appctl 'dpdk/get-malloc-stats' implemented to get result of
> 'rte_malloc_dump_stats()' function.
>
> Could be used for debugging.
 This patch looks good, however, my suggestion on the first patchset was to 
 include the rte_malloc_get_socket_stats() output for all sockets in the 
 system with this command. Or was there a specific reason to abandon this?
>>> Looking again into it, following by your comments, I saw the exact same 
>>> information with the 2 commands, so I abandoned the less elaborated one.
>> The rte_malloc_get_socket_stats() shows the information per numa node, which 
>> might be of interest in a multi-node system. Copied in Kevin/David, do you 
>> have any experience needing this information per node? I guess it will not 
>> hurt putting it in.
>
> I do run on a multi-node system.
>
> Example run with v1, those are the outputs:
>
> Socket = 0
> heap_totalsz_bytes = 1073741824
> heap_freesz_bytes = 1071313728
> greatest_free_size = 1071313728
> free_count = 1
> alloc_count = 227
> heap_allocsz_bytes = 2428096
> Socket = 1
> heap_totalsz_bytes = 1073741824
> heap_freesz_bytes = 277099968
> greatest_free_size = 276979584
> free_count = 34
> alloc_count = 90
> heap_allocsz_bytes = 796641856
>
> Heap id:0
>     Heap name:socket_0
>     Heap_size:1073741824,
>     Free_size:1071313728,
>     Alloc_size:2428096,
>     Greatest_free_size:1071313728,
>     Alloc_count:227,
>     Free_count:1,
> Heap id:1
>     Heap name:socket_1
>     Heap_size:1073741824,
>     Free_size:277099968,
>     Alloc_size:796641856,
>     Greatest_free_size:276979584,
>     Alloc_count:90,
>     Free_count:34,
> Heap id:2
>     Heap name:
>     Heap_size:0,
> ... (all sizes are 0), until "Heap id:31".
>
>
> So the latter (e.g. rte_malloc_dump_stats()) gives the same info, and more, 
> than rte_malloc_get_socket_stats().
>
> This is why I abandoned the socket stats, as we get them anyway, with less 
> code in OVS side.


Nice! Thanks for sharing the output, as I only had a single node system! This 
clears it up..

Acked-by: Eelco Chaudron 

>>
> Signed-off-by: Eli Britstein 
> Reviewed-by: Salem Sol 
> ---
>NEWS |  2 ++
>lib/dpdk-unixctl.man |  2 ++
>lib/dpdk.c   | 30 ++
>3 files changed, 34 insertions(+)
>
> diff --git a/NEWS b/NEWS
> index 95cf922aa..705baa90d 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -9,6 +9,8 @@ Post-v2.15.0
> * New option '--no-record-hostname' to disable hostname 
> configuration
>   in ovsdb on startup.
> * New command 'record-hostname-if-not-set' to update hostname in 
> ovsdb.
> +   - DPDK:
> + * New debug appctl command 'dpdk/get-malloc-stats'.
>
>
>v2.15.0 - 15 Feb 2021
> diff --git a/lib/dpdk-unixctl.man b/lib/dpdk-unixctl.man
> index 2d6d576f2..a0d1fa2ea 100644
> --- a/lib/dpdk-unixctl.man
> +++ b/lib/dpdk-unixctl.man
> @@ -10,5 +10,7 @@ list of words separated by spaces: a word can be either 
> a logging \fBlevel\fR
>\fBnotice\fR, \fBinfo\fR or \fBdebug\fR) or a \fBpattern\fR matching 
> DPDK
>components (see \fBdpdk/log-list\fR command on \fBovs\-appctl\fR(8)) 
> separated
>by a colon from the logging \fBlevel\fR to apply.
> +.IP "\fBdpdk/get-malloc-stats\fR"
> +Prints the heap information statistics about DPDK malloc.
>.RE
>.
> diff --git a/lib/dpdk.c b/lib/dpdk.c
> index 319540394..a22de66eb 100644
> --- a/lib/dpdk.c
> +++ b/lib/dpdk.c
> @@ -25,6 +25,7 @@
>#include 
>#include 
>#include 
> +#include 
>#include 
>#include 
>
> @@ -356,6 +357,33 @@ dpdk_unixctl_log_set(struct unixctl_conn *conn, int 
> argc, const char *argv[],
>unixctl_command_reply(conn, NULL);
>}
>
> +static void
> +dpdk_unixctl_get_malloc_stats(struct unixctl_conn *conn,
> +  int argc OVS_UNUSED,
> +  const char *argv[] OVS_UNUSED,
> +  void *aux OVS_UNUSED)
> +{
> +char *response = NULL;
> +FILE *stream;
> +size_t size;
> +
> +stream = open_memstream(, );
> +if (!stream) {
> +response = xasprintf("Unable to open memstream: %s.",
> + ovs_strerror(errno));
> +unixctl_command_reply_error(conn, response);
> +goto out;
> +

Re: [ovs-dev] [PATCH V2 1/1] dpdk: Add debug appctl to get malloc statistics.

2021-05-11 Thread Eli Britstein


On 5/11/2021 4:13 PM, Eelco Chaudron wrote:

External email: Use caution opening links or attachments


On 11 May 2021, at 14:53, Eli Britstein wrote:


On 5/11/2021 2:29 PM, Eelco Chaudron wrote:

External email: Use caution opening links or attachments


On 2 May 2021, at 9:13, Eli Britstein wrote:


New appctl 'dpdk/get-malloc-stats' implemented to get result of
'rte_malloc_dump_stats()' function.

Could be used for debugging.

This patch looks good, however, my suggestion on the first patchset was to 
include the rte_malloc_get_socket_stats() output for all sockets in the system 
with this command. Or was there a specific reason to abandon this?

Looking again into it, following by your comments, I saw the exact same 
information with the 2 commands, so I abandoned the less elaborated one.

The rte_malloc_get_socket_stats() shows the information per numa node, which 
might be of interest in a multi-node system. Copied in Kevin/David, do you have 
any experience needing this information per node? I guess it will not hurt 
putting it in.


I do run on a multi-node system.

Example run with v1, those are the outputs:

Socket = 0
heap_totalsz_bytes = 1073741824
heap_freesz_bytes = 1071313728
greatest_free_size = 1071313728
free_count = 1
alloc_count = 227
heap_allocsz_bytes = 2428096
Socket = 1
heap_totalsz_bytes = 1073741824
heap_freesz_bytes = 277099968
greatest_free_size = 276979584
free_count = 34
alloc_count = 90
heap_allocsz_bytes = 796641856

Heap id:0
    Heap name:socket_0
    Heap_size:1073741824,
    Free_size:1071313728,
    Alloc_size:2428096,
    Greatest_free_size:1071313728,
    Alloc_count:227,
    Free_count:1,
Heap id:1
    Heap name:socket_1
    Heap_size:1073741824,
    Free_size:277099968,
    Alloc_size:796641856,
    Greatest_free_size:276979584,
    Alloc_count:90,
    Free_count:34,
Heap id:2
    Heap name:
    Heap_size:0,
... (all sizes are 0), until "Heap id:31".


So the latter (e.g. rte_malloc_dump_stats()) gives the same info, and 
more, than rte_malloc_get_socket_stats().


This is why I abandoned the socket stats, as we get them anyway, with 
less code in OVS side.





Signed-off-by: Eli Britstein 
Reviewed-by: Salem Sol 
---
   NEWS |  2 ++
   lib/dpdk-unixctl.man |  2 ++
   lib/dpdk.c   | 30 ++
   3 files changed, 34 insertions(+)

diff --git a/NEWS b/NEWS
index 95cf922aa..705baa90d 100644
--- a/NEWS
+++ b/NEWS
@@ -9,6 +9,8 @@ Post-v2.15.0
* New option '--no-record-hostname' to disable hostname configuration
  in ovsdb on startup.
* New command 'record-hostname-if-not-set' to update hostname in ovsdb.
+   - DPDK:
+ * New debug appctl command 'dpdk/get-malloc-stats'.


   v2.15.0 - 15 Feb 2021
diff --git a/lib/dpdk-unixctl.man b/lib/dpdk-unixctl.man
index 2d6d576f2..a0d1fa2ea 100644
--- a/lib/dpdk-unixctl.man
+++ b/lib/dpdk-unixctl.man
@@ -10,5 +10,7 @@ list of words separated by spaces: a word can be either a 
logging \fBlevel\fR
   \fBnotice\fR, \fBinfo\fR or \fBdebug\fR) or a \fBpattern\fR matching DPDK
   components (see \fBdpdk/log-list\fR command on \fBovs\-appctl\fR(8)) 
separated
   by a colon from the logging \fBlevel\fR to apply.
+.IP "\fBdpdk/get-malloc-stats\fR"
+Prints the heap information statistics about DPDK malloc.
   .RE
   .
diff --git a/lib/dpdk.c b/lib/dpdk.c
index 319540394..a22de66eb 100644
--- a/lib/dpdk.c
+++ b/lib/dpdk.c
@@ -25,6 +25,7 @@
   #include 
   #include 
   #include 
+#include 
   #include 
   #include 

@@ -356,6 +357,33 @@ dpdk_unixctl_log_set(struct unixctl_conn *conn, int argc, 
const char *argv[],
   unixctl_command_reply(conn, NULL);
   }

+static void
+dpdk_unixctl_get_malloc_stats(struct unixctl_conn *conn,
+  int argc OVS_UNUSED,
+  const char *argv[] OVS_UNUSED,
+  void *aux OVS_UNUSED)
+{
+char *response = NULL;
+FILE *stream;
+size_t size;
+
+stream = open_memstream(, );
+if (!stream) {
+response = xasprintf("Unable to open memstream: %s.",
+ ovs_strerror(errno));
+unixctl_command_reply_error(conn, response);
+goto out;
+}
+
+rte_malloc_dump_stats(stream, NULL);
+
+fclose(stream);
+
+unixctl_command_reply(conn, response);
+out:
+free(response);
+}
+
   static bool
   dpdk_init__(const struct smap *ovs_other_config)
   {
@@ -525,6 +553,8 @@ dpdk_init__(const struct smap *ovs_other_config)
dpdk_unixctl_mem_stream, rte_log_dump);
   unixctl_command_register("dpdk/log-set", "{level | pattern:level}", 0,
INT_MAX, dpdk_unixctl_log_set, NULL);
+unixctl_command_register("dpdk/get-malloc-stats", "", 0, 0,
+ dpdk_unixctl_get_malloc_stats, NULL);

   /* We are called from the main thread here */
   RTE_PER_LCORE(_lcore_id) = NON_PMD_CORE_ID;
--
2.28.0.2311.g225365fb51


[ovs-dev] [PATCH] github: Fix up malformed /etc/hosts.

2021-05-11 Thread Ilya Maximets
For some reason /etc/hosts in GHA now contains a plain text line
like this:

  Note: Don't Delete this file. Also, don't remove this line. ...

This breaks libunbound and makes a series of unit tests to emit
following warning:
  |1|dns_resolve|WARN|Failed to read etc/hosts: syntax error

Working around this issue by removing a bad line from /etc/hosts
until this fixed properly by GitHub team.  This in combination
with other fixes should unblock CI.

Bug for virtual-environments:
  https://github.com/actions/virtual-environments/issues/3353

Signed-off-by: Ilya Maximets 
---

Succesful build with this change + previous fix for shinx 4.0:
  https://github.com/igsilya/ovs/actions/runs/831558418

OVN is also affected, so the same patch could be applied there.

 .github/workflows/build-and-test.yml | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/.github/workflows/build-and-test.yml 
b/.github/workflows/build-and-test.yml
index ce98a9f98..e2350c6d9 100644
--- a/.github/workflows/build-and-test.yml
+++ b/.github/workflows/build-and-test.yml
@@ -113,6 +113,12 @@ jobs:
 - name: checkout
   uses: actions/checkout@v2
 
+- name: fix up /etc/hosts
+  # https://github.com/actions/virtual-environments/issues/3353
+  run:  |
+cat /etc/hosts
+sudo sed -i "/don't remove this line/d" /etc/hosts || true
+
 - name: update PATH
   run:  |
 echo "$HOME/bin">> $GITHUB_PATH
-- 
2.26.3

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V2 1/1] dpdk: Add debug appctl to get malloc statistics.

2021-05-11 Thread Eelco Chaudron



On 11 May 2021, at 14:53, Eli Britstein wrote:

> On 5/11/2021 2:29 PM, Eelco Chaudron wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On 2 May 2021, at 9:13, Eli Britstein wrote:
>>
>>> New appctl 'dpdk/get-malloc-stats' implemented to get result of
>>> 'rte_malloc_dump_stats()' function.
>>>
>>> Could be used for debugging.
>> This patch looks good, however, my suggestion on the first patchset was to 
>> include the rte_malloc_get_socket_stats() output for all sockets in the 
>> system with this command. Or was there a specific reason to abandon this?
> Looking again into it, following by your comments, I saw the exact same 
> information with the 2 commands, so I abandoned the less elaborated one.

The rte_malloc_get_socket_stats() shows the information per numa node, which 
might be of interest in a multi-node system. Copied in Kevin/David, do you have 
any experience needing this information per node? I guess it will not hurt 
putting it in.

>>
>>> Signed-off-by: Eli Britstein 
>>> Reviewed-by: Salem Sol 
>>> ---
>>>   NEWS |  2 ++
>>>   lib/dpdk-unixctl.man |  2 ++
>>>   lib/dpdk.c   | 30 ++
>>>   3 files changed, 34 insertions(+)
>>>
>>> diff --git a/NEWS b/NEWS
>>> index 95cf922aa..705baa90d 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -9,6 +9,8 @@ Post-v2.15.0
>>>* New option '--no-record-hostname' to disable hostname configuration
>>>  in ovsdb on startup.
>>>* New command 'record-hostname-if-not-set' to update hostname in 
>>> ovsdb.
>>> +   - DPDK:
>>> + * New debug appctl command 'dpdk/get-malloc-stats'.
>>>
>>>
>>>   v2.15.0 - 15 Feb 2021
>>> diff --git a/lib/dpdk-unixctl.man b/lib/dpdk-unixctl.man
>>> index 2d6d576f2..a0d1fa2ea 100644
>>> --- a/lib/dpdk-unixctl.man
>>> +++ b/lib/dpdk-unixctl.man
>>> @@ -10,5 +10,7 @@ list of words separated by spaces: a word can be either a 
>>> logging \fBlevel\fR
>>>   \fBnotice\fR, \fBinfo\fR or \fBdebug\fR) or a \fBpattern\fR matching DPDK
>>>   components (see \fBdpdk/log-list\fR command on \fBovs\-appctl\fR(8)) 
>>> separated
>>>   by a colon from the logging \fBlevel\fR to apply.
>>> +.IP "\fBdpdk/get-malloc-stats\fR"
>>> +Prints the heap information statistics about DPDK malloc.
>>>   .RE
>>>   .
>>> diff --git a/lib/dpdk.c b/lib/dpdk.c
>>> index 319540394..a22de66eb 100644
>>> --- a/lib/dpdk.c
>>> +++ b/lib/dpdk.c
>>> @@ -25,6 +25,7 @@
>>>   #include 
>>>   #include 
>>>   #include 
>>> +#include 
>>>   #include 
>>>   #include 
>>>
>>> @@ -356,6 +357,33 @@ dpdk_unixctl_log_set(struct unixctl_conn *conn, int 
>>> argc, const char *argv[],
>>>   unixctl_command_reply(conn, NULL);
>>>   }
>>>
>>> +static void
>>> +dpdk_unixctl_get_malloc_stats(struct unixctl_conn *conn,
>>> +  int argc OVS_UNUSED,
>>> +  const char *argv[] OVS_UNUSED,
>>> +  void *aux OVS_UNUSED)
>>> +{
>>> +char *response = NULL;
>>> +FILE *stream;
>>> +size_t size;
>>> +
>>> +stream = open_memstream(, );
>>> +if (!stream) {
>>> +response = xasprintf("Unable to open memstream: %s.",
>>> + ovs_strerror(errno));
>>> +unixctl_command_reply_error(conn, response);
>>> +goto out;
>>> +}
>>> +
>>> +rte_malloc_dump_stats(stream, NULL);
>>> +
>>> +fclose(stream);
>>> +
>>> +unixctl_command_reply(conn, response);
>>> +out:
>>> +free(response);
>>> +}
>>> +
>>>   static bool
>>>   dpdk_init__(const struct smap *ovs_other_config)
>>>   {
>>> @@ -525,6 +553,8 @@ dpdk_init__(const struct smap *ovs_other_config)
>>>dpdk_unixctl_mem_stream, rte_log_dump);
>>>   unixctl_command_register("dpdk/log-set", "{level | pattern:level}", 0,
>>>INT_MAX, dpdk_unixctl_log_set, NULL);
>>> +unixctl_command_register("dpdk/get-malloc-stats", "", 0, 0,
>>> + dpdk_unixctl_get_malloc_stats, NULL);
>>>
>>>   /* We are called from the main thread here */
>>>   RTE_PER_LCORE(_lcore_id) = NON_PMD_CORE_ID;
>>> --
>>> 2.28.0.2311.g225365fb51

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V2 1/1] dpdk: Add debug appctl to get malloc statistics.

2021-05-11 Thread Eli Britstein



On 5/11/2021 2:29 PM, Eelco Chaudron wrote:

External email: Use caution opening links or attachments


On 2 May 2021, at 9:13, Eli Britstein wrote:


New appctl 'dpdk/get-malloc-stats' implemented to get result of
'rte_malloc_dump_stats()' function.

Could be used for debugging.

This patch looks good, however, my suggestion on the first patchset was to 
include the rte_malloc_get_socket_stats() output for all sockets in the system 
with this command. Or was there a specific reason to abandon this?
Looking again into it, following by your comments, I saw the exact same 
information with the 2 commands, so I abandoned the less elaborated one.



Signed-off-by: Eli Britstein 
Reviewed-by: Salem Sol 
---
  NEWS |  2 ++
  lib/dpdk-unixctl.man |  2 ++
  lib/dpdk.c   | 30 ++
  3 files changed, 34 insertions(+)

diff --git a/NEWS b/NEWS
index 95cf922aa..705baa90d 100644
--- a/NEWS
+++ b/NEWS
@@ -9,6 +9,8 @@ Post-v2.15.0
   * New option '--no-record-hostname' to disable hostname configuration
 in ovsdb on startup.
   * New command 'record-hostname-if-not-set' to update hostname in ovsdb.
+   - DPDK:
+ * New debug appctl command 'dpdk/get-malloc-stats'.


  v2.15.0 - 15 Feb 2021
diff --git a/lib/dpdk-unixctl.man b/lib/dpdk-unixctl.man
index 2d6d576f2..a0d1fa2ea 100644
--- a/lib/dpdk-unixctl.man
+++ b/lib/dpdk-unixctl.man
@@ -10,5 +10,7 @@ list of words separated by spaces: a word can be either a 
logging \fBlevel\fR
  \fBnotice\fR, \fBinfo\fR or \fBdebug\fR) or a \fBpattern\fR matching DPDK
  components (see \fBdpdk/log-list\fR command on \fBovs\-appctl\fR(8)) separated
  by a colon from the logging \fBlevel\fR to apply.
+.IP "\fBdpdk/get-malloc-stats\fR"
+Prints the heap information statistics about DPDK malloc.
  .RE
  .
diff --git a/lib/dpdk.c b/lib/dpdk.c
index 319540394..a22de66eb 100644
--- a/lib/dpdk.c
+++ b/lib/dpdk.c
@@ -25,6 +25,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 

@@ -356,6 +357,33 @@ dpdk_unixctl_log_set(struct unixctl_conn *conn, int argc, 
const char *argv[],
  unixctl_command_reply(conn, NULL);
  }

+static void
+dpdk_unixctl_get_malloc_stats(struct unixctl_conn *conn,
+  int argc OVS_UNUSED,
+  const char *argv[] OVS_UNUSED,
+  void *aux OVS_UNUSED)
+{
+char *response = NULL;
+FILE *stream;
+size_t size;
+
+stream = open_memstream(, );
+if (!stream) {
+response = xasprintf("Unable to open memstream: %s.",
+ ovs_strerror(errno));
+unixctl_command_reply_error(conn, response);
+goto out;
+}
+
+rte_malloc_dump_stats(stream, NULL);
+
+fclose(stream);
+
+unixctl_command_reply(conn, response);
+out:
+free(response);
+}
+
  static bool
  dpdk_init__(const struct smap *ovs_other_config)
  {
@@ -525,6 +553,8 @@ dpdk_init__(const struct smap *ovs_other_config)
   dpdk_unixctl_mem_stream, rte_log_dump);
  unixctl_command_register("dpdk/log-set", "{level | pattern:level}", 0,
   INT_MAX, dpdk_unixctl_log_set, NULL);
+unixctl_command_register("dpdk/get-malloc-stats", "", 0, 0,
+ dpdk_unixctl_get_malloc_stats, NULL);

  /* We are called from the main thread here */
  RTE_PER_LCORE(_lcore_id) = NON_PMD_CORE_ID;
--
2.28.0.2311.g225365fb51

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] Fix redundant datapath set ethernet action with NSH Decap

2021-05-11 Thread Eelco Chaudron



On 27 Apr 2021, at 14:42, Martin Varghese wrote:

> From: Martin Varghese 
>
> When a decap action is applied on NSH header encapsulatiing a
> ethernet packet a redundant set mac address action is programmed
> to the datapath.

Patch looks good to me, small style nit below.

Acked-by: Eelco Chaudron 

> Fixes: f839892a206a ("OF support and translation of generic encap and decap")
> Signed-off-by: Martin Varghese 
> ---
>  lib/odp-util.c   | 3 ++-
>  ofproto/ofproto-dpif-xlate.c | 2 ++
>  tests/nsh.at | 8 
>  3 files changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/lib/odp-util.c b/lib/odp-util.c
> index e1199d1da..9d558082f 100644
> --- a/lib/odp-util.c
> +++ b/lib/odp-util.c
> @@ -7830,7 +7830,8 @@ commit_set_ether_action(const struct flow *flow, struct 
> flow *base_flow,
>  struct offsetof_sizeof ovs_key_ethernet_offsetof_sizeof_arr[] =
>  OVS_KEY_ETHERNET_OFFSETOF_SIZEOF_ARR;
>
> -if (flow->packet_type != htonl(PT_ETH)) {
> +if ((flow->packet_type != htonl(PT_ETH)) ||
> +(base_flow->packet_type != htonl(PT_ETH))) {

Guess this can be simplified to the following (see also style guide):

-if ((flow->packet_type != htonl(PT_ETH)) ||
-(base_flow->packet_type != htonl(PT_ETH))) {
+if (flow->packet_type != htonl(PT_ETH) ||
+base_flow->packet_type != htonl(PT_ETH)) {
 return;

>  return;
>  }
>
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 7108c8a30..a6f4ea334 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -6549,6 +6549,8 @@ xlate_generic_decap_action(struct xlate_ctx *ctx,
>   * Delay generating pop_eth to the next commit. */
>  flow->packet_type = htonl(PACKET_TYPE(OFPHTN_ETHERTYPE,
>ntohs(flow->dl_type)));
> +flow->dl_src = eth_addr_zero;
> +flow->dl_dst = eth_addr_zero;
>  ctx->wc->masks.dl_type = OVS_BE16_MAX;
>  }
>  return false;
> diff --git a/tests/nsh.at b/tests/nsh.at
> index d5c772ff0..e84134e42 100644
> --- a/tests/nsh.at
> +++ b/tests/nsh.at
> @@ -105,7 +105,7 @@ bridge("br0")
>
>  Final flow: 
> in_port=1,vlan_tci=0x,dl_src=00:00:00:00:00:00,dl_dst=11:22:33:44:55:66,dl_type=0x894f,nsh_flags=0,nsh_ttl=63,nsh_mdtype=1,nsh_np=3,nsh_spi=0x1234,nsh_si=255,nsh_c1=0x11223344,nsh_c2=0x0,nsh_c3=0x0,nsh_c4=0x0,nw_proto=0,nw_tos=0,nw_ecn=0,nw_ttl=0
>  Megaflow: recirc_id=0,eth,ip,in_port=1,dl_dst=66:77:88:99:aa:bb,nw_frag=no
> -Datapath actions: 
> push_nsh(flags=0,ttl=63,mdtype=1,np=3,spi=0x1234,si=255,c1=0x11223344,c2=0x0,c3=0x0,c4=0x0),push_eth(src=00:00:00:00:00:00,dst=11:22:33:44:55:66),pop_eth,pop_nsh(),set(eth(dst=11:22:33:44:55:66)),recirc(0x1)
> +Datapath actions: 
> push_nsh(flags=0,ttl=63,mdtype=1,np=3,spi=0x1234,si=255,c1=0x11223344,c2=0x0,c3=0x0,c4=0x0),push_eth(src=00:00:00:00:00:00,dst=11:22:33:44:55:66),pop_eth,pop_nsh(),recirc(0x1)
>  ])
>
>  AT_CHECK([
> @@ -139,7 +139,7 @@ ovs-appctl time/warp 1000
>  AT_CHECK([
>  ovs-appctl dpctl/dump-flows dummy@ovs-dummy | strip_used | grep -v ipv6 
> | sort
>  ], [0], [flow-dump from the main thread:
> -recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth(dst=1e:2c:e9:2a:66:9e),eth_type(0x0800),ipv4(frag=no),
>  packets:1, bytes:98, used:0.0s, 
> actions:push_nsh(flags=0,ttl=63,mdtype=1,np=3,spi=0x1234,si=255,c1=0x11223344,c2=0x0,c3=0x0,c4=0x0),push_eth(src=00:00:00:00:00:00,dst=11:22:33:44:55:66),pop_eth,pop_nsh(),set(eth(dst=11:22:33:44:55:66)),recirc(0x3)
> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth(dst=1e:2c:e9:2a:66:9e),eth_type(0x0800),ipv4(frag=no),
>  packets:1, bytes:98, used:0.0s, 
> actions:push_nsh(flags=0,ttl=63,mdtype=1,np=3,spi=0x1234,si=255,c1=0x11223344,c2=0x0,c3=0x0,c4=0x0),push_eth(src=00:00:00:00:00:00,dst=11:22:33:44:55:66),pop_eth,pop_nsh(),recirc(0x3)
>  
> recirc_id(0x3),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no),
>  packets:1, bytes:98, used:0.0s, actions:2
>  ])
>
> @@ -232,7 +232,7 @@ bridge("br0")
>
>  Final flow: 
> in_port=1,vlan_tci=0x,dl_src=00:00:00:00:00:00,dl_dst=11:22:33:44:55:66,dl_type=0x894f,nsh_flags=0,nsh_ttl=63,nsh_mdtype=2,nsh_np=3,nsh_spi=0x1234,nsh_si=255,nw_proto=0,nw_tos=0,nw_ecn=0,nw_ttl=0
>  Megaflow: recirc_id=0,eth,ip,in_port=1,dl_dst=66:77:88:99:aa:bb,nw_frag=no
> -Datapath actions: 
> push_nsh(flags=0,ttl=63,mdtype=2,np=3,spi=0x1234,si=255,md2=0x1a041234567820001408fedcba9876543210),push_eth(src=00:00:00:00:00:00,dst=11:22:33:44:55:66),pop_eth,pop_nsh(),set(eth(dst=11:22:33:44:55:66)),recirc(0x1)
> +Datapath actions: 
> push_nsh(flags=0,ttl=63,mdtype=2,np=3,spi=0x1234,si=255,md2=0x1a041234567820001408fedcba9876543210),push_eth(src=00:00:00:00:00:00,dst=11:22:33:44:55:66),pop_eth,pop_nsh(),recirc(0x1)
>  ])
>
>  AT_CHECK([
> @@ -266,7 +266,7 @@ ovs-appctl time/warp 1000
>  AT_CHECK([
>  

Re: [ovs-dev] [PATCH v2] conntrack: add coverage counters for L3 bad checksum

2021-05-11 Thread Eelco Chaudron



On 30 Apr 2021, at 19:12, Paolo Valerio wrote:

> similarly to what already exists for L4, add conntrack_l3csum_err
> and ipf_l3csum_err for L3.
>
> Received packets with L3 bad checksum will increase respectively
> ipf_l3csum_err if they are fragments and conntrack_l3csum_err
> otherwise.
>
> Although the patch basically covers IPv4, the names are kept generic.

Looks good to me!

Acked-by: Eelco Chaudron 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v3] ovn-northd.c: Don't use DPG for MC_UNKNOWN related lflows.

2021-05-11 Thread Numan Siddique
On Tue, May 11, 2021 at 4:41 AM Dumitru Ceara  wrote:
>
> On 5/10/21 11:59 PM, Han Zhou wrote:
> > In commit b468c2c1 it avoided using DPG for multicast related lflows, but
> > commit dd94f126 update the MC_UNKNOWN related lflows with DPG used again
> > by mistake. This patch fixes it to avoid problems when a lflow using
> > MC_UNKNOWN is monitored by a ovn-controller before the related mc-group
> > is monitored, which could cause ovn-controller problem due to an
> > incomplete dependency handling in I-P.
> >
> > This change can be removed (and applying DPG for all the other mc-group
> > related lflows) when the I-P dependency handling problem is fixed.
> >
> > This change also fixes the ovn-northd-ddlog part, which missed using
> > UniqueFlow() for MC_UNKNOWN from the initial ddlog commit 0e77b3bcbf.
> >
> > Fixes: dd94f1266c ("northd: MAC learning: Add logical flows for fdb.")
> > Fixes: 0e77b3bcbf ("ovn-northd-ddlog: New implementation of ovn-northd 
> > based on ddlog.")
> > Cc: Numan Siddique 
> > Cc: Leonid Ryzhyk 
> > Co-authored-by: Dumitru Ceara 
>
> LGTM.  For the ddlog part:
>
> Signed-off-by: Dumitru Ceara 
>
> > Signed-off-by: Han Zhou 

Thanks for fixing this.

Acked-by: Numan Siddique 

Numan

> > ---
>
> Thanks!
> Dumitru
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] northd: Support flow offloading for logical switches with no ACLs.

2021-05-11 Thread Numan Siddique
On Mon, May 10, 2021 at 11:43 AM Dumitru Ceara  wrote:
>
> On 5/10/21 5:39 PM, Dumitru Ceara wrote:
> > On 5/7/21 4:42 PM, num...@ovn.org wrote:
> >> From: Numan Siddique 
> >>
> >> Some smart NICs can't offload datapath flows matching on conntrack
> >> fields.  If a deployment desires to make use of such smart NICs
> >> then it cannot configure ACLs on the logical switches.  If suppose
> >> a logical switch S1 has no ACLs configured and a logical switch S2
> >> has ACLs configured, then the CMS would expect the datapath flows
> >> belonging to S1 logical ports are offloaded since it has no ACLs.
> >> But this is not working as expected (even if S1 and S2 are
> >> not connected via a logical router).
> >>
> >> ovn-northd generates the below logical flows in ls_in_acl_hint
> >> and ls_in_acl stages for S1
> >>
> >> table=8 (ls_in_acl_hint ), priority=0, match=(1), action=(next;)
> >> table=9 (ls_in_acl  ), priority=0, match=(1), action=(next;)
> >>
> >> And the below for S2
> >>
> >> table=8 (ls_in_acl_hint ), priority=7, match=(ct.new && !ct.est), 
> >> action=(reg0[7] = 1; reg0[9] = 1; next;)
> >> table=8 (ls_in_acl_hint ), priority=6, match=(!ct.new && ct.est && 
> >> !ct.rpl && ct_label.blocked == 1), action=(reg0[7] = 1; reg0[9] = 1; next;)
> >> table=8 (ls_in_acl_hint ), priority=5, match=(!ct.trk), 
> >> action=(reg0[8] = 1; reg0[9] = 1; next;)
> >> table=8 (ls_in_acl_hint ), priority=4, match=(!ct.new && ct.est && 
> >> !ct.rpl && ct_label.blocked == 0), action=(reg0[8] = 1; reg0[10] = 1; 
> >> next;)
> >> table=8 (ls_in_acl_hint ), priority=3, match=(!ct.est), 
> >> action=(reg0[9] = 1; next;)
> >> table=8 (ls_in_acl_hint ), priority=2, match=(ct.est && 
> >> ct_label.blocked == 1), action=(reg0[9] = 1; next;)
> >> table=8 (ls_in_acl_hint ), priority=1, match=(ct.est && 
> >> ct_label.blocked == 0), action=(reg0[10] = 1; next;)
> >> table=8 (ls_in_acl_hint ), priority=0, match=(1), action=(next;)
> >> table=9 (ls_in_acl  ), priority=65535, match=(!ct.est && ct.rel && 
> >> !ct.new && !ct.inv && ct_label.blocked == 0), action=(next;)
> >> table=9 (ls_in_acl  ), priority=65535, match=(ct.est && !ct.rel && 
> >> !ct.new && !ct.inv && ct.rpl && ct_label.blocked == 0), action=(next;)
> >> table=9 (ls_in_acl  ), priority=65535, match=(ct.inv || (ct.est && 
> >> ct.rpl && ct_label.blocked == 1)), action=(drop;)
> >> table=9 (ls_in_acl  ), priority=65535, match=(nd || nd_ra || nd_rs 
> >> || mldv1 || mldv2), action=(next;)
> >> table=9 (ls_in_acl  ), priority=34000, match=(eth.dst == 
> >> $svc_monitor_mac), action=(next;)
> >> table=9 (ls_in_acl  ), priority=1, match=(ip && (!ct.est || 
> >> (ct.est && ct_label.blocked == 1))), action=(reg0[1] = 1; next;)
> >> table=9 (ls_in_acl  ), priority=0, match=(1), action=(next;)
> >>
> >> Because there are higher priority flows in 'ls_in_acl_hint' and
> >> 'ls_in_acl' with the match on conntrack fields,
> >> ovs-vswitchd will generate a datapath flow with the match on ct_state 
> >> fields as -
> >> 'ct_state(-new-est-rel-rpl-inv-trk)' for the packet from S1, even though
> >> the S1 pipeline doesn't have logical flows which match on conntrack
> >> fields.  [1] has more information.
> >>
> >> Modifying the below flows if a logical switch has no ACLs solves this
> >> problem.
> >>
> >> table=8 (ls_in_acl_hint ), priority=65535, match=(1), 
> >> action=(next;)
> >> table=9 (ls_in_acl  ), priority=65535, match=(1), 
> >> action=(next;)
> >>
> >> With the above flows with higher priority, ovs-vswitchd will not
> >> consider other flows in the same table during translation.
> >>
> >> This patch addresses this issue by using higher prioriy flows (for both
> >> ls_in_acl* and ls_out_acl* stages).
> >>
> >> [1] - https://bugzilla.redhat.com/show_bug.cgi?id=1955191#c8
> >>
> >> Signed-off-by: Numan Siddique 
> >> ---
> >
> > Hi Numan,
> >
> > A couple of tests are failing after rebase:
> >
> > 789: ovn -- ct.inv usage -- ovn-northd -- dp-groups=yes FAILED
> > (ovn-northd.at:3147)
> > 790: ovn -- ct.inv usage -- ovn-northd   FAILED
> > (ovn-northd.at:3147)
> >
> >> v1 -> v2
> >> 
> >>  * Rebased to resolve conflicts.
> >>  * Addressed review comment from Dumitru. Combined ls_has_stateful_acl()
> >>and ls_has_acl() into one single function - od_ls_update_acls_flags().
> >
> > Nit: There's no other function in ovn_northd that's prefixed with
> > od_ls_.*().  Maybe it makes sense to rename this to ls_get_acl_flags()
> > to be inline with ls_has_lb_vip()?
> >
>
> I assume the test failure is just due to the rebase and can be easily
> fixed.  The function rename can be done at apply time then:
>
> Acked-by: Dumitru Ceara 

Thanks Dumitru.

I applied this patch to the main branch with the below changes.

*
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index b137cba6b6..97cdfa7d69 

Re: [ovs-dev] [v2 v2 0/6] MFEX Infrastructure + Optimizations

2021-05-11 Thread Van Haaren, Harry
> -Original Message-
> From: Timothy Redaelli 
> Sent: Monday, May 10, 2021 6:43 PM
> To: Amber, Kumar ; d...@openvswitch.org
> Cc: i.maxim...@ovn.org; jhs...@redhat.com; f...@redhat.com; Van Haaren, Harry
> 
> Subject: Re: [ovs-dev] [v2 v2 0/6] MFEX Infrastructure + Optimizations



> >
> 
> Hi,
> we (as Red Hat) did some tests with a "special" build created on top of
> master (a019868a6268 at that time) with with the 2 series ("DPIF
> Framework + Optimizations" and "MFEX Infrastructure + Optimizations")
> cherry-picked.
> The spec file was also modified in order to use add "-msse4.2 -mpopcnt"
> to OVS CFLAGS.

Hi Timothy,

Thanks for testing and reporting back your findings! Most of the configuration 
is clear to me, but I have a few open questions inline below for context.

The performance numbers reported in the email below do not show benefit when 
enabling AVX512, which contradicts our
recent whitepaper on benchmarking an Optimized Deployment of OVS, which 
includes the AVX512 patches you've benchmarked too.
Specifically Table 8. for DPIF/MFEX patches, and Table 9. for the overall 
optimizations at a platform level are relevant:
https://networkbuilders.intel.com/solutionslibrary/open-vswitch-optimized-deployment-benchmark-technology-guide

Based on the differences between these performance reports, there must be some 
discrepancy in our testing/measurements.
I hope that the questions below help us understand any differences so we can 
all measure the benefits from these optimizations.

Regards, -Harry


> RPM=openvswitch2.15-2.15.0-37.avx512.1.el8fdp (the "special" build with
> the patches backported)
> 
>* Master --- 15.2 Mpps
>* Plus "avx512_gather 3" Only --- 15.2 Mpps
>* Plus "dpif-set dpif_avx512" Only --- 10.1 Mpps
>* Plus "miniflow-parser-set study" --- Failed to converge
>* Plus all three --- 13.5 Mpps

Open questions:
1) Is CPU frequency turbo enabled in any scenario, or always pinned to the 2.6 
GHz base frequency?
   - A "perf top -C x,y"   (where x,y are datapath hyperthread ids) would be 
interesting to compare with 3) below.

2) "plus Avx512 gather 3" (aka, DPCLS in AVX512), we see same performance. Is 
DPCLS in use, or is EMC doing all the work?
   - The output of " ovs-appctl dpif-netdev/pmd-perf-show" would be interesting 
to understand where packets are classified. 

3) "dpif-set dpif_avx512" only. The performance here is very strange, with ~30% 
reduction, while our testing shows performance improvement.
   - A "perf top" here (compared vs step 1) would be helpful to see what is 
going on

4) "miniflow parser set study", I don't understand what is meant by "Failed to 
converge"?
   - Is the traffic running in your benchmark Ether()/IP()/UDP() ?
   - Note that the only traffic pattern accelerated today is Ether()/IP()/UDP() 
(see patch 
https://patchwork.ozlabs.org/project/openvswitch/patch/20210428091931.2090062-5-kumar.am...@intel.com/
 for details). The next revision of the patchset will include other traffic 
patterns, for example Ether()/Dot1Q()/IP()/UDP() and Ether()/IP()/TCP().


> RPM=openvswitch2.15-2.15.0-15.el8fdp (w/o "-msse4.2 -mpopcnt")
>* 15.2 Mpps

5) What CFLAGS "-march=" CPU ISA and "-O" optimization options are being used 
for the package?
   - It is likely that "-msse4.2 -mpopcnt" is already implied if -march=corei7 
or Nehalem for example.



> P2P benchmark
>* ovs-dpdk/25 Gb i40e <-> trex/i40e
>* single queue two pmd's --- two HT's  out of a CPU core.
> 
> Host CPU
> Model name:  Intel(R) Xeon(R) Gold 6132 CPU @ 2.60GHz

Thanks for detailing the configuration, and looking forward to understanding 
the configuration/performance better.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V2 1/1] dpdk: Add debug appctl to get malloc statistics.

2021-05-11 Thread Eelco Chaudron



On 2 May 2021, at 9:13, Eli Britstein wrote:

> New appctl 'dpdk/get-malloc-stats' implemented to get result of
> 'rte_malloc_dump_stats()' function.
>
> Could be used for debugging.

This patch looks good, however, my suggestion on the first patchset was to 
include the rte_malloc_get_socket_stats() output for all sockets in the system 
with this command. Or was there a specific reason to abandon this?

> Signed-off-by: Eli Britstein 
> Reviewed-by: Salem Sol 
> ---
>  NEWS |  2 ++
>  lib/dpdk-unixctl.man |  2 ++
>  lib/dpdk.c   | 30 ++
>  3 files changed, 34 insertions(+)
>
> diff --git a/NEWS b/NEWS
> index 95cf922aa..705baa90d 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -9,6 +9,8 @@ Post-v2.15.0
>   * New option '--no-record-hostname' to disable hostname configuration
> in ovsdb on startup.
>   * New command 'record-hostname-if-not-set' to update hostname in ovsdb.
> +   - DPDK:
> + * New debug appctl command 'dpdk/get-malloc-stats'.
>
>
>  v2.15.0 - 15 Feb 2021
> diff --git a/lib/dpdk-unixctl.man b/lib/dpdk-unixctl.man
> index 2d6d576f2..a0d1fa2ea 100644
> --- a/lib/dpdk-unixctl.man
> +++ b/lib/dpdk-unixctl.man
> @@ -10,5 +10,7 @@ list of words separated by spaces: a word can be either a 
> logging \fBlevel\fR
>  \fBnotice\fR, \fBinfo\fR or \fBdebug\fR) or a \fBpattern\fR matching DPDK
>  components (see \fBdpdk/log-list\fR command on \fBovs\-appctl\fR(8)) 
> separated
>  by a colon from the logging \fBlevel\fR to apply.
> +.IP "\fBdpdk/get-malloc-stats\fR"
> +Prints the heap information statistics about DPDK malloc.
>  .RE
>  .
> diff --git a/lib/dpdk.c b/lib/dpdk.c
> index 319540394..a22de66eb 100644
> --- a/lib/dpdk.c
> +++ b/lib/dpdk.c
> @@ -25,6 +25,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>
> @@ -356,6 +357,33 @@ dpdk_unixctl_log_set(struct unixctl_conn *conn, int 
> argc, const char *argv[],
>  unixctl_command_reply(conn, NULL);
>  }
>
> +static void
> +dpdk_unixctl_get_malloc_stats(struct unixctl_conn *conn,
> +  int argc OVS_UNUSED,
> +  const char *argv[] OVS_UNUSED,
> +  void *aux OVS_UNUSED)
> +{
> +char *response = NULL;
> +FILE *stream;
> +size_t size;
> +
> +stream = open_memstream(, );
> +if (!stream) {
> +response = xasprintf("Unable to open memstream: %s.",
> + ovs_strerror(errno));
> +unixctl_command_reply_error(conn, response);
> +goto out;
> +}
> +
> +rte_malloc_dump_stats(stream, NULL);
> +
> +fclose(stream);
> +
> +unixctl_command_reply(conn, response);
> +out:
> +free(response);
> +}
> +
>  static bool
>  dpdk_init__(const struct smap *ovs_other_config)
>  {
> @@ -525,6 +553,8 @@ dpdk_init__(const struct smap *ovs_other_config)
>   dpdk_unixctl_mem_stream, rte_log_dump);
>  unixctl_command_register("dpdk/log-set", "{level | pattern:level}", 0,
>   INT_MAX, dpdk_unixctl_log_set, NULL);
> +unixctl_command_register("dpdk/get-malloc-stats", "", 0, 0,
> + dpdk_unixctl_get_malloc_stats, NULL);
>
>  /* We are called from the main thread here */
>  RTE_PER_LCORE(_lcore_id) = NON_PMD_CORE_ID;
> -- 
> 2.28.0.2311.g225365fb51

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] doc: automake: Add support for sphinx 4.0.

2021-05-11 Thread David Marchand
On Mon, May 10, 2021 at 5:57 PM Ilya Maximets  wrote:
>
> File layout for man pages in sphinx 4 by default changed [1] from:
>
>   Documentation/_ref/man/page.section
>
>  to:
>
>   Documentation/_ref/man/section/page.section
>
> Ajusting our build scripts so they will be able to locate files
> in new places.  This fixes our CI build.
>
> [1] https://github.com/sphinx-doc/sphinx/issues/7996
>
> Signed-off-by: Ilya Maximets 

Reviewed-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ofproto-dpif-upcall: Improve concurrency by adjust flow-limit

2021-05-11 Thread 0-day Robot
Bleep bloop.  Greetings Tao YunXiang, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 85 characters long (recommended limit is 79)
#46 FILE: ofproto/ofproto-dpif-upcall.c:968:
 * when the duration is small, increase the flow-limit, and vice 
versa */

Lines checked: 61, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v3] ovn-northd.c: Don't use DPG for MC_UNKNOWN related lflows.

2021-05-11 Thread Dumitru Ceara
On 5/10/21 11:59 PM, Han Zhou wrote:
> In commit b468c2c1 it avoided using DPG for multicast related lflows, but
> commit dd94f126 update the MC_UNKNOWN related lflows with DPG used again
> by mistake. This patch fixes it to avoid problems when a lflow using
> MC_UNKNOWN is monitored by a ovn-controller before the related mc-group
> is monitored, which could cause ovn-controller problem due to an
> incomplete dependency handling in I-P.
> 
> This change can be removed (and applying DPG for all the other mc-group
> related lflows) when the I-P dependency handling problem is fixed.
> 
> This change also fixes the ovn-northd-ddlog part, which missed using
> UniqueFlow() for MC_UNKNOWN from the initial ddlog commit 0e77b3bcbf.
> 
> Fixes: dd94f1266c ("northd: MAC learning: Add logical flows for fdb.")
> Fixes: 0e77b3bcbf ("ovn-northd-ddlog: New implementation of ovn-northd based 
> on ddlog.")
> Cc: Numan Siddique 
> Cc: Leonid Ryzhyk 
> Co-authored-by: Dumitru Ceara 

LGTM.  For the ddlog part:

Signed-off-by: Dumitru Ceara 

> Signed-off-by: Han Zhou 
> ---

Thanks!
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] ofproto-dpif-upcall: Fix race condition while purging

2021-05-11 Thread Roi Dayan
From: Jianbo Liu 

There is a race condidtion between purger and handler. Handler may
create new ukey and install it while executing 'ovs-appctl
revalidator/purge' command. However, before handler calls
transition_ukey() in handle_upcalls(), purger may get this ukey from
umap, then evict and delete it. This will trigger ovs_abort in
transition_ukey() for handler because it is trying to set state to
EVICTED or OPERATIONAL, but ukey is already in DELETED state.
To fix this issue, purger must not delete ukey in VISIBLE state.

Fixes: 98bb4286970d ("tests: Add command to purge revalidators of flows.")
Signed-off-by: Jianbo Liu 
Reviewed-by: Roi Dayan 
---
 ofproto/ofproto-dpif-upcall.c | 25 -
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index ccf97266c0b9..3add505ff652 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -327,6 +327,12 @@ struct ukey_op {
 struct dpif_op dop;   /* Flow operation. */
 };
 
+enum sweep_type {
+PURGE_NONE,
+PURGE_SOFT,
+PURGE_HARD,
+};
+
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
 static struct ovs_list all_udpifs = OVS_LIST_INITIALIZER(_udpifs);
 
@@ -345,7 +351,7 @@ static unsigned long udpif_get_n_flows(struct udpif *);
 static void revalidate(struct revalidator *);
 static void revalidator_pause(struct revalidator *);
 static void revalidator_sweep(struct revalidator *);
-static void revalidator_purge(struct revalidator *);
+static void revalidator_purge(struct revalidator *, enum sweep_type);
 static void upcall_unixctl_show(struct unixctl_conn *conn, int argc,
 const char *argv[], void *aux);
 static void upcall_unixctl_disable_megaflows(struct unixctl_conn *, int argc,
@@ -541,7 +547,7 @@ udpif_stop_threads(struct udpif *udpif, bool delete_flows)
 
 if (delete_flows) {
 for (i = 0; i < udpif->n_revalidators; i++) {
-revalidator_purge(>revalidators[i]);
+revalidator_purge(>revalidators[i], PURGE_HARD);
 }
 }
 
@@ -2772,7 +2778,8 @@ revalidator_pause(struct revalidator *revalidator)
 }
 
 static void
-revalidator_sweep__(struct revalidator *revalidator, bool purge)
+revalidator_sweep__(struct revalidator *revalidator,
+enum sweep_type sweep_type)
 {
 struct udpif *udpif;
 uint64_t dump_seq, reval_seq;
@@ -2803,13 +2810,13 @@ revalidator_sweep__(struct revalidator *revalidator, 
bool purge)
 }
 ukey_state = ukey->state;
 if (ukey_state == UKEY_OPERATIONAL
-|| (ukey_state == UKEY_VISIBLE && purge)) {
+|| (ukey_state == UKEY_VISIBLE && sweep_type == PURGE_HARD)) {
 struct recirc_refs recircs = RECIRC_REFS_EMPTY_INITIALIZER;
 bool seq_mismatch = (ukey->dump_seq != dump_seq
  && ukey->reval_seq != reval_seq);
 enum reval_result result;
 
-if (purge) {
+if (sweep_type > PURGE_NONE) {
 result = UKEY_DELETE;
 } else if (!seq_mismatch) {
 result = UKEY_KEEP;
@@ -2856,13 +2863,13 @@ revalidator_sweep__(struct revalidator *revalidator, 
bool purge)
 static void
 revalidator_sweep(struct revalidator *revalidator)
 {
-revalidator_sweep__(revalidator, false);
+revalidator_sweep__(revalidator, PURGE_NONE);
 }
 
 static void
-revalidator_purge(struct revalidator *revalidator)
+revalidator_purge(struct revalidator *revalidator, enum sweep_type sweep_type)
 {
-revalidator_sweep__(revalidator, true);
+revalidator_sweep__(revalidator, sweep_type);
 }
 
 /* In reaction to dpif purge, purges all 'ukey's with same 'pmd_id'. */
@@ -3056,7 +3063,7 @@ upcall_unixctl_purge(struct unixctl_conn *conn, int argc 
OVS_UNUSED,
 int n;
 
 for (n = 0; n < udpif->n_revalidators; n++) {
-revalidator_purge(>revalidators[n]);
+revalidator_purge(>revalidators[n], PURGE_SOFT);
 }
 }
 unixctl_command_reply(conn, "");
-- 
2.8.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v12 00/11] Add offload support for sFlow

2021-05-11 Thread Chris Mi

On 4/27/2021 9:23 AM, Chris Mi wrote:

On 3/24/2021 8:14 PM, Ilya Maximets wrote:

On 3/24/21 10:17 AM, Chris Mi wrote:

On 3/23/2021 10:24 PM, Ilya Maximets wrote:

On 3/5/21 4:27 AM, Chris Mi wrote:

Hi Ilya,

I think about your suggestion recently. But I'm still not very 
clear about the design.

Please see my reply below:

On 3/1/2021 8:48 PM, Ilya Maximets wrote:

On 3/1/21 9:30 AM, Chris Mi wrote:

Hi Simon, Ilya,

Could I know what should we do to make progress for this patch set?
It has been posted in the community for a long time 

In general, the way to get your patches reviewed is to review other
patches.  It's simply because we still have a huge review backlog
(214 patches right now in patchwork and most of them needs review)
and bugfixes usually has a bit higher priority.  By reviewing other
patches you're reducing amount of work for maintainers so they can
get to your patches faster.

OK, I see.

For the series and comments from Eelco:
I didn't read the patches carefully, only a quick glance, but I 
still
do not understand why we need a separate thread to poll psample 
events.

Why can't we just allow usual handler threads to do that?
I'm not sure if you are aware of that the psample netlink is 
different from the ovs
netlink. Without offload, kernel sends missed packet and sFlow 
packet to userspace
using the same netlink 'ovs_packet_family'. So they can use the 
same handler thread.
But in order to offload sFlow action, we use psample kernel module 
to send sampled
packets from kernel to userspace. The format for ovs netlink 
message and psample

netlink messages are different.

Hi.  Sorry for late reply.

Yes, I understand that message format is different, but it doesn't 
really

matter.  All the message-specific handling and parsing should happen
inside the netdev_offload_tc_recv().  This function should have a 
prototype
similar to dpif_netlink_recv(), i.e. it should receive a pointer to 
the

struct dpif_upcall from the caller and fill it with data. Maybe other
type of a generic data structure if it's really not possible to 
construct

struct dpif_upcall.



 From the
architecture perspective it's not a good thing to call ofproto code
from the offload-provider.  This introduces lots of complications
and might cause lots of issues in the future.

I'd say that design should look like this:

 handler thread ->
   dpif_recv() ->
 dpif_netlink_recv() ->
   netdev_offload_recv() ->
 netdev_offload_tc_recv() ->
   nl_sock_recv()
In order to use the handler thread, I'm not sure if we should add 
psample socket
fd to every handler's epoll_fd. If we should do that, we should 
consider how to
differentiate if the event comes from ovs or psample netlink. 
Maybe we should
allocate a special port number for psample and assign it to 
event.data.u32.
Anyway, that's the details. If this is the right direction, I'll 
think about it.

Last three calls should be implemented.

The better version of this will be to throw away dpif part from
above call chain and make it:

 handler thread ->
   netdev_offload_recv() ->
 netdev_offload_tc_recv() ->
   nl_sock_recv()
If we throw away dpif part, maybe we have to write some duplicate 
epoll code
for psample only. And we can't block in nl_sock_recv(). Maybe we 
have to add
psample socket fd to every handler's epoll_fd. So we have to 
change the dpif

somehow.

There is no need to implement any epoll_fd logic for psample.
You may only use handler with handler_id == 0.  Only this handler 
will receive
psample upcalls.  netdev_offload_recv_wait() should be implemented 
similar
to how dpif_netlink_recv_wait() implemented for windows case, i.e. 
it will

call nl_sock_wait(nl_sock, POLLIN) for the psample netlink socket.
There is no need to block and you're never blocking anywhere 
because your're
calling nl_sock_recv(..., wait = false) which will use MSG_DONTWAIT 
while

actually receiving a message.

Hi Ilya,

With this new design, the code will be changed greatly. I'm not 
unwilling to change it.
The effort is not small. So before I start, I want to make sure that 
it is feasible.


Yes, we won't block in nl_sock_recv(), but the handler thread will 
be blocked in poll_block().
If any of the vport netlink socket is readable , it will be waken up 
by kernel. If we don't use
epoll_ctl to add the psample netlink socket to the handler's epoll 
fd, we can't receive the
psample packet as soon as possible. That means we can only receive 
the sampled packet

when there is a miss upcall.

I added some debug message in ovs epoll code and got above 
conclusion. But I'm not

the expert of the epoll API, so I'm not sure if I missed anything.
Handler thread wakes up on POLLIN on epoll_fd itself, not on the 
event on one
of the file descriptors added to epoll.  epoll_fd is added to a 
thread's poll

loop by
   poll_fd_wait(handler->epoll_fd, POLLIN);

If you will call nl_sock_wait() on psample socket, this 

[ovs-dev] [PATCH] ofproto-dpif-upcall: Improve concurrency by adjust flow-limit

2021-05-11 Thread Tao YunXiang
OVS uses a set of gentle mechanisms in ofproto-dpif-upcall.c to control
the number of flows in the datapath. However, this mechanism will make it
difficult for the number of datapath flows to reach a high level in a short 
time.

Through some tests, I found that the problem is that the flow-limit
increases too slowly, but decreases very quickly. We need adjust the
algorithm to make it increase faster as it decreases.

Through this change, the test result of Apache benchmark can reach 12k from 5k,
the test command is as follows:

ab -n 20 -c 500 -r http://1.1.1.2/


Author: Tao YunXiang 
Signed-off-by: Tao YunXiang 
Cc: Ben Pfaff 
---
 ofproto/ofproto-dpif-upcall.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index ccf97266c..1ae2ba5e2 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -445,7 +445,7 @@ udpif_create(struct dpif_backer *backer, struct dpif *dpif)
 
 udpif->dpif = dpif;
 udpif->backer = backer;
-atomic_init(>flow_limit, MIN(ofproto_flow_limit, 1));
+atomic_init(>flow_limit, ofproto_flow_limit);
 udpif->reval_seq = seq_create();
 udpif->dump_seq = seq_create();
 latch_init(>exit_latch);
@@ -963,13 +963,13 @@ udpif_revalidator(void *arg)
 
 duration = MAX(time_msec() - start_time, 1);
 udpif->dump_duration = duration;
-if (duration > 2000) {
+
+/* Use duration as a reference, adjust the number of flow_limit,
+ * when the duration is small, increase the flow-limit, and vice 
versa */
+if (duration >= 1000) {
 flow_limit /= duration / 1000;
-} else if (duration > 1300) {
-flow_limit = flow_limit * 3 / 4;
-} else if (duration < 1000 &&
-   flow_limit < n_flows * 1000 / duration) {
-flow_limit += 1000;
+} else {
+flow_limit *= 1000 / duration;
 }
 flow_limit = MIN(ofproto_flow_limit, MAX(flow_limit, 1000));
 atomic_store_relaxed(>flow_limit, flow_limit);
-- 
2.17.1



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev