Re: [ovs-dev] [FYI][PATCH] tracing/treewide: Remove second parameter of __assign_str()

2024-02-23 Thread Steven Rostedt
On Fri, 23 Feb 2024 13:46:53 -0500
Steven Rostedt  wrote:

> Now one thing I could do is to not remove the parameter, but just add:
> 
>   WARN_ON_ONCE((src) != __data_offsets->item##_ptr_);
> 
> in the __assign_str() macro to make sure that it's still the same that is
> assigned. But I'm not sure how useful that is, and still causes burden to
> have it. I never really liked the passing of the string in two places to
> begin with.

Hmm, maybe I'll just add this patch for 6.9 and then in 6.10 do the
parameter removal.

-- Steve

diff --git a/include/trace/stages/stage6_event_callback.h 
b/include/trace/stages/stage6_event_callback.h
index 0c0f50bcdc56..7372e2c2a0c4 100644
--- a/include/trace/stages/stage6_event_callback.h
+++ b/include/trace/stages/stage6_event_callback.h
@@ -35,6 +35,7 @@ #define __assign_str(dst, src)
do {\
char *__str__ = __get_str(dst); \
int __len__ = __get_dynamic_array_len(dst) - 1; \
+   WARN_ON_ONCE((src) != __data_offsets.dst##_ptr_);   \
memcpy(__str__, __data_offsets.dst##_ptr_ ? :   \
   EVENT_NULL_STR, __len__);\
__str__[__len__] = '\0';\
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [FYI][PATCH] tracing/treewide: Remove second parameter of __assign_str()

2024-02-23 Thread Steven Rostedt
On Fri, 23 Feb 2024 14:50:49 -0500
Kent Overstreet  wrote:

> Tangentially related though, what would make me really happy is if we
> could create the string with in the TP__fast_assign() section. I have to
> have a bunch of annoying wrappers right now because the string length
> has to be known when we invoke the tracepoint.

You can use __string_len() to determine the string length in the tracepoint
(which is executed in the TP_fast_assign() section).

My clean up patches will make __assign_str_len() obsolete too (I'm working
on them now), and you can just use __assign_str().

I noticed that I don't have a string_len example in the sample code and I'm
actually writing it now.

// cutting out everything else:

TRACE_EVENT(foo_bar,

TP_PROTO(const char *foo, int bar),

TP_ARGS(foo, bar),

TP_STRUCT__entry(
__string_len(   lstr,   foo,bar < strlen(foo) ? bar : 
strlen(foo) )
),

TP_fast_assign(
__assign_str(lstr, foo);

// Note, the above is with my updates, without them, you need to duplicate the 
logic

//  __assign_str_len(lstr, foo, bar < strlen(foo) ? bar : 
strlen(foo));
),

TP_printk("%s", __get_str(lstr))
);


The above will allocate "bar < strlen(foo) ? bar : strlen(foo)" size on the
ring buffer. As the size is already stored, my clean up code uses that
instead of requiring duplicating the logic again.

-- Steve
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [FYI][PATCH] tracing/treewide: Remove second parameter of __assign_str()

2024-02-23 Thread Jeff Johnson
On 2/23/2024 9:56 AM, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" 
> 
> [
>This is a treewide change. I will likely re-create this patch again in
>the second week of the merge window of v6.9 and submit it then. Hoping
>to keep the conflicts that it will cause to a minimum.
> ]
> 
> With the rework of how the __string() handles dynamic strings where it
> saves off the source string in field in the helper structure[1], the
> assignment of that value to the trace event field is stored in the helper
> value and does not need to be passed in again.

Just curious if this could be done piecemeal by first changing the
macros to be variadic macros which allows you to ignore the extra
argument. The callers could then be modified in their separate trees.
And then once all the callers have be merged, the macros could be
changed to no longer be variadic.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [FYI][PATCH] tracing/treewide: Remove second parameter of __assign_str()

2024-02-23 Thread Steven Rostedt
On Fri, 23 Feb 2024 10:30:45 -0800
Jeff Johnson  wrote:

> On 2/23/2024 9:56 AM, Steven Rostedt wrote:
> > From: "Steven Rostedt (Google)" 
> > 
> > [
> >This is a treewide change. I will likely re-create this patch again in
> >the second week of the merge window of v6.9 and submit it then. Hoping
> >to keep the conflicts that it will cause to a minimum.
> > ]
> > 
> > With the rework of how the __string() handles dynamic strings where it
> > saves off the source string in field in the helper structure[1], the
> > assignment of that value to the trace event field is stored in the helper
> > value and does not need to be passed in again.  
> 
> Just curious if this could be done piecemeal by first changing the
> macros to be variadic macros which allows you to ignore the extra
> argument. The callers could then be modified in their separate trees.
> And then once all the callers have be merged, the macros could be
> changed to no longer be variadic.

I weighed doing that, but I think ripping off the band-aid is a better
approach. One thing I found is that leaving unused parameters in the macros
can cause bugs itself. I found one case doing my clean up, where an unused
parameter in one of the macros was bogus, and when I made it a used
parameter, it broke the build.

I think for tree-wide changes, the preferred approach is to do one big
patch at once. And since this only affects TRACE_EVENT() macros, it
hopefully would not be too much of a burden (although out of tree users may
suffer from this, but do we care?)

Now one thing I could do is to not remove the parameter, but just add:

WARN_ON_ONCE((src) != __data_offsets->item##_ptr_);

in the __assign_str() macro to make sure that it's still the same that is
assigned. But I'm not sure how useful that is, and still causes burden to
have it. I never really liked the passing of the string in two places to
begin with.

-- Steve
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [FYI][PATCH] tracing/treewide: Remove second parameter of __assign_str()

2024-02-23 Thread Kent Overstreet
On Fri, Feb 23, 2024 at 01:46:53PM -0500, Steven Rostedt wrote:
> On Fri, 23 Feb 2024 10:30:45 -0800
> Jeff Johnson  wrote:
> 
> > On 2/23/2024 9:56 AM, Steven Rostedt wrote:
> > > From: "Steven Rostedt (Google)" 
> > > 
> > > [
> > >This is a treewide change. I will likely re-create this patch again in
> > >the second week of the merge window of v6.9 and submit it then. Hoping
> > >to keep the conflicts that it will cause to a minimum.
> > > ]
> > > 
> > > With the rework of how the __string() handles dynamic strings where it
> > > saves off the source string in field in the helper structure[1], the
> > > assignment of that value to the trace event field is stored in the helper
> > > value and does not need to be passed in again.  
> > 
> > Just curious if this could be done piecemeal by first changing the
> > macros to be variadic macros which allows you to ignore the extra
> > argument. The callers could then be modified in their separate trees.
> > And then once all the callers have be merged, the macros could be
> > changed to no longer be variadic.
> 
> I weighed doing that, but I think ripping off the band-aid is a better
> approach. One thing I found is that leaving unused parameters in the macros
> can cause bugs itself. I found one case doing my clean up, where an unused
> parameter in one of the macros was bogus, and when I made it a used
> parameter, it broke the build.
> 
> I think for tree-wide changes, the preferred approach is to do one big
> patch at once. And since this only affects TRACE_EVENT() macros, it
> hopefully would not be too much of a burden (although out of tree users may
> suffer from this, but do we care?)

Agreed on doing it all at once, it'll be way less spam for people to
deal with.

Tangentially related though, what would make me really happy is if we
could create the string with in the TP__fast_assign() section. I have to
have a bunch of annoying wrappers right now because the string length
has to be known when we invoke the tracepoint.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [FYI][PATCH] tracing/treewide: Remove second parameter of __assign_str()

2024-02-23 Thread Steven Rostedt
On Fri, 23 Feb 2024 12:56:34 -0500
Steven Rostedt  wrote:

> Note, the same updates will need to be done for:
> 
>   __assign_str_len()
>   __assign_rel_str()
>   __assign_rel_str_len()

Correction: The below macros do not pass in their source to the entry
macros, so they will not need to be updated.

-- Steve

>   __assign_bitmask()
>   __assign_rel_bitmask()
>   __assign_cpumask()
>   __assign_rel_cpumask()

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4 2/2] netlink-conntrack: Optimize flushing ct zone.

2024-02-23 Thread Felix Huettner via dev
Previously the kernel did not provide a netlink interface to flush/list
only conntrack entries matching a specific zone. With [1] and [2] it is now
possible to flush and list conntrack entries filtered by zone. Older
kernels not yet supporting this feature will ignore the filter.
For the list request that means just returning all entries (which we can
then filter in userspace as before).
For the flush request that means deleting all conntrack entries.

The implementation is now identical to the windows one, so we combine
them.

These significantly improves the performance of flushing conntrack zones
when the conntrack table is large. Since flushing a conntrack zone is
normally triggered via an openflow command it blocks the main ovs thread
and thereby also blocks new flows from being applied. Using this new
feature we can reduce the flushing time for zones by around 93%.

In combination with OVN the creation of a Logical_Router (which causes
the flushing of a ct zone) could block other operations, e.g. the
failover of Logical_Routers (as they cause new flows to be created).
This is visible from a user perspective as a ovn-controller that is idle
(as it waits for vswitchd) and vswitchd reporting:
"blocked 1000 ms waiting for main to quiesce" (potentially with ever
increasing times).

The following performance tests where run in a qemu vm with 500.000
conntrack entries distributed evenly over 500 ct zones using `ovstest
test-netlink-conntrack flush zone=`.

  |  flush zone with 1000 entries  |   flush zone with no entry |
  +-+--+-+--|
  |   with the patch| without  |   with the patch| without  |
  +--+--+--+--+--+--|
  | v6.8-rc4 |  v6.7.1  | v6.8-rc4 | v6.8-rc4 |  v6.7.1  | v6.8-rc4 |
+-+--+--+--+--+--+--|
| Min |  0.260   |  3.946   |  3.497   |  0.228   |  3.462   |  3.212   |
| Median  |  0.319   |  4.237   |  4.349   |  0.298   |  4.460   |  4.010   |
| 90%ile  |  0.335   |  4.367   |  4.522   |  0.325   |  4.662   |  4.572   |
| 99%ile  |  0.348   |  4.495   |  4.773   |  0.340   |  4.931   |  6.003   |
| Max |  0.362   |  4.543   |  5.054   |  0.348   |  5.390   |  6.396   |
| Mean|  0.320   |  4.236   |  4.331   |  0.296   |  4.430   |  4.071   |
| Total   |  80.02   |  1058|  1082|  73.93   |  1107|  1017|

[1]: 
https://github.com/torvalds/linux/commit/eff3c558bb7e61c41b53e4c8130e514a5a4df9ba
[2]: 
https://github.com/torvalds/linux/commit/fa173a1b4e3fd1ab5451cbc57de6fc624c824b0a

Co-Authored-By: Luca Czesla 
Signed-off-by: Luca Czesla 
Co-Authored-By: Max Lamprecht 
Signed-off-by: Max Lamprecht 
Signed-off-by: Felix Huettner 
---
v3->v4:
- combine the flush logic with windows implementation
v2->v3:
- update description to include upstream fix (Thanks to Ilya for finding
  that issue)
v1->v2:
- fixed wrong signed-off-by

 lib/netlink-conntrack.c | 52 -
 tests/system-traffic.at |  8 +++
 2 files changed, 54 insertions(+), 6 deletions(-)

diff --git a/lib/netlink-conntrack.c b/lib/netlink-conntrack.c
index 492bfcffb..d9015d9f0 100644
--- a/lib/netlink-conntrack.c
+++ b/lib/netlink-conntrack.c
@@ -141,6 +141,9 @@ nl_ct_dump_start(struct nl_ct_dump_state **statep, const 
uint16_t *zone,
 
 nl_msg_put_nfgenmsg(>buf, 0, AF_UNSPEC, NFNL_SUBSYS_CTNETLINK,
 IPCTNL_MSG_CT_GET, NLM_F_REQUEST);
+if (zone) {
+nl_msg_put_be16(>buf, CTA_ZONE, htons(*zone));
+}
 nl_dump_start(>dump, NETLINK_NETFILTER, >buf);
 ofpbuf_clear(>buf);
 
@@ -263,11 +266,9 @@ out:
 return err;
 }
 
-#ifdef _WIN32
-int
-nl_ct_flush_zone(uint16_t flush_zone)
+static int
+nl_ct_flush_zone_with_cta_zone(uint16_t flush_zone)
 {
-/* Windows can flush a specific zone */
 struct ofpbuf buf;
 int err;
 
@@ -282,24 +283,63 @@ nl_ct_flush_zone(uint16_t flush_zone)
 
 return err;
 }
+
+#ifdef _WIN32
+int
+nl_ct_flush_zone(uint16_t flush_zone)
+{
+return nl_ct_flush_zone_with_cta_zone(flush_zone);
+}
 #else
+
+static bool
+netlink_flush_supports_zone(void)
+{
+static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
+static bool supported = false;
+
+if (ovsthread_once_start()) {
+if (ovs_kernel_is_version_or_newer(6, 8)) {
+supported = true;
+} else {
+VLOG_INFO("disabling conntrack flush by zone. "
+  "Not supported in Linux kernel");
+}
+ovsthread_once_done();
+}
+return supported;
+}
+
 int
 nl_ct_flush_zone(uint16_t flush_zone)
 {
-/* Apparently, there's no netlink interface to flush a specific zone.
+/* In older kernels, there was no netlink interface to flush a specific
+ * conntrack zone.
  * This code dumps every connection, checks the zone and eventually
  * delete 

[ovs-dev] [PATCH v4 1/2] util: Support checking for kernel versions.

2024-02-23 Thread Felix Huettner via dev
Extract checking for a given kernel version to a separate function.
It will be used also in the next patch.

Signed-off-by: Felix Huettner 
---
v4:
- extract function to check kernel version

 lib/netdev-linux.c | 14 +++---
 lib/util.c | 24 
 lib/util.h |  4 
 3 files changed, 31 insertions(+), 11 deletions(-)

diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index bf91ef462..51bd71ae3 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -6427,18 +6427,10 @@ getqdisc_is_safe(void)
 static bool safe = false;
 
 if (ovsthread_once_start()) {
-struct utsname utsname;
-int major, minor;
-
-if (uname() == -1) {
-VLOG_WARN("uname failed (%s)", ovs_strerror(errno));
-} else if (!ovs_scan(utsname.release, "%d.%d", , )) {
-VLOG_WARN("uname reported bad OS release (%s)", utsname.release);
-} else if (major < 2 || (major == 2 && minor < 35)) {
-VLOG_INFO("disabling unsafe RTM_GETQDISC in Linux kernel %s",
-  utsname.release);
-} else {
+if (ovs_kernel_is_version_or_newer(2, 35)) {
 safe = true;
+} else {
+VLOG_INFO("disabling unsafe RTM_GETQDISC in Linux kernel");
 }
 ovsthread_once_done();
 }
diff --git a/lib/util.c b/lib/util.c
index 3fb3a4b40..95c605af8 100644
--- a/lib/util.c
+++ b/lib/util.c
@@ -27,6 +27,7 @@
 #include 
 #ifdef __linux__
 #include 
+#include 
 #endif
 #include 
 #include 
@@ -2500,3 +2501,26 @@ OVS_CONSTRUCTOR(winsock_start) {
}
 }
 #endif
+
+#ifndef _WIN32
+bool
+ovs_kernel_is_version_or_newer(int target_major, int target_minor)
+{
+static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
+static int current_major, current_minor = -1;
+
+if (ovsthread_once_start()) {
+struct utsname utsname;
+
+if (uname() == -1) {
+VLOG_WARN("uname failed (%s)", ovs_strerror(errno));
+} else if (!ovs_scan(utsname.release, "%d.%d",
+_major, _minor)) {
+VLOG_WARN("uname reported bad OS release (%s)", utsname.release);
+}
+ovsthread_once_done();
+}
+return current_major < target_major || (
+current_major == target_major && current_minor < target_minor);
+}
+#endif
diff --git a/lib/util.h b/lib/util.h
index f2d45bcac..57d8a2072 100644
--- a/lib/util.h
+++ b/lib/util.h
@@ -611,4 +611,8 @@ int ftruncate(int fd, off_t length);
 }
 #endif
 
+#ifndef _WIN32
+bool ovs_kernel_is_version_or_newer(int target_major, int target_minor);
+#endif
+
 #endif /* util.h */

base-commit: 5f2af0b7a30e7de84de97556223f892ef63ec14b
-- 
2.43.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev