Re: [ovs-dev] [PATCH v7 1/2] revalidator: Add a USDT probe during flow deletion with purge reason.

2024-01-30 Thread Simon Horman
On Fri, Jan 26, 2024 at 10:55:48AM -0500, Aaron Conole wrote:
> Simon Horman  writes:
> 
> > On Thu, Jan 25, 2024 at 12:05:29PM -0500, Aaron Conole wrote:
> >> From: Kevin Sprague 
> >> 
> >> During normal operations, it is useful to understand when a particular flow
> >> gets removed from the system. This can be useful when debugging performance
> >> issues tied to ofproto flow changes, trying to determine deployed traffic
> >> patterns, or while debugging dynamic systems where ports come and go.
> >> 
> >> Prior to this change, there was a lack of visibility around flow 
> >> expiration.
> >> The existing debugging infrastructure could tell us when a flow was added 
> >> to
> >> the datapath, but not when it was removed or why.
> >> 
> >> This change introduces a USDT probe at the point where the revalidator
> >> determines that the flow should be removed.  Additionally, we track the
> >> reason for the flow eviction and provide that information as well.  With
> >> this change, we can track the complete flow lifecycle for the netlink
> >> datapath by hooking the upcall tracepoint in kernel, the flow put USDT, and
> >> the revaldiator USDT, letting us watch as flows are added and removed from
> >> the kernel datapath.
> >> 
> >> This change only enables this information via USDT probe, so it won't be
> >> possible to access this information any other way (see:
> >> Documentation/topics/usdt-probes.rst).
> >> 
> >> Also included is a script (utilities/usdt-scripts/flow_reval_monitor.py)
> >> which serves as a demonstration of how the new USDT probe might be used
> >> going forward.
> >> 
> >> Signed-off-by: Kevin Sprague 
> >> Co-authored-by: Aaron Conole 
> >> Signed-off-by: Aaron Conole 
> >
> > Hi Aaron,
> >
> > has any consideration been given to adding tests for this feature?
> 
> I don't think so.  Actually, we don't have any tests for any of the USDT
> probes / scripts.  At the very least, we could probably update some of
> the existing tests to take advantage of these tracepoints to verify the
> information being captured.  That's a much bigger task, though.

Thanks Aaron,

I understand your point that this is a bigger task.
And I have no objections to treating it as a follow-up.

> Maybe Adrian has opinions on it as well, since Retis may use this
> information to gather insights.
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 1/2] revalidator: Add a USDT probe during flow deletion with purge reason.

2024-01-26 Thread Aaron Conole
Simon Horman  writes:

> On Thu, Jan 25, 2024 at 12:05:29PM -0500, Aaron Conole wrote:
>> From: Kevin Sprague 
>> 
>> During normal operations, it is useful to understand when a particular flow
>> gets removed from the system. This can be useful when debugging performance
>> issues tied to ofproto flow changes, trying to determine deployed traffic
>> patterns, or while debugging dynamic systems where ports come and go.
>> 
>> Prior to this change, there was a lack of visibility around flow expiration.
>> The existing debugging infrastructure could tell us when a flow was added to
>> the datapath, but not when it was removed or why.
>> 
>> This change introduces a USDT probe at the point where the revalidator
>> determines that the flow should be removed.  Additionally, we track the
>> reason for the flow eviction and provide that information as well.  With
>> this change, we can track the complete flow lifecycle for the netlink
>> datapath by hooking the upcall tracepoint in kernel, the flow put USDT, and
>> the revaldiator USDT, letting us watch as flows are added and removed from
>> the kernel datapath.
>> 
>> This change only enables this information via USDT probe, so it won't be
>> possible to access this information any other way (see:
>> Documentation/topics/usdt-probes.rst).
>> 
>> Also included is a script (utilities/usdt-scripts/flow_reval_monitor.py)
>> which serves as a demonstration of how the new USDT probe might be used
>> going forward.
>> 
>> Signed-off-by: Kevin Sprague 
>> Co-authored-by: Aaron Conole 
>> Signed-off-by: Aaron Conole 
>
> Hi Aaron,
>
> has any consideration been given to adding tests for this feature?

I don't think so.  Actually, we don't have any tests for any of the USDT
probes / scripts.  At the very least, we could probably update some of
the existing tests to take advantage of these tracepoints to verify the
information being captured.  That's a much bigger task, though.

Maybe Adrian has opinions on it as well, since Retis may use this
information to gather insights.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 1/2] revalidator: Add a USDT probe during flow deletion with purge reason.

2024-01-26 Thread Simon Horman
On Thu, Jan 25, 2024 at 12:05:29PM -0500, Aaron Conole wrote:
> From: Kevin Sprague 
> 
> During normal operations, it is useful to understand when a particular flow
> gets removed from the system. This can be useful when debugging performance
> issues tied to ofproto flow changes, trying to determine deployed traffic
> patterns, or while debugging dynamic systems where ports come and go.
> 
> Prior to this change, there was a lack of visibility around flow expiration.
> The existing debugging infrastructure could tell us when a flow was added to
> the datapath, but not when it was removed or why.
> 
> This change introduces a USDT probe at the point where the revalidator
> determines that the flow should be removed.  Additionally, we track the
> reason for the flow eviction and provide that information as well.  With
> this change, we can track the complete flow lifecycle for the netlink
> datapath by hooking the upcall tracepoint in kernel, the flow put USDT, and
> the revaldiator USDT, letting us watch as flows are added and removed from
> the kernel datapath.
> 
> This change only enables this information via USDT probe, so it won't be
> possible to access this information any other way (see:
> Documentation/topics/usdt-probes.rst).
> 
> Also included is a script (utilities/usdt-scripts/flow_reval_monitor.py)
> which serves as a demonstration of how the new USDT probe might be used
> going forward.
> 
> Signed-off-by: Kevin Sprague 
> Co-authored-by: Aaron Conole 
> Signed-off-by: Aaron Conole 

Hi Aaron,

has any consideration been given to adding tests for this feature?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v7 1/2] revalidator: Add a USDT probe during flow deletion with purge reason.

2024-01-25 Thread Aaron Conole
From: Kevin Sprague 

During normal operations, it is useful to understand when a particular flow
gets removed from the system. This can be useful when debugging performance
issues tied to ofproto flow changes, trying to determine deployed traffic
patterns, or while debugging dynamic systems where ports come and go.

Prior to this change, there was a lack of visibility around flow expiration.
The existing debugging infrastructure could tell us when a flow was added to
the datapath, but not when it was removed or why.

This change introduces a USDT probe at the point where the revalidator
determines that the flow should be removed.  Additionally, we track the
reason for the flow eviction and provide that information as well.  With
this change, we can track the complete flow lifecycle for the netlink
datapath by hooking the upcall tracepoint in kernel, the flow put USDT, and
the revaldiator USDT, letting us watch as flows are added and removed from
the kernel datapath.

This change only enables this information via USDT probe, so it won't be
possible to access this information any other way (see:
Documentation/topics/usdt-probes.rst).

Also included is a script (utilities/usdt-scripts/flow_reval_monitor.py)
which serves as a demonstration of how the new USDT probe might be used
going forward.

Signed-off-by: Kevin Sprague 
Co-authored-by: Aaron Conole 
Signed-off-by: Aaron Conole 
---
 Documentation/topics/usdt-probes.rst |   1 +
 ofproto/ofproto-dpif-upcall.c|  42 +-
 utilities/automake.mk|   3 +
 utilities/usdt-scripts/flow_reval_monitor.py | 653 +++
 4 files changed, 693 insertions(+), 6 deletions(-)
 create mode 100755 utilities/usdt-scripts/flow_reval_monitor.py

diff --git a/Documentation/topics/usdt-probes.rst 
b/Documentation/topics/usdt-probes.rst
index e527f43bab..a8da9bb1f7 100644
--- a/Documentation/topics/usdt-probes.rst
+++ b/Documentation/topics/usdt-probes.rst
@@ -214,6 +214,7 @@ Available probes in ``ovs_vswitchd``:
 - dpif_recv:recv_upcall
 - main:poll_block
 - main:run_start
+- revalidate:flow_result
 - revalidate_ukey\_\_:entry
 - revalidate_ukey\_\_:exit
 - udpif_revalidator:start_dump
diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index b5cbeed878..97d75833f7 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -269,6 +269,18 @@ enum ukey_state {
 };
 #define N_UKEY_STATES (UKEY_DELETED + 1)
 
+enum flow_del_reason {
+FDR_REVALIDATE = 0, /* The flow was revalidated. */
+FDR_FLOW_IDLE,  /* The flow went unused and was deleted. */
+FDR_TOO_EXPENSIVE,  /* The flow was too expensive to revalidate. */
+FDR_FLOW_WILDCARDED,/* The flow needed a narrower wildcard mask. */
+FDR_BAD_ODP_FIT,/* The flow had a bad ODP flow fit. */
+FDR_NO_OFPROTO, /* The flow didn't have an associated ofproto. */
+FDR_XLATION_ERROR,  /* There was an error translating the flow. */
+FDR_AVOID_CACHING,  /* Flow deleted to avoid caching. */
+FDR_FLOW_LIMIT, /* All flows being killed. */
+};
+
 /* 'udpif_key's are responsible for tracking the little bit of state udpif
  * needs to do flow expiration which can't be pulled directly from the
  * datapath.  They may be created by any handler or revalidator thread at any
@@ -2272,7 +2284,8 @@ populate_xcache(struct udpif *udpif, struct udpif_key 
*ukey,
 static enum reval_result
 revalidate_ukey__(struct udpif *udpif, const struct udpif_key *ukey,
   uint16_t tcp_flags, struct ofpbuf *odp_actions,
-  struct recirc_refs *recircs, struct xlate_cache *xcache)
+  struct recirc_refs *recircs, struct xlate_cache *xcache,
+  enum flow_del_reason *reason)
 {
 struct xlate_out *xoutp;
 struct netflow *netflow;
@@ -2293,11 +2306,13 @@ revalidate_ukey__(struct udpif *udpif, const struct 
udpif_key *ukey,
 netflow = NULL;
 
 if (xlate_ukey(udpif, ukey, tcp_flags, )) {
+*reason = FDR_XLATION_ERROR;
 goto exit;
 }
 xoutp = 
 
 if (xoutp->avoid_caching) {
+*reason = FDR_AVOID_CACHING;
 goto exit;
 }
 
@@ -2311,6 +2326,7 @@ revalidate_ukey__(struct udpif *udpif, const struct 
udpif_key *ukey,
 ofpbuf_clear(odp_actions);
 
 if (!ofproto) {
+*reason = FDR_NO_OFPROTO;
 goto exit;
 }
 
@@ -2322,6 +2338,7 @@ revalidate_ukey__(struct udpif *udpif, const struct 
udpif_key *ukey,
 if (odp_flow_key_to_mask(ukey->mask, ukey->mask_len, _mask, ,
  NULL)
 == ODP_FIT_ERROR) {
+*reason = FDR_BAD_ODP_FIT;
 goto exit;
 }
 
@@ -2331,6 +2348,7 @@ revalidate_ukey__(struct udpif *udpif, const struct 
udpif_key *ukey,
  * down.  Note that we do not know if the datapath has ignored any of the
  * wildcarded bits, so we may be overly conservative here. */
 if