On Tue, Aug 27, 2024 at 11:00:10AM +0200, Eelco Chaudron wrote:
> It is observed in some environments that there are much more ukeys than
> actual DP flows. For example:
> 
> $ ovs-appctl upcall/show
> system@ovs-system:
> flows : (current 7) (avg 6) (max 117) (limit 2125)
> offloaded flows : 525
> dump duration : 1063ms
> ufid enabled : true
> 
> 23: (keys 3612)
> 24: (keys 3625)
> 25: (keys 3485)
> 
> The revalidator threads are busy revalidating the stale ukeys leading to
> high CPU and long dump duration.
> 
> This patch tracks the number of consecutive missed dumps. If four dumps
> are missed in a row, it is assumed that the datapath flow no longer
> exists, and the ukey can be deleted.
> 
> Reported-by: Roi Dayan <[email protected]>
> Co-authored-by: Han Zhou <[email protected]>
> Co-authored-by: Roi Dayan <[email protected]>
> Signed-off-by: Han Zhou <[email protected]>
> Signed-off-by: Roi Dayan <[email protected]>
> Signed-off-by: Eelco Chaudron <[email protected]>
> ---
> 
> v3: - Rewrote fix to use actual dump state, and added a tests case.
> v4: - Added reason to end of flow_del_reason.
>     - Rather than time based, make it missed dumps based.
>     - Changed it from a TC specific test to a generic unit test.
> ---
>  ofproto/ofproto-dpif-upcall.c                | 18 +++++++++
>  tests/ofproto-dpif.at                        | 39 ++++++++++++++++++++
>  utilities/usdt-scripts/flow_reval_monitor.py |  4 +-
>  3 files changed, 60 insertions(+), 1 deletion(-)
> 
> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> index 4d39bc5a7..053cfd863 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -57,6 +57,7 @@ COVERAGE_DEFINE(dumped_inconsistent_flow);
>  COVERAGE_DEFINE(dumped_new_flow);
>  COVERAGE_DEFINE(handler_duplicate_upcall);
>  COVERAGE_DEFINE(revalidate_missed_dp_flow);
> +COVERAGE_DEFINE(revalidate_missing_dp_flow);
>  COVERAGE_DEFINE(ukey_dp_change);
>  COVERAGE_DEFINE(ukey_invalid_stat_reset);
>  COVERAGE_DEFINE(ukey_replace_contention);
> @@ -284,6 +285,7 @@ enum flow_del_reason {
>      FDR_TOO_EXPENSIVE,      /* Too expensive to revalidate. */
>      FDR_UPDATE_FAIL,        /* Datapath update failed. */
>      FDR_XLATION_ERROR,      /* Flow translation error. */
> +    FDR_FLOW_MISSING_DP,    /* Flow is missing from the datapath. */
>  };
>  
>  /* 'udpif_key's are responsible for tracking the little bit of state udpif
> @@ -318,6 +320,7 @@ struct udpif_key {
>      uint64_t dump_seq OVS_GUARDED;            /* Tracks udpif->dump_seq. */
>      uint64_t reval_seq OVS_GUARDED;           /* Tracks udpif->reval_seq. */
>      enum ukey_state state OVS_GUARDED;        /* Tracks ukey lifetime. */
> +    uint32_t missed_dumps OVS_GUARDED;        /* Missed consecutive dumps. */
>  
>      /* 'state' debug information. */
>      unsigned int state_thread OVS_GUARDED;    /* Thread that transitions. */
> @@ -3040,6 +3043,21 @@ revalidator_sweep__(struct revalidator *revalidator, 
> bool purge)
>                      result = revalidate_ukey(udpif, ukey, &stats, 
> &odp_actions,
>                                               reval_seq, &recircs, 
> &del_reason);
>                  }

Hi Eelco,

In practice this probably doesn't make much difference, but it caught my eye.

Is it intentional that the logic below may override
del_reason and result even in cases where result is
set to a value other than UKEY_KEEP above (not visible in diff)?

IOW, is it intentional that this reason overrides the others,
even after work has been done determining the others?

> +
> +                if (ukey->dump_seq != dump_seq) {
> +                    ukey->missed_dumps++;
> +                    if (ukey->missed_dumps >= 4) {
> +                        /* If the flow was not dumped for 4 revalidator 
> rounds,
> +                         * we can assume the datapath flow now longer exists
> +                         * and the ukey should be deleted. */
> +                        COVERAGE_INC(revalidate_missing_dp_flow);
> +                        del_reason = FDR_FLOW_MISSING_DP;
> +                        result = UKEY_DELETE;
> +                    }
> +                } else {
> +                    ukey->missed_dumps = 0;
> +                }
> +
>                  if (result != UKEY_KEEP) {
>                      /* Clears 'recircs' if filled by revalidate_ukey(). */
>                      reval_op_init(&ops[n_ops++], result, udpif, ukey, 
> &recircs,

...
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to