On 29/05/2024 17:35, Mike Pattrick wrote:
> On Thu, May 23, 2024 at 6:47 AM Roi Dayan via dev
> <[email protected]> wrote:
>>
>> It is observed in some environments that there are much more ukeys than
>> actual DP flows. For example:
>>
>> $ ovs-appctl upcall/show
>> system@ovs-system:
>> flows : (current 7) (avg 6) (max 117) (limit 2125)
>> offloaded flows : 525
>> dump duration : 1063ms
>> ufid enabled : true
>>
>> 23: (keys 3612)
>> 24: (keys 3625)
>> 25: (keys 3485)
>>
>> The revalidator threads are busy revalidating the stale ukeys leading to
>> high CPU and long dump duration.
>>
>> There are some possible situations that may result in stale ukeys that
>> have no corresponding DP flows.
>>
>> In revalidator, push_dp_ops() doesn't check error if the op type is not
>> DEL. It is possible that a PUT(MODIFY) fails, especially for tc offload
>> case, where the old flow is deleted first and then the new one is
>> created. If the creation fails, the ukey will be stale (no corresponding
>> DP flow). This patch adds a warning in such case.
>>
>> Another possible scenario is in handle_upcalls() if a PUT operation did
>> not succeed and op->error attribute was not set correctly it can lead to
>> stale ukey in operational state.
>>
>> This patch adds checks in the sweep phase for such ukeys and move them
>> to DELETE so that they can be cleared eventually.
>>
>> Co-authored-by: Han Zhou <[email protected]>
>> Signed-off-by: Han Zhou <[email protected]>
>> Signed-off-by: Roi Dayan <[email protected]>
>> ---
>>  ofproto/ofproto-dpif-upcall.c | 15 +++++++++++++++
>>  1 file changed, 15 insertions(+)
>>
>> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
>> index 83609ec62b63..e9520ebdf910 100644
>> --- a/ofproto/ofproto-dpif-upcall.c
>> +++ b/ofproto/ofproto-dpif-upcall.c
>> @@ -57,6 +57,7 @@ COVERAGE_DEFINE(dumped_inconsistent_flow);
>>  COVERAGE_DEFINE(dumped_new_flow);
>>  COVERAGE_DEFINE(handler_duplicate_upcall);
>>  COVERAGE_DEFINE(revalidate_missed_dp_flow);
>> +COVERAGE_DEFINE(revalidate_missed_dp_flow_del);
>>  COVERAGE_DEFINE(ukey_dp_change);
>>  COVERAGE_DEFINE(ukey_invalid_stat_reset);
>>  COVERAGE_DEFINE(ukey_replace_contention);
>> @@ -278,6 +279,7 @@ enum flow_del_reason {
>>      FDR_BAD_ODP_FIT,        /* Bad ODP flow fit. */
>>      FDR_FLOW_IDLE,          /* Flow idle timeout. */
>>      FDR_FLOW_LIMIT,         /* Kill all flows condition reached. */
>> +    FDR_FLOW_STALE,         /* Flow stale detected. */
>>      FDR_FLOW_WILDCARDED,    /* Flow needs a narrower wildcard mask. */
>>      FDR_NO_OFPROTO,         /* Bridge not found. */
>>      FDR_PURGE,              /* User requested flow deletion. */
>> @@ -2557,6 +2559,10 @@ push_dp_ops(struct udpif *udpif, struct ukey_op *ops, 
>> size_t n_ops)
>>
>>          if (op->dop.type != DPIF_OP_FLOW_DEL) {
>>              /* Only deleted flows need their stats pushed. */
>> +            if (op->dop.error) {
>> +                VLOG_WARN_RL(&rl, "push_dp_ops: error %d in op type %d, 
>> ukey "
>> +                             "%p", op->dop.error, op->dop.type, op->ukey);
>> +            }
>>              continue;
>>          }
>>
>> @@ -3027,6 +3033,15 @@ revalidator_sweep__(struct revalidator *revalidator, 
>> bool purge)
>>                      del_reason = purge ? FDR_PURGE : FDR_UPDATE_FAIL;
>>                  } else if (!seq_mismatch) {
>>                      result = UKEY_KEEP;
>> +                } else if (!ukey->stats.used &&
> 
> Would it be possible for stats.used to be set but the dp flow to be
> deleted? For example, if a flow is offloaded to TC, but something
> external to OVS clears it?
> 
> 
> Thanks,
> Mike

Hi Mike, 

Yes it could be and in this situation the revalidator checking
the tc dump to check the rules will cause a readd of that rule.

Thanks,
Roi


> 
>> +                           udpif_flow_time_delta(udpif, ukey) * 1000 >
>> +                           ofproto_max_idle) {
>> +                    COVERAGE_INC(revalidate_missed_dp_flow_del);
>> +                    VLOG_WARN_RL(&rl, "revalidator_sweep__: Remove stale 
>> ukey "
>> +                                 "%p delta %llus", ukey,
>> +                                 udpif_flow_time_delta(udpif, ukey));
>> +                    result = UKEY_DELETE;
>> +                    del_reason = FDR_FLOW_STALE;
>>                  } else {
>>                      struct dpif_flow_stats stats;
>>                      COVERAGE_INC(revalidate_missed_dp_flow);
>> --
>> 2.21.0
>>
>> _______________________________________________
>> dev mailing list
>> [email protected]
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to