On 18 Nov 2022, at 2:53, Peng He wrote:

> Eelco Chaudron <[email protected]> 于2022年11月16日周三 18:14写道:
>
>>
>>
>> On 6 Nov 2022, at 8:12, Peng He wrote:
>>
>>> push_dp_ops only handles delete ops errors but ignores (忽略) the modify
>> (修改)
>>> ops results. It's better to handle all the dp operation errors in
>>> a consistent way.
>>>
>>> We observe in the production environment that sometimes a megaflow
>>> with wrong actions keep staying in datapath. The coverage command shows
>>> revalidators have dumped several times, however the correct
>>> actions are not set. This implies (暗示) that the ukey's action does not
>>> equal to the meagaflow's, i.e. revalidators think the underlying (基础)
>>> megaflow's actions are correct however they are not.
>>>
>>> We also check the megaflow using the ofproto/trace command, and the
>>> actions are not matched with the ones in the actual magaflow. By
>>> performing a revalidator/purge command, the right actions are set.
>>>
>>> This patch (补丁) prevents the inconsistency by considering modify (修改)
>> failure
>>> in revalidators.
>>>
>>> To note, we cannot perform two state transitions and change ukey_state
>>> into UKEY_EVICTED directly here, because, if we do so, the
>>> sweep (扫) will remove the ukey alone and leave dp flow alive. Later, the
>>> dump will retrieve (检索) the dp flow and might even recover it. This will
>>> contribute the stats of this dp flow twice.
>>>
>>> Signed-off-by: Peng He <[email protected]>
>>
>> Hi Peng,
>>
>> Thanks for looking at the statistics part, see some comments inline!
>>
>> In addition, I already acked patch (补丁) 2 out of this series, but it
>> mentions patch x/3, but I do not see patch 3 in this series. Is this
>> missing? Or are there only two patches (补丁) left?
>
>
> there are only two patches. the third one is about the race comments, which
> is not in this patchset.
> I guess I made some mistake.
>
>
>>
>>
>> Cheers,
>>
>> Eelco
>>
>>
>>> ---
>>>  ofproto/ofproto-dpif-upcall.c | 39 ++++++++++++++++++++++-------------
>>>  1 file (文件) changed, 25 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/ofproto/ofproto-dpif-upcall.c
>> b/ofproto/ofproto-dpif-upcall.c
>>> index 7ad728adf..a7970fa9b 100644
>>> --- a/ofproto/ofproto-dpif-upcall.c
>>> +++ b/ofproto/ofproto-dpif-upcall.c
>>> @@ -254,6 +254,7 @@ enum ukey_state {
>>>      UKEY_CREATED = 0,
>>>      UKEY_VISIBLE,       /* Ukey is in umap, datapath flow install (安装)
>> is queued (队列) . */
>>>      UKEY_OPERATIONAL,   /* Ukey is in umap, datapath flow is installed
>> (安装) . */
>>> +    UKEY_INCONSISTENT,  /* Ukey is in umap, datapath flow is modified
>> (修改) but failed */
>>>      UKEY_EVICTING,      /* Ukey is in umap, datapath flow delete is
>> queued (队列) . */
>>>      UKEY_EVICTED,       /* Ukey is in umap, datapath flow is deleted. */
>>>      UKEY_DELETED,       /* Ukey removed from umap, ukey free is
>> deferred (推迟) . */
>>> @@ -1966,6 +1967,10 @@ transition_ukey_at(struct udpif_key *ukey, enum
>> ukey_state dst,
>>>       * UKEY_VISIBLE -> UKEY_EVICTED
>>>       *  A handler attempts to install (安装) the flow, but the datapath
>> rejects it.
>>>       *  Consider that the datapath has already destroyed it.
>>> +     * UKEY_OPERATIONAL -> UKEY_INCONSISTENT
>>> +     *  A revalidator modifies (修改) the flow with error returns (返回) .
>>> +     * UKEY_INCONSISTENT -> UKEY_EVICTING
>>> +     *  A revalidator decides to evict (驱逐) the datapath flow.
>>>       * UKEY_OPERATIONAL -> UKEY_EVICTING
>>>       *  A revalidator decides to evict (驱逐) the datapath flow.
>>>       * UKEY_EVICTING    -> UKEY_EVICTED
>>> @@ -1974,7 +1979,8 @@ transition_ukey_at(struct udpif_key *ukey, enum
>> ukey_state dst,
>>>       *  A revalidator has removed the ukey from the umap and is
>> deleting it.
>>>       */
>>>      if (ukey->state == dst - 1 || (ukey->state == UKEY_VISIBLE &&
>>> -                                   dst < UKEY_DELETED)) {
>>> +                                   dst < UKEY_DELETED) ||
>>> +            (ukey->state == UKEY_OPERATIONAL && dst == UKEY_EVICTING)) {
>>>          ukey->state = dst;
>>>      } else {
>>>          struct ds ds = DS_EMPTY_INITIALIZER;
>>> @@ -2416,26 +2422,31 @@ push_dp_ops(struct udpif *udpif, struct ukey_op
>> *ops, size_t n_ops)
>>>
>>>      for (i = 0; i < n_ops; i++) {
>>>          struct ukey_op *op = &ops[i];
>>> -        struct dpif_flow_stats *push, *stats, push_buf;
>>> -
>>> -        stats = op->dop.flow_del.stats;
>>> -        push = &push_buf;
>>> -
>>> -        if (op->dop.type != DPIF_OP_FLOW_DEL) {
>>> -            /* Only deleted flows need their stats pushed. */
>>> -            continue;
>>> -        }
>>>
>>>          if (op->dop.error) {
>>> -            /* flow_del error, 'stats' is unusable. */
>>>              if (op->ukey) {
>>>                  ovs_mutex_lock(&op->ukey->mutex);
>>> -                transition_ukey(op->ukey, UKEY_EVICTED);
>>> +                if (op->dop.type == DPIF_OP_FLOW_DEL)
>>
>> We should {} to the if branch.
>>
>
> OK
>
>
>>
>>> +                    transition_ukey(op->ukey, UKEY_EVICTED);
>>> +                else {
>>> +                    transition_ukey(op->ukey, UKEY_INCONSISTENT);
>>
>> Is there a specific (特定) reason to have delete go through a different
>> state on failure? If not, it might be more (更多) consistent especially when
>> debugging.
>
>
> As is stated in the commit comments,
> "
>> To note, we cannot perform two state transitions and change ukey_state
>> into UKEY_EVICTED directly here, because, if we do so, the
>> sweep will remove the ukey alone and leave dp flow alive. Later, the
>> dump will retrieve the dp flow and might even recover it. This will
>> contribute the stats of this dp flow twice.
> "
> So I have to introduce another state here.

I meant chaining the UKEY_EVICTED also to UKEY_INCONSISTENT.

>
>>
>>> +                }
>>>                  ovs_mutex_unlock(&op->ukey->mutex);
>>>              }
>>> +            /* if modify (修改) or delete fails, there is no need to push
>> stats */
>>
>> Captial for “If” and ending with a dot.
>>
>
> Sure, will do that.
>
>
>>
>>>              continue;
>>>          }
>>>
>>> +        if (op->dop.type != DPIF_OP_FLOW_DEL) {
>>> +            /* Only deleted flows need their stats pushed. */
>>> +            continue;
>>> +        }
>>> +
>>> +        struct dpif_flow_stats *push, *stats, push_buf;
>>> +
>>> +        stats = op->dop.flow_del.stats;
>>> +        push = &push_buf;
>>> +
>>>          if (op->ukey) {
>>>              ovs_mutex_lock(&op->ukey->mutex);
>>>              transition_ukey(op->ukey, UKEY_EVICTED);
>>> @@ -2848,14 +2859,14 @@ revalidator_sweep__(struct revalidator
>> *revalidator, bool purge (吹扫) )
>>>                  continue;
>>>              }
>>>              ukey_state = ukey->state;
>>> -            if (ukey_state == UKEY_OPERATIONAL
>>> +            if (ukey_state == UKEY_OPERATIONAL || ukey_state ==
>> UKEY_INCONSISTENT
>>>                  || (ukey_state == UKEY_VISIBLE && purge (吹扫) )) {
>>>                  struct recirc_refs recircs =
>> RECIRC_REFS_EMPTY_INITIALIZER;
>>>                  bool seq_mismatch = (ukey->dump_seq != dump_seq
>>>                                       && ukey->reval_seq != reval_seq);
>>>                  enum reval_result result;
>>>
>>> -                if ( purge (吹扫) ) {
>>> +                if ( purge (吹扫) || ukey_state == UKEY_INCONSISTENT) {
>>>                      result = UKEY_DELETE;
>>>                  } else if (!seq_mismatch) {
>>>                      result = UKEY_KEEP;
>>> --
>>> 2.25.1
>>>
>>> _______________________________________________
>>> dev mailing list (列表)
>>> [email protected]
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>>
>
> -- 
> hepeng

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to