On 24 Nov 2022, at 1:41, Peng He wrote:

> Eelco Chaudron <[email protected]> 于2022年11月24日周四 00:08写道:
>
>>
>>
>> On 22 Nov 2022, at 2:44, Peng He wrote:
>>
>>> Hi,
>>>
>>> After a second thought, I think maybe keeping INCONSISTENT just for the
>>> modify (修改) error is a better option.
>>>
>>> With current patch (补丁) :
>>> 1.
>>> the modify (修改) error case:
>>> OPERATIONAL -> INCONSISTENT ->  EVICTING (驱逐) -> EVICTED (驱逐)
>>> 2.
>>> the delete error case:
>>> EVICTING (驱逐) -> EVICTED (驱逐)
>>>
>>> Change both to INCONSISTENT:
>>>
>>> the modify (修改) error case:
>>> did not change.
>>>
>>> the delete error case:
>>> EVICTING (驱逐) -> INCONSISTENT -> EVICTED (驱逐) ?
>>> this will make the state machine allows both INCONSISTENT -> EVICTING
>> (驱逐) and
>>> EVICTING (驱逐) -> INCONSISTENT transitions.
>>> which I guess it's more (更多) confusing ...
>>>
>>>  Another solution is that, drop INCONSISTENT state, if modify (修改)
>> fails, just
>>> changes to EVICTING (驱逐) .
>>> and let the revalidate or sweep (扫) to take (采取) care of EVICTING (驱逐)
>> state ukey and
>>> initial (初始) another dp_ops to remove it.
>>>
>>> I now prefer the second solution, what do you think?
>>
>> Yes, the second one sounds more (更多) straightforward (直截了当) , but would it
>> not cause issues with the statistics? If not we should probably go with
>> this one.
>
>
> The second one will also have no statistics issues.
>
> As long as the megaflow and ukey have the same lifetime, i.e. it's
> impossible that ukey is dead but megaflow is alive,
> or reverse (逆向) , there are no statistics issues.  So I think the second
> solution (补丁) does not introduce statistics issues.
>
> will submit a v6.

Thanks!

//Eelco

>>
>>
>> If we run into statistics issues, which I think we will, let’s keep the
>> code (代码) as in v5, but add some code comments on why the paths are
>> different, i.e. to keep stats happy.
>>
>>
>> If stats are inconsistent, we should also add a test case so further
>> fixes/changes will not mess this up.
>>
>>> Peng He <[email protected]> 于2022年11月22日周二 09:01写道:
>>>
>>>>
>>>> Eelco Chaudron <[email protected]> 于2022年11月18日周五 15:35写道:
>>>>
>>>>>
>>>>>
>>>>> On 18 Nov 2022, at 2:53, Peng He wrote:
>>>>>
>>>>>> Eelco Chaudron <[email protected]> 于2022年11月16日周三 18:14写道:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 6 Nov 2022, at 8:12, Peng He wrote:
>>>>>>>
>>>>>>>> push_dp_ops only handles delete ops errors but ignores (忽略) (忽略)
>> the modify (修改)
>>>>>>> (修改)
>>>>>>>> ops results. It's better to handle all the dp operation errors in
>>>>>>>> a consistent way.
>>>>>>>>
>>>>>>>> We observe in the production environment that sometimes a megaflow
>>>>>>>> with wrong actions keep staying in datapath. The coverage command
>>>>> shows
>>>>>>>> revalidators have dumped several times, however the correct
>>>>>>>> actions are not set. This implies (暗示) (暗示) that the ukey's action
>> does not
>>>>>>>> equal to the meagaflow's, i.e. revalidators think the underlying
>> (基础) (基础)
>>>>>>>> megaflow's actions are correct however they are not.
>>>>>>>>
>>>>>>>> We also check the megaflow using the ofproto/trace command, and the
>>>>>>>> actions are not matched with the ones in the actual magaflow. By
>>>>>>>> performing a revalidator/purge command, the right actions are set.
>>>>>>>>
>>>>>>>> This patch (补丁) (补丁) prevents the inconsistency by considering
>> modify (修改) (修改)
>>>>>>> failure
>>>>>>>> in revalidators.
>>>>>>>>
>>>>>>>> To note, we cannot perform two state transitions and change
>> ukey_state
>>>>>>>> into UKEY_EVICTED directly here, because, if we do so, the
>>>>>>>> sweep (扫) (扫) will remove the ukey alone and leave dp flow alive.
>> Later,
>>>>> the
>>>>>>>> dump will retrieve (检索) (检索) the dp flow and might even recover it.
>> This
>>>>> will
>>>>>>>> contribute the stats of this dp flow twice.
>>>>>>>>
>>>>>>>> Signed-off-by: Peng He <[email protected]>
>>>>>>>
>>>>>>> Hi Peng,
>>>>>>>
>>>>>>> Thanks for looking at the statistics part, see some comments inline!
>>>>>>>
>>>>>>> In addition, I already acked patch (补丁) (补丁) 2 out of this series,
>> but it
>>>>>>> mentions patch (补丁) x/3, but I do not see patch 3 in this series. Is
>> this
>>>>>>> missing? Or are there only two patches (补丁) (补丁) left?
>>>>>>
>>>>>>
>>>>>> there are only two patches (补丁) . the third one is about the race
>> comments,
>>>>> which
>>>>>> is not in this patchset.
>>>>>> I guess I made some mistake.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> Eelco
>>>>>>>
>>>>>>>
>>>>>>>> ---
>>>>>>>>  ofproto/ofproto-dpif-upcall.c | 39
>>>>> ++++++++++++++++++++++-------------
>>>>>>>>  1 file (文件) (文件) changed, 25 insertions(+), 14 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/ofproto/ofproto-dpif-upcall.c
>>>>>>> b/ofproto/ofproto-dpif-upcall.c
>>>>>>>> index 7ad728adf..a7970fa9b 100644
>>>>>>>> --- a/ofproto/ofproto-dpif-upcall.c
>>>>>>>> +++ b/ofproto/ofproto-dpif-upcall.c
>>>>>>>> @@ -254,6 +254,7 @@ enum ukey_state {
>>>>>>>>      UKEY_CREATED = 0,
>>>>>>>>      UKEY_VISIBLE,       /* Ukey is in umap, datapath flow install
>> (安装)
>>>>> (安装)
>>>>>>> is queued (队列) (队列) . */
>>>>>>>>      UKEY_OPERATIONAL,   /* Ukey is in umap, datapath flow is
>>>>> installed (安装)
>>>>>>> (安装) . */
>>>>>>>> +    UKEY_INCONSISTENT,  /* Ukey is in umap, datapath flow is
>> modified (修改)
>>>>>>> (修改) but failed */
>>>>>>>>      UKEY_EVICTING,      /* Ukey is in umap, datapath flow delete is
>>>>>>> queued (队列) (队列) . */
>>>>>>>>      UKEY_EVICTED,       /* Ukey is in umap, datapath flow is
>>>>> deleted. */
>>>>>>>>      UKEY_DELETED,       /* Ukey removed from umap, ukey free is
>>>>>>> deferred (推迟) (推迟) . */
>>>>>>>> @@ -1966,6 +1967,10 @@ transition_ukey_at(struct udpif_key *ukey,
>> enum
>>>>>>> ukey_state dst,
>>>>>>>>       * UKEY_VISIBLE -> UKEY_EVICTED
>>>>>>>>       *  A handler attempts to install (安装) (安装) the flow, but the
>> datapath
>>>>>>> rejects it.
>>>>>>>>       *  Consider that the datapath has already destroyed it.
>>>>>>>> +     * UKEY_OPERATIONAL -> UKEY_INCONSISTENT
>>>>>>>> +     *  A revalidator modifies (修改) (修改) the flow with error
>> returns (返回) (返回)
>>>>> .
>>>>>>>> +     * UKEY_INCONSISTENT -> UKEY_EVICTING
>>>>>>>> +     *  A revalidator decides to evict (驱逐) (驱逐) the datapath flow.
>>>>>>>>       * UKEY_OPERATIONAL -> UKEY_EVICTING
>>>>>>>>       *  A revalidator decides to evict (驱逐) (驱逐) the datapath flow.
>>>>>>>>       * UKEY_EVICTING    -> UKEY_EVICTED
>>>>>>>> @@ -1974,7 +1979,8 @@ transition_ukey_at(struct udpif_key *ukey,
>> enum
>>>>>>> ukey_state dst,
>>>>>>>>       *  A revalidator has removed the ukey from the umap and is
>>>>>>> deleting it.
>>>>>>>>       */
>>>>>>>>      if (ukey->state == dst - 1 || (ukey->state == UKEY_VISIBLE &&
>>>>>>>> -                                   dst < UKEY_DELETED)) {
>>>>>>>> +                                   dst < UKEY_DELETED) ||
>>>>>>>> +            (ukey->state == UKEY_OPERATIONAL && dst ==
>>>>> UKEY_EVICTING)) {
>>>>>>>>          ukey->state = dst;
>>>>>>>>      } else {
>>>>>>>>          struct ds ds = DS_EMPTY_INITIALIZER;
>>>>>>>> @@ -2416,26 +2422,31 @@ push_dp_ops(struct udpif *udpif, struct
>>>>> ukey_op
>>>>>>> *ops, size_t n_ops)
>>>>>>>>
>>>>>>>>      for (i = 0; i < n_ops; i++) {
>>>>>>>>          struct ukey_op *op = &ops[i];
>>>>>>>> -        struct dpif_flow_stats *push, *stats, push_buf;
>>>>>>>> -
>>>>>>>> -        stats = op->dop.flow_del.stats;
>>>>>>>> -        push = &push_buf;
>>>>>>>> -
>>>>>>>> -        if (op->dop.type != DPIF_OP_FLOW_DEL) {
>>>>>>>> -            /* Only deleted flows need their stats pushed. */
>>>>>>>> -            continue;
>>>>>>>> -        }
>>>>>>>>
>>>>>>>>          if (op->dop.error) {
>>>>>>>> -            /* flow_del error, 'stats' is unusable. */
>>>>>>>>              if (op->ukey) {
>>>>>>>>                  ovs_mutex_lock(&op->ukey->mutex);
>>>>>>>> -                transition_ukey(op->ukey, UKEY_EVICTED);
>>>>>>>> +                if (op->dop.type == DPIF_OP_FLOW_DEL)
>>>>>>>
>>>>>>> We should {} to the if branch.
>>>>>>>
>>>>>>
>>>>>> OK
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>> +                    transition_ukey(op->ukey, UKEY_EVICTED);
>>>>>>>> +                else {
>>>>>>>> +                    transition_ukey(op->ukey, UKEY_INCONSISTENT);
>>>>>>>
>>>>>>> Is there a specific (特定) (特定) reason to have delete go through a
>> different
>>>>>>> state on failure? If not, it might be more (更多) (更多) consistent
>> especially
>>>>> when
>>>>>>> debugging.
>>>>>>
>>>>>>
>>>>>> As is stated in the commit (提交) comments,
>>>>>> "
>>>>>>> To note, we cannot perform two state transitions and change
>> ukey_state
>>>>>>> into UKEY_EVICTED directly here, because, if we do so, the
>>>>>>> sweep (扫) will remove the ukey alone and leave dp flow alive. Later,
>> the
>>>>>>> dump will retrieve (检索) the dp flow and might even recover it. This
>> will
>>>>>>> contribute the stats of this dp flow twice.
>>>>>> "
>>>>>> So I have to introduce another state here.
>>>>>
>>>>> I meant (意味着) chaining the UKEY_EVICTED also to UKEY_INCONSISTENT.
>>>>>
>>>>
>>>> By chaining, you mean (意味着) changing, correct?
>>>> I got it. So if error, we should all go to the UKEY_INCONSISTENT.
>>>> yes, it makes sense, will do that in the next version.
>>>>
>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>>> +                }
>>>>>>>>                  ovs_mutex_unlock(&op->ukey->mutex);
>>>>>>>>              }
>>>>>>>> +            /* if modify (修改) (修改) or delete fails, there is no
>> need to
>>>>> push
>>>>>>> stats */
>>>>>>>
>>>>>>> Captial for “If” and ending with a dot.
>>>>>>>
>>>>>>
>>>>>> Sure, will do that.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>>              continue;
>>>>>>>>          }
>>>>>>>>
>>>>>>>> +        if (op->dop.type != DPIF_OP_FLOW_DEL) {
>>>>>>>> +            /* Only deleted flows need their stats pushed. */
>>>>>>>> +            continue;
>>>>>>>> +        }
>>>>>>>> +
>>>>>>>> +        struct dpif_flow_stats *push, *stats, push_buf;
>>>>>>>> +
>>>>>>>> +        stats = op->dop.flow_del.stats;
>>>>>>>> +        push = &push_buf;
>>>>>>>> +
>>>>>>>>          if (op->ukey) {
>>>>>>>>              ovs_mutex_lock(&op->ukey->mutex);
>>>>>>>>              transition_ukey(op->ukey, UKEY_EVICTED);
>>>>>>>> @@ -2848,14 +2859,14 @@ revalidator_sweep__(struct revalidator
>>>>>>> *revalidator, bool purge (吹扫) (吹扫) )
>>>>>>>>                  continue;
>>>>>>>>              }
>>>>>>>>              ukey_state = ukey->state;
>>>>>>>> -            if (ukey_state == UKEY_OPERATIONAL
>>>>>>>> +            if (ukey_state == UKEY_OPERATIONAL || ukey_state ==
>>>>>>> UKEY_INCONSISTENT
>>>>>>>>                  || (ukey_state == UKEY_VISIBLE && purge (吹扫) (吹扫)
>> )) {
>>>>>>>>                  struct recirc_refs recircs =
>>>>>>> RECIRC_REFS_EMPTY_INITIALIZER;
>>>>>>>>                  bool seq_mismatch = (ukey->dump_seq != dump_seq
>>>>>>>>                                       && ukey->reval_seq !=
>>>>> reval_seq);
>>>>>>>>                  enum reval_result result;
>>>>>>>>
>>>>>>>> -                if ( purge (吹扫) (吹扫) ) {
>>>>>>>> +                if ( purge (吹扫) (吹扫) || ukey_state ==
>> UKEY_INCONSISTENT) {
>>>>>>>>                      result = UKEY_DELETE;
>>>>>>>>                  } else if (!seq_mismatch) {
>>>>>>>>                      result = UKEY_KEEP;
>>>>>>>> --
>>>>>>>> 2.25.1
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> dev mailing list (列表) (列表)
>>>>>>>> [email protected]
>>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> hepeng
>>>>>
>>>>>
>>>>
>>>> --
>>>> hepeng
>>>>
>>>
>>>
>>> --
>>> hepeng
>>
>>
>
> -- 
> hepeng

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to