Eelco Chaudron <echau...@redhat.com> 于2022年12月13日周二 20:36写道:

>
>
> On 10 Dec 2022, at 1:37, Peng He wrote:
>
> > Patch v5 has statistics issues.
> >
> > In order to solve this issue, we had a discussion.
> >
> > below is the quote of the email.
> >
> > ”
> > After a second thought, I think maybe keeping INCONSISTENT just for the
> > modify error is a better option.
> >
> > With current patch:
> > 1.
> > the modify error case:
> > OPERATIONAL -> INCONSISTENT ->  EVICTING -> EVICTED
> > 2.
> > the delete error case:
> > EVICTING -> EVICTED
> >
> > Change both to INCONSISTENT:
> >
> > the modify error case:
> > did not change.
> >
> > the delete error case:
> > EVICTING -> INCONSISTENT -> EVICTED?
> >
> > “
> >
> > And we agree to take the second solution.
>
> I know, but going over the state meanings again, UKEY_EVICTING means the
> following:
>
>  /* Ukey is in umap, datapath flow delete is queued. */
>
> Which now no longer is the case, so should a new state not make more sense?
>

Why it's no longer valid?

In the patch, only modify failed ukey will be set to EVICTING, is it just
right fit the meaning of
EVICTING? (ukey in the umap, but delete operation is queued?)



>
> Any one else has some input on this??
>
> > Eelco Chaudron <echau...@redhat.com> 于2022年12月8日周四 18:54写道:
> >
> >>
> >>
> >> On 27 Nov 2022, at 8:28, Peng He wrote:
> >>
> >>> push_dp_ops only handles delete ops errors but ignores the modify
> >>> ops results. It's better to handle all the dp operation errors in
> >>> a consistent way.
> >>>
> >>> We observe in the production environment that sometimes a megaflow
> >>> with wrong actions keep staying in datapath. The coverage command shows
> >>> revalidators have dumped several times, however the correct
> >>> actions are not set. This implies that the ukey's action does not
> >>> equal to the meagaflow's, i.e. revalidators think the underlying
> >>> megaflow's actions are correct however they are not.
> >>>
> >>> We also check the megaflow using the ofproto/trace command, and the
> >>> actions are not matched with the ones in the actual magaflow. By
> >>> performing a revalidator/purge command, the right actions are set.
> >>>
> >>> This patch prevents the inconsistency by considering modify failure
> >>> in revalidators.
> >>>
> >>> To note, we cannot perform two state transitions and change ukey_state
> >>> into UKEY_EVICTED directly here, because, if we do so, the
> >>> sweep will remove the ukey alone and leave dp flow alive. Later, the
> >>> dump will retrieve the dp flow and might even recover it. This will
> >>> contribute the stats of this dp flow twice.
> >>>
> >>> Signed-off-by: Peng He <hepeng.0...@bytedance.com>
> >>> ---
> >>>  ofproto/ofproto-dpif-upcall.c | 34 +++++++++++++++++++++++-----------
> >>>  1 file changed, 23 insertions(+), 11 deletions(-)
> >>>
> >>> diff --git a/ofproto/ofproto-dpif-upcall.c
> >> b/ofproto/ofproto-dpif-upcall.c
> >>> index 7ad728adf..c2cefbeb8 100644
> >>> --- a/ofproto/ofproto-dpif-upcall.c
> >>> +++ b/ofproto/ofproto-dpif-upcall.c
> >>> @@ -2416,26 +2416,30 @@ push_dp_ops(struct udpif *udpif, struct ukey_op
> >> *ops, size_t n_ops)
> >>>
> >>>      for (i = 0; i < n_ops; i++) {
> >>>          struct ukey_op *op = &ops[i];
> >>> -        struct dpif_flow_stats *push, *stats, push_buf;
> >>> -
> >>> -        stats = op->dop.flow_del.stats;
> >>> -        push = &push_buf;
> >>> -
> >>> -        if (op->dop.type != DPIF_OP_FLOW_DEL) {
> >>> -            /* Only deleted flows need their stats pushed. */
> >>> -            continue;
> >>> -        }
> >>>
> >>>          if (op->dop.error) {
> >>> -            /* flow_del error, 'stats' is unusable. */
> >>>              if (op->ukey) {
> >>>                  ovs_mutex_lock(&op->ukey->mutex);
> >>> -                transition_ukey(op->ukey, UKEY_EVICTED);
> >>> +                if (op->dop.type == DPIF_OP_FLOW_DEL) {
> >>> +                    transition_ukey(op->ukey, UKEY_EVICTED);
> >>> +                } else {
>
> I think we could use a comment here to make sure why we set it to
> evicting. Maybe just a reference to the comment in revalidator_sweep__()
> might be enough.
>
> >>> +                    transition_ukey(op->ukey, UKEY_EVICTING);
> >>> +                }
> >>>                  ovs_mutex_unlock(&op->ukey->mutex);
> >>>              }
> >>>              continue;
> >>>          }
> >>>
> >>> +        if (op->dop.type != DPIF_OP_FLOW_DEL) {
> >>> +            /* Only deleted flows need their stats pushed. */
> >>> +            continue;
> >>> +        }
> >>> +
> >>> +        struct dpif_flow_stats *push, *stats, push_buf;
> >>> +
> >>> +        stats = op->dop.flow_del.stats;
> >>> +        push = &push_buf;
> >>> +
> >>>          if (op->ukey) {
> >>>              ovs_mutex_lock(&op->ukey->mutex);
> >>>              transition_ukey(op->ukey, UKEY_EVICTED);
> >>> @@ -2848,6 +2852,14 @@ revalidator_sweep__(struct revalidator
> >> *revalidator, bool purge)
> >>>                  continue;
> >>>              }
> >>>              ukey_state = ukey->state;
> >>> +
> >>> +            if (ukey_state == UKEY_EVICTING) {
> >>> +                /* previous modify operation fails on this ukey and
> >> ukey_state
> >>> +                 * is set to UKEY_EVICTING, issue a delete operation
> on
> >> this
> >>> +                 * ukey.
> >>> +                 */
> >>> +                delete_op_init(udpif, &ops[n_ops++], ukey);
>
> This would cause the later push_ukey_ops() to call ukey_delete() which
> will call transition_ukey(ukey, UKEY_DELETED).
> Which I think will result in an invalid state transaction from
> UKEY_EVICTING to UKEY_DELETED.
>

No, push_ukey_ops calls  push_dp_ops, and push_dp_ops will delete the
datapath key first and then set the state to UKEY_EVICTED.


> >>
> >> How can we be sure this state here is only for failing to modify
> >> operations?
> >> This state is set in other places in the code also.
> >>
> >
> > because revalidate_sweep is called after revaldiate. all the datapath ops
> > have already been
> > pushed, and if there are ukeys at UKEY_EVICTING, it's because previously
> > datapath ops
> > have errors. and if del ops fail, ukey is deleted. so only ukeys at
> > UKEY_EVICTING must
> > come from modify fail.
>
> First of all, I missed that p_purge_cb() was pausing the revalidators, so
> you are right that there is no other way to set UKEY_EVICTING other than
> the sweet phase.
>
> I also missed the part that the ukey_delete() will do the actual removal,
> and not the if (ukey_state == UKEY_EVICTED) statement lower.
>
> Maybe we should reconsider having a special state? Anyhow, I do think the
> comment in the code needs some changing to make this clear:
>

Yes the comments are clearer, and I can add them in the next version.



>
>
>             if (ukey_state == UKEY_EVICTING) {
>                 /* A previous modify operation in push_dp_ops() failed and
>                  * caused the state to be UKEY_EVICTING. We need to
> manually
>                  * push a delete operation on this ukey to get both the
>                  * datapath flow and ukey removed. The later
> push_ukey_ops()
>                  * will call  push_dp_ops() and ukey_delete() to accomplish
>                  * this. */
>                 delete_op_init(udpif, &ops[n_ops++], ukey);
>             }
>
> >> Maybe we should keep the INCONSISTENT state to avoid this (see v5)?
> >>
>
> What about the case the push_dp_ops()/push_ukey_ops() fails in
> revalidator_sweep__() for a none DELETE action, who will take care of this?
> The alternative fix might be better in this case.
>

DELETE failure in the original code is taken cared. It will go to EVICTED
directly.
This leaves a possibility that a key is in the dp but not in the ukey,
which is fine, because we have the mecahnism already:

check ukey_create_from_dpif_flow and ukey_acquire fail path.



>
> >>> +            }
> >>>              if (ukey_state == UKEY_OPERATIONAL
> >>>                  || (ukey_state == UKEY_VISIBLE && purge)) {
> >>>                  struct recirc_refs recircs =
> >> RECIRC_REFS_EMPTY_INITIALIZER;
> >>> --
> >>> 2.25.1
> >>
> >>
> >
> > --
> > hepeng
>
>

-- 
hepeng
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to