Patch v5 has statistics issues.

In order to solve this issue, we had a discussion.

below is the quote of the email.

”
After a second thought, I think maybe keeping INCONSISTENT just for the
modify error is a better option.

With current patch:
1.
the modify error case:
OPERATIONAL -> INCONSISTENT ->  EVICTING -> EVICTED
2.
the delete error case:
EVICTING -> EVICTED

Change both to INCONSISTENT:

the modify error case:
did not change.

the delete error case:
EVICTING -> INCONSISTENT -> EVICTED?

“

And we agree to take the second solution.



Eelco Chaudron <[email protected]> 于2022年12月8日周四 18:54写道:

>
>
> On 27 Nov 2022, at 8:28, Peng He wrote:
>
> > push_dp_ops only handles delete ops errors but ignores the modify
> > ops results. It's better to handle all the dp operation errors in
> > a consistent way.
> >
> > We observe in the production environment that sometimes a megaflow
> > with wrong actions keep staying in datapath. The coverage command shows
> > revalidators have dumped several times, however the correct
> > actions are not set. This implies that the ukey's action does not
> > equal to the meagaflow's, i.e. revalidators think the underlying
> > megaflow's actions are correct however they are not.
> >
> > We also check the megaflow using the ofproto/trace command, and the
> > actions are not matched with the ones in the actual magaflow. By
> > performing a revalidator/purge command, the right actions are set.
> >
> > This patch prevents the inconsistency by considering modify failure
> > in revalidators.
> >
> > To note, we cannot perform two state transitions and change ukey_state
> > into UKEY_EVICTED directly here, because, if we do so, the
> > sweep will remove the ukey alone and leave dp flow alive. Later, the
> > dump will retrieve the dp flow and might even recover it. This will
> > contribute the stats of this dp flow twice.
> >
> > Signed-off-by: Peng He <[email protected]>
> > ---
> >  ofproto/ofproto-dpif-upcall.c | 34 +++++++++++++++++++++++-----------
> >  1 file changed, 23 insertions(+), 11 deletions(-)
> >
> > diff --git a/ofproto/ofproto-dpif-upcall.c
> b/ofproto/ofproto-dpif-upcall.c
> > index 7ad728adf..c2cefbeb8 100644
> > --- a/ofproto/ofproto-dpif-upcall.c
> > +++ b/ofproto/ofproto-dpif-upcall.c
> > @@ -2416,26 +2416,30 @@ push_dp_ops(struct udpif *udpif, struct ukey_op
> *ops, size_t n_ops)
> >
> >      for (i = 0; i < n_ops; i++) {
> >          struct ukey_op *op = &ops[i];
> > -        struct dpif_flow_stats *push, *stats, push_buf;
> > -
> > -        stats = op->dop.flow_del.stats;
> > -        push = &push_buf;
> > -
> > -        if (op->dop.type != DPIF_OP_FLOW_DEL) {
> > -            /* Only deleted flows need their stats pushed. */
> > -            continue;
> > -        }
> >
> >          if (op->dop.error) {
> > -            /* flow_del error, 'stats' is unusable. */
> >              if (op->ukey) {
> >                  ovs_mutex_lock(&op->ukey->mutex);
> > -                transition_ukey(op->ukey, UKEY_EVICTED);
> > +                if (op->dop.type == DPIF_OP_FLOW_DEL) {
> > +                    transition_ukey(op->ukey, UKEY_EVICTED);
> > +                } else {
> > +                    transition_ukey(op->ukey, UKEY_EVICTING);
> > +                }
> >                  ovs_mutex_unlock(&op->ukey->mutex);
> >              }
> >              continue;
> >          }
> >
> > +        if (op->dop.type != DPIF_OP_FLOW_DEL) {
> > +            /* Only deleted flows need their stats pushed. */
> > +            continue;
> > +        }
> > +
> > +        struct dpif_flow_stats *push, *stats, push_buf;
> > +
> > +        stats = op->dop.flow_del.stats;
> > +        push = &push_buf;
> > +
> >          if (op->ukey) {
> >              ovs_mutex_lock(&op->ukey->mutex);
> >              transition_ukey(op->ukey, UKEY_EVICTED);
> > @@ -2848,6 +2852,14 @@ revalidator_sweep__(struct revalidator
> *revalidator, bool purge)
> >                  continue;
> >              }
> >              ukey_state = ukey->state;
> > +
> > +            if (ukey_state == UKEY_EVICTING) {
> > +                /* previous modify operation fails on this ukey and
> ukey_state
> > +                 * is set to UKEY_EVICTING, issue a delete operation on
> this
> > +                 * ukey.
> > +                 */
> > +                delete_op_init(udpif, &ops[n_ops++], ukey);
>
> How can we be sure this state here is only for failing to modify
> operations?
> This state is set in other places in the code also.
>

because revalidate_sweep is called after revaldiate. all the datapath ops
have already been
pushed, and if there are ukeys at UKEY_EVICTING, it's because previously
datapath ops
have errors. and if del ops fail, ukey is deleted. so only ukeys at
UKEY_EVICTING must
come from modify fail.


> Maybe we should keep the INCONSISTENT state to avoid this (see v5)?
>
> > +            }
> >              if (ukey_state == UKEY_OPERATIONAL
> >                  || (ukey_state == UKEY_VISIBLE && purge)) {
> >                  struct recirc_refs recircs =
> RECIRC_REFS_EMPTY_INITIALIZER;
> > --
> > 2.25.1
>
>

-- 
hepeng
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to