On 3 Jun 2023, at 2:01, Peng He wrote:

> push_dp_ops only handles delete ops errors but ignores the modify
> ops results. It's better to handle all the dp operation errors in
> a consistent way.
>
> This patch prevents the inconsistency by considering modify failure
> in revalidators.
>
> To note, we cannot perform two state transitions and change ukey_state
> into UKEY_EVICTED directly here, because, if we do so, the
> sweep will remove the ukey alone and leave dp flow alive. Later, the
> dump will retrieve the dp flow and might even recover it. This will
> contribute the stats of this dp flow twice.
>
> v8->v9: add testsuite and delete INCONSISTENT ukey at revalidate_sweep.
> v9->v10: change the commit message and refine the test case.

Hi Peng,

Thanks for keeping this patch updated :) Some more comments below, but I think 
the next rev should be all fine.

Cheers,

Eelco

> Signed-off-by: Peng He <[email protected]>
> ---
>  ofproto/ofproto-dpif-upcall.c | 48 ++++++++++++++++++++++++----------
>  tests/dpif-netdev.at          | 49 +++++++++++++++++++++++++++++++++++
>  2 files changed, 84 insertions(+), 13 deletions(-)
>
> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> index cd57fdbd9..5a75d9444 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -258,6 +258,7 @@ enum ukey_state {
>      UKEY_CREATED = 0,
>      UKEY_VISIBLE,       /* Ukey is in umap, datapath flow install is queued. 
> */
>      UKEY_OPERATIONAL,   /* Ukey is in umap, datapath flow is installed. */
> +    UKEY_INCONSISTENT,  /* Ukey is in umap, datapath flow is inconsistent. */
>      UKEY_EVICTING,      /* Ukey is in umap, datapath flow delete is queued. 
> */
>      UKEY_EVICTED,       /* Ukey is in umap, datapath flow is deleted. */
>      UKEY_DELETED,       /* Ukey removed from umap, ukey free is deferred. */
> @@ -1999,6 +2000,10 @@ transition_ukey_at(struct udpif_key *ukey, enum 
> ukey_state dst,
>       * UKEY_VISIBLE -> UKEY_EVICTED
>       *  A handler attempts to install the flow, but the datapath rejects it.
>       *  Consider that the datapath has already destroyed it.
> +     * UKEY_OPERATIONAL -> UKEY_INCONSISTENT
> +     *  A revalidator modifies the flow with error returns.
> +     * UKEY_INCONSISTENT -> UKEY_EVICTING
> +     *  A revalidator decides to evict the datapath flow.
>       * UKEY_OPERATIONAL -> UKEY_EVICTING
>       *  A revalidator decides to evict the datapath flow.
>       * UKEY_EVICTING    -> UKEY_EVICTED
> @@ -2007,7 +2012,9 @@ transition_ukey_at(struct udpif_key *ukey, enum 
> ukey_state dst,
>       *  A revalidator has removed the ukey from the umap and is deleting it.
>       */
>      if (ukey->state == dst - 1 || (ukey->state == UKEY_VISIBLE &&
> -                                   dst < UKEY_DELETED)) {
> +                                   dst < UKEY_DELETED) ||
> +                                   (ukey->state == UKEY_OPERATIONAL &&

Indentation is wrong here, but maybe we should fix both lines, so it’s easier 
to read:

    if (ukey->state == dst - 1 ||
        (ukey->state == UKEY_VISIBLE && dst < UKEY_DELETED) ||
        (ukey->state == UKEY_OPERATIONAL && dst == UKEY_EVICTING)) {



> +                                   dst == UKEY_EVICTING)) {
>          ukey->state = dst;
>      } else {
>          struct ds ds = DS_EMPTY_INITIALIZER;
> @@ -2472,26 +2479,31 @@ push_dp_ops(struct udpif *udpif, struct ukey_op *ops, 
> size_t n_ops)
>
>      for (i = 0; i < n_ops; i++) {
>          struct ukey_op *op = &ops[i];
> -        struct dpif_flow_stats *push, *stats, push_buf;
> -
> -        stats = op->dop.flow_del.stats;
> -        push = &push_buf;
> -
> -        if (op->dop.type != DPIF_OP_FLOW_DEL) {
> -            /* Only deleted flows need their stats pushed. */
> -            continue;
> -        }
>
>          if (op->dop.error) {
> -            /* flow_del error, 'stats' is unusable. */
>              if (op->ukey) {
>                  ovs_mutex_lock(&op->ukey->mutex);
> -                transition_ukey(op->ukey, UKEY_EVICTED);
> +                if (op->ukey->state == UKEY_EVICTING) {

How about not checking the UKEY_STATE’s as they are internal, but using the 
operation type instead?
This will still result in errors in transition_ukey() for any corner cases we 
might have missed (if some new operation gets added and fails).

                 ovs_mutex_lock(&op->ukey->mutex);
-                if (op->ukey->state == UKEY_EVICTING) {
+                if (op->dop.type == DPIF_OP_FLOW_DEL) {
                     transition_ukey(op->ukey, UKEY_EVICTED);
-                } else if (op->ukey->state == UKEY_OPERATIONAL) {
-                    /* Modify failed, ukey's state was UKEY_OPERATIONAL */
+                } else {
+                    /* Modification of the flow failed. */
                     transition_ukey(op->ukey, UKEY_INCONSISTENT);
                 }
                 ovs_mutex_unlock(&op->ukey->mutex);


> +                    transition_ukey(op->ukey, UKEY_EVICTED);
> +                } else if (op->ukey->state == UKEY_OPERATIONAL) {
> +                    /* Modify failed, ukey's state was UKEY_OPERATIONAL */
> +                    transition_ukey(op->ukey, UKEY_INCONSISTENT);
> +                }
>                  ovs_mutex_unlock(&op->ukey->mutex);
>              }
>              continue;
>          }
>
> +        if (op->dop.type != DPIF_OP_FLOW_DEL) {
> +            /* Only deleted flows need their stats pushed. */
> +            continue;
> +        }
> +
> +        struct dpif_flow_stats *push, *stats, push_buf;
> +
> +        stats = op->dop.flow_del.stats;
> +        push = &push_buf;
> +
>          if (op->ukey) {
>              ovs_mutex_lock(&op->ukey->mutex);
>              transition_ukey(op->ukey, UKEY_EVICTED);
> @@ -2839,6 +2851,15 @@ revalidate(struct revalidator *revalidator)
>                  continue;
>              }
>
> +            if (ukey->state == UKEY_INCONSISTENT) {
> +                ukey->dump_seq = dump_seq;
> +                reval_op_init(&ops[n_ops++], UKEY_DELETE, udpif, ukey,
> +                              &recircs, &odp_actions);
> +                ovs_mutex_unlock(&ukey->mutex);

I think it would be nice to have a coverage counter here for this situation.

                    COVERAGE_INC(dumped_inconsisten_flow);

> +                continue;
> +            }
> +
> +
>              if (ukey->state <= UKEY_OPERATIONAL) {
>                  /* The flow is now confirmed to be in the datapath. */
>                  transition_ukey(ukey, UKEY_OPERATIONAL);
> @@ -2927,13 +2948,14 @@ revalidator_sweep__(struct revalidator *revalidator, 
> bool purge)
>              }
>              ukey_state = ukey->state;
>              if (ukey_state == UKEY_OPERATIONAL
> +                || (ukey_state == UKEY_INCONSISTENT)
>                  || (ukey_state == UKEY_VISIBLE && purge)) {
>                  struct recirc_refs recircs = RECIRC_REFS_EMPTY_INITIALIZER;
>                  bool seq_mismatch = (ukey->dump_seq != dump_seq
>                                       && ukey->reval_seq != reval_seq);
>                  enum reval_result result;
>
> -                if (purge) {
> +                if (purge || ukey_state == UKEY_INCONSISTENT) {
>                      result = UKEY_DELETE;
>                  } else if (!seq_mismatch) {
>                      result = UKEY_KEEP;
> diff --git a/tests/dpif-netdev.at b/tests/dpif-netdev.at
> index baab60a22..5ee8e6f7e 100644
> --- a/tests/dpif-netdev.at
> +++ b/tests/dpif-netdev.at
> @@ -716,3 +716,52 @@ AT_CHECK([test `ovs-vsctl get Interface p2 
> statistics:tx_q0_packets` -gt 0 -a dn
>
>  OVS_VSWITCHD_STOP
>  AT_CLEANUP
> +
> +AT_SETUP([dpif-netdev - revalidators handle dp modification fail correctly])
> +OVS_VSWITCHD_START(
> +  [add-port br0 p1 \
> +   -- set interface p1 type=dummy \
> +   -- set bridge br0 datapath-type=dummy \
> +   -- add-port br0 p2 \
> +   -- set interface p2 type=dummy --
> +   ])
> +
> +AT_DATA([flows.txt], [dnl
> +table=0,in_port=p1,actions=p2
> +])
> +
> +ovs-ofctl add-flows br0 flows.txt

As this is only a single flow, I would change it to something like:

-AT_DATA([flows.txt], [dnl
-table=0,in_port=p1,actions=p2
-])
-
-ovs-ofctl add-flows br0 flows.txt
+ovs-ofctl add-flow br0 table=0,in_port=p1,actions=p2
> +ovs-appctl netdev-dummy/receive p1 
> 'ipv4(src=10.0.0.1,dst=10.0.0.2),tcp(src=1,dst=2)'
> +ovs-appctl netdev-dummy/receive p1 
> 'ipv4(src=10.0.0.1,dst=10.0.0.2),tcp(src=1,dst=2)'
> +
> +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*thread://' | 
> strip_xout_keep_actions ], [0], [
> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no),
>  packets:0, bytes:0, used:0.0s, actions:2
> +])
> +
> +dnl wait the dp flow enter OPERATIONAL

Start with capital W for wait, and end with a dot. Some more re-writing:

  -dnl wait the dp flow enter OPERATIONAL
  +dnl Wait for the dp flow to enter OPERATIONAL state.

> +ovs-appctl revalidator/wait
> +
> +AT_CHECK([ovs-appctl revalidator/pause])
> +
> +dnl delete all dp flows, so flow modify will fail.

dnl Delete all dp flows, so flow modification will fail.


> +AT_CHECK([ovs-appctl dpctl/del-flows])
> +
> +dnl replace openflow rules, let revalidator work

 +dnl Replace OpenFlow rules and let the revalidator work.

> +AT_DATA([flows2.txt], [dnl
> +table=0,in_port=p1,ip actions=ct(commit)
> +])

Do this in one command

> +AT_CHECK([ovs-appctl revalidator/resume])
> +
> +dnl trigger revalidating and wait

Trigger revalidation and wait for it to complete.

> +AT_CHECK([ovs-ofctl --bundle replace-flows br0 flows2.txt])

-AT_CHECK([ovs-ofctl --bundle replace-flows br0 flows2.txt])

> +ovs-appctl revalidator/wait
> +
> +dnl inconsistent ukey should be deleted.
> +AT_CHECK([ovs-appctl upcall/show | grep keys | grep -q -v 0], [1])

If we add the new coverage counter we should add the check here also.

> +dnl delete revalidator warnings log to let testsuite pass.
> +sed -i 's/.*failed to put.*$//' ovs-vswitchd.log
> +sed -i 's/.*failed to flow_del.*$//' ovs-vswitchd.log

Should we not explicitly check for the flow(_del) error, as this will tell us 
the test introduced the right problem to begin with?


The above we can do through the OVS_VSWITCHD_STOP(["......"] command.

> +
> +OVS_VSWITCHD_STOP
> +AT_CLEANUP
> -- 
> 2.25.1

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to