On Mon, Sep 5, 2022 at 4:31 PM Eelco Chaudron <[email protected]> wrote:

>
>
> On 5 Sep 2022, at 12:45, Ales Musil wrote:
>
> > On Fri, Sep 2, 2022 at 4:19 PM Eelco Chaudron <[email protected]>
> wrote:
> >
> >>
> >>
> >> On 9 Aug 2022, at 9:17, Ales Musil wrote:
> >>
> >>> When the packet was traveling through patch port boundary
> >>> OvS would check if any of the actions is reversible,
> >>> if not it would clone the packet. However, the check
> >>> was only at the first level of the second bridge.
> >>> That caused some issues when the packet had gone
> >>> through more actions, some of them might have been
> >>> irreversible.
> >>>
> >>> In order to keep the semantics the same we might
> >>> need to run the actions twice in the worst case
> >>> scenario. During the clone there are 4 scenarios
> >>> that can happen.
> >>>
> >>> 1) The action is last one for that flow,
> >>> in that case we just continue without clone.
> >>>
> >>> 2) There is irreversible action in the action
> >>> set (first level). In this case we know that
> >>> there is at leas one irreversible action which
> >>> is enough to do clone.
> >>>
> >>> 3) All actions in first level are reversible,
> >>> we can try to run all actions as if we don't
> >>> need any clone and inspect the ofpbuf at the
> >>> end. In positive case there are no irreversible
> >>> actions so we will just submit the buffer and continue.
> >>>
> >>> 4) This is same as 3) with the difference that
> >>> there is irreversible action in the ofpbuf.
> >>> To keep the semantics we need to re-run the actions
> >>> and treat it as clone. This requires resotration
> >>> of the xlate_ctx.
> >>>
> >>> Add test cases for all irreversible actions
> >>> to see if they are properly cloned.
> >>>
> >>> Signed-off-by: Ales Musil <[email protected]>
>
> <SNIP>
>
> >>> +    /* In any other case we cannot be certain if there is any
> >> irreversible
> >>> +     * action down the action stack. Let's run the actions as if they
> >> are
> >>> +     * reversible, if we find out that there was any irreversible
> >> action we
> >>> +     * can restore the xlate_ctx and run the do_clone instead. */
> >>> +    struct ofpbuf old_stack = ctx->stack;
> >>> +    union mf_subvalue new_stack[1024 / sizeof (union mf_subvalue)];
> >>> +
> >>> +    ofpbuf_use_stub(&ctx->stack, new_stack, sizeof new_stack);
> >>> +    ofpbuf_put(&ctx->stack, old_stack.data, old_stack.size);
> >>> +
> >>> +    struct ofpbuf old_action_set = ctx->action_set;
> >>> +    uint64_t actset_stub[1024 / 8];
> >>> +
> >>> +    ofpbuf_use_stub(&ctx->action_set, actset_stub, sizeof
> actset_stub);
> >>> +    ofpbuf_put(&ctx->action_set, old_action_set.data,
> >> old_action_set.size);
> >>> +
> >>> +    struct flow old_flow = ctx->xin->flow;
> >>> +    bool old_was_mpls = ctx->was_mpls;
> >>> +    bool old_conntracked = ctx->conntracked;
> >>> +    struct flow old_base = ctx->base_flow;
> >>
> >> This is the part that scares me. Are these the only value we need to
> cache
> >> and restore? Are we sure none of the other CTX fields arale potentially
> >> adjusted?
> >> What about fields like sflow_n_outputs, pending_encap, resubmits, etc.
> etc.
> >>
> >
> > I did some testing and it seems that what is stored is enough, however
> that
> > does not mean that this is 100% correct.
> > IMO it comes down to the else below. This is the same procedure that was
> > done by the clone before, however the else
> > part of "if (!odp_contains_irreversible_action(odp_actions.data,
> > odp_actions.size))"  does need to do the heavy lifting
> > of restoring everything back so we can clone again.
>
> Ok, so assuming it was working fine before with the same local ctx
> changes, we can ignore it for now ;)
> But it might make sense to copy the same code comments as in the existing
> function to make clear why we restore them.
>
> >>> +    size_t old_recirc_size = ctx->xin->recirc_queue
> >>> +                             ? ovs_list_size(ctx->xin->recirc_queue)
> >>> +                             : 0;
> >>> +
> >>> +    uint64_t odp_actions_stub[1024 / 8];
> >>
> >> All these mf_subvalue/uint64_t arrays add quite some stack space, isn't
> >> there a smarter way we can deal with this?
> >> Maybe we can store the size of the ofpbuf's and reset them to that size?
> >>
> >
> > Seems like a good idea, adjusted, thanks.
> >
> >
> >>
> >>> +    struct ofpbuf odp_actions =
> >> OFPBUF_STUB_INITIALIZER(odp_actions_stub);
> >>> +    struct ofpbuf *old_odp_actions = ctx->odp_actions;
> >>> +
> >>> +    ctx->odp_actions = &odp_actions;
> >>> +
> >>> +    do_xlate_actions(actions, actions_len, ctx, false, false);
> >>
> >> For consistency should we use "...ctx, do_xlate_actions, false);"?
> >> Or maybe change the above also to "false, false", probably the better
> >> option.
> >>
> >
> > Not sure I follow, can you please elaborate a bit?
> > Did you mean "...ctx, is_last_action, false);"?
> > If so, at this point "is_last_action" is always false, but I agree that
> in
> > that if above it can be changed to " ...ctx, true, false);"
>
> Yes, this is the diff, way less complex than me trying to explain what I
> meant ;)
>
>
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 5a196ec5e..b0c328bf0 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -5911,7 +5911,7 @@ clone_xlate_actions(const struct ofpact *actions,
> size_t actions_len,
>  {
>      /* We will never clone on last action, just proceed. */
>      if (is_last_action) {
> -        do_xlate_actions(actions, actions_len, ctx, is_last_action,
> false);
> +        do_xlate_actions(actions, actions_len, ctx, true, false);
>          if (!ctx->freezing) {
>              xlate_action_set(ctx);
>          }
>
> >>
> >>> +
> >>> +    /* Restore ctx parts that are common even if don't need to clone,
> */
> >>
> >> , instead of . at the end of the sentence.
> >>
> >
> > Done in v4.
> >
> >
> >>
> >>> +    ctx->odp_actions = old_odp_actions;
> >>> +    ofpbuf_uninit(&ctx->action_set);
> >>> +    ctx->action_set = old_action_set;
> >>> +    ofpbuf_uninit(&ctx->stack);
> >>> +    ctx->stack = old_stack;
> >>> +    ctx->xin->flow = old_flow;
> >>> +
> >>> +    if (!odp_contains_irreversible_action(odp_actions.data,
> >> odp_actions.size)) {
> >>
> >> The line is too long, see robot email.
> >>
> >
> > Done in v4.
> >
> >
> >>
> >>> +        nl_msg_put(ctx->odp_actions, odp_actions.data,
> >> odp_actions.size);
> >>> +        if (!ctx->freezing) {
> >>> +            xlate_action_set(ctx);
> >>> +        }
> >>> +        if (ctx->freezing) {
> >>> +            finish_freezing(ctx);
> >>> +        }
> >>> +    } else {
> >>> +        ctx->was_mpls = old_was_mpls;
> >>> +        ctx->conntracked = old_conntracked;
> >>> +        ctx->base_flow = old_base;
> >>> +        /* Clear recirculations that were added by non-clone path so
> >> that
> >>> +         * packets are not recirculated twice. */
> >>> +        size_t recirc_size = ctx->xin->recirc_queue
> >>> +                             ? ovs_list_size(ctx->xin->recirc_queue)
> >>> +                             : 0;
> >>> +        for (; old_recirc_size < recirc_size; old_recirc_size++) {
> >>> +            struct oftrace_recirc_node *node =
> >>> +
> CONTAINER_OF(ovs_list_pop_back(ctx->xin->recirc_queue),
> >>> +                             struct oftrace_recirc_node, node);
> >>> +            oftrace_recirc_node_destroy(node);
> >>> +        }
>
> I’am also a bit confused here (maybe I need more coffee), but you reuse
> the actions buffer you just filled with actions
>

I think the v3 and v4 got mixed up in the email, v4 is doing
"nl_msg_reset_size(ctx->odp_actions, old_odp_actions_size);".


>
> >>> +        do_clone(actions, actions_len, ctx);
> >>
> >> Here we will do the entire translation again, can we not just take the
> >> newly added
> >> datapath action stream and just encapsulate it in OVS_ACTION_ATTR_CLONE?
> >>
> >
> > I have tried this approach numerous times, but as explained in commit
> > message in order to
> > keep the semantics we need to rerun it again, maybe I have missed
> something
> > how to do it
> > without breaking semantics. The main issue I had was that if we don't
> clone
> > we should not
> > "xlate_commit_actions", but that does not seem to be possible to know
> > without running all
> > actions in the chain first. If we want to do it in one pass we would
> > probably need to ensure
> > that "xlate_commit_actions" can be reverted if needed.
>
> Sorry, I was miss-reading the code, I guess I needed more coffee last
> Friday :)
>
> I guess the process is ok, however, might it be possible to make the code
> easier to read?  There is also some code duplication, which would be nice
> to get rid of.
>

Do you have something specific in mind? If not I have some things in mind
that could improve that (I hope).


>
> >>> +    }
> >>> +}
> >>> +
> >>>  static void
> >>>  compose_clone(struct xlate_ctx *ctx, const struct ofpact_nest *oc,
> >>>                bool is_last_action)
> >>> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
> >>> index 2e91ae1a1..7bcb2272b 100644
> >>> --- a/tests/ofproto-dpif.at
> >>> +++ b/tests/ofproto-dpif.at
> >>> @@ -11817,3 +11817,108 @@ AT_CHECK([test 1 = `ovs-ofctl parse-pcap
> >> p2-tx.pcap | wc -l`])
> >>>
> >>>  OVS_VSWITCHD_STOP
> >>>  AT_CLEANUP
> >>> +
> >>> +AT_SETUP([ofproto-dpif - nested patch ports clone])
> >>> +
> >>> +OVS_VSWITCHD_START(
> >>> +  [add-port br0 br0-p0 -- set Interface br0-p0 type=dummy
> >> ofport_request=1 -- \
> >>> +   add-port br0 br0-p1 -- set Interface br0-p1 type=patch
> >> options:peer=br1-p0 -- \
> >>> +   add-br br1 -- \
> >>> +   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:01 -- \
> >>> +   set bridge br1 datapath-type=dummy other-config:datapath-id=1111
> >> fail-mode=secure -- \
> >>> +   add-port br1 br1-p0 -- set Interface br1-p0 type=patch
> >> options:peer=br0-p1 -- \
> >>> +   add-port br1 br1-p1 -- set Interface br1-p1 type=patch
> >> options:peer=br2-p0 -- \
> >>> +   add-port br1 br1-p2 -- set Interface br1-p2 type=dummy
> >> ofport_request=2 --\
> >>> +   add-br br2 -- \
> >>> +   set bridge br2 other-config:hwaddr=aa:66:aa:66:00:02 -- \
> >>> +   set bridge br2 datapath-type=dummy other-config:datapath-id=2222
> >> fail-mode=secure -- \
> >>> +   add-port br2 br2-p0 -- set Interface br2-p0 type=patch
> >> options:peer=br1-p1 -- \
> >>> +   add-port br2 br2-p1 -- set Interface br2-p1 type=dummy
> >> ofport_request=3])
> >>> +
> >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br1 'meter=1 pktps stats
> >> bands=type=drop rate=2'])
> >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br2 'meter=1 pktps stats
> >> bands=type=drop rate=2'])
> >>> +AT_CHECK([ovs-ofctl del-flows br0])
> >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0
> >> in_port=local,ip,actions=br0-p1,br0-p0])
> >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
> >> in_port=br1-p0,ip,actions=meter:1,br1-p1,br1-p2])
> >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br2
> >> in_port=br2-p0,ip,actions=meter:1,br2-p1])
> >>> +
> >>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | grep
> >> "Datapath actions:"], [0], [dnl
> >>> +Datapath actions: clone(meter(0),clone(meter(1),102),101),100
> >>> +])
> >>> +
> >>> +OVS_VSWITCHD_STOP
> >>> +AT_CLEANUP
> >>> +
> >>> +AT_SETUP([ofproto-dpif - patch ports - ct (clone)])
> >>> +
> >>> +OVS_VSWITCHD_START(
> >>> +  [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 --
> \
> >>> +   add-port br0 p1 -- set Interface p1 type=patch options:peer=p2
> >> ofport_request=2 -- \
> >>> +   add-br br1 -- \
> >>> +   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \
> >>> +   set bridge br1 datapath-type=dummy other-config:datapath-id=1234
> >> fail-mode=secure -- \
> >>> +   add-port br1 p2 -- set Interface p2 type=patch options:peer=p1 -- \
> >>> +   add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3])
> >>> +
> >>> +AT_CHECK([ovs-ofctl del-flows br0])
> >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0
> >> in_port=local,ip,actions=p1,p0])
> >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
> >> "in_port=1,ip,actions=resubmit(,3)"])
> >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
> >> "table=3,in_port=1,ip,ct_state=-trk,actions=ct(table=0)"])
> >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
> >> "table=3,in_port=1,ip,ct_state=+trk,actions=p3"])
> >>> +
> >>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | \
> >>> +          sed 's/recirc(0x[[0-9]]\+)/recirc(X)/' | grep "Datapath
> >> actions:"], [0], [dnl
> >>> +Datapath actions: clone(ct,recirc(X)),1
> >>> +Datapath actions: 3
> >>> +])
> >>> +
> >>> +OVS_VSWITCHD_STOP
> >>> +AT_CLEANUP
> >>> +
> >>> +
> >>> +AT_SETUP([ofproto-dpif - patch ports - trunc (clone)])
> >>> +
> >>> +OVS_VSWITCHD_START(
> >>> +  [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 --
> \
> >>> +   add-port br0 p1 -- set Interface p1 type=patch options:peer=p2
> >> ofport_request=2 -- \
> >>> +   add-br br1 -- \
> >>> +   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \
> >>> +   set bridge br1 datapath-type=dummy other-config:datapath-id=1234
> >> fail-mode=secure -- \
> >>> +   add-port br1 p2 -- set Interface p2 type=patch options:peer=p1 -- \
> >>> +   add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3])
> >>> +
> >>> +AT_CHECK([ovs-ofctl del-flows br0])
> >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0
> >> in_port=local,ip,actions=p1,p0])
> >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
> >> "in_port=1,ip,actions=resubmit(,3)"])
> >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
> >> "table=3,in_port=1,ip,actions=output(port=p3, max_len=20)"])
> >>> +
> >>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | grep
> >> "Datapath actions:"], [0], [dnl
> >>> +Datapath actions: clone(trunc(20),3),1
> >>> +])
> >>> +
> >>> +OVS_VSWITCHD_STOP
> >>> +AT_CLEANUP
> >>> +
> >>> +AT_SETUP([ofproto-dpif - patch ports - encap (clone)])
> >>> +
> >>> +OVS_VSWITCHD_START(
> >>> +  [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 --
> \
> >>> +   add-port br0 p1 -- set Interface p1 type=patch options:peer=p2
> >> ofport_request=2 -- \
> >>> +   add-br br1 -- \
> >>> +   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \
> >>> +   set bridge br1 datapath-type=dummy other-config:datapath-id=1234
> >> fail-mode=secure -- \
> >>> +   add-port br1 p2 -- set Interface p2 type=patch options:peer=p1 -- \
> >>> +   add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3])
> >>> +
> >>> +AT_CHECK([ovs-ofctl del-flows br0])
> >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0
> >> "in_port=local,ip,actions=encap(nsh()),encap(ethernet),p1,p0"])
> >>> +#AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
> >> "in_port=1,dl_type=0x894f,actions=resubmit(,3)"])
> >>
> >> Forgot to remove, or uncomment?
> >>
> >
> > To uncomment :) Done in v4, thanks.
> >
> >
> >>
> >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
> >> "in_port=1,dl_type=0x894f,actions=decap(),decap(),p3"])
> >>> +
> >>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | \
> >>> +          sed 's/recirc(0x[[0-9]]\+)/recirc(X)/' | grep "Datapath
> >> actions:"], [0], [dnl
> >>> +Datapath actions:
> >>
> push_nsh(flags=0,ttl=63,mdtype=1,np=3,spi=0x0,si=255,c1=0x0,c2=0x0,c3=0x0,c4=0x0),push_eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),clone(pop_eth,pop_nsh(),recirc(X)),1
> >>> +])
> >>> +
> >>> +OVS_VSWITCHD_STOP
> >>> +AT_CLEANUP
> >>> --
> >>> 2.37.1
> >>>
> >>> _______________________________________________
> >>> dev mailing list
> >>> [email protected]
> >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>
> >>
> > Thanks,
> > Ales
> > --
> >
> > Ales Musil
> >
> > Senior Software Engineer - OVN Core
> >
> > Red Hat EMEA <https://www.redhat.com>
> >
> > [email protected]    IM: amusil
> > <https://red.ht/sig>
>
>
Thanks,
Ales

-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA <https://www.redhat.com>

[email protected]    IM: amusil
<https://red.ht/sig>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to