On Mon, Sep 12, 2022 at 2:38 PM Eelco Chaudron <[email protected]> wrote:

>
>
> On 6 Sep 2022, at 7:49, Ales Musil wrote:
>
> > On Mon, Sep 5, 2022 at 4:31 PM Eelco Chaudron <[email protected]>
> wrote:
> >
> >>
> >>
> >> On 5 Sep 2022, at 12:45, Ales Musil wrote:
> >>
> >>> On Fri, Sep 2, 2022 at 4:19 PM Eelco Chaudron <[email protected]>
> >> wrote:
> >>>
> >>>>
> >>>>
> >>>> On 9 Aug 2022, at 9:17, Ales Musil wrote:
> >>>>
> >>>>> When the packet was traveling through patch port boundary
> >>>>> OvS would check if any of the actions is reversible,
> >>>>> if not it would clone the packet. However, the check
> >>>>> was only at the first level of the second bridge.
> >>>>> That caused some issues when the packet had gone
> >>>>> through more actions, some of them might have been
> >>>>> irreversible.
> >>>>>
> >>>>> In order to keep the semantics the same we might
> >>>>> need to run the actions twice in the worst case
> >>>>> scenario. During the clone there are 4 scenarios
> >>>>> that can happen.
> >>>>>
> >>>>> 1) The action is last one for that flow,
> >>>>> in that case we just continue without clone.
> >>>>>
> >>>>> 2) There is irreversible action in the action
> >>>>> set (first level). In this case we know that
> >>>>> there is at leas one irreversible action which
> >>>>> is enough to do clone.
> >>>>>
> >>>>> 3) All actions in first level are reversible,
> >>>>> we can try to run all actions as if we don't
> >>>>> need any clone and inspect the ofpbuf at the
> >>>>> end. In positive case there are no irreversible
> >>>>> actions so we will just submit the buffer and continue.
> >>>>>
> >>>>> 4) This is same as 3) with the difference that
> >>>>> there is irreversible action in the ofpbuf.
> >>>>> To keep the semantics we need to re-run the actions
> >>>>> and treat it as clone. This requires resotration
> >>>>> of the xlate_ctx.
> >>>>>
> >>>>> Add test cases for all irreversible actions
> >>>>> to see if they are properly cloned.
> >>>>>
> >>>>> Signed-off-by: Ales Musil <[email protected]>
> >>
> >> <SNIP>
> >>
> >>>>> +    /* In any other case we cannot be certain if there is any
> >>>> irreversible
> >>>>> +     * action down the action stack. Let's run the actions as if
> they
> >>>> are
> >>>>> +     * reversible, if we find out that there was any irreversible
> >>>> action we
> >>>>> +     * can restore the xlate_ctx and run the do_clone instead. */
> >>>>> +    struct ofpbuf old_stack = ctx->stack;
> >>>>> +    union mf_subvalue new_stack[1024 / sizeof (union mf_subvalue)];
> >>>>> +
> >>>>> +    ofpbuf_use_stub(&ctx->stack, new_stack, sizeof new_stack);
> >>>>> +    ofpbuf_put(&ctx->stack, old_stack.data, old_stack.size);
> >>>>> +
> >>>>> +    struct ofpbuf old_action_set = ctx->action_set;
> >>>>> +    uint64_t actset_stub[1024 / 8];
> >>>>> +
> >>>>> +    ofpbuf_use_stub(&ctx->action_set, actset_stub, sizeof
> >> actset_stub);
> >>>>> +    ofpbuf_put(&ctx->action_set, old_action_set.data,
> >>>> old_action_set.size);
> >>>>> +
> >>>>> +    struct flow old_flow = ctx->xin->flow;
> >>>>> +    bool old_was_mpls = ctx->was_mpls;
> >>>>> +    bool old_conntracked = ctx->conntracked;
> >>>>> +    struct flow old_base = ctx->base_flow;
> >>>>
> >>>> This is the part that scares me. Are these the only value we need to
> >> cache
> >>>> and restore? Are we sure none of the other CTX fields arale
> potentially
> >>>> adjusted?
> >>>> What about fields like sflow_n_outputs, pending_encap, resubmits, etc.
> >> etc.
> >>>>
> >>>
> >>> I did some testing and it seems that what is stored is enough, however
> >> that
> >>> does not mean that this is 100% correct.
> >>> IMO it comes down to the else below. This is the same procedure that
> was
> >>> done by the clone before, however the else
> >>> part of "if (!odp_contains_irreversible_action(odp_actions.data,
> >>> odp_actions.size))"  does need to do the heavy lifting
> >>> of restoring everything back so we can clone again.
> >>
> >> Ok, so assuming it was working fine before with the same local ctx
> >> changes, we can ignore it for now ;)
> >> But it might make sense to copy the same code comments as in the
> existing
> >> function to make clear why we restore them.
> >>
> >>>>> +    size_t old_recirc_size = ctx->xin->recirc_queue
> >>>>> +                             ? ovs_list_size(ctx->xin->recirc_queue)
> >>>>> +                             : 0;
> >>>>> +
> >>>>> +    uint64_t odp_actions_stub[1024 / 8];
> >>>>
> >>>> All these mf_subvalue/uint64_t arrays add quite some stack space,
> isn't
> >>>> there a smarter way we can deal with this?
> >>>> Maybe we can store the size of the ofpbuf's and reset them to that
> size?
> >>>>
> >>>
> >>> Seems like a good idea, adjusted, thanks.
> >>>
> >>>
> >>>>
> >>>>> +    struct ofpbuf odp_actions =
> >>>> OFPBUF_STUB_INITIALIZER(odp_actions_stub);
> >>>>> +    struct ofpbuf *old_odp_actions = ctx->odp_actions;
> >>>>> +
> >>>>> +    ctx->odp_actions = &odp_actions;
> >>>>> +
> >>>>> +    do_xlate_actions(actions, actions_len, ctx, false, false);
> >>>>
> >>>> For consistency should we use "...ctx, do_xlate_actions, false);"?
> >>>> Or maybe change the above also to "false, false", probably the better
> >>>> option.
> >>>>
> >>>
> >>> Not sure I follow, can you please elaborate a bit?
> >>> Did you mean "...ctx, is_last_action, false);"?
> >>> If so, at this point "is_last_action" is always false, but I agree that
> >> in
> >>> that if above it can be changed to " ...ctx, true, false);"
> >>
> >> Yes, this is the diff, way less complex than me trying to explain what I
> >> meant ;)
> >>
> >>
> >> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> >> index 5a196ec5e..b0c328bf0 100644
> >> --- a/ofproto/ofproto-dpif-xlate.c
> >> +++ b/ofproto/ofproto-dpif-xlate.c
> >> @@ -5911,7 +5911,7 @@ clone_xlate_actions(const struct ofpact *actions,
> >> size_t actions_len,
> >>  {
> >>      /* We will never clone on last action, just proceed. */
> >>      if (is_last_action) {
> >> -        do_xlate_actions(actions, actions_len, ctx, is_last_action,
> >> false);
> >> +        do_xlate_actions(actions, actions_len, ctx, true, false);
> >>          if (!ctx->freezing) {
> >>              xlate_action_set(ctx);
> >>          }
> >>
> >>>>
> >>>>> +
> >>>>> +    /* Restore ctx parts that are common even if don't need to
> clone,
> >> */
> >>>>
> >>>> , instead of . at the end of the sentence.
> >>>>
> >>>
> >>> Done in v4.
> >>>
> >>>
> >>>>
> >>>>> +    ctx->odp_actions = old_odp_actions;
> >>>>> +    ofpbuf_uninit(&ctx->action_set);
> >>>>> +    ctx->action_set = old_action_set;
> >>>>> +    ofpbuf_uninit(&ctx->stack);
> >>>>> +    ctx->stack = old_stack;
> >>>>> +    ctx->xin->flow = old_flow;
> >>>>> +
> >>>>> +    if (!odp_contains_irreversible_action(odp_actions.data,
> >>>> odp_actions.size)) {
> >>>>
> >>>> The line is too long, see robot email.
> >>>>
> >>>
> >>> Done in v4.
> >>>
> >>>
> >>>>
> >>>>> +        nl_msg_put(ctx->odp_actions, odp_actions.data,
> >>>> odp_actions.size);
> >>>>> +        if (!ctx->freezing) {
> >>>>> +            xlate_action_set(ctx);
> >>>>> +        }
> >>>>> +        if (ctx->freezing) {
> >>>>> +            finish_freezing(ctx);
> >>>>> +        }
> >>>>> +    } else {
> >>>>> +        ctx->was_mpls = old_was_mpls;
> >>>>> +        ctx->conntracked = old_conntracked;
> >>>>> +        ctx->base_flow = old_base;
> >>>>> +        /* Clear recirculations that were added by non-clone path so
> >>>> that
> >>>>> +         * packets are not recirculated twice. */
> >>>>> +        size_t recirc_size = ctx->xin->recirc_queue
> >>>>> +                             ? ovs_list_size(ctx->xin->recirc_queue)
> >>>>> +                             : 0;
> >>>>> +        for (; old_recirc_size < recirc_size; old_recirc_size++) {
> >>>>> +            struct oftrace_recirc_node *node =
> >>>>> +
> >> CONTAINER_OF(ovs_list_pop_back(ctx->xin->recirc_queue),
> >>>>> +                             struct oftrace_recirc_node, node);
> >>>>> +            oftrace_recirc_node_destroy(node);
> >>>>> +        }
> >>
> >> I’am also a bit confused here (maybe I need more coffee), but you reuse
> >> the actions buffer you just filled with actions
> >>
> >
> > I think the v3 and v4 got mixed up in the email, v4 is doing
> > "nl_msg_reset_size(ctx->odp_actions, old_odp_actions_size);".
> >
>
> Sorry for the late response, as I was out of the office last week.
>
> I did not see the v4 until you reminded me in this email ;)
> Can you CC me on the v5, so I know a new version is present?
>
> >>>>> +        do_clone(actions, actions_len, ctx);
> >>>>
> >>>> Here we will do the entire translation again, can we not just take the
> >>>> newly added
> >>>> datapath action stream and just encapsulate it in
> OVS_ACTION_ATTR_CLONE?
> >>>>
> >>>
> >>> I have tried this approach numerous times, but as explained in commit
> >>> message in order to
> >>> keep the semantics we need to rerun it again, maybe I have missed
> >> something
> >>> how to do it
> >>> without breaking semantics. The main issue I had was that if we don't
> >> clone
> >>> we should not
> >>> "xlate_commit_actions", but that does not seem to be possible to know
> >>> without running all
> >>> actions in the chain first. If we want to do it in one pass we would
> >>> probably need to ensure
> >>> that "xlate_commit_actions" can be reverted if needed.
> >>
> >> Sorry, I was miss-reading the code, I guess I needed more coffee last
> >> Friday :)
> >>
> >> I guess the process is ok, however, might it be possible to make the
> code
> >> easier to read?  There is also some code duplication, which would be
> nice
> >> to get rid of.
> >>
> >
> > Do you have something specific in mind? If not I have some things in mind
> > that could improve that (I hope).
>
> I have nothing in mind, but I’ll wait for your v5 and see.
>

I have actually pushed v5 faster.
https://patchwork.ozlabs.org/project/openvswitch/list/?series=317208
I'll add you to the cc list so you'll get any further update.

Thanks,
Ales


>
> >>>>> +    }
> >>>>> +}
> >>>>> +
> >>>>>  static void
> >>>>>  compose_clone(struct xlate_ctx *ctx, const struct ofpact_nest *oc,
> >>>>>                bool is_last_action)
> >>>>> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
> >>>>> index 2e91ae1a1..7bcb2272b 100644
> >>>>> --- a/tests/ofproto-dpif.at
> >>>>> +++ b/tests/ofproto-dpif.at
> >>>>> @@ -11817,3 +11817,108 @@ AT_CHECK([test 1 = `ovs-ofctl parse-pcap
> >>>> p2-tx.pcap | wc -l`])
> >>>>>
> >>>>>  OVS_VSWITCHD_STOP
> >>>>>  AT_CLEANUP
> >>>>> +
> >>>>> +AT_SETUP([ofproto-dpif - nested patch ports clone])
> >>>>> +
> >>>>> +OVS_VSWITCHD_START(
> >>>>> +  [add-port br0 br0-p0 -- set Interface br0-p0 type=dummy
> >>>> ofport_request=1 -- \
> >>>>> +   add-port br0 br0-p1 -- set Interface br0-p1 type=patch
> >>>> options:peer=br1-p0 -- \
> >>>>> +   add-br br1 -- \
> >>>>> +   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:01 -- \
> >>>>> +   set bridge br1 datapath-type=dummy other-config:datapath-id=1111
> >>>> fail-mode=secure -- \
> >>>>> +   add-port br1 br1-p0 -- set Interface br1-p0 type=patch
> >>>> options:peer=br0-p1 -- \
> >>>>> +   add-port br1 br1-p1 -- set Interface br1-p1 type=patch
> >>>> options:peer=br2-p0 -- \
> >>>>> +   add-port br1 br1-p2 -- set Interface br1-p2 type=dummy
> >>>> ofport_request=2 --\
> >>>>> +   add-br br2 -- \
> >>>>> +   set bridge br2 other-config:hwaddr=aa:66:aa:66:00:02 -- \
> >>>>> +   set bridge br2 datapath-type=dummy other-config:datapath-id=2222
> >>>> fail-mode=secure -- \
> >>>>> +   add-port br2 br2-p0 -- set Interface br2-p0 type=patch
> >>>> options:peer=br1-p1 -- \
> >>>>> +   add-port br2 br2-p1 -- set Interface br2-p1 type=dummy
> >>>> ofport_request=3])
> >>>>> +
> >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br1 'meter=1 pktps stats
> >>>> bands=type=drop rate=2'])
> >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br2 'meter=1 pktps stats
> >>>> bands=type=drop rate=2'])
> >>>>> +AT_CHECK([ovs-ofctl del-flows br0])
> >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0
> >>>> in_port=local,ip,actions=br0-p1,br0-p0])
> >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
> >>>> in_port=br1-p0,ip,actions=meter:1,br1-p1,br1-p2])
> >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br2
> >>>> in_port=br2-p0,ip,actions=meter:1,br2-p1])
> >>>>> +
> >>>>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | grep
> >>>> "Datapath actions:"], [0], [dnl
> >>>>> +Datapath actions: clone(meter(0),clone(meter(1),102),101),100
> >>>>> +])
> >>>>> +
> >>>>> +OVS_VSWITCHD_STOP
> >>>>> +AT_CLEANUP
> >>>>> +
> >>>>> +AT_SETUP([ofproto-dpif - patch ports - ct (clone)])
> >>>>> +
> >>>>> +OVS_VSWITCHD_START(
> >>>>> +  [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1
> --
> >> \
> >>>>> +   add-port br0 p1 -- set Interface p1 type=patch options:peer=p2
> >>>> ofport_request=2 -- \
> >>>>> +   add-br br1 -- \
> >>>>> +   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \
> >>>>> +   set bridge br1 datapath-type=dummy other-config:datapath-id=1234
> >>>> fail-mode=secure -- \
> >>>>> +   add-port br1 p2 -- set Interface p2 type=patch options:peer=p1
> -- \
> >>>>> +   add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3])
> >>>>> +
> >>>>> +AT_CHECK([ovs-ofctl del-flows br0])
> >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0
> >>>> in_port=local,ip,actions=p1,p0])
> >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
> >>>> "in_port=1,ip,actions=resubmit(,3)"])
> >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
> >>>> "table=3,in_port=1,ip,ct_state=-trk,actions=ct(table=0)"])
> >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
> >>>> "table=3,in_port=1,ip,ct_state=+trk,actions=p3"])
> >>>>> +
> >>>>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | \
> >>>>> +          sed 's/recirc(0x[[0-9]]\+)/recirc(X)/' | grep "Datapath
> >>>> actions:"], [0], [dnl
> >>>>> +Datapath actions: clone(ct,recirc(X)),1
> >>>>> +Datapath actions: 3
> >>>>> +])
> >>>>> +
> >>>>> +OVS_VSWITCHD_STOP
> >>>>> +AT_CLEANUP
> >>>>> +
> >>>>> +
> >>>>> +AT_SETUP([ofproto-dpif - patch ports - trunc (clone)])
> >>>>> +
> >>>>> +OVS_VSWITCHD_START(
> >>>>> +  [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1
> --
> >> \
> >>>>> +   add-port br0 p1 -- set Interface p1 type=patch options:peer=p2
> >>>> ofport_request=2 -- \
> >>>>> +   add-br br1 -- \
> >>>>> +   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \
> >>>>> +   set bridge br1 datapath-type=dummy other-config:datapath-id=1234
> >>>> fail-mode=secure -- \
> >>>>> +   add-port br1 p2 -- set Interface p2 type=patch options:peer=p1
> -- \
> >>>>> +   add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3])
> >>>>> +
> >>>>> +AT_CHECK([ovs-ofctl del-flows br0])
> >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0
> >>>> in_port=local,ip,actions=p1,p0])
> >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
> >>>> "in_port=1,ip,actions=resubmit(,3)"])
> >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
> >>>> "table=3,in_port=1,ip,actions=output(port=p3, max_len=20)"])
> >>>>> +
> >>>>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | grep
> >>>> "Datapath actions:"], [0], [dnl
> >>>>> +Datapath actions: clone(trunc(20),3),1
> >>>>> +])
> >>>>> +
> >>>>> +OVS_VSWITCHD_STOP
> >>>>> +AT_CLEANUP
> >>>>> +
> >>>>> +AT_SETUP([ofproto-dpif - patch ports - encap (clone)])
> >>>>> +
> >>>>> +OVS_VSWITCHD_START(
> >>>>> +  [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1
> --
> >> \
> >>>>> +   add-port br0 p1 -- set Interface p1 type=patch options:peer=p2
> >>>> ofport_request=2 -- \
> >>>>> +   add-br br1 -- \
> >>>>> +   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \
> >>>>> +   set bridge br1 datapath-type=dummy other-config:datapath-id=1234
> >>>> fail-mode=secure -- \
> >>>>> +   add-port br1 p2 -- set Interface p2 type=patch options:peer=p1
> -- \
> >>>>> +   add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3])
> >>>>> +
> >>>>> +AT_CHECK([ovs-ofctl del-flows br0])
> >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0
> >>>> "in_port=local,ip,actions=encap(nsh()),encap(ethernet),p1,p0"])
> >>>>> +#AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
> >>>> "in_port=1,dl_type=0x894f,actions=resubmit(,3)"])
> >>>>
> >>>> Forgot to remove, or uncomment?
> >>>>
> >>>
> >>> To uncomment :) Done in v4, thanks.
> >>>
> >>>
> >>>>
> >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
> >>>> "in_port=1,dl_type=0x894f,actions=decap(),decap(),p3"])
> >>>>> +
> >>>>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | \
> >>>>> +          sed 's/recirc(0x[[0-9]]\+)/recirc(X)/' | grep "Datapath
> >>>> actions:"], [0], [dnl
> >>>>> +Datapath actions:
> >>>>
> >>
> push_nsh(flags=0,ttl=63,mdtype=1,np=3,spi=0x0,si=255,c1=0x0,c2=0x0,c3=0x0,c4=0x0),push_eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),clone(pop_eth,pop_nsh(),recirc(X)),1
> >>>>> +])
> >>>>> +
> >>>>> +OVS_VSWITCHD_STOP
> >>>>> +AT_CLEANUP
> >>>>> --
> >>>>> 2.37.1
> >>>>>
> >>>>> _______________________________________________
> >>>>> dev mailing list
> >>>>> [email protected]
> >>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>>>
> >>>>
> >>> Thanks,
> >>> Ales
> >>> --
> >>>
> >>> Ales Musil
> >>>
> >>> Senior Software Engineer - OVN Core
> >>>
> >>> Red Hat EMEA <https://www.redhat.com>
> >>>
> >>> [email protected]    IM: amusil
> >>> <https://red.ht/sig>
> >>
> >>
> > Thanks,
> > Ales
> >
> > --
> >
> > Ales Musil
> >
> > Senior Software Engineer - OVN Core
> >
> > Red Hat EMEA <https://www.redhat.com>
> >
> > [email protected]    IM: amusil
> > <https://red.ht/sig>
>
>

-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA <https://www.redhat.com>

[email protected]    IM: amusil
<https://red.ht/sig>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to