On 6 Sep 2022, at 7:49, Ales Musil wrote:

> On Mon, Sep 5, 2022 at 4:31 PM Eelco Chaudron <[email protected]> wrote:
>
>>
>>
>> On 5 Sep 2022, at 12:45, Ales Musil wrote:
>>
>>> On Fri, Sep 2, 2022 at 4:19 PM Eelco Chaudron <[email protected]>
>> wrote:
>>>
>>>>
>>>>
>>>> On 9 Aug 2022, at 9:17, Ales Musil wrote:
>>>>
>>>>> When the packet was traveling through patch port boundary
>>>>> OvS would check if any of the actions is reversible,
>>>>> if not it would clone the packet. However, the check
>>>>> was only at the first level of the second bridge.
>>>>> That caused some issues when the packet had gone
>>>>> through more actions, some of them might have been
>>>>> irreversible.
>>>>>
>>>>> In order to keep the semantics the same we might
>>>>> need to run the actions twice in the worst case
>>>>> scenario. During the clone there are 4 scenarios
>>>>> that can happen.
>>>>>
>>>>> 1) The action is last one for that flow,
>>>>> in that case we just continue without clone.
>>>>>
>>>>> 2) There is irreversible action in the action
>>>>> set (first level). In this case we know that
>>>>> there is at leas one irreversible action which
>>>>> is enough to do clone.
>>>>>
>>>>> 3) All actions in first level are reversible,
>>>>> we can try to run all actions as if we don't
>>>>> need any clone and inspect the ofpbuf at the
>>>>> end. In positive case there are no irreversible
>>>>> actions so we will just submit the buffer and continue.
>>>>>
>>>>> 4) This is same as 3) with the difference that
>>>>> there is irreversible action in the ofpbuf.
>>>>> To keep the semantics we need to re-run the actions
>>>>> and treat it as clone. This requires resotration
>>>>> of the xlate_ctx.
>>>>>
>>>>> Add test cases for all irreversible actions
>>>>> to see if they are properly cloned.
>>>>>
>>>>> Signed-off-by: Ales Musil <[email protected]>
>>
>> <SNIP>
>>
>>>>> +    /* In any other case we cannot be certain if there is any
>>>> irreversible
>>>>> +     * action down the action stack. Let's run the actions as if they
>>>> are
>>>>> +     * reversible, if we find out that there was any irreversible
>>>> action we
>>>>> +     * can restore the xlate_ctx and run the do_clone instead. */
>>>>> +    struct ofpbuf old_stack = ctx->stack;
>>>>> +    union mf_subvalue new_stack[1024 / sizeof (union mf_subvalue)];
>>>>> +
>>>>> +    ofpbuf_use_stub(&ctx->stack, new_stack, sizeof new_stack);
>>>>> +    ofpbuf_put(&ctx->stack, old_stack.data, old_stack.size);
>>>>> +
>>>>> +    struct ofpbuf old_action_set = ctx->action_set;
>>>>> +    uint64_t actset_stub[1024 / 8];
>>>>> +
>>>>> +    ofpbuf_use_stub(&ctx->action_set, actset_stub, sizeof
>> actset_stub);
>>>>> +    ofpbuf_put(&ctx->action_set, old_action_set.data,
>>>> old_action_set.size);
>>>>> +
>>>>> +    struct flow old_flow = ctx->xin->flow;
>>>>> +    bool old_was_mpls = ctx->was_mpls;
>>>>> +    bool old_conntracked = ctx->conntracked;
>>>>> +    struct flow old_base = ctx->base_flow;
>>>>
>>>> This is the part that scares me. Are these the only value we need to
>> cache
>>>> and restore? Are we sure none of the other CTX fields arale potentially
>>>> adjusted?
>>>> What about fields like sflow_n_outputs, pending_encap, resubmits, etc.
>> etc.
>>>>
>>>
>>> I did some testing and it seems that what is stored is enough, however
>> that
>>> does not mean that this is 100% correct.
>>> IMO it comes down to the else below. This is the same procedure that was
>>> done by the clone before, however the else
>>> part of "if (!odp_contains_irreversible_action(odp_actions.data,
>>> odp_actions.size))"  does need to do the heavy lifting
>>> of restoring everything back so we can clone again.
>>
>> Ok, so assuming it was working fine before with the same local ctx
>> changes, we can ignore it for now ;)
>> But it might make sense to copy the same code comments as in the existing
>> function to make clear why we restore them.
>>
>>>>> +    size_t old_recirc_size = ctx->xin->recirc_queue
>>>>> +                             ? ovs_list_size(ctx->xin->recirc_queue)
>>>>> +                             : 0;
>>>>> +
>>>>> +    uint64_t odp_actions_stub[1024 / 8];
>>>>
>>>> All these mf_subvalue/uint64_t arrays add quite some stack space, isn't
>>>> there a smarter way we can deal with this?
>>>> Maybe we can store the size of the ofpbuf's and reset them to that size?
>>>>
>>>
>>> Seems like a good idea, adjusted, thanks.
>>>
>>>
>>>>
>>>>> +    struct ofpbuf odp_actions =
>>>> OFPBUF_STUB_INITIALIZER(odp_actions_stub);
>>>>> +    struct ofpbuf *old_odp_actions = ctx->odp_actions;
>>>>> +
>>>>> +    ctx->odp_actions = &odp_actions;
>>>>> +
>>>>> +    do_xlate_actions(actions, actions_len, ctx, false, false);
>>>>
>>>> For consistency should we use "...ctx, do_xlate_actions, false);"?
>>>> Or maybe change the above also to "false, false", probably the better
>>>> option.
>>>>
>>>
>>> Not sure I follow, can you please elaborate a bit?
>>> Did you mean "...ctx, is_last_action, false);"?
>>> If so, at this point "is_last_action" is always false, but I agree that
>> in
>>> that if above it can be changed to " ...ctx, true, false);"
>>
>> Yes, this is the diff, way less complex than me trying to explain what I
>> meant ;)
>>
>>
>> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
>> index 5a196ec5e..b0c328bf0 100644
>> --- a/ofproto/ofproto-dpif-xlate.c
>> +++ b/ofproto/ofproto-dpif-xlate.c
>> @@ -5911,7 +5911,7 @@ clone_xlate_actions(const struct ofpact *actions,
>> size_t actions_len,
>>  {
>>      /* We will never clone on last action, just proceed. */
>>      if (is_last_action) {
>> -        do_xlate_actions(actions, actions_len, ctx, is_last_action,
>> false);
>> +        do_xlate_actions(actions, actions_len, ctx, true, false);
>>          if (!ctx->freezing) {
>>              xlate_action_set(ctx);
>>          }
>>
>>>>
>>>>> +
>>>>> +    /* Restore ctx parts that are common even if don't need to clone,
>> */
>>>>
>>>> , instead of . at the end of the sentence.
>>>>
>>>
>>> Done in v4.
>>>
>>>
>>>>
>>>>> +    ctx->odp_actions = old_odp_actions;
>>>>> +    ofpbuf_uninit(&ctx->action_set);
>>>>> +    ctx->action_set = old_action_set;
>>>>> +    ofpbuf_uninit(&ctx->stack);
>>>>> +    ctx->stack = old_stack;
>>>>> +    ctx->xin->flow = old_flow;
>>>>> +
>>>>> +    if (!odp_contains_irreversible_action(odp_actions.data,
>>>> odp_actions.size)) {
>>>>
>>>> The line is too long, see robot email.
>>>>
>>>
>>> Done in v4.
>>>
>>>
>>>>
>>>>> +        nl_msg_put(ctx->odp_actions, odp_actions.data,
>>>> odp_actions.size);
>>>>> +        if (!ctx->freezing) {
>>>>> +            xlate_action_set(ctx);
>>>>> +        }
>>>>> +        if (ctx->freezing) {
>>>>> +            finish_freezing(ctx);
>>>>> +        }
>>>>> +    } else {
>>>>> +        ctx->was_mpls = old_was_mpls;
>>>>> +        ctx->conntracked = old_conntracked;
>>>>> +        ctx->base_flow = old_base;
>>>>> +        /* Clear recirculations that were added by non-clone path so
>>>> that
>>>>> +         * packets are not recirculated twice. */
>>>>> +        size_t recirc_size = ctx->xin->recirc_queue
>>>>> +                             ? ovs_list_size(ctx->xin->recirc_queue)
>>>>> +                             : 0;
>>>>> +        for (; old_recirc_size < recirc_size; old_recirc_size++) {
>>>>> +            struct oftrace_recirc_node *node =
>>>>> +
>> CONTAINER_OF(ovs_list_pop_back(ctx->xin->recirc_queue),
>>>>> +                             struct oftrace_recirc_node, node);
>>>>> +            oftrace_recirc_node_destroy(node);
>>>>> +        }
>>
>> I’am also a bit confused here (maybe I need more coffee), but you reuse
>> the actions buffer you just filled with actions
>>
>
> I think the v3 and v4 got mixed up in the email, v4 is doing
> "nl_msg_reset_size(ctx->odp_actions, old_odp_actions_size);".
>

Sorry for the late response, as I was out of the office last week.

I did not see the v4 until you reminded me in this email ;)
Can you CC me on the v5, so I know a new version is present?

>>>>> +        do_clone(actions, actions_len, ctx);
>>>>
>>>> Here we will do the entire translation again, can we not just take the
>>>> newly added
>>>> datapath action stream and just encapsulate it in OVS_ACTION_ATTR_CLONE?
>>>>
>>>
>>> I have tried this approach numerous times, but as explained in commit
>>> message in order to
>>> keep the semantics we need to rerun it again, maybe I have missed
>> something
>>> how to do it
>>> without breaking semantics. The main issue I had was that if we don't
>> clone
>>> we should not
>>> "xlate_commit_actions", but that does not seem to be possible to know
>>> without running all
>>> actions in the chain first. If we want to do it in one pass we would
>>> probably need to ensure
>>> that "xlate_commit_actions" can be reverted if needed.
>>
>> Sorry, I was miss-reading the code, I guess I needed more coffee last
>> Friday :)
>>
>> I guess the process is ok, however, might it be possible to make the code
>> easier to read?  There is also some code duplication, which would be nice
>> to get rid of.
>>
>
> Do you have something specific in mind? If not I have some things in mind
> that could improve that (I hope).

I have nothing in mind, but I’ll wait for your v5 and see.

>>>>> +    }
>>>>> +}
>>>>> +
>>>>>  static void
>>>>>  compose_clone(struct xlate_ctx *ctx, const struct ofpact_nest *oc,
>>>>>                bool is_last_action)
>>>>> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
>>>>> index 2e91ae1a1..7bcb2272b 100644
>>>>> --- a/tests/ofproto-dpif.at
>>>>> +++ b/tests/ofproto-dpif.at
>>>>> @@ -11817,3 +11817,108 @@ AT_CHECK([test 1 = `ovs-ofctl parse-pcap
>>>> p2-tx.pcap | wc -l`])
>>>>>
>>>>>  OVS_VSWITCHD_STOP
>>>>>  AT_CLEANUP
>>>>> +
>>>>> +AT_SETUP([ofproto-dpif - nested patch ports clone])
>>>>> +
>>>>> +OVS_VSWITCHD_START(
>>>>> +  [add-port br0 br0-p0 -- set Interface br0-p0 type=dummy
>>>> ofport_request=1 -- \
>>>>> +   add-port br0 br0-p1 -- set Interface br0-p1 type=patch
>>>> options:peer=br1-p0 -- \
>>>>> +   add-br br1 -- \
>>>>> +   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:01 -- \
>>>>> +   set bridge br1 datapath-type=dummy other-config:datapath-id=1111
>>>> fail-mode=secure -- \
>>>>> +   add-port br1 br1-p0 -- set Interface br1-p0 type=patch
>>>> options:peer=br0-p1 -- \
>>>>> +   add-port br1 br1-p1 -- set Interface br1-p1 type=patch
>>>> options:peer=br2-p0 -- \
>>>>> +   add-port br1 br1-p2 -- set Interface br1-p2 type=dummy
>>>> ofport_request=2 --\
>>>>> +   add-br br2 -- \
>>>>> +   set bridge br2 other-config:hwaddr=aa:66:aa:66:00:02 -- \
>>>>> +   set bridge br2 datapath-type=dummy other-config:datapath-id=2222
>>>> fail-mode=secure -- \
>>>>> +   add-port br2 br2-p0 -- set Interface br2-p0 type=patch
>>>> options:peer=br1-p1 -- \
>>>>> +   add-port br2 br2-p1 -- set Interface br2-p1 type=dummy
>>>> ofport_request=3])
>>>>> +
>>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br1 'meter=1 pktps stats
>>>> bands=type=drop rate=2'])
>>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br2 'meter=1 pktps stats
>>>> bands=type=drop rate=2'])
>>>>> +AT_CHECK([ovs-ofctl del-flows br0])
>>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0
>>>> in_port=local,ip,actions=br0-p1,br0-p0])
>>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
>>>> in_port=br1-p0,ip,actions=meter:1,br1-p1,br1-p2])
>>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br2
>>>> in_port=br2-p0,ip,actions=meter:1,br2-p1])
>>>>> +
>>>>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | grep
>>>> "Datapath actions:"], [0], [dnl
>>>>> +Datapath actions: clone(meter(0),clone(meter(1),102),101),100
>>>>> +])
>>>>> +
>>>>> +OVS_VSWITCHD_STOP
>>>>> +AT_CLEANUP
>>>>> +
>>>>> +AT_SETUP([ofproto-dpif - patch ports - ct (clone)])
>>>>> +
>>>>> +OVS_VSWITCHD_START(
>>>>> +  [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 --
>> \
>>>>> +   add-port br0 p1 -- set Interface p1 type=patch options:peer=p2
>>>> ofport_request=2 -- \
>>>>> +   add-br br1 -- \
>>>>> +   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \
>>>>> +   set bridge br1 datapath-type=dummy other-config:datapath-id=1234
>>>> fail-mode=secure -- \
>>>>> +   add-port br1 p2 -- set Interface p2 type=patch options:peer=p1 -- \
>>>>> +   add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3])
>>>>> +
>>>>> +AT_CHECK([ovs-ofctl del-flows br0])
>>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0
>>>> in_port=local,ip,actions=p1,p0])
>>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
>>>> "in_port=1,ip,actions=resubmit(,3)"])
>>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
>>>> "table=3,in_port=1,ip,ct_state=-trk,actions=ct(table=0)"])
>>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
>>>> "table=3,in_port=1,ip,ct_state=+trk,actions=p3"])
>>>>> +
>>>>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | \
>>>>> +          sed 's/recirc(0x[[0-9]]\+)/recirc(X)/' | grep "Datapath
>>>> actions:"], [0], [dnl
>>>>> +Datapath actions: clone(ct,recirc(X)),1
>>>>> +Datapath actions: 3
>>>>> +])
>>>>> +
>>>>> +OVS_VSWITCHD_STOP
>>>>> +AT_CLEANUP
>>>>> +
>>>>> +
>>>>> +AT_SETUP([ofproto-dpif - patch ports - trunc (clone)])
>>>>> +
>>>>> +OVS_VSWITCHD_START(
>>>>> +  [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 --
>> \
>>>>> +   add-port br0 p1 -- set Interface p1 type=patch options:peer=p2
>>>> ofport_request=2 -- \
>>>>> +   add-br br1 -- \
>>>>> +   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \
>>>>> +   set bridge br1 datapath-type=dummy other-config:datapath-id=1234
>>>> fail-mode=secure -- \
>>>>> +   add-port br1 p2 -- set Interface p2 type=patch options:peer=p1 -- \
>>>>> +   add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3])
>>>>> +
>>>>> +AT_CHECK([ovs-ofctl del-flows br0])
>>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0
>>>> in_port=local,ip,actions=p1,p0])
>>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
>>>> "in_port=1,ip,actions=resubmit(,3)"])
>>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
>>>> "table=3,in_port=1,ip,actions=output(port=p3, max_len=20)"])
>>>>> +
>>>>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | grep
>>>> "Datapath actions:"], [0], [dnl
>>>>> +Datapath actions: clone(trunc(20),3),1
>>>>> +])
>>>>> +
>>>>> +OVS_VSWITCHD_STOP
>>>>> +AT_CLEANUP
>>>>> +
>>>>> +AT_SETUP([ofproto-dpif - patch ports - encap (clone)])
>>>>> +
>>>>> +OVS_VSWITCHD_START(
>>>>> +  [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 --
>> \
>>>>> +   add-port br0 p1 -- set Interface p1 type=patch options:peer=p2
>>>> ofport_request=2 -- \
>>>>> +   add-br br1 -- \
>>>>> +   set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \
>>>>> +   set bridge br1 datapath-type=dummy other-config:datapath-id=1234
>>>> fail-mode=secure -- \
>>>>> +   add-port br1 p2 -- set Interface p2 type=patch options:peer=p1 -- \
>>>>> +   add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3])
>>>>> +
>>>>> +AT_CHECK([ovs-ofctl del-flows br0])
>>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0
>>>> "in_port=local,ip,actions=encap(nsh()),encap(ethernet),p1,p0"])
>>>>> +#AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
>>>> "in_port=1,dl_type=0x894f,actions=resubmit(,3)"])
>>>>
>>>> Forgot to remove, or uncomment?
>>>>
>>>
>>> To uncomment :) Done in v4, thanks.
>>>
>>>
>>>>
>>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1
>>>> "in_port=1,dl_type=0x894f,actions=decap(),decap(),p3"])
>>>>> +
>>>>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | \
>>>>> +          sed 's/recirc(0x[[0-9]]\+)/recirc(X)/' | grep "Datapath
>>>> actions:"], [0], [dnl
>>>>> +Datapath actions:
>>>>
>> push_nsh(flags=0,ttl=63,mdtype=1,np=3,spi=0x0,si=255,c1=0x0,c2=0x0,c3=0x0,c4=0x0),push_eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),clone(pop_eth,pop_nsh(),recirc(X)),1
>>>>> +])
>>>>> +
>>>>> +OVS_VSWITCHD_STOP
>>>>> +AT_CLEANUP
>>>>> --
>>>>> 2.37.1
>>>>>
>>>>> _______________________________________________
>>>>> dev mailing list
>>>>> [email protected]
>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>
>>>>
>>> Thanks,
>>> Ales
>>> --
>>>
>>> Ales Musil
>>>
>>> Senior Software Engineer - OVN Core
>>>
>>> Red Hat EMEA <https://www.redhat.com>
>>>
>>> [email protected]    IM: amusil
>>> <https://red.ht/sig>
>>
>>
> Thanks,
> Ales
>
> -- 
>
> Ales Musil
>
> Senior Software Engineer - OVN Core
>
> Red Hat EMEA <https://www.redhat.com>
>
> [email protected]    IM: amusil
> <https://red.ht/sig>

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to