On 6 Sep 2022, at 7:49, Ales Musil wrote:
> On Mon, Sep 5, 2022 at 4:31 PM Eelco Chaudron <[email protected]> wrote: > >> >> >> On 5 Sep 2022, at 12:45, Ales Musil wrote: >> >>> On Fri, Sep 2, 2022 at 4:19 PM Eelco Chaudron <[email protected]> >> wrote: >>> >>>> >>>> >>>> On 9 Aug 2022, at 9:17, Ales Musil wrote: >>>> >>>>> When the packet was traveling through patch port boundary >>>>> OvS would check if any of the actions is reversible, >>>>> if not it would clone the packet. However, the check >>>>> was only at the first level of the second bridge. >>>>> That caused some issues when the packet had gone >>>>> through more actions, some of them might have been >>>>> irreversible. >>>>> >>>>> In order to keep the semantics the same we might >>>>> need to run the actions twice in the worst case >>>>> scenario. During the clone there are 4 scenarios >>>>> that can happen. >>>>> >>>>> 1) The action is last one for that flow, >>>>> in that case we just continue without clone. >>>>> >>>>> 2) There is irreversible action in the action >>>>> set (first level). In this case we know that >>>>> there is at leas one irreversible action which >>>>> is enough to do clone. >>>>> >>>>> 3) All actions in first level are reversible, >>>>> we can try to run all actions as if we don't >>>>> need any clone and inspect the ofpbuf at the >>>>> end. In positive case there are no irreversible >>>>> actions so we will just submit the buffer and continue. >>>>> >>>>> 4) This is same as 3) with the difference that >>>>> there is irreversible action in the ofpbuf. >>>>> To keep the semantics we need to re-run the actions >>>>> and treat it as clone. This requires resotration >>>>> of the xlate_ctx. >>>>> >>>>> Add test cases for all irreversible actions >>>>> to see if they are properly cloned. >>>>> >>>>> Signed-off-by: Ales Musil <[email protected]> >> >> <SNIP> >> >>>>> + /* In any other case we cannot be certain if there is any >>>> irreversible >>>>> + * action down the action stack. Let's run the actions as if they >>>> are >>>>> + * reversible, if we find out that there was any irreversible >>>> action we >>>>> + * can restore the xlate_ctx and run the do_clone instead. */ >>>>> + struct ofpbuf old_stack = ctx->stack; >>>>> + union mf_subvalue new_stack[1024 / sizeof (union mf_subvalue)]; >>>>> + >>>>> + ofpbuf_use_stub(&ctx->stack, new_stack, sizeof new_stack); >>>>> + ofpbuf_put(&ctx->stack, old_stack.data, old_stack.size); >>>>> + >>>>> + struct ofpbuf old_action_set = ctx->action_set; >>>>> + uint64_t actset_stub[1024 / 8]; >>>>> + >>>>> + ofpbuf_use_stub(&ctx->action_set, actset_stub, sizeof >> actset_stub); >>>>> + ofpbuf_put(&ctx->action_set, old_action_set.data, >>>> old_action_set.size); >>>>> + >>>>> + struct flow old_flow = ctx->xin->flow; >>>>> + bool old_was_mpls = ctx->was_mpls; >>>>> + bool old_conntracked = ctx->conntracked; >>>>> + struct flow old_base = ctx->base_flow; >>>> >>>> This is the part that scares me. Are these the only value we need to >> cache >>>> and restore? Are we sure none of the other CTX fields arale potentially >>>> adjusted? >>>> What about fields like sflow_n_outputs, pending_encap, resubmits, etc. >> etc. >>>> >>> >>> I did some testing and it seems that what is stored is enough, however >> that >>> does not mean that this is 100% correct. >>> IMO it comes down to the else below. This is the same procedure that was >>> done by the clone before, however the else >>> part of "if (!odp_contains_irreversible_action(odp_actions.data, >>> odp_actions.size))" does need to do the heavy lifting >>> of restoring everything back so we can clone again. >> >> Ok, so assuming it was working fine before with the same local ctx >> changes, we can ignore it for now ;) >> But it might make sense to copy the same code comments as in the existing >> function to make clear why we restore them. >> >>>>> + size_t old_recirc_size = ctx->xin->recirc_queue >>>>> + ? ovs_list_size(ctx->xin->recirc_queue) >>>>> + : 0; >>>>> + >>>>> + uint64_t odp_actions_stub[1024 / 8]; >>>> >>>> All these mf_subvalue/uint64_t arrays add quite some stack space, isn't >>>> there a smarter way we can deal with this? >>>> Maybe we can store the size of the ofpbuf's and reset them to that size? >>>> >>> >>> Seems like a good idea, adjusted, thanks. >>> >>> >>>> >>>>> + struct ofpbuf odp_actions = >>>> OFPBUF_STUB_INITIALIZER(odp_actions_stub); >>>>> + struct ofpbuf *old_odp_actions = ctx->odp_actions; >>>>> + >>>>> + ctx->odp_actions = &odp_actions; >>>>> + >>>>> + do_xlate_actions(actions, actions_len, ctx, false, false); >>>> >>>> For consistency should we use "...ctx, do_xlate_actions, false);"? >>>> Or maybe change the above also to "false, false", probably the better >>>> option. >>>> >>> >>> Not sure I follow, can you please elaborate a bit? >>> Did you mean "...ctx, is_last_action, false);"? >>> If so, at this point "is_last_action" is always false, but I agree that >> in >>> that if above it can be changed to " ...ctx, true, false);" >> >> Yes, this is the diff, way less complex than me trying to explain what I >> meant ;) >> >> >> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c >> index 5a196ec5e..b0c328bf0 100644 >> --- a/ofproto/ofproto-dpif-xlate.c >> +++ b/ofproto/ofproto-dpif-xlate.c >> @@ -5911,7 +5911,7 @@ clone_xlate_actions(const struct ofpact *actions, >> size_t actions_len, >> { >> /* We will never clone on last action, just proceed. */ >> if (is_last_action) { >> - do_xlate_actions(actions, actions_len, ctx, is_last_action, >> false); >> + do_xlate_actions(actions, actions_len, ctx, true, false); >> if (!ctx->freezing) { >> xlate_action_set(ctx); >> } >> >>>> >>>>> + >>>>> + /* Restore ctx parts that are common even if don't need to clone, >> */ >>>> >>>> , instead of . at the end of the sentence. >>>> >>> >>> Done in v4. >>> >>> >>>> >>>>> + ctx->odp_actions = old_odp_actions; >>>>> + ofpbuf_uninit(&ctx->action_set); >>>>> + ctx->action_set = old_action_set; >>>>> + ofpbuf_uninit(&ctx->stack); >>>>> + ctx->stack = old_stack; >>>>> + ctx->xin->flow = old_flow; >>>>> + >>>>> + if (!odp_contains_irreversible_action(odp_actions.data, >>>> odp_actions.size)) { >>>> >>>> The line is too long, see robot email. >>>> >>> >>> Done in v4. >>> >>> >>>> >>>>> + nl_msg_put(ctx->odp_actions, odp_actions.data, >>>> odp_actions.size); >>>>> + if (!ctx->freezing) { >>>>> + xlate_action_set(ctx); >>>>> + } >>>>> + if (ctx->freezing) { >>>>> + finish_freezing(ctx); >>>>> + } >>>>> + } else { >>>>> + ctx->was_mpls = old_was_mpls; >>>>> + ctx->conntracked = old_conntracked; >>>>> + ctx->base_flow = old_base; >>>>> + /* Clear recirculations that were added by non-clone path so >>>> that >>>>> + * packets are not recirculated twice. */ >>>>> + size_t recirc_size = ctx->xin->recirc_queue >>>>> + ? ovs_list_size(ctx->xin->recirc_queue) >>>>> + : 0; >>>>> + for (; old_recirc_size < recirc_size; old_recirc_size++) { >>>>> + struct oftrace_recirc_node *node = >>>>> + >> CONTAINER_OF(ovs_list_pop_back(ctx->xin->recirc_queue), >>>>> + struct oftrace_recirc_node, node); >>>>> + oftrace_recirc_node_destroy(node); >>>>> + } >> >> I’am also a bit confused here (maybe I need more coffee), but you reuse >> the actions buffer you just filled with actions >> > > I think the v3 and v4 got mixed up in the email, v4 is doing > "nl_msg_reset_size(ctx->odp_actions, old_odp_actions_size);". > Sorry for the late response, as I was out of the office last week. I did not see the v4 until you reminded me in this email ;) Can you CC me on the v5, so I know a new version is present? >>>>> + do_clone(actions, actions_len, ctx); >>>> >>>> Here we will do the entire translation again, can we not just take the >>>> newly added >>>> datapath action stream and just encapsulate it in OVS_ACTION_ATTR_CLONE? >>>> >>> >>> I have tried this approach numerous times, but as explained in commit >>> message in order to >>> keep the semantics we need to rerun it again, maybe I have missed >> something >>> how to do it >>> without breaking semantics. The main issue I had was that if we don't >> clone >>> we should not >>> "xlate_commit_actions", but that does not seem to be possible to know >>> without running all >>> actions in the chain first. If we want to do it in one pass we would >>> probably need to ensure >>> that "xlate_commit_actions" can be reverted if needed. >> >> Sorry, I was miss-reading the code, I guess I needed more coffee last >> Friday :) >> >> I guess the process is ok, however, might it be possible to make the code >> easier to read? There is also some code duplication, which would be nice >> to get rid of. >> > > Do you have something specific in mind? If not I have some things in mind > that could improve that (I hope). I have nothing in mind, but I’ll wait for your v5 and see. >>>>> + } >>>>> +} >>>>> + >>>>> static void >>>>> compose_clone(struct xlate_ctx *ctx, const struct ofpact_nest *oc, >>>>> bool is_last_action) >>>>> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at >>>>> index 2e91ae1a1..7bcb2272b 100644 >>>>> --- a/tests/ofproto-dpif.at >>>>> +++ b/tests/ofproto-dpif.at >>>>> @@ -11817,3 +11817,108 @@ AT_CHECK([test 1 = `ovs-ofctl parse-pcap >>>> p2-tx.pcap | wc -l`]) >>>>> >>>>> OVS_VSWITCHD_STOP >>>>> AT_CLEANUP >>>>> + >>>>> +AT_SETUP([ofproto-dpif - nested patch ports clone]) >>>>> + >>>>> +OVS_VSWITCHD_START( >>>>> + [add-port br0 br0-p0 -- set Interface br0-p0 type=dummy >>>> ofport_request=1 -- \ >>>>> + add-port br0 br0-p1 -- set Interface br0-p1 type=patch >>>> options:peer=br1-p0 -- \ >>>>> + add-br br1 -- \ >>>>> + set bridge br1 other-config:hwaddr=aa:66:aa:66:00:01 -- \ >>>>> + set bridge br1 datapath-type=dummy other-config:datapath-id=1111 >>>> fail-mode=secure -- \ >>>>> + add-port br1 br1-p0 -- set Interface br1-p0 type=patch >>>> options:peer=br0-p1 -- \ >>>>> + add-port br1 br1-p1 -- set Interface br1-p1 type=patch >>>> options:peer=br2-p0 -- \ >>>>> + add-port br1 br1-p2 -- set Interface br1-p2 type=dummy >>>> ofport_request=2 --\ >>>>> + add-br br2 -- \ >>>>> + set bridge br2 other-config:hwaddr=aa:66:aa:66:00:02 -- \ >>>>> + set bridge br2 datapath-type=dummy other-config:datapath-id=2222 >>>> fail-mode=secure -- \ >>>>> + add-port br2 br2-p0 -- set Interface br2-p0 type=patch >>>> options:peer=br1-p1 -- \ >>>>> + add-port br2 br2-p1 -- set Interface br2-p1 type=dummy >>>> ofport_request=3]) >>>>> + >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br1 'meter=1 pktps stats >>>> bands=type=drop rate=2']) >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br2 'meter=1 pktps stats >>>> bands=type=drop rate=2']) >>>>> +AT_CHECK([ovs-ofctl del-flows br0]) >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 >>>> in_port=local,ip,actions=br0-p1,br0-p0]) >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 >>>> in_port=br1-p0,ip,actions=meter:1,br1-p1,br1-p2]) >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br2 >>>> in_port=br2-p0,ip,actions=meter:1,br2-p1]) >>>>> + >>>>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | grep >>>> "Datapath actions:"], [0], [dnl >>>>> +Datapath actions: clone(meter(0),clone(meter(1),102),101),100 >>>>> +]) >>>>> + >>>>> +OVS_VSWITCHD_STOP >>>>> +AT_CLEANUP >>>>> + >>>>> +AT_SETUP([ofproto-dpif - patch ports - ct (clone)]) >>>>> + >>>>> +OVS_VSWITCHD_START( >>>>> + [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 -- >> \ >>>>> + add-port br0 p1 -- set Interface p1 type=patch options:peer=p2 >>>> ofport_request=2 -- \ >>>>> + add-br br1 -- \ >>>>> + set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \ >>>>> + set bridge br1 datapath-type=dummy other-config:datapath-id=1234 >>>> fail-mode=secure -- \ >>>>> + add-port br1 p2 -- set Interface p2 type=patch options:peer=p1 -- \ >>>>> + add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3]) >>>>> + >>>>> +AT_CHECK([ovs-ofctl del-flows br0]) >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 >>>> in_port=local,ip,actions=p1,p0]) >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 >>>> "in_port=1,ip,actions=resubmit(,3)"]) >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 >>>> "table=3,in_port=1,ip,ct_state=-trk,actions=ct(table=0)"]) >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 >>>> "table=3,in_port=1,ip,ct_state=+trk,actions=p3"]) >>>>> + >>>>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | \ >>>>> + sed 's/recirc(0x[[0-9]]\+)/recirc(X)/' | grep "Datapath >>>> actions:"], [0], [dnl >>>>> +Datapath actions: clone(ct,recirc(X)),1 >>>>> +Datapath actions: 3 >>>>> +]) >>>>> + >>>>> +OVS_VSWITCHD_STOP >>>>> +AT_CLEANUP >>>>> + >>>>> + >>>>> +AT_SETUP([ofproto-dpif - patch ports - trunc (clone)]) >>>>> + >>>>> +OVS_VSWITCHD_START( >>>>> + [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 -- >> \ >>>>> + add-port br0 p1 -- set Interface p1 type=patch options:peer=p2 >>>> ofport_request=2 -- \ >>>>> + add-br br1 -- \ >>>>> + set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \ >>>>> + set bridge br1 datapath-type=dummy other-config:datapath-id=1234 >>>> fail-mode=secure -- \ >>>>> + add-port br1 p2 -- set Interface p2 type=patch options:peer=p1 -- \ >>>>> + add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3]) >>>>> + >>>>> +AT_CHECK([ovs-ofctl del-flows br0]) >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 >>>> in_port=local,ip,actions=p1,p0]) >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 >>>> "in_port=1,ip,actions=resubmit(,3)"]) >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 >>>> "table=3,in_port=1,ip,actions=output(port=p3, max_len=20)"]) >>>>> + >>>>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | grep >>>> "Datapath actions:"], [0], [dnl >>>>> +Datapath actions: clone(trunc(20),3),1 >>>>> +]) >>>>> + >>>>> +OVS_VSWITCHD_STOP >>>>> +AT_CLEANUP >>>>> + >>>>> +AT_SETUP([ofproto-dpif - patch ports - encap (clone)]) >>>>> + >>>>> +OVS_VSWITCHD_START( >>>>> + [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 -- >> \ >>>>> + add-port br0 p1 -- set Interface p1 type=patch options:peer=p2 >>>> ofport_request=2 -- \ >>>>> + add-br br1 -- \ >>>>> + set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \ >>>>> + set bridge br1 datapath-type=dummy other-config:datapath-id=1234 >>>> fail-mode=secure -- \ >>>>> + add-port br1 p2 -- set Interface p2 type=patch options:peer=p1 -- \ >>>>> + add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3]) >>>>> + >>>>> +AT_CHECK([ovs-ofctl del-flows br0]) >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 >>>> "in_port=local,ip,actions=encap(nsh()),encap(ethernet),p1,p0"]) >>>>> +#AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 >>>> "in_port=1,dl_type=0x894f,actions=resubmit(,3)"]) >>>> >>>> Forgot to remove, or uncomment? >>>> >>> >>> To uncomment :) Done in v4, thanks. >>> >>> >>>> >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 >>>> "in_port=1,dl_type=0x894f,actions=decap(),decap(),p3"]) >>>>> + >>>>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | \ >>>>> + sed 's/recirc(0x[[0-9]]\+)/recirc(X)/' | grep "Datapath >>>> actions:"], [0], [dnl >>>>> +Datapath actions: >>>> >> push_nsh(flags=0,ttl=63,mdtype=1,np=3,spi=0x0,si=255,c1=0x0,c2=0x0,c3=0x0,c4=0x0),push_eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),clone(pop_eth,pop_nsh(),recirc(X)),1 >>>>> +]) >>>>> + >>>>> +OVS_VSWITCHD_STOP >>>>> +AT_CLEANUP >>>>> -- >>>>> 2.37.1 >>>>> >>>>> _______________________________________________ >>>>> dev mailing list >>>>> [email protected] >>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>> >>>> >>> Thanks, >>> Ales >>> -- >>> >>> Ales Musil >>> >>> Senior Software Engineer - OVN Core >>> >>> Red Hat EMEA <https://www.redhat.com> >>> >>> [email protected] IM: amusil >>> <https://red.ht/sig> >> >> > Thanks, > Ales > > -- > > Ales Musil > > Senior Software Engineer - OVN Core > > Red Hat EMEA <https://www.redhat.com> > > [email protected] IM: amusil > <https://red.ht/sig> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
