On Mon, Sep 12, 2022 at 2:38 PM Eelco Chaudron <[email protected]> wrote:
> > > On 6 Sep 2022, at 7:49, Ales Musil wrote: > > > On Mon, Sep 5, 2022 at 4:31 PM Eelco Chaudron <[email protected]> > wrote: > > > >> > >> > >> On 5 Sep 2022, at 12:45, Ales Musil wrote: > >> > >>> On Fri, Sep 2, 2022 at 4:19 PM Eelco Chaudron <[email protected]> > >> wrote: > >>> > >>>> > >>>> > >>>> On 9 Aug 2022, at 9:17, Ales Musil wrote: > >>>> > >>>>> When the packet was traveling through patch port boundary > >>>>> OvS would check if any of the actions is reversible, > >>>>> if not it would clone the packet. However, the check > >>>>> was only at the first level of the second bridge. > >>>>> That caused some issues when the packet had gone > >>>>> through more actions, some of them might have been > >>>>> irreversible. > >>>>> > >>>>> In order to keep the semantics the same we might > >>>>> need to run the actions twice in the worst case > >>>>> scenario. During the clone there are 4 scenarios > >>>>> that can happen. > >>>>> > >>>>> 1) The action is last one for that flow, > >>>>> in that case we just continue without clone. > >>>>> > >>>>> 2) There is irreversible action in the action > >>>>> set (first level). In this case we know that > >>>>> there is at leas one irreversible action which > >>>>> is enough to do clone. > >>>>> > >>>>> 3) All actions in first level are reversible, > >>>>> we can try to run all actions as if we don't > >>>>> need any clone and inspect the ofpbuf at the > >>>>> end. In positive case there are no irreversible > >>>>> actions so we will just submit the buffer and continue. > >>>>> > >>>>> 4) This is same as 3) with the difference that > >>>>> there is irreversible action in the ofpbuf. > >>>>> To keep the semantics we need to re-run the actions > >>>>> and treat it as clone. This requires resotration > >>>>> of the xlate_ctx. > >>>>> > >>>>> Add test cases for all irreversible actions > >>>>> to see if they are properly cloned. > >>>>> > >>>>> Signed-off-by: Ales Musil <[email protected]> > >> > >> <SNIP> > >> > >>>>> + /* In any other case we cannot be certain if there is any > >>>> irreversible > >>>>> + * action down the action stack. Let's run the actions as if > they > >>>> are > >>>>> + * reversible, if we find out that there was any irreversible > >>>> action we > >>>>> + * can restore the xlate_ctx and run the do_clone instead. */ > >>>>> + struct ofpbuf old_stack = ctx->stack; > >>>>> + union mf_subvalue new_stack[1024 / sizeof (union mf_subvalue)]; > >>>>> + > >>>>> + ofpbuf_use_stub(&ctx->stack, new_stack, sizeof new_stack); > >>>>> + ofpbuf_put(&ctx->stack, old_stack.data, old_stack.size); > >>>>> + > >>>>> + struct ofpbuf old_action_set = ctx->action_set; > >>>>> + uint64_t actset_stub[1024 / 8]; > >>>>> + > >>>>> + ofpbuf_use_stub(&ctx->action_set, actset_stub, sizeof > >> actset_stub); > >>>>> + ofpbuf_put(&ctx->action_set, old_action_set.data, > >>>> old_action_set.size); > >>>>> + > >>>>> + struct flow old_flow = ctx->xin->flow; > >>>>> + bool old_was_mpls = ctx->was_mpls; > >>>>> + bool old_conntracked = ctx->conntracked; > >>>>> + struct flow old_base = ctx->base_flow; > >>>> > >>>> This is the part that scares me. Are these the only value we need to > >> cache > >>>> and restore? Are we sure none of the other CTX fields arale > potentially > >>>> adjusted? > >>>> What about fields like sflow_n_outputs, pending_encap, resubmits, etc. > >> etc. > >>>> > >>> > >>> I did some testing and it seems that what is stored is enough, however > >> that > >>> does not mean that this is 100% correct. > >>> IMO it comes down to the else below. This is the same procedure that > was > >>> done by the clone before, however the else > >>> part of "if (!odp_contains_irreversible_action(odp_actions.data, > >>> odp_actions.size))" does need to do the heavy lifting > >>> of restoring everything back so we can clone again. > >> > >> Ok, so assuming it was working fine before with the same local ctx > >> changes, we can ignore it for now ;) > >> But it might make sense to copy the same code comments as in the > existing > >> function to make clear why we restore them. > >> > >>>>> + size_t old_recirc_size = ctx->xin->recirc_queue > >>>>> + ? ovs_list_size(ctx->xin->recirc_queue) > >>>>> + : 0; > >>>>> + > >>>>> + uint64_t odp_actions_stub[1024 / 8]; > >>>> > >>>> All these mf_subvalue/uint64_t arrays add quite some stack space, > isn't > >>>> there a smarter way we can deal with this? > >>>> Maybe we can store the size of the ofpbuf's and reset them to that > size? > >>>> > >>> > >>> Seems like a good idea, adjusted, thanks. > >>> > >>> > >>>> > >>>>> + struct ofpbuf odp_actions = > >>>> OFPBUF_STUB_INITIALIZER(odp_actions_stub); > >>>>> + struct ofpbuf *old_odp_actions = ctx->odp_actions; > >>>>> + > >>>>> + ctx->odp_actions = &odp_actions; > >>>>> + > >>>>> + do_xlate_actions(actions, actions_len, ctx, false, false); > >>>> > >>>> For consistency should we use "...ctx, do_xlate_actions, false);"? > >>>> Or maybe change the above also to "false, false", probably the better > >>>> option. > >>>> > >>> > >>> Not sure I follow, can you please elaborate a bit? > >>> Did you mean "...ctx, is_last_action, false);"? > >>> If so, at this point "is_last_action" is always false, but I agree that > >> in > >>> that if above it can be changed to " ...ctx, true, false);" > >> > >> Yes, this is the diff, way less complex than me trying to explain what I > >> meant ;) > >> > >> > >> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c > >> index 5a196ec5e..b0c328bf0 100644 > >> --- a/ofproto/ofproto-dpif-xlate.c > >> +++ b/ofproto/ofproto-dpif-xlate.c > >> @@ -5911,7 +5911,7 @@ clone_xlate_actions(const struct ofpact *actions, > >> size_t actions_len, > >> { > >> /* We will never clone on last action, just proceed. */ > >> if (is_last_action) { > >> - do_xlate_actions(actions, actions_len, ctx, is_last_action, > >> false); > >> + do_xlate_actions(actions, actions_len, ctx, true, false); > >> if (!ctx->freezing) { > >> xlate_action_set(ctx); > >> } > >> > >>>> > >>>>> + > >>>>> + /* Restore ctx parts that are common even if don't need to > clone, > >> */ > >>>> > >>>> , instead of . at the end of the sentence. > >>>> > >>> > >>> Done in v4. > >>> > >>> > >>>> > >>>>> + ctx->odp_actions = old_odp_actions; > >>>>> + ofpbuf_uninit(&ctx->action_set); > >>>>> + ctx->action_set = old_action_set; > >>>>> + ofpbuf_uninit(&ctx->stack); > >>>>> + ctx->stack = old_stack; > >>>>> + ctx->xin->flow = old_flow; > >>>>> + > >>>>> + if (!odp_contains_irreversible_action(odp_actions.data, > >>>> odp_actions.size)) { > >>>> > >>>> The line is too long, see robot email. > >>>> > >>> > >>> Done in v4. > >>> > >>> > >>>> > >>>>> + nl_msg_put(ctx->odp_actions, odp_actions.data, > >>>> odp_actions.size); > >>>>> + if (!ctx->freezing) { > >>>>> + xlate_action_set(ctx); > >>>>> + } > >>>>> + if (ctx->freezing) { > >>>>> + finish_freezing(ctx); > >>>>> + } > >>>>> + } else { > >>>>> + ctx->was_mpls = old_was_mpls; > >>>>> + ctx->conntracked = old_conntracked; > >>>>> + ctx->base_flow = old_base; > >>>>> + /* Clear recirculations that were added by non-clone path so > >>>> that > >>>>> + * packets are not recirculated twice. */ > >>>>> + size_t recirc_size = ctx->xin->recirc_queue > >>>>> + ? ovs_list_size(ctx->xin->recirc_queue) > >>>>> + : 0; > >>>>> + for (; old_recirc_size < recirc_size; old_recirc_size++) { > >>>>> + struct oftrace_recirc_node *node = > >>>>> + > >> CONTAINER_OF(ovs_list_pop_back(ctx->xin->recirc_queue), > >>>>> + struct oftrace_recirc_node, node); > >>>>> + oftrace_recirc_node_destroy(node); > >>>>> + } > >> > >> I’am also a bit confused here (maybe I need more coffee), but you reuse > >> the actions buffer you just filled with actions > >> > > > > I think the v3 and v4 got mixed up in the email, v4 is doing > > "nl_msg_reset_size(ctx->odp_actions, old_odp_actions_size);". > > > > Sorry for the late response, as I was out of the office last week. > > I did not see the v4 until you reminded me in this email ;) > Can you CC me on the v5, so I know a new version is present? > > >>>>> + do_clone(actions, actions_len, ctx); > >>>> > >>>> Here we will do the entire translation again, can we not just take the > >>>> newly added > >>>> datapath action stream and just encapsulate it in > OVS_ACTION_ATTR_CLONE? > >>>> > >>> > >>> I have tried this approach numerous times, but as explained in commit > >>> message in order to > >>> keep the semantics we need to rerun it again, maybe I have missed > >> something > >>> how to do it > >>> without breaking semantics. The main issue I had was that if we don't > >> clone > >>> we should not > >>> "xlate_commit_actions", but that does not seem to be possible to know > >>> without running all > >>> actions in the chain first. If we want to do it in one pass we would > >>> probably need to ensure > >>> that "xlate_commit_actions" can be reverted if needed. > >> > >> Sorry, I was miss-reading the code, I guess I needed more coffee last > >> Friday :) > >> > >> I guess the process is ok, however, might it be possible to make the > code > >> easier to read? There is also some code duplication, which would be > nice > >> to get rid of. > >> > > > > Do you have something specific in mind? If not I have some things in mind > > that could improve that (I hope). > > I have nothing in mind, but I’ll wait for your v5 and see. > I have actually pushed v5 faster. https://patchwork.ozlabs.org/project/openvswitch/list/?series=317208 I'll add you to the cc list so you'll get any further update. Thanks, Ales > > >>>>> + } > >>>>> +} > >>>>> + > >>>>> static void > >>>>> compose_clone(struct xlate_ctx *ctx, const struct ofpact_nest *oc, > >>>>> bool is_last_action) > >>>>> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at > >>>>> index 2e91ae1a1..7bcb2272b 100644 > >>>>> --- a/tests/ofproto-dpif.at > >>>>> +++ b/tests/ofproto-dpif.at > >>>>> @@ -11817,3 +11817,108 @@ AT_CHECK([test 1 = `ovs-ofctl parse-pcap > >>>> p2-tx.pcap | wc -l`]) > >>>>> > >>>>> OVS_VSWITCHD_STOP > >>>>> AT_CLEANUP > >>>>> + > >>>>> +AT_SETUP([ofproto-dpif - nested patch ports clone]) > >>>>> + > >>>>> +OVS_VSWITCHD_START( > >>>>> + [add-port br0 br0-p0 -- set Interface br0-p0 type=dummy > >>>> ofport_request=1 -- \ > >>>>> + add-port br0 br0-p1 -- set Interface br0-p1 type=patch > >>>> options:peer=br1-p0 -- \ > >>>>> + add-br br1 -- \ > >>>>> + set bridge br1 other-config:hwaddr=aa:66:aa:66:00:01 -- \ > >>>>> + set bridge br1 datapath-type=dummy other-config:datapath-id=1111 > >>>> fail-mode=secure -- \ > >>>>> + add-port br1 br1-p0 -- set Interface br1-p0 type=patch > >>>> options:peer=br0-p1 -- \ > >>>>> + add-port br1 br1-p1 -- set Interface br1-p1 type=patch > >>>> options:peer=br2-p0 -- \ > >>>>> + add-port br1 br1-p2 -- set Interface br1-p2 type=dummy > >>>> ofport_request=2 --\ > >>>>> + add-br br2 -- \ > >>>>> + set bridge br2 other-config:hwaddr=aa:66:aa:66:00:02 -- \ > >>>>> + set bridge br2 datapath-type=dummy other-config:datapath-id=2222 > >>>> fail-mode=secure -- \ > >>>>> + add-port br2 br2-p0 -- set Interface br2-p0 type=patch > >>>> options:peer=br1-p1 -- \ > >>>>> + add-port br2 br2-p1 -- set Interface br2-p1 type=dummy > >>>> ofport_request=3]) > >>>>> + > >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br1 'meter=1 pktps stats > >>>> bands=type=drop rate=2']) > >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br2 'meter=1 pktps stats > >>>> bands=type=drop rate=2']) > >>>>> +AT_CHECK([ovs-ofctl del-flows br0]) > >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 > >>>> in_port=local,ip,actions=br0-p1,br0-p0]) > >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 > >>>> in_port=br1-p0,ip,actions=meter:1,br1-p1,br1-p2]) > >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br2 > >>>> in_port=br2-p0,ip,actions=meter:1,br2-p1]) > >>>>> + > >>>>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | grep > >>>> "Datapath actions:"], [0], [dnl > >>>>> +Datapath actions: clone(meter(0),clone(meter(1),102),101),100 > >>>>> +]) > >>>>> + > >>>>> +OVS_VSWITCHD_STOP > >>>>> +AT_CLEANUP > >>>>> + > >>>>> +AT_SETUP([ofproto-dpif - patch ports - ct (clone)]) > >>>>> + > >>>>> +OVS_VSWITCHD_START( > >>>>> + [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 > -- > >> \ > >>>>> + add-port br0 p1 -- set Interface p1 type=patch options:peer=p2 > >>>> ofport_request=2 -- \ > >>>>> + add-br br1 -- \ > >>>>> + set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \ > >>>>> + set bridge br1 datapath-type=dummy other-config:datapath-id=1234 > >>>> fail-mode=secure -- \ > >>>>> + add-port br1 p2 -- set Interface p2 type=patch options:peer=p1 > -- \ > >>>>> + add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3]) > >>>>> + > >>>>> +AT_CHECK([ovs-ofctl del-flows br0]) > >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 > >>>> in_port=local,ip,actions=p1,p0]) > >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 > >>>> "in_port=1,ip,actions=resubmit(,3)"]) > >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 > >>>> "table=3,in_port=1,ip,ct_state=-trk,actions=ct(table=0)"]) > >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 > >>>> "table=3,in_port=1,ip,ct_state=+trk,actions=p3"]) > >>>>> + > >>>>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | \ > >>>>> + sed 's/recirc(0x[[0-9]]\+)/recirc(X)/' | grep "Datapath > >>>> actions:"], [0], [dnl > >>>>> +Datapath actions: clone(ct,recirc(X)),1 > >>>>> +Datapath actions: 3 > >>>>> +]) > >>>>> + > >>>>> +OVS_VSWITCHD_STOP > >>>>> +AT_CLEANUP > >>>>> + > >>>>> + > >>>>> +AT_SETUP([ofproto-dpif - patch ports - trunc (clone)]) > >>>>> + > >>>>> +OVS_VSWITCHD_START( > >>>>> + [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 > -- > >> \ > >>>>> + add-port br0 p1 -- set Interface p1 type=patch options:peer=p2 > >>>> ofport_request=2 -- \ > >>>>> + add-br br1 -- \ > >>>>> + set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \ > >>>>> + set bridge br1 datapath-type=dummy other-config:datapath-id=1234 > >>>> fail-mode=secure -- \ > >>>>> + add-port br1 p2 -- set Interface p2 type=patch options:peer=p1 > -- \ > >>>>> + add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3]) > >>>>> + > >>>>> +AT_CHECK([ovs-ofctl del-flows br0]) > >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 > >>>> in_port=local,ip,actions=p1,p0]) > >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 > >>>> "in_port=1,ip,actions=resubmit(,3)"]) > >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 > >>>> "table=3,in_port=1,ip,actions=output(port=p3, max_len=20)"]) > >>>>> + > >>>>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | grep > >>>> "Datapath actions:"], [0], [dnl > >>>>> +Datapath actions: clone(trunc(20),3),1 > >>>>> +]) > >>>>> + > >>>>> +OVS_VSWITCHD_STOP > >>>>> +AT_CLEANUP > >>>>> + > >>>>> +AT_SETUP([ofproto-dpif - patch ports - encap (clone)]) > >>>>> + > >>>>> +OVS_VSWITCHD_START( > >>>>> + [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 > -- > >> \ > >>>>> + add-port br0 p1 -- set Interface p1 type=patch options:peer=p2 > >>>> ofport_request=2 -- \ > >>>>> + add-br br1 -- \ > >>>>> + set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \ > >>>>> + set bridge br1 datapath-type=dummy other-config:datapath-id=1234 > >>>> fail-mode=secure -- \ > >>>>> + add-port br1 p2 -- set Interface p2 type=patch options:peer=p1 > -- \ > >>>>> + add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3]) > >>>>> + > >>>>> +AT_CHECK([ovs-ofctl del-flows br0]) > >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 > >>>> "in_port=local,ip,actions=encap(nsh()),encap(ethernet),p1,p0"]) > >>>>> +#AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 > >>>> "in_port=1,dl_type=0x894f,actions=resubmit(,3)"]) > >>>> > >>>> Forgot to remove, or uncomment? > >>>> > >>> > >>> To uncomment :) Done in v4, thanks. > >>> > >>> > >>>> > >>>>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 > >>>> "in_port=1,dl_type=0x894f,actions=decap(),decap(),p3"]) > >>>>> + > >>>>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | \ > >>>>> + sed 's/recirc(0x[[0-9]]\+)/recirc(X)/' | grep "Datapath > >>>> actions:"], [0], [dnl > >>>>> +Datapath actions: > >>>> > >> > push_nsh(flags=0,ttl=63,mdtype=1,np=3,spi=0x0,si=255,c1=0x0,c2=0x0,c3=0x0,c4=0x0),push_eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),clone(pop_eth,pop_nsh(),recirc(X)),1 > >>>>> +]) > >>>>> + > >>>>> +OVS_VSWITCHD_STOP > >>>>> +AT_CLEANUP > >>>>> -- > >>>>> 2.37.1 > >>>>> > >>>>> _______________________________________________ > >>>>> dev mailing list > >>>>> [email protected] > >>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >>>> > >>>> > >>> Thanks, > >>> Ales > >>> -- > >>> > >>> Ales Musil > >>> > >>> Senior Software Engineer - OVN Core > >>> > >>> Red Hat EMEA <https://www.redhat.com> > >>> > >>> [email protected] IM: amusil > >>> <https://red.ht/sig> > >> > >> > > Thanks, > > Ales > > > > -- > > > > Ales Musil > > > > Senior Software Engineer - OVN Core > > > > Red Hat EMEA <https://www.redhat.com> > > > > [email protected] IM: amusil > > <https://red.ht/sig> > > -- Ales Musil Senior Software Engineer - OVN Core Red Hat EMEA <https://www.redhat.com> [email protected] IM: amusil <https://red.ht/sig> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
