On Mon, Sep 5, 2022 at 4:31 PM Eelco Chaudron <[email protected]> wrote:
> > > On 5 Sep 2022, at 12:45, Ales Musil wrote: > > > On Fri, Sep 2, 2022 at 4:19 PM Eelco Chaudron <[email protected]> > wrote: > > > >> > >> > >> On 9 Aug 2022, at 9:17, Ales Musil wrote: > >> > >>> When the packet was traveling through patch port boundary > >>> OvS would check if any of the actions is reversible, > >>> if not it would clone the packet. However, the check > >>> was only at the first level of the second bridge. > >>> That caused some issues when the packet had gone > >>> through more actions, some of them might have been > >>> irreversible. > >>> > >>> In order to keep the semantics the same we might > >>> need to run the actions twice in the worst case > >>> scenario. During the clone there are 4 scenarios > >>> that can happen. > >>> > >>> 1) The action is last one for that flow, > >>> in that case we just continue without clone. > >>> > >>> 2) There is irreversible action in the action > >>> set (first level). In this case we know that > >>> there is at leas one irreversible action which > >>> is enough to do clone. > >>> > >>> 3) All actions in first level are reversible, > >>> we can try to run all actions as if we don't > >>> need any clone and inspect the ofpbuf at the > >>> end. In positive case there are no irreversible > >>> actions so we will just submit the buffer and continue. > >>> > >>> 4) This is same as 3) with the difference that > >>> there is irreversible action in the ofpbuf. > >>> To keep the semantics we need to re-run the actions > >>> and treat it as clone. This requires resotration > >>> of the xlate_ctx. > >>> > >>> Add test cases for all irreversible actions > >>> to see if they are properly cloned. > >>> > >>> Signed-off-by: Ales Musil <[email protected]> > > <SNIP> > > >>> + /* In any other case we cannot be certain if there is any > >> irreversible > >>> + * action down the action stack. Let's run the actions as if they > >> are > >>> + * reversible, if we find out that there was any irreversible > >> action we > >>> + * can restore the xlate_ctx and run the do_clone instead. */ > >>> + struct ofpbuf old_stack = ctx->stack; > >>> + union mf_subvalue new_stack[1024 / sizeof (union mf_subvalue)]; > >>> + > >>> + ofpbuf_use_stub(&ctx->stack, new_stack, sizeof new_stack); > >>> + ofpbuf_put(&ctx->stack, old_stack.data, old_stack.size); > >>> + > >>> + struct ofpbuf old_action_set = ctx->action_set; > >>> + uint64_t actset_stub[1024 / 8]; > >>> + > >>> + ofpbuf_use_stub(&ctx->action_set, actset_stub, sizeof > actset_stub); > >>> + ofpbuf_put(&ctx->action_set, old_action_set.data, > >> old_action_set.size); > >>> + > >>> + struct flow old_flow = ctx->xin->flow; > >>> + bool old_was_mpls = ctx->was_mpls; > >>> + bool old_conntracked = ctx->conntracked; > >>> + struct flow old_base = ctx->base_flow; > >> > >> This is the part that scares me. Are these the only value we need to > cache > >> and restore? Are we sure none of the other CTX fields arale potentially > >> adjusted? > >> What about fields like sflow_n_outputs, pending_encap, resubmits, etc. > etc. > >> > > > > I did some testing and it seems that what is stored is enough, however > that > > does not mean that this is 100% correct. > > IMO it comes down to the else below. This is the same procedure that was > > done by the clone before, however the else > > part of "if (!odp_contains_irreversible_action(odp_actions.data, > > odp_actions.size))" does need to do the heavy lifting > > of restoring everything back so we can clone again. > > Ok, so assuming it was working fine before with the same local ctx > changes, we can ignore it for now ;) > But it might make sense to copy the same code comments as in the existing > function to make clear why we restore them. > > >>> + size_t old_recirc_size = ctx->xin->recirc_queue > >>> + ? ovs_list_size(ctx->xin->recirc_queue) > >>> + : 0; > >>> + > >>> + uint64_t odp_actions_stub[1024 / 8]; > >> > >> All these mf_subvalue/uint64_t arrays add quite some stack space, isn't > >> there a smarter way we can deal with this? > >> Maybe we can store the size of the ofpbuf's and reset them to that size? > >> > > > > Seems like a good idea, adjusted, thanks. > > > > > >> > >>> + struct ofpbuf odp_actions = > >> OFPBUF_STUB_INITIALIZER(odp_actions_stub); > >>> + struct ofpbuf *old_odp_actions = ctx->odp_actions; > >>> + > >>> + ctx->odp_actions = &odp_actions; > >>> + > >>> + do_xlate_actions(actions, actions_len, ctx, false, false); > >> > >> For consistency should we use "...ctx, do_xlate_actions, false);"? > >> Or maybe change the above also to "false, false", probably the better > >> option. > >> > > > > Not sure I follow, can you please elaborate a bit? > > Did you mean "...ctx, is_last_action, false);"? > > If so, at this point "is_last_action" is always false, but I agree that > in > > that if above it can be changed to " ...ctx, true, false);" > > Yes, this is the diff, way less complex than me trying to explain what I > meant ;) > > > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c > index 5a196ec5e..b0c328bf0 100644 > --- a/ofproto/ofproto-dpif-xlate.c > +++ b/ofproto/ofproto-dpif-xlate.c > @@ -5911,7 +5911,7 @@ clone_xlate_actions(const struct ofpact *actions, > size_t actions_len, > { > /* We will never clone on last action, just proceed. */ > if (is_last_action) { > - do_xlate_actions(actions, actions_len, ctx, is_last_action, > false); > + do_xlate_actions(actions, actions_len, ctx, true, false); > if (!ctx->freezing) { > xlate_action_set(ctx); > } > > >> > >>> + > >>> + /* Restore ctx parts that are common even if don't need to clone, > */ > >> > >> , instead of . at the end of the sentence. > >> > > > > Done in v4. > > > > > >> > >>> + ctx->odp_actions = old_odp_actions; > >>> + ofpbuf_uninit(&ctx->action_set); > >>> + ctx->action_set = old_action_set; > >>> + ofpbuf_uninit(&ctx->stack); > >>> + ctx->stack = old_stack; > >>> + ctx->xin->flow = old_flow; > >>> + > >>> + if (!odp_contains_irreversible_action(odp_actions.data, > >> odp_actions.size)) { > >> > >> The line is too long, see robot email. > >> > > > > Done in v4. > > > > > >> > >>> + nl_msg_put(ctx->odp_actions, odp_actions.data, > >> odp_actions.size); > >>> + if (!ctx->freezing) { > >>> + xlate_action_set(ctx); > >>> + } > >>> + if (ctx->freezing) { > >>> + finish_freezing(ctx); > >>> + } > >>> + } else { > >>> + ctx->was_mpls = old_was_mpls; > >>> + ctx->conntracked = old_conntracked; > >>> + ctx->base_flow = old_base; > >>> + /* Clear recirculations that were added by non-clone path so > >> that > >>> + * packets are not recirculated twice. */ > >>> + size_t recirc_size = ctx->xin->recirc_queue > >>> + ? ovs_list_size(ctx->xin->recirc_queue) > >>> + : 0; > >>> + for (; old_recirc_size < recirc_size; old_recirc_size++) { > >>> + struct oftrace_recirc_node *node = > >>> + > CONTAINER_OF(ovs_list_pop_back(ctx->xin->recirc_queue), > >>> + struct oftrace_recirc_node, node); > >>> + oftrace_recirc_node_destroy(node); > >>> + } > > I’am also a bit confused here (maybe I need more coffee), but you reuse > the actions buffer you just filled with actions > I think the v3 and v4 got mixed up in the email, v4 is doing "nl_msg_reset_size(ctx->odp_actions, old_odp_actions_size);". > > >>> + do_clone(actions, actions_len, ctx); > >> > >> Here we will do the entire translation again, can we not just take the > >> newly added > >> datapath action stream and just encapsulate it in OVS_ACTION_ATTR_CLONE? > >> > > > > I have tried this approach numerous times, but as explained in commit > > message in order to > > keep the semantics we need to rerun it again, maybe I have missed > something > > how to do it > > without breaking semantics. The main issue I had was that if we don't > clone > > we should not > > "xlate_commit_actions", but that does not seem to be possible to know > > without running all > > actions in the chain first. If we want to do it in one pass we would > > probably need to ensure > > that "xlate_commit_actions" can be reverted if needed. > > Sorry, I was miss-reading the code, I guess I needed more coffee last > Friday :) > > I guess the process is ok, however, might it be possible to make the code > easier to read? There is also some code duplication, which would be nice > to get rid of. > Do you have something specific in mind? If not I have some things in mind that could improve that (I hope). > > >>> + } > >>> +} > >>> + > >>> static void > >>> compose_clone(struct xlate_ctx *ctx, const struct ofpact_nest *oc, > >>> bool is_last_action) > >>> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at > >>> index 2e91ae1a1..7bcb2272b 100644 > >>> --- a/tests/ofproto-dpif.at > >>> +++ b/tests/ofproto-dpif.at > >>> @@ -11817,3 +11817,108 @@ AT_CHECK([test 1 = `ovs-ofctl parse-pcap > >> p2-tx.pcap | wc -l`]) > >>> > >>> OVS_VSWITCHD_STOP > >>> AT_CLEANUP > >>> + > >>> +AT_SETUP([ofproto-dpif - nested patch ports clone]) > >>> + > >>> +OVS_VSWITCHD_START( > >>> + [add-port br0 br0-p0 -- set Interface br0-p0 type=dummy > >> ofport_request=1 -- \ > >>> + add-port br0 br0-p1 -- set Interface br0-p1 type=patch > >> options:peer=br1-p0 -- \ > >>> + add-br br1 -- \ > >>> + set bridge br1 other-config:hwaddr=aa:66:aa:66:00:01 -- \ > >>> + set bridge br1 datapath-type=dummy other-config:datapath-id=1111 > >> fail-mode=secure -- \ > >>> + add-port br1 br1-p0 -- set Interface br1-p0 type=patch > >> options:peer=br0-p1 -- \ > >>> + add-port br1 br1-p1 -- set Interface br1-p1 type=patch > >> options:peer=br2-p0 -- \ > >>> + add-port br1 br1-p2 -- set Interface br1-p2 type=dummy > >> ofport_request=2 --\ > >>> + add-br br2 -- \ > >>> + set bridge br2 other-config:hwaddr=aa:66:aa:66:00:02 -- \ > >>> + set bridge br2 datapath-type=dummy other-config:datapath-id=2222 > >> fail-mode=secure -- \ > >>> + add-port br2 br2-p0 -- set Interface br2-p0 type=patch > >> options:peer=br1-p1 -- \ > >>> + add-port br2 br2-p1 -- set Interface br2-p1 type=dummy > >> ofport_request=3]) > >>> + > >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br1 'meter=1 pktps stats > >> bands=type=drop rate=2']) > >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br2 'meter=1 pktps stats > >> bands=type=drop rate=2']) > >>> +AT_CHECK([ovs-ofctl del-flows br0]) > >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 > >> in_port=local,ip,actions=br0-p1,br0-p0]) > >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 > >> in_port=br1-p0,ip,actions=meter:1,br1-p1,br1-p2]) > >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br2 > >> in_port=br2-p0,ip,actions=meter:1,br2-p1]) > >>> + > >>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | grep > >> "Datapath actions:"], [0], [dnl > >>> +Datapath actions: clone(meter(0),clone(meter(1),102),101),100 > >>> +]) > >>> + > >>> +OVS_VSWITCHD_STOP > >>> +AT_CLEANUP > >>> + > >>> +AT_SETUP([ofproto-dpif - patch ports - ct (clone)]) > >>> + > >>> +OVS_VSWITCHD_START( > >>> + [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 -- > \ > >>> + add-port br0 p1 -- set Interface p1 type=patch options:peer=p2 > >> ofport_request=2 -- \ > >>> + add-br br1 -- \ > >>> + set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \ > >>> + set bridge br1 datapath-type=dummy other-config:datapath-id=1234 > >> fail-mode=secure -- \ > >>> + add-port br1 p2 -- set Interface p2 type=patch options:peer=p1 -- \ > >>> + add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3]) > >>> + > >>> +AT_CHECK([ovs-ofctl del-flows br0]) > >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 > >> in_port=local,ip,actions=p1,p0]) > >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 > >> "in_port=1,ip,actions=resubmit(,3)"]) > >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 > >> "table=3,in_port=1,ip,ct_state=-trk,actions=ct(table=0)"]) > >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 > >> "table=3,in_port=1,ip,ct_state=+trk,actions=p3"]) > >>> + > >>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | \ > >>> + sed 's/recirc(0x[[0-9]]\+)/recirc(X)/' | grep "Datapath > >> actions:"], [0], [dnl > >>> +Datapath actions: clone(ct,recirc(X)),1 > >>> +Datapath actions: 3 > >>> +]) > >>> + > >>> +OVS_VSWITCHD_STOP > >>> +AT_CLEANUP > >>> + > >>> + > >>> +AT_SETUP([ofproto-dpif - patch ports - trunc (clone)]) > >>> + > >>> +OVS_VSWITCHD_START( > >>> + [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 -- > \ > >>> + add-port br0 p1 -- set Interface p1 type=patch options:peer=p2 > >> ofport_request=2 -- \ > >>> + add-br br1 -- \ > >>> + set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \ > >>> + set bridge br1 datapath-type=dummy other-config:datapath-id=1234 > >> fail-mode=secure -- \ > >>> + add-port br1 p2 -- set Interface p2 type=patch options:peer=p1 -- \ > >>> + add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3]) > >>> + > >>> +AT_CHECK([ovs-ofctl del-flows br0]) > >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 > >> in_port=local,ip,actions=p1,p0]) > >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 > >> "in_port=1,ip,actions=resubmit(,3)"]) > >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 > >> "table=3,in_port=1,ip,actions=output(port=p3, max_len=20)"]) > >>> + > >>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | grep > >> "Datapath actions:"], [0], [dnl > >>> +Datapath actions: clone(trunc(20),3),1 > >>> +]) > >>> + > >>> +OVS_VSWITCHD_STOP > >>> +AT_CLEANUP > >>> + > >>> +AT_SETUP([ofproto-dpif - patch ports - encap (clone)]) > >>> + > >>> +OVS_VSWITCHD_START( > >>> + [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 -- > \ > >>> + add-port br0 p1 -- set Interface p1 type=patch options:peer=p2 > >> ofport_request=2 -- \ > >>> + add-br br1 -- \ > >>> + set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \ > >>> + set bridge br1 datapath-type=dummy other-config:datapath-id=1234 > >> fail-mode=secure -- \ > >>> + add-port br1 p2 -- set Interface p2 type=patch options:peer=p1 -- \ > >>> + add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3]) > >>> + > >>> +AT_CHECK([ovs-ofctl del-flows br0]) > >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 > >> "in_port=local,ip,actions=encap(nsh()),encap(ethernet),p1,p0"]) > >>> +#AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 > >> "in_port=1,dl_type=0x894f,actions=resubmit(,3)"]) > >> > >> Forgot to remove, or uncomment? > >> > > > > To uncomment :) Done in v4, thanks. > > > > > >> > >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 > >> "in_port=1,dl_type=0x894f,actions=decap(),decap(),p3"]) > >>> + > >>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | \ > >>> + sed 's/recirc(0x[[0-9]]\+)/recirc(X)/' | grep "Datapath > >> actions:"], [0], [dnl > >>> +Datapath actions: > >> > push_nsh(flags=0,ttl=63,mdtype=1,np=3,spi=0x0,si=255,c1=0x0,c2=0x0,c3=0x0,c4=0x0),push_eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),clone(pop_eth,pop_nsh(),recirc(X)),1 > >>> +]) > >>> + > >>> +OVS_VSWITCHD_STOP > >>> +AT_CLEANUP > >>> -- > >>> 2.37.1 > >>> > >>> _______________________________________________ > >>> dev mailing list > >>> [email protected] > >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >> > >> > > Thanks, > > Ales > > -- > > > > Ales Musil > > > > Senior Software Engineer - OVN Core > > > > Red Hat EMEA <https://www.redhat.com> > > > > [email protected] IM: amusil > > <https://red.ht/sig> > > Thanks, Ales -- Ales Musil Senior Software Engineer - OVN Core Red Hat EMEA <https://www.redhat.com> [email protected] IM: amusil <https://red.ht/sig> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
