On 5 Sep 2022, at 12:45, Ales Musil wrote:
> On Fri, Sep 2, 2022 at 4:19 PM Eelco Chaudron <[email protected]> wrote: > >> >> >> On 9 Aug 2022, at 9:17, Ales Musil wrote: >> >>> When the packet was traveling through patch port boundary >>> OvS would check if any of the actions is reversible, >>> if not it would clone the packet. However, the check >>> was only at the first level of the second bridge. >>> That caused some issues when the packet had gone >>> through more actions, some of them might have been >>> irreversible. >>> >>> In order to keep the semantics the same we might >>> need to run the actions twice in the worst case >>> scenario. During the clone there are 4 scenarios >>> that can happen. >>> >>> 1) The action is last one for that flow, >>> in that case we just continue without clone. >>> >>> 2) There is irreversible action in the action >>> set (first level). In this case we know that >>> there is at leas one irreversible action which >>> is enough to do clone. >>> >>> 3) All actions in first level are reversible, >>> we can try to run all actions as if we don't >>> need any clone and inspect the ofpbuf at the >>> end. In positive case there are no irreversible >>> actions so we will just submit the buffer and continue. >>> >>> 4) This is same as 3) with the difference that >>> there is irreversible action in the ofpbuf. >>> To keep the semantics we need to re-run the actions >>> and treat it as clone. This requires resotration >>> of the xlate_ctx. >>> >>> Add test cases for all irreversible actions >>> to see if they are properly cloned. >>> >>> Signed-off-by: Ales Musil <[email protected]> <SNIP> >>> + /* In any other case we cannot be certain if there is any >> irreversible >>> + * action down the action stack. Let's run the actions as if they >> are >>> + * reversible, if we find out that there was any irreversible >> action we >>> + * can restore the xlate_ctx and run the do_clone instead. */ >>> + struct ofpbuf old_stack = ctx->stack; >>> + union mf_subvalue new_stack[1024 / sizeof (union mf_subvalue)]; >>> + >>> + ofpbuf_use_stub(&ctx->stack, new_stack, sizeof new_stack); >>> + ofpbuf_put(&ctx->stack, old_stack.data, old_stack.size); >>> + >>> + struct ofpbuf old_action_set = ctx->action_set; >>> + uint64_t actset_stub[1024 / 8]; >>> + >>> + ofpbuf_use_stub(&ctx->action_set, actset_stub, sizeof actset_stub); >>> + ofpbuf_put(&ctx->action_set, old_action_set.data, >> old_action_set.size); >>> + >>> + struct flow old_flow = ctx->xin->flow; >>> + bool old_was_mpls = ctx->was_mpls; >>> + bool old_conntracked = ctx->conntracked; >>> + struct flow old_base = ctx->base_flow; >> >> This is the part that scares me. Are these the only value we need to cache >> and restore? Are we sure none of the other CTX fields arale potentially >> adjusted? >> What about fields like sflow_n_outputs, pending_encap, resubmits, etc. etc. >> > > I did some testing and it seems that what is stored is enough, however that > does not mean that this is 100% correct. > IMO it comes down to the else below. This is the same procedure that was > done by the clone before, however the else > part of "if (!odp_contains_irreversible_action(odp_actions.data, > odp_actions.size))" does need to do the heavy lifting > of restoring everything back so we can clone again. Ok, so assuming it was working fine before with the same local ctx changes, we can ignore it for now ;) But it might make sense to copy the same code comments as in the existing function to make clear why we restore them. >>> + size_t old_recirc_size = ctx->xin->recirc_queue >>> + ? ovs_list_size(ctx->xin->recirc_queue) >>> + : 0; >>> + >>> + uint64_t odp_actions_stub[1024 / 8]; >> >> All these mf_subvalue/uint64_t arrays add quite some stack space, isn't >> there a smarter way we can deal with this? >> Maybe we can store the size of the ofpbuf's and reset them to that size? >> > > Seems like a good idea, adjusted, thanks. > > >> >>> + struct ofpbuf odp_actions = >> OFPBUF_STUB_INITIALIZER(odp_actions_stub); >>> + struct ofpbuf *old_odp_actions = ctx->odp_actions; >>> + >>> + ctx->odp_actions = &odp_actions; >>> + >>> + do_xlate_actions(actions, actions_len, ctx, false, false); >> >> For consistency should we use "...ctx, do_xlate_actions, false);"? >> Or maybe change the above also to "false, false", probably the better >> option. >> > > Not sure I follow, can you please elaborate a bit? > Did you mean "...ctx, is_last_action, false);"? > If so, at this point "is_last_action" is always false, but I agree that in > that if above it can be changed to " ...ctx, true, false);" Yes, this is the diff, way less complex than me trying to explain what I meant ;) diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index 5a196ec5e..b0c328bf0 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -5911,7 +5911,7 @@ clone_xlate_actions(const struct ofpact *actions, size_t actions_len, { /* We will never clone on last action, just proceed. */ if (is_last_action) { - do_xlate_actions(actions, actions_len, ctx, is_last_action, false); + do_xlate_actions(actions, actions_len, ctx, true, false); if (!ctx->freezing) { xlate_action_set(ctx); } >> >>> + >>> + /* Restore ctx parts that are common even if don't need to clone, */ >> >> , instead of . at the end of the sentence. >> > > Done in v4. > > >> >>> + ctx->odp_actions = old_odp_actions; >>> + ofpbuf_uninit(&ctx->action_set); >>> + ctx->action_set = old_action_set; >>> + ofpbuf_uninit(&ctx->stack); >>> + ctx->stack = old_stack; >>> + ctx->xin->flow = old_flow; >>> + >>> + if (!odp_contains_irreversible_action(odp_actions.data, >> odp_actions.size)) { >> >> The line is too long, see robot email. >> > > Done in v4. > > >> >>> + nl_msg_put(ctx->odp_actions, odp_actions.data, >> odp_actions.size); >>> + if (!ctx->freezing) { >>> + xlate_action_set(ctx); >>> + } >>> + if (ctx->freezing) { >>> + finish_freezing(ctx); >>> + } >>> + } else { >>> + ctx->was_mpls = old_was_mpls; >>> + ctx->conntracked = old_conntracked; >>> + ctx->base_flow = old_base; >>> + /* Clear recirculations that were added by non-clone path so >> that >>> + * packets are not recirculated twice. */ >>> + size_t recirc_size = ctx->xin->recirc_queue >>> + ? ovs_list_size(ctx->xin->recirc_queue) >>> + : 0; >>> + for (; old_recirc_size < recirc_size; old_recirc_size++) { >>> + struct oftrace_recirc_node *node = >>> + CONTAINER_OF(ovs_list_pop_back(ctx->xin->recirc_queue), >>> + struct oftrace_recirc_node, node); >>> + oftrace_recirc_node_destroy(node); >>> + } I’am also a bit confused here (maybe I need more coffee), but you reuse the actions buffer you just filled with actions >>> + do_clone(actions, actions_len, ctx); >> >> Here we will do the entire translation again, can we not just take the >> newly added >> datapath action stream and just encapsulate it in OVS_ACTION_ATTR_CLONE? >> > > I have tried this approach numerous times, but as explained in commit > message in order to > keep the semantics we need to rerun it again, maybe I have missed something > how to do it > without breaking semantics. The main issue I had was that if we don't clone > we should not > "xlate_commit_actions", but that does not seem to be possible to know > without running all > actions in the chain first. If we want to do it in one pass we would > probably need to ensure > that "xlate_commit_actions" can be reverted if needed. Sorry, I was miss-reading the code, I guess I needed more coffee last Friday :) I guess the process is ok, however, might it be possible to make the code easier to read? There is also some code duplication, which would be nice to get rid of. >>> + } >>> +} >>> + >>> static void >>> compose_clone(struct xlate_ctx *ctx, const struct ofpact_nest *oc, >>> bool is_last_action) >>> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at >>> index 2e91ae1a1..7bcb2272b 100644 >>> --- a/tests/ofproto-dpif.at >>> +++ b/tests/ofproto-dpif.at >>> @@ -11817,3 +11817,108 @@ AT_CHECK([test 1 = `ovs-ofctl parse-pcap >> p2-tx.pcap | wc -l`]) >>> >>> OVS_VSWITCHD_STOP >>> AT_CLEANUP >>> + >>> +AT_SETUP([ofproto-dpif - nested patch ports clone]) >>> + >>> +OVS_VSWITCHD_START( >>> + [add-port br0 br0-p0 -- set Interface br0-p0 type=dummy >> ofport_request=1 -- \ >>> + add-port br0 br0-p1 -- set Interface br0-p1 type=patch >> options:peer=br1-p0 -- \ >>> + add-br br1 -- \ >>> + set bridge br1 other-config:hwaddr=aa:66:aa:66:00:01 -- \ >>> + set bridge br1 datapath-type=dummy other-config:datapath-id=1111 >> fail-mode=secure -- \ >>> + add-port br1 br1-p0 -- set Interface br1-p0 type=patch >> options:peer=br0-p1 -- \ >>> + add-port br1 br1-p1 -- set Interface br1-p1 type=patch >> options:peer=br2-p0 -- \ >>> + add-port br1 br1-p2 -- set Interface br1-p2 type=dummy >> ofport_request=2 --\ >>> + add-br br2 -- \ >>> + set bridge br2 other-config:hwaddr=aa:66:aa:66:00:02 -- \ >>> + set bridge br2 datapath-type=dummy other-config:datapath-id=2222 >> fail-mode=secure -- \ >>> + add-port br2 br2-p0 -- set Interface br2-p0 type=patch >> options:peer=br1-p1 -- \ >>> + add-port br2 br2-p1 -- set Interface br2-p1 type=dummy >> ofport_request=3]) >>> + >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br1 'meter=1 pktps stats >> bands=type=drop rate=2']) >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br2 'meter=1 pktps stats >> bands=type=drop rate=2']) >>> +AT_CHECK([ovs-ofctl del-flows br0]) >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 >> in_port=local,ip,actions=br0-p1,br0-p0]) >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 >> in_port=br1-p0,ip,actions=meter:1,br1-p1,br1-p2]) >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br2 >> in_port=br2-p0,ip,actions=meter:1,br2-p1]) >>> + >>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | grep >> "Datapath actions:"], [0], [dnl >>> +Datapath actions: clone(meter(0),clone(meter(1),102),101),100 >>> +]) >>> + >>> +OVS_VSWITCHD_STOP >>> +AT_CLEANUP >>> + >>> +AT_SETUP([ofproto-dpif - patch ports - ct (clone)]) >>> + >>> +OVS_VSWITCHD_START( >>> + [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 -- \ >>> + add-port br0 p1 -- set Interface p1 type=patch options:peer=p2 >> ofport_request=2 -- \ >>> + add-br br1 -- \ >>> + set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \ >>> + set bridge br1 datapath-type=dummy other-config:datapath-id=1234 >> fail-mode=secure -- \ >>> + add-port br1 p2 -- set Interface p2 type=patch options:peer=p1 -- \ >>> + add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3]) >>> + >>> +AT_CHECK([ovs-ofctl del-flows br0]) >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 >> in_port=local,ip,actions=p1,p0]) >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 >> "in_port=1,ip,actions=resubmit(,3)"]) >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 >> "table=3,in_port=1,ip,ct_state=-trk,actions=ct(table=0)"]) >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 >> "table=3,in_port=1,ip,ct_state=+trk,actions=p3"]) >>> + >>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | \ >>> + sed 's/recirc(0x[[0-9]]\+)/recirc(X)/' | grep "Datapath >> actions:"], [0], [dnl >>> +Datapath actions: clone(ct,recirc(X)),1 >>> +Datapath actions: 3 >>> +]) >>> + >>> +OVS_VSWITCHD_STOP >>> +AT_CLEANUP >>> + >>> + >>> +AT_SETUP([ofproto-dpif - patch ports - trunc (clone)]) >>> + >>> +OVS_VSWITCHD_START( >>> + [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 -- \ >>> + add-port br0 p1 -- set Interface p1 type=patch options:peer=p2 >> ofport_request=2 -- \ >>> + add-br br1 -- \ >>> + set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \ >>> + set bridge br1 datapath-type=dummy other-config:datapath-id=1234 >> fail-mode=secure -- \ >>> + add-port br1 p2 -- set Interface p2 type=patch options:peer=p1 -- \ >>> + add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3]) >>> + >>> +AT_CHECK([ovs-ofctl del-flows br0]) >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 >> in_port=local,ip,actions=p1,p0]) >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 >> "in_port=1,ip,actions=resubmit(,3)"]) >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 >> "table=3,in_port=1,ip,actions=output(port=p3, max_len=20)"]) >>> + >>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | grep >> "Datapath actions:"], [0], [dnl >>> +Datapath actions: clone(trunc(20),3),1 >>> +]) >>> + >>> +OVS_VSWITCHD_STOP >>> +AT_CLEANUP >>> + >>> +AT_SETUP([ofproto-dpif - patch ports - encap (clone)]) >>> + >>> +OVS_VSWITCHD_START( >>> + [add-port br0 p0 -- set Interface p0 type=dummy ofport_request=1 -- \ >>> + add-port br0 p1 -- set Interface p1 type=patch options:peer=p2 >> ofport_request=2 -- \ >>> + add-br br1 -- \ >>> + set bridge br1 other-config:hwaddr=aa:66:aa:66:00:00 -- \ >>> + set bridge br1 datapath-type=dummy other-config:datapath-id=1234 >> fail-mode=secure -- \ >>> + add-port br1 p2 -- set Interface p2 type=patch options:peer=p1 -- \ >>> + add-port br1 p3 -- set Interface p3 type=dummy ofport_request=3]) >>> + >>> +AT_CHECK([ovs-ofctl del-flows br0]) >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 >> "in_port=local,ip,actions=encap(nsh()),encap(ethernet),p1,p0"]) >>> +#AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 >> "in_port=1,dl_type=0x894f,actions=resubmit(,3)"]) >> >> Forgot to remove, or uncomment? >> > > To uncomment :) Done in v4, thanks. > > >> >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 >> "in_port=1,dl_type=0x894f,actions=decap(),decap(),p3"]) >>> + >>> +AT_CHECK([ovs-appctl ofproto/trace br0 "in_port(local),ip" | \ >>> + sed 's/recirc(0x[[0-9]]\+)/recirc(X)/' | grep "Datapath >> actions:"], [0], [dnl >>> +Datapath actions: >> push_nsh(flags=0,ttl=63,mdtype=1,np=3,spi=0x0,si=255,c1=0x0,c2=0x0,c3=0x0,c4=0x0),push_eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),clone(pop_eth,pop_nsh(),recirc(X)),1 >>> +]) >>> + >>> +OVS_VSWITCHD_STOP >>> +AT_CLEANUP >>> -- >>> 2.37.1 >>> >>> _______________________________________________ >>> dev mailing list >>> [email protected] >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> >> > Thanks, > Ales > -- > > Ales Musil > > Senior Software Engineer - OVN Core > > Red Hat EMEA <https://www.redhat.com> > > [email protected] IM: amusil > <https://red.ht/sig> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
