On Fri, Sep 30, 2022 at 11:50 AM Eelco Chaudron <[email protected]> wrote:

>
>
> On 29 Sep 2022, at 18:01, Ales Musil wrote:
>
> > On Fri, Sep 23, 2022 at 1:18 PM Eelco Chaudron <[email protected]>
> wrote:
> >
> >>
> >>
> >> On 7 Sep 2022, at 8:54, Ales Musil wrote:
> >>
> >>> When the packet was traveling through patch port boundary
> >>> OvS would check if any of the actions is reversible,
> >>> if not it would clone the packet. However, the check
> >>> was only at the first level of the second bridge.
> >>> That caused some issues when the packet had gone
> >>> through more actions, some of them might have been
> >>> irreversible.
> >>>
> >>> In order to keep the semantics the same we might
> >>> need to run the actions twice in the worst case
> >>> scenario. During the clone there are 4 scenarios
> >>> that can happen.
> >>>
> >>> 1) The action is last one for that flow,
> >>> in that case we just continue without clone.
> >>>
> >>> 2) There is irreversible action in the action
> >>> set (first level). In this case we know that
> >>> there is at leas one irreversible action which
> >>> is enough to do clone.
> >>>
> >>> 3) All actions in first level are reversible,
> >>> we can try to run all actions as if we don't
> >>> need any clone and inspect the ofpbuf at the
> >>> end. In positive case there are no irreversible
> >>> actions so we will just submit the buffer and continue.
> >>>
> >>> 4) This is same as 3) with the difference that
> >>> there is irreversible action in the ofpbuf.
> >>> To keep the semantics we need to re-run the actions
> >>> and treat it as clone. This requires resotration
> >>> of the xlate_ctx.
> >>>
> >>> Add test cases for all irreversible actions
> >>> to see if they are properly cloned.
> >>>
> >>> Signed-off-by: Ales Musil <[email protected]>
> >>
> >> Some more comments on this v5, see inline below.
> >>
> >
> > Thank you for the review. I will address those points in v6.
> >
>
> Thank you!
>
> <SNIP>
>
> >>
> >> This is the same as an invalid action, so I think we should return
> false.
> >> We should probably return false on each unknown option.
> >> How about the following change?
> >>
> >
> > Applied, but I think you have meant to return true, as any unknown action
> > being irreversible by default?
>
> Yes, you are right they should be irreversible by default.
>
> <SNIP>
>
> >>>  static void
> >>> -clone_xlate_actions(const struct ofpact *actions, size_t actions_len,
> >>> -                    struct xlate_ctx *ctx, bool is_last_action,
> >>> -                    bool group_bucket_action OVS_UNUSED)
> >>> +do_clone_xlate_actions(const struct ofpact *actions, size_t
> actions_len,
> >>> +                       struct xlate_ctx *ctx)
> >>>  {
> >>> -    struct ofpbuf old_stack = ctx->stack;
> >>> -    union mf_subvalue new_stack[1024 / sizeof(union mf_subvalue)];
> >>
> >> For the stack and action_set you no longer reserve a clean array.
> >> I think this was needed to stop it from leaking, see commit “b827b231”.
> >>
> >
> > The commit states that it is a design decision, I don't mind having it as
> > before.
> > It just seemed to me that this won't harm, also before that commit it was
> > not "clearing"
> > the stack or action_set  at all, which I do by moving the size back. This
> > avoids
> > taking up stack space.
>
> But with the old implementation, the stack size is 0, so you can not
> retrieve anything from the stack. With your changes, the stack has all
> entries on it as before, so in theory, you could get them from the stack.
> However, I do not know the details on the use case/implementation for this,
> so I have no way how this will affect OpenFlow. I’ve copied on Benn who
> might still know this ;) Or maybe Ilya knows…
>
>
> Let’s wait with sending out a v6 until this is resolved, so we can do one
> final review.
>
> <SNIP>
>
> >>> -dp_clone_done:
> >>> -    /* The clone's conntrack execution should have no effect on the
> >> original
> >>> -     * packet. */
> >>> -    ctx->conntracked = old_conntracked;
> >>> +    /* Let's run the actions as if they are reversible, if we find out
> >> that
> >>> +     * there was any irreversible action we can restore the xlate_ctx
> >> and run
> >>> +     * the do_clone instead. If the action is last we don't have to
> >> check and
> >>> +     * can just proceed with the actions. */
> >>
> >> Any stats to suggest that this is the best approach? I’m wondering if
> this
> >> is the common case, or if the real clone is more common. If it's the
> >> latter, we might want to swap the order.
> >>
> >> Maybe you can do some tests with OVN and add two coverage counters, one
> >> for total invocation of this function and one for real clones, and see
> the
> >> results while running through a real-user OVN use case?
> >>
> >
> > I will try to come up with some OVN tests that might give a better
> > perspective which is more common. For start I'll run the counters over
> our
> > tests in OVN.
>
> Thanks!
>
>
Hi,

I got the counters from OVN tests, it should represent some pieces of
normal workflows.

On average the actual clone is happening in 0.229%. Which suggests that the
order is correct.

The raw data are available at https://pastebin.mozilla.org/VqKTaO2e for 21
days. First number are calls to clone_xlate_actions
second number is how much of those were actual clones.

Thanks,
Ales

-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA <https://www.redhat.com>

[email protected]    IM: amusil
<https://red.ht/sig>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to