On Thu, Jan 5, 2017 at 4:28 PM, Ben Pfaff <b...@ovn.org> wrote:

> On Tue, Jan 03, 2017 at 02:55:19AM -0800, Mickey Spiegel wrote:
> > One of the motivations for clone is to use it as a lightweight way to
> > resubmit to an earlier table at the beginning of the pipeline, without
> > incurring all of the overhead associated with openflow patch ports.
> > One such usage is in OVN, where a recent patch set replaced the
> > use of openflow patch ports with clone, for OVN patch ports within
> > the same bridge (br-int).
> >
> > Over the holidays, some issues arose related to this usage of clone
> > (see the thread ovn ping from VM to external gateway IP failed).
>
> Thanks for bringing this up.  I had overlooked these questions and this
> issue.
>

Thanks for addressing all of this. The patch set should clear up any
existing and potential future problems with the use of clone as a
lightweight way to resubmit to the pipeline.

>
> > While investigating these issues, I noticed some significant
> > differences between the way flow and other context information
> > is saved and restored when using openflow patch ports, versus
> > the way it is done with clone. At least one such difference, with
> > regard to conntrack, is causing failures.
>
> What's the difference that is causing failures?  I do not know about it,
> so it is a high priority to fix it.  Is it already fixed, or do we need
> to do something about it?
>

There are two issues causing failures.

One is http://patchwork.ozlabs.org/patch/709981/, which needs to be
reviewed. This is not an issue with the usage of clone, it is an
underlying issue that was revealed by the removal of ofports for
OVN patch ports.

We need to call your new ct_clear action inside the clone in
ovn/controller/physical.c. Numan tried a diff
http://patchwork.ozlabs.org/patch/710368/ that cleared flow
conntrack state as part of the clone action, and that fixed the
problem that he was seeing. Since Numan is able to reproduce the
issue, he should probably be the one to submit this OVN fix.

>
> > I would classify the differences (those for which there is no current
> > workaround by specifying nested actions within the clone) into two
> > categories:
> > 1. State that is kept outside of flow, which is currently saved,
> >    cleared, and restored when using openflow patch ports.
> >    This includes:
> >    ctx->wc->masks.tunnel
>
> I think that clone should not save and restore this.  The situation is
> different from a patch port because clone does not change the ingress
> port.  The patch port code has the following comment:
>
>         /* Since this packet came in on a patch port (from the perspective
> of
>          * the peer bridge), it cannot have useful tunnel information. As a
>          * result, any wildcards generated on that tunnel also cannot be
> valid.
>          * The tunnel wildcards must be restored to their original version
> since
>          * the peer bridge uses a separate tunnel metadata table and
> therefore
>          * any generated wildcards will be garbage in the context of our
>          * metadata table. */
>         ctx->wc->masks.tunnel = old_flow_tnl_wc;
>

I am still a little unclear on the implications of clearing or restoring
this.
If this is like flow->tunnel.metadata.tab and is only derived from the
bridge, then there would be no need to save and restore.

>
> >    ctx->conntracked
>
> I guess that we should save and restore this since we're saving and
> restoring the conntrack metadata.  I've written up a patch.
>

Agreed.

>
> >    ctx->was_mpls
>
> I do not think that that this is a correctness issue, but it's a nice
> optimization to save and restore this.  I've written up a patch.
>
> >    ctx->xin->tables_version (not an issue if bridge does not change)
>
> clone doesn't change the bridge, so this shouldn't matter.
>

Agreed

>
> >    ctx->stack
> >    ctx->action_set
>
> I think it's cleanest if a clone starts off with both of these empty and
> saves and restores them.  I've written up a patch.
>
> >    ctx->xbridge (not an issue if bridge does not change)
> >    ctx->mirrors (not an issue if bridge does not change)
>
> clone doesn't change the bridge, so these shouldn't matter.
>

Agreed.

>
> > 2. State that resides inside the flow, but for which there is no
> >    workaround to clear or reset the fields. This includes:
> >    flow->tunnel ?
>
> Like the tunnel mask, this doesn't matter for clone, because clone
> doesn't change the ingress port.
>

I am still a little unclear on the implications of clearing or restoring
this.
The ingress port is cleared in ovn/controller/physical.c.
Does that matter?

>
> >    flow->tunnel.metadata.tab (not an issue if bridge does not change)
>
> clone doesn't change the bridge, so this shouldn't matter.
>

Agreed.

>
> >    flow->ct_state (read only)
> >    flow->ct_mark (restricted so can only be set in ct action)
> >    flow->ct_label (restricted so can only be set in ct action)
>
> It's not obvious to me that clone should always clear these.  However,
> if it saves and restores them, then we can support clearing them by
> adding a ct_clear action.  I've written up a patch.
>
> I sent out a series that addresses these issues.  Probably, we also need
> to modify OVN to insert a "ct_clear" action into its use of "clone", but
> my series doesn't do that.
>

We need to do that to fix problems that Numan is seeing. I suggested
above that Numan should write up that patch and check that it
actually fixes the problems.

>
> The series starts here:
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327204.html
>

I am working my way through the series now.

Mickey
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to