Thanks - I was worried about the unforeseen side effects, especially with
my unfamiliarity of this part of the tree and OVS.

After looking at the tree, I see uses of dev_net(dev) for this kind of test
within IP tunneling sections, which also handles the CONFIG_NET_NS=n case.

Thanks for you time,
Tyler

On Tue, Feb 16, 2021 at 3:03 PM Gregory Rose <[email protected]> wrote:

>
>
> On 2/13/2021 8:01 AM, Tyler Stachecki wrote:
> > I've fixed the issue in such a way that it works for me (TM), but would
> > appreciate confirmation from an OVS expert that I'm not overlooking
> > something here:
> >
> > Based on my last post, we need:
> > --- a/net/openvswitch/vport.c
> > +++ b/net/openvswitch/vport.c
> > @@ -503,6 +503,7 @@ void ovs_vport_send(struct vport *vport, struct
> sk_buff
> > *skb, u8 mac_proto)
> >          }
> >
> >          skb->dev = vport->dev;
> > +       skb->tstamp = 0;
> >          vport->ops->send(skb);
> >          return;
>
> Hmm... I'm not so sure about this.  The skb_scrub_packet() function only
> clears skb->tstamp if the @xnet boolean parameter is true.  In this case
> you are doing it unconditionally which very well might have unforeseen
> side effects.
>
> Maybe test skb->dev->nd_net and if it isn't NULL then clear the
> tstamp?
>
> What do you think?
>
> - Greg
>
> >
> > As the timestamp must be cleared when forwarding packets to a different
> > namespace ref:
> >
> https://patchwork.ozlabs.org/project/netdev/patch/[email protected]/#1871003
> >
> > Cheers,
> > Tyler
> >
> > On Sat, Feb 13, 2021 at 12:04 AM Tyler Stachecki <
> [email protected]>
> > wrote:
> >
> >> Here's the offender:
> >>
> >> commit fb420d5d91c1274d5966917725e71f27ed092a85 (refs/bisect/bad)
> >> Author: Eric Dumazet <[email protected]>
> >> Date:   Fri Sep 28 10:28:44 2018 -0700
> >>
> >>      tcp/fq: move back to CLOCK_MONOTONIC
> >>
> >> Without this, I wasn't able to make it past the 4.20 series.  I
> >> forward-ported a reversion to 5.4 LTS for fun and things still work
> great.
> >> Though it sounds like simply reverting this is not the right fix -- some
> >> interesting discussion on others impact of this commit:
> >> https://lists.openwall.net/netdev/2019/01/10/36
> >>
> >>> Then, we probably need to clear skb->tstamp in more paths (you are
> >>> mentioning bridge ...)
> >>
> >> I will try to take a peek sometime this weekend to see if I can spot
> where
> >> in OVS, assuming it is there.
> >>
> >> On Tue, Feb 9, 2021 at 4:22 PM Gregory Rose <[email protected]>
> wrote:
> >>
> >>>
> >>>
> >>> On 2/8/2021 4:19 PM, Tyler Stachecki wrote:
> >>>> Thanks for the reply.  This is router, so it is using conntrack;
> unsure
> >>> if
> >>>> there is additional connection tracking in OVS.  `ovs-ofctl dump-flows
> >>>> br-util` shows exactly one flow: the default one.
> >>>>
> >>>> Here's my approx /etc/network//interfaces.  I just attach VMs to this
> >>> with
> >>>> libvirt and having nothing else added at this point:
> >>>> allow-ovs br-util
> >>>> iface br-util inet manual
> >>>>           ovs_type OVSBridge
> >>>>           ovs_ports enp0s20f1.102 vrf-util
> >>>>
> >>>> allow-br-util enp0s20f1.102
> >>>> auto enp0s20f1.102
> >>>> iface enp0s20f1.102 inet manual
> >>>>           ovs_bridge br-util
> >>>>           ovs_type OVSPort
> >>>>           mtu 9000
> >>>>
> >>>> allow-br-util vrf-util
> >>>> iface vrf-util inet static
> >>>>           ovs_bridge br-util
> >>>>           ovs_type OVSIntPort
> >>>>           address 10.10.2.1/24
> >>>>           mtu 9000
> >>>>
> >>>> I roughly transcribed what I was doing into a Linux bridge, and it
> >>> works as
> >>>> expected in 5.10... e.g. this in my /etc/network/interfaces:
> >>>> auto enp0s20f1.102
> >>>> iface enp0s20f1.102 inet manual
> >>>>           mtu 9000
> >>>>
> >>>> auto vrf-util
> >>>> iface vrf-util inet static
> >>>>           bridge_ports enp0s20f1.102
> >>>>           bridge-vlan-aware no
> >>>>           address 10.10.2.1/24
> >>>>           mtu 9000
> >>>>
> >>>> I'm having a bit of a tough time following the dataflow code, and the
> ~1
> >>>> commit or so I was missing from the kernel staging tree does not seem
> to
> >>>> have fixed the issue.
> >>>
> >>> Hi Tyler,
> >>>
> >>> this does not sound like the previous issue I mentioned because that
> one
> >>> was caused by flow programming for dropping packets.
> >>>
> >>> I hate to say it but you're probably going to have to resort to a
> >>> bisect to find this one.
> >>>
> >>> - Greg
> >>>
> >>>>
> >>>> On Mon, Feb 8, 2021 at 6:21 PM Gregory Rose <[email protected]>
> >>> wrote:
> >>>>
> >>>>>
> >>>>>
> >>>>> On 2/6/2021 9:50 AM, Tyler Stachecki wrote:
> >>>>>> I have simple forwarding issues when running the Debian stable
> >>> backports
> >>>>>> kernel (5.9) that I don't see with the stable, non-backported 4.19
> >>>>> kernel.
> >>>>>> Big fat disclaimer: I compiled my OVS (2.14.1) from source, but
> given
> >>> it
> >>>>>> works with the 4.19 kernel I doubt it has anything to do with it.
> For
> >>>>> good
> >>>>>> measure, I also compiled 5.10.8 from source and see the same issue I
> >>> do
> >>>>> in
> >>>>>> 5.9.
> >>>>>>
> >>>>>> The issue I see on 5.x (config snippets below):
> >>>>>> My VM (vnet0 - 10.10.0.16/24) can ARP/ping for other physical hosts
> >>> on
> >>>>> its
> >>>>>> subnet (e.g. 00:07:32:4d:2f:71 = 10.10.0.23/24 below), but only the
> >>>>> first
> >>>>>> echo request in a sequence is seen by the destination host.  I then
> >>> have
> >>>>> to
> >>>>>> wait about 10 seconds before pinging the destination host from the
> VM
> >>>>>> again, but again only the first echo in a sequence gets a reply.
> >>>>>>
> >>>>>> I've tried tcpdump'ing enp0s20f1.102 (the external interface on the
> >>>>>> hypervisor) and see the pings going out that interface at the rate I
> >>>>> would
> >>>>>> expect.  OTOH, when I tcpdump on the destination host, I only see
> the
> >>>>> first
> >>>>>> of the ICMP echo requests in a sequence (for which an echo reply is
> >>>>> sent).
> >>>>>>
> >>>>>> I then added an OVS internal port on the hypervisor (i.e., on
> br-util)
> >>>>> and
> >>>>>> gave it an IP address (10.10.2.1/24).  It is able to ping that same
> >>>>>> external host just fine.  Likewise, I am able to ping between the VM
> >>> and
> >>>>>> the OVS internal port just fine.
> >>>>>>
> >>>>>> When I rollback to 4.19, this weirdness about traffic going out of
> >>>>>> enp0s20f1.102 *for the VM* goes away and everything just works.  Any
> >>>>> clues
> >>>>>> while I start ripping into code?
> >>>>>
> >>>>> Are you using any of the connection tracking capabilities? I vaguely
> >>>>> recall some issue that sounds a lot like what you're seeing but do
> not
> >>>>> see anything in the git log to stir my memory.  IIRC though it was a
> >>>>> similar problem.
> >>>>>
> >>>>> Maybe provide a dump of your flows.
> >>>>>
> >>>>> - Greg
> >>>>>
> >>>>
> >>>
> >>
> >
>
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to