On 2/13/2021 8:01 AM, Tyler Stachecki wrote:
I've fixed the issue in such a way that it works for me (TM), but would
appreciate confirmation from an OVS expert that I'm not overlooking
something here:
Based on my last post, we need:
--- a/net/openvswitch/vport.c
+++ b/net/openvswitch/vport.c
@@ -503,6 +503,7 @@ void ovs_vport_send(struct vport *vport, struct sk_buff
*skb, u8 mac_proto)
}
skb->dev = vport->dev;
+ skb->tstamp = 0;
vport->ops->send(skb);
return;
Hmm... I'm not so sure about this. The skb_scrub_packet() function only
clears skb->tstamp if the @xnet boolean parameter is true. In this case
you are doing it unconditionally which very well might have unforeseen
side effects.
Maybe test skb->dev->nd_net and if it isn't NULL then clear the
tstamp?
What do you think?
- Greg
As the timestamp must be cleared when forwarding packets to a different
namespace ref:
https://patchwork.ozlabs.org/project/netdev/patch/[email protected]/#1871003
Cheers,
Tyler
On Sat, Feb 13, 2021 at 12:04 AM Tyler Stachecki <[email protected]>
wrote:
Here's the offender:
commit fb420d5d91c1274d5966917725e71f27ed092a85 (refs/bisect/bad)
Author: Eric Dumazet <[email protected]>
Date: Fri Sep 28 10:28:44 2018 -0700
tcp/fq: move back to CLOCK_MONOTONIC
Without this, I wasn't able to make it past the 4.20 series. I
forward-ported a reversion to 5.4 LTS for fun and things still work great.
Though it sounds like simply reverting this is not the right fix -- some
interesting discussion on others impact of this commit:
https://lists.openwall.net/netdev/2019/01/10/36
Then, we probably need to clear skb->tstamp in more paths (you are
mentioning bridge ...)
I will try to take a peek sometime this weekend to see if I can spot where
in OVS, assuming it is there.
On Tue, Feb 9, 2021 at 4:22 PM Gregory Rose <[email protected]> wrote:
On 2/8/2021 4:19 PM, Tyler Stachecki wrote:
Thanks for the reply. This is router, so it is using conntrack; unsure
if
there is additional connection tracking in OVS. `ovs-ofctl dump-flows
br-util` shows exactly one flow: the default one.
Here's my approx /etc/network//interfaces. I just attach VMs to this
with
libvirt and having nothing else added at this point:
allow-ovs br-util
iface br-util inet manual
ovs_type OVSBridge
ovs_ports enp0s20f1.102 vrf-util
allow-br-util enp0s20f1.102
auto enp0s20f1.102
iface enp0s20f1.102 inet manual
ovs_bridge br-util
ovs_type OVSPort
mtu 9000
allow-br-util vrf-util
iface vrf-util inet static
ovs_bridge br-util
ovs_type OVSIntPort
address 10.10.2.1/24
mtu 9000
I roughly transcribed what I was doing into a Linux bridge, and it
works as
expected in 5.10... e.g. this in my /etc/network/interfaces:
auto enp0s20f1.102
iface enp0s20f1.102 inet manual
mtu 9000
auto vrf-util
iface vrf-util inet static
bridge_ports enp0s20f1.102
bridge-vlan-aware no
address 10.10.2.1/24
mtu 9000
I'm having a bit of a tough time following the dataflow code, and the ~1
commit or so I was missing from the kernel staging tree does not seem to
have fixed the issue.
Hi Tyler,
this does not sound like the previous issue I mentioned because that one
was caused by flow programming for dropping packets.
I hate to say it but you're probably going to have to resort to a
bisect to find this one.
- Greg
On Mon, Feb 8, 2021 at 6:21 PM Gregory Rose <[email protected]>
wrote:
On 2/6/2021 9:50 AM, Tyler Stachecki wrote:
I have simple forwarding issues when running the Debian stable
backports
kernel (5.9) that I don't see with the stable, non-backported 4.19
kernel.
Big fat disclaimer: I compiled my OVS (2.14.1) from source, but given
it
works with the 4.19 kernel I doubt it has anything to do with it. For
good
measure, I also compiled 5.10.8 from source and see the same issue I
do
in
5.9.
The issue I see on 5.x (config snippets below):
My VM (vnet0 - 10.10.0.16/24) can ARP/ping for other physical hosts
on
its
subnet (e.g. 00:07:32:4d:2f:71 = 10.10.0.23/24 below), but only the
first
echo request in a sequence is seen by the destination host. I then
have
to
wait about 10 seconds before pinging the destination host from the VM
again, but again only the first echo in a sequence gets a reply.
I've tried tcpdump'ing enp0s20f1.102 (the external interface on the
hypervisor) and see the pings going out that interface at the rate I
would
expect. OTOH, when I tcpdump on the destination host, I only see the
first
of the ICMP echo requests in a sequence (for which an echo reply is
sent).
I then added an OVS internal port on the hypervisor (i.e., on br-util)
and
gave it an IP address (10.10.2.1/24). It is able to ping that same
external host just fine. Likewise, I am able to ping between the VM
and
the OVS internal port just fine.
When I rollback to 4.19, this weirdness about traffic going out of
enp0s20f1.102 *for the VM* goes away and everything just works. Any
clues
while I start ripping into code?
Are you using any of the connection tracking capabilities? I vaguely
recall some issue that sounds a lot like what you're seeing but do not
see anything in the git log to stir my memory. IIRC though it was a
similar problem.
Maybe provide a dump of your flows.
- Greg
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss