Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-26 Thread Ilya Maximets via discuss
On 4/26/24 22:05, Gavin McKee wrote: > Thanks again for coming back on this Ilya, > > Another option I am looking at here is to switch the kernal path (Open > vSwitch kernel module) with OVS-DOCA as we are using the CX6/7 card > https://docs.nvidia.com/doca/archive/doca-v2.0.2/ovs-doca/index.html

Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-26 Thread Gavin McKee via discuss
Thanks again for coming back on this Ilya, Another option I am looking at here is to switch the kernal path (Open vSwitch kernel module) with OVS-DOCA as we are using the CX6/7 card https://docs.nvidia.com/doca/archive/doca-v2.0.2/ovs-doca/index.html I'm trying to wrangle the documented Known

Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-26 Thread Ilya Maximets via discuss
On 4/26/24 20:12, Gavin McKee wrote: > Thanks for coming back to me on this. > > Moving kernal versions around is not a straightforward option here - > especially when you are using hardware offload . The OFED driver > version is coupled to the kernal so if we move from that we are out of >

Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-26 Thread Gavin McKee via discuss
Adrian, Yes , we are using tc-offload using the Nvidia CX6/7 cards (OFED driver aligned to the support kernal matrix). When I do a dump of the tc filter rules when the issue is occuring , I can't see any rule at all relating to the TCP connection thats breaking . Is this because the TC_INGRESS

Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-26 Thread Gavin McKee via discuss
Thanks for coming back to me on this. Moving kernal versions around is not a straightforward option here - especially when you are using hardware offload . The OFED driver version is coupled to the kernal so if we move from that we are out of support coverage . Doing an ovn-appctl -t

Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-25 Thread Adrian Moreno via discuss
On 4/23/24 17:39, Gavin McKee wrote: If you look at both traces (non working and working) the thing that stands out to me is this At line 10 in the working file the following entry exists ct_state NEW tcp (SYN_SENT) orig [172.27.16.11.38793 > 172.27.31.189.9100] reply [172.27.31.189.9100

Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-24 Thread Ilya Maximets via discuss
On 4/23/24 17:39, Gavin McKee wrote: > If you look at both traces (non working and working) the thing that > stands out to me is this > > At line 10 in the working file the following entry exists > ct_state NEW tcp (SYN_SENT) orig [172.27.16.11.38793 > > 172.27.31.189.9100] reply

Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-23 Thread Gavin McKee via discuss
If you look at both traces (non working and working) the thing that stands out to me is this At line 10 in the working file the following entry exists ct_state NEW tcp (SYN_SENT) orig [172.27.16.11.38793 > 172.27.31.189.9100] reply [172.27.31.189.9100 > 172.27.16.11.38793] zone 195 his

Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-22 Thread Gavin McKee via discuss
Ok @Adrian Moreno @Flavio Leitner Two more detailed Retis traces attached. One is not working - the same session that I can't establish a TCP session to on port 9010 172.27.16.11.42303 > 172.27.31.189.9100 Then I restart Open vSwtich and try again 172.27.16.11.38793 > 172.27.31.189.9100 (this

Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-22 Thread Gavin McKee via discuss
Hi Guys, We have had another occurrence of the issue today. In short - we try to open a connection from 172.27.22.90 -> 172.27.31.189 on port 9100. We see the SYN being received and the ACK being sent from the server back to the client 172.27.22.90. The server retransmits the ACK as it's

Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-18 Thread Gavin McKee via discuss
I thought ct was in that retis trace . I’ll need to capture these events when they occur again . I’m also having some issues with huge amounts of loss in the kernal pipeline that I can’t explain . When sending traffic VM to VM within a logical switch but going across GENEVE tunnel I get 30 - 40

Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-18 Thread Paolo Valerio via discuss
Paolo Valerio writes: > Adrian Moreno via discuss writes: > >> Hi Gavin >> >> On 4/18/24 02:38, Gavin McKee via discuss wrote: >>> This is an example. >>> >>> Again the TCP 3 handshake completes , but the next packet fails to NAT >>> and goes out onto the physical network using the private

Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-18 Thread Paolo Valerio via discuss
Adrian Moreno via discuss writes: > Hi Gavin > > On 4/18/24 02:38, Gavin McKee via discuss wrote: >> This is an example. >> >> Again the TCP 3 handshake completes , but the next packet fails to NAT >> and goes out onto the physical network using the private address . An >> example of this is

Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-18 Thread Adrian Moreno via discuss
Hi Gavin On 4/18/24 02:38, Gavin McKee via discuss wrote: This is an example. Again the TCP 3 handshake completes , but the next packet fails to NAT and goes out onto the physical network using the private address . An example of this is in the packet trace I provided. Given you were using

Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-17 Thread Gavin McKee via discuss
This is an example. Again the TCP 3 handshake completes , but the next packet fails to NAT and goes out onto the physical network using the private address . An example of this is in the packet trace I provided. ovs-appctl ofproto/trace br-int

Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-17 Thread Gavin McKee via discuss
That information is all in the email The openflow trace is showing that the pipeline is fine . This is why I’m worried about a deeper issue with the kernal / openvswitch kernal module / connection tracking On Wed, Apr 17, 2024 at 16:33 Flavio Leitner wrote: > On Wed, 17 Apr 2024 12:26:27

Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-17 Thread Flavio Leitner via discuss
On Wed, 17 Apr 2024 12:26:27 -0700 Gavin McKee wrote: > Hi Flavio, > > I had to restart the Open vSwitch across 16 machines to resolve the > issue for a customer . I think it will occur again and when it does > I'll use that command to gather the tc information. > > Until then I think I have

Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-17 Thread Gavin McKee via discuss
Hi Flavio, I had to restart the Open vSwitch across 16 machines to resolve the issue for a customer . I think it will occur again and when it does I'll use that command to gather the tc information. Until then I think I have found why the issue is occurring . Take a look at the output below

Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-17 Thread Flavio Leitner via discuss
Hi Gavin, It would be helpful if you can provide some TC dumps from the "good" state to the "bad" state to see how it was and what changes. Something like: # tc -s filter show dev enp148s0f0_1 ingress I haven't checked the attached files, but one suggestion is to check if this is not a

Re: [ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-16 Thread Gavin McKee via discuss
Adding information relating to the Open VSwitch kernal module @Ilya Maximets @Numan Siddique Can either of you help out here? modinfo openvswitch filename: /lib/modules/5.14.0-362.8.1.el9_3.x86_64/kernel/net/openvswitch/openvswitch.ko.xz alias: net-pf-16-proto-16-family-ovs_ct_limit

[ovs-discuss] Urgent Help needed: OVS 3.2.2 Strange TC DROPs

2024-04-16 Thread Gavin McKee via discuss
Hi, I need some help with strange OVS behaviours. ovs-vsctl (Open vSwitch) 3.2.2 ovn-controller 23.09.1 Open vSwitch Library 3.2.2 TLDR: We need to restart Open VSwitch in order for TLS traffic to work between a VM and Cloudflare R2. After restarting Open VSwitch the TLS connection works fine.