Hi, Max > effect: from time to time all threads are blocked simultaneously for over > 100µs, > whether they are interacting with the NIC or not.
From our experience I would recommend to check: - some process with higher prio preempts your application - NUMA balancer is disabled. This special kernel feature periodically unmaps the whole process memory and checks in exceptions the memory belong to the correct NUMA node This might cause hiccup - SMI - System Management Interrupt, all CPU caches flushed, all cores stalled and CPU goes to special mode to handle HW events. SMI statistics can be checked with turbostat utility > How can I enable "DMA to LLC"? If I see correctly, "Direct Cache Access" is an > Inter-exclusive feature not available on the AMD EPYC CPUSs we are using. Does your EPYC have no DDIO or something similar ? ☹ With best regards, Slava > > -----Original Message----- > From: Engelhardt, Maximilian <[email protected]> > Sent: Tuesday, January 16, 2024 3:57 PM > To: Slava Ovsiienko <[email protected]>; [email protected] > Cc: Maayan Kashani <[email protected]>; Carsten Andrich > <[email protected]> > Subject: AW: [mlx5] Loss of packet pacing precision under high Tx loads > > Hi Slava, > > I'm using an 100Gbit link and want to transfer 10GByte (80 Gbit) per second. I > did test it with different numbers of queues (1,2,4,8) without any change to > the > result. In our application, the other end (FPGA) does not support L2 flow > control. > > As you assume, the problem does not seem to be in the actual NIC timestamping > as is guessed first, but in the interaction of host and NIC: I have inserted > another > thread in my application that does nothing but repeatedly call > rte_delay_us_block(1) and measures the elapsed time. This shows the same > effect: from time to time all threads are blocked simultaneously for over > 100µs, > whether they are interacting with the NIC or not. > > I seem to have the same problem as described here: https://www.mail- > archive.com/[email protected]/msg07437.html > > Investigating further, I discovered strange behavior: In my main application > (not > the MWE posted here), the problem also occurs when receiving the data when > the packet load changes (start and end of the data stream). Normally, the > received data is copied into a large buffer - if I comment out this memcpy, > i.e. > *reduce* the workload, these stalls occur *more* often. It also seems to > depend on the software environment: On Debian stalls are less frequent than > when using NIXOS(same hardware and the same isolation features). > > How can I enable "DMA to LLC"? If I see correctly, "Direct Cache Access" is an > Inter-exclusive feature not available on the AMD EPYC CPUSs we are using. > > I would be grateful for any advice on how I could solve the problem. > > Thank you and best regards, > Max > > >-----Ursprüngliche Nachricht----- > >Von: Slava Ovsiienko <[email protected]> > >Gesendet: Sonntag, 14. Januar 2024 12:09 > >An: [email protected] > >Cc: Engelhardt, Maximilian <[email protected]>; > >Maayan Kashani <[email protected]> > >Betreff: RE: [mlx5] Loss of packet pacing precision under high Tx loads > > > >Hi, Max > > > >As far as I understand, some packets are delayed. > >What Is the data rate? 10 GigaBytes (not 10 Gbits) ? > >What is the connection rate? 100 Gbps? > >It is not trivial to satisfy correct packet delivery for highload (> > >50% of line rate) connections, a lot of aspects are involved. > >Sometimes the traffic schedules from neighbor queues are just overlapped. > > > >I have some extra questions: > >How many Tx queues do you use? (8 is optimal, over 32 on CX6 might > >induce the performance penalty). > >Did your traffic contain VLAN headers ? > >Did you disable L2 flow control ? > >High wander value rather indicates we have an issue with overloaded > >PCIe bus/host memory. > >Did you enable the option on the host "DMA to LLC (last layer cache)" ? > > > >With best regards, > >Slava > > > >> > >> > >>From: Engelhardt, Maximilian > ><mailto:[email protected]> > >>Sent: Wednesday, 8 November 2023 17:41 > >>To: mailto:[email protected] > >>Cc: Andrich, Carsten <mailto:[email protected]> > >>Subject: [mlx5] Loss of packet pacing precision under high Tx loads > >> > >>Hi > >>I am currently working on a system in which a high-rate data stream is > >>to be > >transmitted to an FPGA. As this only has small buffers available, I am > >using the packet pacing function of the NIC Mellanox ConnectX-6 > >MCX623106AN to send the packets at uniform intervals. This works if I > >only transfer 5 GB/s per second, but >as soon as I step up to 10 GB/s, after > >a > few seconds errors begin to occur: > >The tx_pp_wander value increases significantly (>80000ns) and there are > >large gaps in the packet stream (>100µs, the affected packets are not > >lost, but arrive later). > >>To demonstrate this, I connected my host to another computer with the > >>same > >type of NIC via a DAC cable, enabling Rx hardware timestamping on the > >second device and observing the timing difference between adjacent > >packets. The code for this minimum working example is attached to this > >message. It includes an >assertion to ensure that every packet is > >enqueued well before its Tx time comes, so software timing should not > influence the issue. > >>I tested different packet pacing granularity settings (tx_pp) in the > >>range of > >500ns-4µs, which did not change the outcome. Also, enabling Tx > >timestamping only for every 16th packet did not have the desired > >effect. Distributing the workload over multiple threads and Tx queues > >also has no effect. The NIC is connected via >PCIe 4.0x16 and has > >firmware version 22.38.1002, DPDK version 22.11.3-2. > >>To be able to use packet pacing, the configuration > >REAL_TIME_CLOCK_ENABLE=1 must be set for this NIC. Is it possible that > >the large gaps are caused by the NIC and host clock synchronizing > >mechanism not working correctly under the high packet load? In my > >specific application I do not need a real-time NIC clock - the > >>synchronization between the devices is done via feedback from the > >FPGA. Is there any way to eliminate these jumps in the NIC clock? > >>Thank you and best regards > >>Max
