Hi, Hari, I think modern Linux network drivers use a "polling" approach rather than an interrupt driven approach, so I've found IRQ affinity to be less important than it used to be. This can be observed as relatively low interrupt counts in /proc/interrupts. The main things that I've found beneficial are:
1. Ensuring that the processing code runs on CPU cores in the same socket that the NIC's PCIe slot is connected to. If you have a multi-socket NUMA system you will want to become familiar with its NUMA topology. The "hwloc" package includes the cool "lstopo" utility that will show you a lot about your system's topology. Even on a single socket system it can help to stay away from core 0 where many OS things tend to run. 2. Ensuring that memory allocations happen after your processes/threads have had their CPU affinity set, either by "taskset" or "numactl" or its own built-in CPU affinity setting code. This is mostly for NUMA systems. 3. Ensuring that various buffers are sized appropriately. There are a number of settings that can be tweaked in this category, most via "sysctl". I won't dare to make any specific recommendations here. Everybody seems to have their own set of "these are the settings I used last time". One of the most important things you can do in your packet receiving code is to keep track of how many packets you receive over a certain time interval. If this value does not match the expected number of packets then you have a problem. Any difference usually will be that the received packet count is lower than the expected packet count. Some people call these dropped packets, but I prefer to call them "missed packets" at this point because all we can say is that we didn't get them. We don't yet know what happened to them (maybe they were dropped, maybe they were misdirected, maybe they were never sent), but it helps to know where to look to find out. 4. Places to check for missing packets getting "dropped": 4.1 If you are using "normal" (aka SOCK_DGRAM) sockets to receive UDP packets, you will see a line in /proc/net/udp for your socket. The last number on that line will be the number of packets that the kernel wanted to give to your socket but couldn't because the socket's receive buffer was full so the kernel had to drop the packet. 4.2 If you are using "packet" (aka SOCK_RAW) sockets to receive UDP packets, there are ways to get the total number of packets the kernel has handled for that socket and the number it had to drop because of lack of kernel/application buffer space. I forget the details, but I'm sure you can google for it. If you're using Hashpipe's packet socket support it has a function that will fetch these values for you. 4.3 The ifconfig utility will give you a count of "RX errors". This is a generic category and I don't know all possible contributions to it, but one is that the NIC couldn't pass packets to the kernel. 4.4 Using "ethtool -S IFACE" (eg "ethtool -S eth4") will show you loads of stats. These values all come from counters on the NIC. Two interesting ones are called something like "rx_dropped" and "rx_fifo_errors". A non-zero rx_fifo_errors value means that the kernel was not keeping up with the packet rate for long enough that the NIC/kernel buffers filled up and packets had to be dropped. 4.5 If you're using a lower-level kernel bypass approach (e.g. IBVerbs or DPDK), then you may have to dig a little harder to find the packet drop counters as th kernel is no longer involved and all the previously mentioned counters will be useless (with the possible exception of the NIC counters). 4.6 You may be able to login to and query your switch for interface statistics. That can show various data and packet rates as well as bytes sent, packets sent, and some various error counters. One thing to remember about buffer sizes is that if your average processing rate isn't keeping up with the data rate, larger buffers won't solve your problem. Larger buffers will only allow the system to withstand slightly longer temporary lulls in throughput ("hiccups") if the overall throughput of the system (including the lulls/hiccups) is as fast or (ideally) faster than the incoming data rate. Hope this helps, Dave > On Sep 9, 2020, at 22:15, Hariharan Krishnan <vasanthikrishh...@gmail.com> > wrote: > > Hello Everyone, > > I'm trying to tune the NIC on a server with Ubuntu 18.04 OS > to listen to a multicast network and optimize it for throughput through IRQ > affinity binding. It is a Mellanox card and I tried using the "mlnx_tune" for > doing this, but haven't been successful. > I would really appreciate any help in this regard. > > Looking forward to responses from the group. > > Thank you. > > Regards, > > Hari > > -- > You received this message because you are subscribed to the Google Groups > "casper@lists.berkeley.edu" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to casper+unsubscr...@lists.berkeley.edu > <mailto:casper+unsubscr...@lists.berkeley.edu>. > To view this discussion on the web visit > https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAHNYk1yn5xkdjfDVMm0UMO%3DQ-vjfm4nmVQbf-Jt1b4kGjB9VUQ%40mail.gmail.com > > <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CAHNYk1yn5xkdjfDVMm0UMO%3DQ-vjfm4nmVQbf-Jt1b4kGjB9VUQ%40mail.gmail.com?utm_medium=email&utm_source=footer>. -- You received this message because you are subscribed to the Google Groups "casper@lists.berkeley.edu" group. To unsubscribe from this group and stop receiving emails from it, send an email to casper+unsubscr...@lists.berkeley.edu. To view this discussion on the web visit https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/3E685598-8E83-429C-AD7F-3B44D3C90F05%40berkeley.edu.