On 12/07/2016 10:23 AM, Ben Greear wrote: > On 12/07/2016 10:17 AM, Rustad, Mark D wrote: >> Ben Greear <gree...@candelatech.com> wrote: >> >>> I forgot to mention earlier, we have 4 x 4-port NICs in here too, so likely >>> we are pushing >>> the IRQ resources. Any way to configure the NICs to use fewer IRQ >>> resources and still >>> perform at least moderately well? >> >> And two 1G ports on the mobo as well! If you look at the igb >> initializations, you will see that it stopped getting MSI-X interrupts even >> before ixgbe failed to >> get them. See below: >> >> [ 6.876620] igb 0000:07:00.0: Intel(R) Gigabit Ethernet Network Connection >> [ 6.876621] igb 0000:07:00.0: eth16: (PCIe:5.0Gb/s:Width x4) >> 00:e0:ed:2e:b8:78 >> [ 6.876703] igb 0000:07:00.0: eth16: PBA No: 104900-000 >> [ 6.876704] igb 0000:07:00.0: Using MSI-X interrupts. 8 rx queue(s), 8 tx >> queue(s) >> ^^^^^^^^^ Above igb got the 9 MSI-X interrupts I think it would have >> requested. >> >> [ 6.933202] igb 0000:07:00.1: added PHC on eth17 >> [ 6.933204] igb 0000:07:00.1: Intel(R) Gigabit Ethernet Network Connection >> [ 6.933205] igb 0000:07:00.1: eth17: (PCIe:5.0Gb/s:Width x4) >> 00:e0:ed:2e:b8:79 >> [ 6.933286] igb 0000:07:00.1: eth17: PBA No: 104900-000 >> [ 6.933287] igb 0000:07:00.1: Using MSI interrupts. 1 rx queue(s), 1 tx >> queue(s) >> ^^^^^^^^^ The one above did not, and so would is using an MSI interrupt >> >> [ 6.989791] igb 0000:07:00.2: added PHC on eth18 >> [ 6.989792] igb 0000:07:00.2: Intel(R) Gigabit Ethernet Network Connection >> [ 6.989793] igb 0000:07:00.2: eth18: (PCIe:5.0Gb/s:Width x4) >> 00:e0:ed:2e:b8:7a >> [ 6.989874] igb 0000:07:00.2: eth18: PBA No: 104900-000 >> [ 6.989875] igb 0000:07:00.2: Using MSI interrupts. 1 rx queue(s), 1 tx >> queue(s) >> ^^^^^^^^^ Again, igb did not get MSI-X and is using an MSI interrupt >> >> [ 7.003771] ixgbe 0000:06:00.0: Multiqueue Enabled: Rx Queue count = 8, >> Tx Queue count = 8 >> [ 7.003900] ixgbe 0000:06:00.0: PCI Express bandwidth of 32GT/s available >> [ 7.003902] ixgbe 0000:06:00.0: (Speed:5.0GT/s, Width: x8, Encoding >> Loss:20%) >> [ 7.003986] ixgbe 0000:06:00.0: MAC: 2, PHY: 15, SFP+: 5, PBA No: >> FFFFFF-0FF >> [ 7.003988] ixgbe 0000:06:00.0: 00:e0:ed:77:09:28 >> [ 7.005730] ixgbe 0000:06:00.0: Intel(R) 10 Gigabit Network Connection >> [ 7.005854] ixgbe 0000:06:00.1: PCI INT B: failed to register GSI >> [ 7.005859] ixgbe: probe of 0000:06:00.1 failed with error -28 >> ^^^^^^^^^ ixgbe failed to get legacy interrupt >> >> <snip> >> >> [ 7.044932] igb 0000:07:00.3: added PHC on eth20 >> [ 7.044935] igb 0000:07:00.3: Intel(R) Gigabit Ethernet Network Connection >> [ 7.044937] igb 0000:07:00.3: eth20: (PCIe:5.0Gb/s:Width x4) >> 00:e0:ed:2e:b8:7b >> [ 7.045011] igb 0000:07:00.3: eth20: PBA No: 104900-000 >> [ 7.045014] igb 0000:07:00.3: Using MSI interrupts. 1 rx queue(s), 1 tx >> queue(s) >> ^^^^^^^^^ Again, igb got an MSI interrupt >> >> and so forth. >> >> It is looking like ixgbe may have a bug that it fails when falling back to a >> legacy interrupt. It looks like it should have failed back to an MSI >> interrupt. >> Instead of trying to fix ixgbe, I recommend loading ixgbe first, so that >> ixgbe can get all of the MSI-X that it wants. I am assuming that ixgbe at >> 10G will >> benefit more from having those resources than all those igb devices will. My >> assumption could be wrong depending on your traffic patterns, but I think it >> is >> likely to be true. >> >> To be sure igb loads later, you could blacklist igb and modprobe it >> explicitly after ixgbe has initialized. These methods would allow igb to get >> as many MSI-X >> interrupts to exploit multiqueue as possible. On Fedora you can blacklist >> igb with the kernel command line parameter rdblacklist=igb. >> >> Have you considered the memory bandwidth in this system? I really doubt that >> it can support full bandwidth on all those ports, assuming that they are all >> on >> maximum speed links. > > Thanks for the diagnosis. > > The purpose of this system is to act as a network traffic generation test > system, and we > do not expect full line-rate throughput on all ports concurrently. We do see > nice throughput over-all > on this sytem (in previous boots we tested all NICs concurrently, probably we > got lucky and > ixgbe loaded first perhaps?) > > We can test this blacklist option, but is there also a way to tell igb to > just not grab so many > MSI-X interrupts? In my particular use case, I don't think they will add a > lot of benefit.
The blacklist option worked. I blacklisted both ixgbe and igb, and then added a pre_network init.d script that runs before the 'network' target. This script loads ixgbe, waits 2 seconds, loads igb. I now see all network devices registered. This is quite a bit of a hack though, so I plan to still try a module param to limit number of txqueues for igb and ixgbe. Probably will be a bit before I get to that though.... Thanks, Ben -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com ------------------------------------------------------------------------------ Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/xeonphi _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired