Rajagopal Kunhappan writes: > > > Nicolas Michael wrote: > > Hi all, > > > > unfortunately it took us quite a while to finish the new setup of our > > system, but for a couple of days we were now running tests again. The > > results are good -- except for one major problem (see below). > > > > For the first tests, we've used 4 Quad-Port nxge 1 GBit NICs with each 1 > > port used for the cluster interconnect and set ddi_msix_alloc_limit to 8 - > > the largest allowed value (this is the discussed workaround until Crossbow > > will be available). 16 DMA channels on that card allow for a fanout over 4 > > cpus per port, so with 4 ports we have a fanout over 16 cpus. This is > > working pretty good and solves our current problems with the interrupt > > load on our 128-way server. (We were able to reduce the packet load by > > some further optimizations.) > > > > In the tests now we have the target configuration with 2 Dual-Port nxge 10 > > GBit NICs instead with each 1 port used, again giving us a fanout over 16 > > cpus (8 cpus per port). Basically, this configuration works very good as > > well, except for one major problem: > > > > The nxge driver accidentally decides to use cpu 0 as one of the cpus > > (among 7 others) for the interrupt handler. Since this cpu is always > > handling the clock interrupts, this cpu is now overloaded with interrupt > > processing. Although we already put cpu 0 into a processor set (so that > > nothing but interrupts are running on that cpu), it now reaches 100% sys > > load (mpstat). From the kstat interrupt statistics I've calculated the > > per-level interrupt time over a time of 5 minutes, which is: > > CPU 0 - Overall: 97.7% > > CPU 0 - Level 1: 15.9% > > CPU 0 - Level 6: 8.9% > > CPU 0 - Level 10: 71.9% > > > > The cpu 0 is already 71.9% busy with the clock interrupts. On top of that > > come some Level 1 interrupts (PCIe to PCI bridge driver (?)) and some > > Level 6 interrupts -- this is our nxge NIC. Since the clock interrupt has > > the highest priority, the nxge performance suffers -- we see bad packet > > latencies on the interconnect when cpu 0 becomes overloaded. > > > > So my question is: How can I restrict nxge (and the PCIe/PCIe bridge) to > > chose cpu 0 for their interrupts?? "psradm -i 0" won't help since this > > will also affect the clock interrupt (I want to make sure that no HW > > interrupts are running on the cpu that is handling the clock interrupt). > > So what I would need is something like telling the driver not to use cpu 0 > > (or better: not to use the cpu that the clock interrupt is using). > > > > Is there a solution for this problem in S10U4? > > Will there be a solition in Crossbow? > > > Hi Nick, > > I don't know if there is a way for doing this in S10U4 but with Crossbow > this will be possible. In Crossbow you will be able to use dladm command > to specifically re-target the interrupt/s from a NIC to CPU/s of your > choice. > > -krgopi
With Crossbow the problem should go away by itself since the NIC should be placed into polling mode anyway. -r > > Thanks a lot, > > Nick. > > > > > > This message posted from opensolaris.org > > _______________________________________________ > > crossbow-discuss mailing list > > crossbow-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss > > > _______________________________________________ > crossbow-discuss mailing list > crossbow-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss