I am working on a multi-core system having 3 e1000 NIC. Following is
information from lspci -v

00:02.0 Class 0200: Unknown device 8086:1079 (rev 03)
        Subsystem: Unknown device 8086:1011
        Flags: bus master, 66MHz, medium devsel, latency 48, IRQ 44
        Memory at 11b00f8000000 (64-bit, non-prefetchable) [size=128K]
        I/O ports at 4000 [size=64]
        Capabilities: [dc] Power Management version 2
        Capabilities: [e4] PCI-X non-bridge device
        Capabilities: [f0] Message Signalled Interrupts: Mask- 64bit+
Queue=0/0 Enable-

00:02.1 Class 0200: Unknown device 8086:1079 (rev 03)
        Subsystem: Unknown device 8086:1011
        Flags: bus master, 66MHz, medium devsel, latency 48, IRQ 45
        Memory at 11b00f8020000 (64-bit, non-prefetchable) [size=128K]
        I/O ports at 4040 [size=64]
        Capabilities: [dc] Power Management version 2
        Capabilities: [e4] PCI-X non-bridge device
        Capabilities: [f0] Message Signalled Interrupts: Mask- 64bit+
Queue=0/0 Enable-

00:03.0 Class 0200: Unknown device 8086:1079 (rev 03)
        Subsystem: Unknown device 8086:1011
        Flags: bus master, 66MHz, medium devsel, latency 48, IRQ 47
        Memory at 11b00f8040000 (64-bit, non-prefetchable) [size=128K]
        I/O ports at 4080 [size=64]
        Capabilities: [dc] Power Management version 2
        Capabilities: [e4] PCI-X non-bridge device
        Capabilities: [f0] Message Signalled Interrupts: Mask- 64bit+
Queue=0/0 Enable-


Each of above interfaces are eth0, eth1 and eth2 respectively. I send
moderate then heavy traffic to eth0 interface. When moderate traffic is
sent, the driver works fine. Following is some statistics, I print to see
traffic

cpu       intr        schedule     action       poll        squeeze
   0     708571      29380      29379      72280          0
   1     712430      28054      28055      68762          1
   2     711130      26940      26940      66425          0
   3     714811      25866      25866      63728          0
   4     713158      24553      24553      60311          0
   5     715464      23105      23105      57058          0
   6     712195      21397      21397      52448          0
   7     708600      20042      20042      49540          0
   8     703734      18469      18469      45554          0
   9     696629      17620      17620      43374          0
 10     677145      16843      16842      41779          0
 11     662875      16158      16158      40283          0
 12     644950      15568      15568      38483          0
 13     626776      15015      15015      37353          0
 14          0             0              0              0                0
 15          0             0              0              0                0

Following are the meanings:
intr - number of times e1000_intr is invoked for irq 44
schedule-number of times __netif_rx_schedule2 is invoked for irq 44,45,47
action - number of times net_rx_action2 is invoked for irq 44, 45, 47
poll - number of times dev->poll is invoked for irq 44, 45, 47
squeeze - number of times 'goto softnet_break' is executed

__netif_rx_schedule2 and net_rx_action2 are modified versions of functions
__netif_rx_schedule and net_rx_action where I do tasklet_schedule in place
of  __raise_softirq_irqoff(NET_RX_SOFTIRQ);
I have created seperate tasklets for each cpu which invoke function
net_rx_action2. There is no other change in these funtions.
(I did above change only to see that situation remains same with both
NET_RX_SOFTIRQ and seperate tasklets).

When eth0 is exposed to heavy traffic, one of cpu cores becomes 100% busy
while other remails almost idle. In that condition, although eth0 is exposed
to heavy traffic still e1000_intr is not invoked for irq 44 at all.
Following are traffic statistics:

cpu       intr   schedule     action       poll    squeeze
   0          0          0         660        3299        660
   1          0         12         12            23          0
   2          0          9          9              17          0
   3          0          5          5              10          0
   4          0          1          1                2          0
   5          0          7          7              14          0
   6          0          1          1               2           0
   7          0          0          0               0           0
   8          0          1          1               2           0
   9          0          2          2               4           0
 10          0          0          0               0           0
 11          0          1          1               2           0
 12          0          0          0               0           0
 13          0          0          0               0           0
 14          0          0          0               0           0
 15          0          0          0               0           0

In above case, cpu0 is 100% busy. This behavior keeps changing i.e. some
times cpu0 goes 100% busy while sometimes some other. But at a point only
one cpu remains 100% busy while rest remain idle.


Following is one `top` snapshot taken when cpu9 was 100% busy.

 Cpu 0 : 1.0%us, 1.7%sy, 0.0%ni, 97.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
 Cpu 1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
 Cpu 2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
 Cpu 3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
 Cpu 4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
 Cpu 5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
 Cpu 6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
 Cpu 7 : 0.3%us, 0.3%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
 Cpu 8 : 0.3%us, 0.7%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
 Cpu 9 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi,100.0%si, 0.0%st
Cpu10 : 0.0%us, 0.0%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.3%hi, 0.3%si, 0.0%st
Cpu11 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu12 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu13 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Could someone shed some light why this behavior is happening?

-Mohan



------------------------------------------------------------------------------
Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT 
is a gathering of tech-side developers & brand creativity professionals. Meet
the minds behind Google Creative Lab, Visual Complexity, Processing, & 
iPhoneDevCamp as they present alongside digital heavyweights like Barbarian 
Group, R/GA, & Big Spaceship. http://p.sf.net/sfu/creativitycat-com 
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel

Reply via email to