Dear John,
Ronciak, John wrote:
> Thanks, it's a bit hard to try and translate this into some we can
> understand. :-(
Let me know, if there is anything specific you'd like to know and I'll
try to translate those bits.
>> I am getting dropped packets on the 82572EI interfaces as well.
> Thanks not good. This means that the interrupts are not being
> serviced fast enough to keep up with the traffic. With 5 networking
> ports it doesn't surprise me. What kind of tests are you running to
> cause this? It's unclear if this system can withstand the traffic
> from these ports. Have you tried to run the test on a single port to
> see if the drops happen then as well? Try to see where the problem
> starts to happen. Are interrupts being shared between the devices?
> What OS are you running?
We are running Debian Lenny with different kernel versions. At the
moment we are testing with 2.6.30 bpo version. Reading through the
archives I've tried the module options
TxDescriptorStep=4 TxDescriptors=1024
with the e1000 module.
This changes the behaviour. We no longer have tx unit hang messages in
the log, but the link nevertheless goes down sporadically and comes back
after some seconds. There is no down message in the logs, just the up
message:
syslog:Mar 22 17:34:09 gw kernel: [765361.074217] e1000: aur-mgt:
e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex, Flow Control:
None
syslog:Mar 22 22:28:21 gw kernel: [783013.698259] e1000: aur-mgt:
e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex, Flow Control:
None
Interrupts of the 82541 ports are shared with USB, interrupts of the
82572 ports are not shared. I think that the error rate corresponds to
the load of the interfaces somehow. Interfaces with little traffic have
a smaller value of rx_no_buffer_count and rx_missed_errors than
interfaces with lots of traffic.
CPU0 CPU1
0: 2676 0 IO-APIC-edge timer
1: 2 0 IO-APIC-edge i8042
3: 896339 0 IO-APIC-edge serial
4: 11 0 IO-APIC-edge
7: 0 0 IO-APIC-edge parport0
8: 56 0 IO-APIC-edge rtc0
9: 0 0 IO-APIC-fasteoi acpi
14: 0 0 IO-APIC-edge ide0
16: 14766 0 IO-APIC-fasteoi uhci_hcd:usb4, ath
18: 193566354 0 IO-APIC-fasteoi uhci_hcd:usb3, aur-mgt
19: 1904270 0 IO-APIC-fasteoi uhci_hcd:usb2, gst
23: 0 0 IO-APIC-fasteoi uhci_hcd:usb1, ehci_hcd:usb5
28: 6776261 0 PCI-MSI-edge pbr-Q0
29: 2 0 PCI-MSI-edge pbr
30: 66797611 0 PCI-MSI-edge dmz-Q0
31: 871 0 PCI-MSI-edge dmz
32: 218588672 0 PCI-MSI-edge inet-Q0
33: 475455 0 PCI-MSI-edge inet
35: 5566356 0 PCI-MSI-edge ahci
NMI: 0 0 Non-maskable interrupts
LOC: 70790723 55485197 Local timer interrupts
SPU: 0 0 Spurious interrupts
RES: 303286 277829 Rescheduling interrupts
CAL: 127 293 Function call interrupts
TLB: 237854 61406 TLB shootdowns
Strange thing is we have five of those devices, two show this behavior,
three don't. I might be able to dedicate a single device for further
testing. Would it help to diagnose further if you had shell access to
one of those devices?
Best regards,
Lars
------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit
http://communities.intel.com/community/wired