David, Based on the information you provided it would appear that the link interrupt for the ports isn't getting through. If you have ethtool 6 installed I would appreciate it if you could send me a copy of an "ethtool -d" dump for one of the non-functional ports so I can check to verify that it isn't something in our hardware that is masking it off.
Another thing to try would be calling "ethtool -r" on the port and see if this might trigger a link interrupt to be generated. The fact that we see no link interrupts would seem to indicate that on opening the interface the link interrupt wasn't generated as it is supposed to be, it is possible that "ethtool -r" might trigger one as it uses a slightly different initialization path. Thanks, Alex David Lawless wrote: > At 11:00 10/13/2008 -0700, Alexander Duyck wrote: >> David Lawless wrote: >>> It only works with the pure Fedora kernel, which we did not >>> compile. Note that it fails to work with the RHEL 5 kernel, >>> which we also did not compile. A noticeable difference is that >>> the RHEL 5 kernel configures the 82575's with MSI-X interrupts. >> >> If the only difference is MSI-X interrupts there are actually a few >> things you can try to verify if this is in fact the issue. >> >> First try loading the Fedora kernel with the option "pci=msi". It >> seems that on some systems the Fedora kernels disable MSI/MSI-X by >> default and setting this option may enable MSI/MSI-X interrupts. > > Alex, > > This caused it to fail on Fedora, so something MSI related > seems to be the prime suspect. > >> Second, on any of the other kernels you should be able to compile our >> OEM driver version 1.2.44.9 and then install the module with the >> option "IntMode=0,0,0,0". This should force all of the igb ports to >> use legacy interrupts. > > This helped greatly. Was able to get it working on RHEL 5.2, > so I'm hopefully not going back to Fedora--it was only desperation > and the release notes indicating 2.6.21 and higher were necessary > that took me there. > > Found that it works with MSI (IntMode=1,1,1,1) but not with > MSI-X. Did work with PCI-APIC (IntMode=0,0,0,0) as well. > >>> As you see if you read the original message, we have compiled >>> quite a sample of Kernel.org kernels and tried them. In every >>> case the instructions that came with the driver were carefully >>> observed. The 'igb' driver was built both with and without DCA >>> support--this makes no difference. You have assumed we did not >>> work this carefully. Let me be absolutely clear that we have. >>> 'e1000e' drivers compiled by us on a similar system work >>> perfectly. >> >> DCA shouldn't make any difference in this situation. If I am >> understanding you correctly the issue is that the ethtool shows a >> link of 1000/Full but states no link is detected. An issue like >> this would typically be due to the link interrupt not being handled >> or being masked off. If you could include a dump of >> /proc/interrupts for the affected system this my also prove useful >> as it would give us some information on the interrupt configuration >> of the device. > > It's working fine now with Intel OEM 'dca.ko', 'ioatdma.ko', > 'ixgbe.ko' and 'igb.ko' drivers. > > I've attached both output for /proc/interrupts, 'ethtool', and 'dmesg' > for both a good boot and a non-functioning IGB boot. > >>> This is a HP DL160 G5, which is a 1U rack mount server. The two >>> x16 ports are intended for network and storage devices, not for >>> graphics. Nobody puts fancy GPUs in 1U rack systems unless they >>> are to be used as numeric co-processors. The BIOS has no >>> setting other than a PCI 1 versus PCI 2 indication. It has been >>> tried both ways--makes no difference. >> >> Could you try swapping the dual CX4 and Quad-VT to see if x1 bus >> width issue follows the slot or the card? This should tell us where >> the problem lies. If it is a BIOS or hardware issue with the slot >> then anything plugged into the slot will only report a x1 with the >> possible exception of graphics cards. > > It started showing x4 when the BIOS "Video Card Support" setting > was turned on. I had disabled it after foolishly believing the > description to be accurate. Under Fedora the CX4 comes up x4 > when PCI is set to v1 or v2. Under RHEL it comes up x4 only > when PCI is set to v1, so that's where it's set forever more, > or at least until I get my hands on an ET/82576 quad. Want > to see the PTP support in action and whether PCIe 2 makes > much difference. > > Your point that it could be the BIOS seems correct, especially > since all earlier DL1xx systems appear to have x4 instead of x16 > signaling in the #2 slot. Might be a hang-over from the ancestor > BIOS. So I'm going to work on having the cards swapped. > Unfortunately the supplier did not include the 1/2-height > bracket so I'll have to hunt one down. > > Will also open an issue with HP on this, though I won't be > holding my breath for a BIOS fix. > > >>> Everything compiles fine except for 'ioatdma.ko' under >>> 2.6.26 and 2.6.27 kernels. It's obvious that the internal >>> APIs have changed, since the error is for a structure member >>> 'ack' that no longer exists. >> >> I don't believe the ioatdma provided on our website is currently >> compatible with 2.6.26/27 kernel api. This is why it is recommended >> you use the ioatdma provided with these kernels instead of trying to >> compile the ioatdma driver separately. >> >>> We've tried every conceivable combination (in the true >>> mathematical sense) of hand-compiled versus Fedora 9 and RHEL >>> 5.2 distribution drivers. Nothing works except the pure >>> Fedora 9 drivers, the one case where MSI interrupts are disabled. >>> >>> It is clear that the driver is immature and has flaws. >>> This Bugzilla reinforces the perception: >>> https://bugzilla.redhat.com/show_bug.cgi?id=452289 >>> >>> David >> >> If you could try the recommendations above it should bring us much >> closer to understanding the root cause of the issues you are seeing. >> Also if you could include a dmesg log from one of the non-working >> OSes instead of Fedora it might help us to resolve the link issue. > > Attached. > > Regards, > > David ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel