I too had this problem a few months ago when we put our first R410 into production. The resolution was to upgrade the ethernet driver. I got it from Dell's website then installed it on the CentOS system:
gtar xvzf Bcom_LAN_14.1.0_LinuxR5S10_DKMS_A01.tar.gz Find the appropriate version: Bcom_LAN_14.1.0_LinuxR5S10_DKMS_A01/NetXtremeII Had to install some dependencies: yum -y install rpm-build dkms Then install the driver: rpm -ihv netxtreme2-5.0.17-1.dkms.noarch.rpm On 2/9/2010 3:25 PM, Big Wave Dave wrote: > We had to do the same thing on R410's. We ran into a situation where > the one of the interfaces would stop working, and ethtool would show > "no link". Restarting network wouldn't fix anything, only a reboot > would resolve the problem. Moving to the Broadcomm provided driver > has solved this for us. > > Dave > > On Wed, Feb 3, 2010 at 6:30 AM, Carlson, Timothy S > <[email protected]> wrote: > >> I've moved away from the RHEL/Centos driver and have gone directly to the >> bnx2 driver from Broadcomm. >> >> dmesg | grep bnx >> Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v1.9.20b (July 9, 2009) >> bnx2: eth0: using MSI >> >> That driver seems stable for me. I was seeing your things similar to your >> problem and this driver fixed things right up for me. >> >> http://www.broadcom.com/support/ethernet_nic/driver-sla.php?driver=NX2-Linux >> >> You'll need to download that driver and rebuild it from the SRPM. You'll >> also need to rebuild the driver for each kernel update which is a pain. >> >> Tim >> >> -----Original Message----- >> From: [email protected] >> [mailto:[email protected]] On Behalf Of James Sparenberg >> Sent: Wednesday, February 03, 2010 1:08 AM >> To: [email protected] >> Subject: Returning Network stability problems on R710 servers and BCM5709 >> >> All, >> >> I'm referencing an earlier thread from last Sept. >> >> http://lists.us.dell.com/pipermail/linux-poweredge/2009-September/040252.html >> >> In it there was a discussion related to stability problems with the >> Broadcom BCM5709 on a Dell r610, where there would be a loss of connectivity >> for new connections but existing connections, or all connections of a >> different protocol passed. >> >> For example. Just now I lost the ability to ping eth0, or get NIS >> authentication on that IP, I also lost the ability to get TFTP connections >> via the eth1 address. However at the same time DHCP is running against >> eth1, and SNMP NTP and HTTP over port 10000 (webmin) where merrily working >> quite well on eth0. >> >> OS CentOS 5.4 >> >> kernel 2.6.18-164.10.1.el5 SMP x86_64 >> Kernel module bk2 >> >> modinfo output >> >> filename: >> /lib/modules/2.6.18-164.10.1.el5.centos.plus/kernel/drivers/net/bnx2.ko >> version: 1.9.3 >> license: GPL >> description: Broadcom NetXtreme II BCM5706/5708/5709/5716 Driver >> author: Michael Chan<[email protected]> >> srcversion: 1040A42F87B8BE8A019736C >> alias: pci:v000014E4d0000163Csv*sd*bc*sc*i* >> alias: pci:v000014E4d0000163Bsv*sd*bc*sc*i* >> alias: pci:v000014E4d0000163Asv*sd*bc*sc*i* >> alias: pci:v000014E4d00001639sv*sd*bc*sc*i* >> alias: pci:v000014E4d000016ACsv*sd*bc*sc*i* >> alias: pci:v000014E4d000016AAsv*sd*bc*sc*i* >> alias: pci:v000014E4d000016AAsv0000103Csd00003102bc*sc*i* >> alias: pci:v000014E4d0000164Csv*sd*bc*sc*i* >> alias: pci:v000014E4d0000164Asv*sd*bc*sc*i* >> alias: pci:v000014E4d0000164Asv0000103Csd00003106bc*sc*i* >> alias: pci:v000014E4d0000164Asv0000103Csd00003101bc*sc*i* >> depends: >> vermagic: 2.6.18-164.10.1.el5.centos.plus SMP mod_unload gcc-4.1 >> parm: disable_msi:Disable Message Signaled Interrupt (MSI) (int) >> parm: enable_entropy:Allow bnx2 to populate the /dev/random >> entropy pool (int) >> module_sig: >> 883f3504b47af9bd3b84a368dd51f2112b6b90a0ed1bac15e1b94720602336594dc65775db83c460991575cc8694cf9c03aca6e623e0950281e5094 >> >> So you can see that the version I have exceeds the version said to be stable >> in the prior thread. BTW this chassis is about 1 month old so it should >> (but unverified) have the latest BIOS. >> >> Ironic part. Same model running the same version/kernel of CentOS (kick >> start install so all my boxes are the same) is running some load testing >> pushing millions of sessions and billions (soaking 4 1G nics) of packets >> without a hitch in our LAB, testing out equipment, yet, this box which has a >> relatively low throughput is the one that locks up. >> >> Any thoughts or suggestions would be appreciated. So far nothing in normal >> logs so I'm going to turn some additional logging on. >> >> James Sparenberg >> >> _______________________________________________ >> Linux-PowerEdge mailing list >> [email protected] >> https://lists.us.dell.com/mailman/listinfo/linux-poweredge >> Please read the FAQ at http://lists.us.dell.com/faq >> >> _______________________________________________ >> Linux-PowerEdge mailing list >> [email protected] >> https://lists.us.dell.com/mailman/listinfo/linux-poweredge >> Please read the FAQ at http://lists.us.dell.com/faq >> >> > _______________________________________________ > Linux-PowerEdge mailing list > [email protected] > https://lists.us.dell.com/mailman/listinfo/linux-poweredge > Please read the FAQ at http://lists.us.dell.com/faq > -- Greg Gulik http://www.gulik.com/greg/ greg @ gulik.com _______________________________________________ Linux-PowerEdge mailing list [email protected] https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
