On Thu, Nov 12, 2009 at 10:36:16AM -0900, Royce Williams wrote: > We have servers with dual 82573 NICs that work well during low-throughput > activity, but during high-volume activity, they pause shortly after transfers > start and do not recover. Other sessions to the system are not affected.
Please define "low-throughput" and "high-volume" if you could; it might help folks determine where the threshold is for problems. > These systems are being repurposed, jumping from 6.3 to 7.2. The same system > and its kin do not exhibit the symptom under 6.3-RELEASE-p13. The symptoms > appear under freebsd-updated 7.2-RELEASE GENERIC kernel with no tuning. > > Previously, we've been using DCGDIS.EXE (from Jack Vogel) for this symptom. > The first system to be repurposed accepts DCGDIS with 'Updated' and > subsequent 'update not needed', with no relief. > > Notably, there are no watchdog timeout errors - unlike our various Supermicro > models still running FreeBSD 6.x. All of our other 7.x Supermicro flavors > had already received the flash update and haven't show the symptom. > > Details follow. > > Kernel: > > rand# uname -a > FreeBSD rand.acsalaska.net 7.2-RELEASE-p4 FreeBSD 7.2-RELEASE-p4 #0: Fri Oct > 2 12:21:39 UTC 2009 > [email protected]:/usr/obj/usr/src/sys/GENERIC i386 > > sysctls: > > rand# sysctl dev.em > dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 6.9.6 > dev.em.0.%driver: em > dev.em.0.%location: slot=0 function=0 > dev.em.0.%pnpinfo: vendor=0x8086 device=0x108c subvendor=0x15d9 > subdevice=0x108c class=0x020000 > dev.em.0.%parent: pci13 > dev.em.0.debug: -1 > dev.em.0.stats: -1 > dev.em.0.rx_int_delay: 0 > dev.em.0.tx_int_delay: 66 > dev.em.0.rx_abs_int_delay: 66 > dev.em.0.tx_abs_int_delay: 66 > dev.em.0.rx_processing_limit: 100 > dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 6.9.6 > dev.em.1.%driver: em > dev.em.1.%location: slot=0 function=0 > dev.em.1.%pnpinfo: vendor=0x8086 device=0x108c subvendor=0x15d9 > subdevice=0x108c class=0x020000 > dev.em.1.%parent: pci14 > dev.em.1.debug: -1 > dev.em.1.stats: -1 > dev.em.1.rx_int_delay: 0 > dev.em.1.tx_int_delay: 66 > dev.em.1.rx_abs_int_delay: 66 > dev.em.1.tx_abs_int_delay: 66 > dev.em.1.rx_processing_limit: 100 > > kenv: > > rand# kenv | grep smbios | egrep -v 'socket|serial|uuid|tag|0123456789' > smbios.bios.reldate="03/05/2008" > smbios.bios.vendor="Phoenix Technologies LTD" > smbios.bios.version="6.00" > smbios.chassis.maker="Supermicro" > smbios.planar.maker="Supermicro" > smbios.planar.product="PDSMi " > smbios.planar.version="PCB Version" > smbios.system.maker="Supermicro" > smbios.system.product="PDSMi" > > > The system is not yet production, so I can invasively abuse it if needed. > The other systems are in production under 6.3-RELEASE-p13 and can also be > inspected. > > Any pointers appreciated. > > Royce For what it's worth as a comparison base: We use the following Supermicro SuperServers, and can confirm that no such issues occur for us using RELENG_6 nor RELENG_7 on the following hardware: Supermicro SuperServer 5015B-MTB - amd64 - Intel 82573V + Intel 82573L Supermicro SuperServer 5015M-T+B - amd64 - Intel 82573V + Intel 82573L Supermicro SuperServer 5015M-T+B - amd64 - Intel 82573V + Intel 82573L Supermicro SuperServer 5015M-T+B - i386 - Intel 82573V + Intel 82573L Supermicro SuperServer 5015M-T+B - i386 - Intel 82573V + Intel 82573L The 5015B-MTB system presently runs RELENG_8 -- no issues there either. Relevant server configuration and network setup details: - All machines use pf(4). - All emX devices are configured for autoneg. - All emX devices use RXCSUM, TXCSUM, and TSO4. - We do not use polling. - All machines use both NICs simultaneously at all times. - All machines connected to an HP ProCurve 2626 switch (100mbit, full-duplex ports, all autoneg). - We do not use Jumbo frames. - No add-in cards (PCI, PCI-X, nor PCIe) are used in the systems. - All of the systems had DCGDIS.EXE run on them; no EEPROM settings were changed, indicating the from-the-Intel-factory MANC register in question was set properly. Relevant throughput details per box: - em0 pushes ~600-1000kbit/sec at all times. - em1 pushes ~100-200kbit/sec at all times. - During nightly maintenance (backups), em1 pushes ~2-3mbit/sec for a variable amount of time. - For a full level 0 backup (which I've done numerous times), em1 pushes 60-70mbit/sec without issues. I've compared your sysctl dev.em output to that of our 5015M-T+B systems (which use the PDSMi+, not the PDSMi, but whatever), and ours is 100% identical. All of our 5015M-T+B systems are using BIOS 1.3, and the 5015B-MTB system is using BIOS 1.30. If you'd like, I can provide the exact BIOS settings we use on the machines in question; they do deviate from the factory defaults a slight bit, but none of the adjustments are "tweaks" for performance or otherwise (just disabling things which we don't use, etc.). -- | Jeremy Chadwick [email protected] | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | _______________________________________________ [email protected] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[email protected]"
