Jeff Garzik wrote: > Brice Goglin wrote: >> Limit the number of recoveries from a NIC hw watchdog reset to 1 by >> default. >> It enables detection of defective NICs immediately since these memory >> parity >> errors are expected to happen very rarely (less than once per >> century*NIC). >> However, a defective NIC (very rare, fortunately) can see such an error >> quite often, ie. every few minutes under high load. >> >> Make the limit tunable to allow people with mission critical >> installations >> to crank up the tunable and recover an INTMAX number of times while >> waiting >> for a downtime window to replace the NIC. The performance won't be >> optimal, >> but at least, it will still work. >> >> Signed-off-by: Brice Goglin <[EMAIL PROTECTED]> >> --- >> drivers/net/myri10ge/myri10ge.c | 15 +++++++++++++-- >> 1 file changed, 13 insertions(+), 2 deletions(-) > > NAK.
Ok... Then please apply the following patch which limits the number of recovery to 1 without making it tunable. It will at least enable detection of bad NICs. Brice - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html