Re: watchdog timeout panic in e1000 driver

Kenzo Iwami Wed, 25 Oct 2006 06:42:51 -0700

Hi,

>> This problem originally occurred in a very large cluster system using snmp
>> for server management. About two servers panicked each day. The program I 
>> sent
>> is to reproduce this problem in a very short time. It does occur under normal
>> load when there is a lot of servers.
> 
> hmm, not good - does your snmp daemon use ethtool excessively? That would 
> certainly be 
> painful to the driver (any driver!).


I only looked at the panic message after this problem occurred.
I could tell that the snmp daemon caused the panic while trying to process
the ethtool's ioctl, but I don't know how often this was called.
However, it shouldn't be excessively called because it occurred on a production
system while it was idle.

> Anyway as I said in the same e-mail, we're working on reducing the lock 
> timeout to a 
> reasonable time. This will unfortunately take some time, as we need to change 
> some major 
> components in the driver to make sure this doesn't happen.

How about the following approach?
If acquiring semaphore fails inside the interrupt handler, acquiring semaphore
is abandoned immediately without waiting for timeout.
However, I don't know whether this method affects other processes.

-- 
  Kenzo Iwami ([EMAIL PROTECTED])
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: watchdog timeout panic in e1000 driver

Reply via email to