No success or failure reports on this one??
Remember, without a response, this issue will never be fixed.


On Thursday 19 October 2006 21:38, Michael Buesch wrote:
> Hi,
> 
> I hopefully found out why we get a watchdog timeout now
> and then.
> I spent some time thinking, testing, thinking and thinking
> and I found out that this bug triggers when there is no network
> traffic only (at least for me). That is kind of strange and
> I think a possible reason for this is the following race.
> 
> 
> |---5secs - ~10 jiffies time---|---|OOPS
> ^                              ^
> last real TX                   periodic work stops netif
> 
> At OOPS, the following happens:
> The watchdog timer triggers, because the timeout of 5secs
> is over. The watchdog first checks for stopped TX.
> _Usually_ TX is only stopped from the TX handler to indicate
> a full TX queue. But this is different. We need to stop TX here,
> regardless of the TX queue state. So the watchdog recognizes
> the stopped device and assumes it is stopped due to full
> TX queues (Which is a _wrong_ assumption in this case). It then
> tests how far the last TX has been in the past. If it's more than
> 5secs (which is the case for low or no traffic), it will fire
> a TX timeout.
> 
> I think the correct solution for this is to fake a TX start
> on every periodic work execution. This fake is harmless and
> prevents the watchdog from triggering. At least here in my testsuite. :)
> 
> Please test this guys.
> 
> This patch is against 2.6.18.1 (and not 2.6.18, as the diff prolog suggests)
> 
> 
> Index: linux-2.6.18/drivers/net/wireless/bcm43xx/bcm43xx_main.c
> ===================================================================
> --- linux-2.6.18.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c     
> 2006-10-19 21:30:42.000000000 +0200
> +++ linux-2.6.18/drivers/net/wireless/bcm43xx/bcm43xx_main.c  2006-10-19 
> 21:33:28.000000000 +0200
> @@ -3165,7 +3165,15 @@ static void bcm43xx_periodic_work_handle
>  
>       badness = estimate_periodic_work_badness(bcm->periodic_state);
>       mutex_lock(&bcm->mutex);
> +
> +     /* We must fake a started transmission here, as we are going to
> +      * disable TX. If we wouldn't fake a TX, it would be possible to
> +      * trigger the netdev watchdog, if the last real TX is already
> +      * some time on the past (slightly less than 5secs)
> +      */
> +     bcm->net_dev->trans_start = jiffies;
>       netif_tx_disable(bcm->net_dev);
> +
>       spin_lock_irqsave(&bcm->irq_lock, flags);
>       if (badness > BADNESS_LIMIT) {
>               /* Periodic work will take a long time, so we want it to
> 
> 
> 

-- 
Greetings Michael.
_______________________________________________
Bcm43xx-dev mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev

Reply via email to