Ray Lee wrote:
First off, thanks for all your help.

Second off,

On 11/16/06, Larry Finger <[EMAIL PROTECTED]> wrote:
Ray Lee wrote:
>
> If I could figure out a way to make it repeatable, I'd happily do a blind
> bisect.
[...]
> I'm open to suggestions on how to make the problem trigger more than once
> every two days...

I don't know what might be causing the lock problems. I'm more concerned with the NETDEV WATCHDOG timeouts. AFAIK, you are the only one still reporting this error. On my system, I get an occasional
MAC suspend failure, sometimes followed by an BCM43xx_IRQ_XMIT_ERROR.

Last time I had trouble with 2.6.18-rcX, I wasn't the only one, just
the only one reporting it. Can you tell me why reverting the likely
culprit isn't an option? rc6 is out, and Linus is really pushing to
finalize 2.6.19 here soon.

From what I read in your post, the timeouts happen a lot more often than once every two days. Once
we get those fixed, then we can concentrate on the locking.

It's becoming clear that I wasn't so clear :-). No, it doesn't happen
more than once every two (three, now) days. I'm saying that it's only
happened twice, as once the first timeout message starts, the timeouts
don't stop short of a reboot.

Or, in other words, it happened occasionally under 2.6.19-rc3, but
fixed itself. Under 2.6.19-rc5, it's happened less frequently (maybe),
but once it starts, it goes on solid until I reboot the computer.
Until I reboot, the laptop is fully unusable as things start hanging
on the rtnl_lock (X, apparently).

Please see http://madrabbit.org/~ray/messages.gz for the
/var/log/messages to understand what I mean by that. (Though, that was
captured before I'd rebuilt the module with debugging, unfortunately.
Regardless, it may help clarify what I mean here.)

So all the NETDEV WATCHDOG timeouts other than the first (of each of
the two events) appear to be bogus, or side effects of rtnl_lock being
held after the first time, and not clearing out.

<thinks...> Maybe I've got the culprit backward here. Perhaps
something else in my system is locking on rtnl_lock, and bcm43xx can't
acquire it? Could the NETDEV WATCHDOG timeouts be a side effect of
someone acquiring and not releasing the rtnl_lock()? Is that possible?
(ie, would it cause the effect I'm seeing?)

It certainly could. Please remove the new line in the hunk below for drivers/net/wireless/bcm43xx/bcm43xx_main.c:

@@ -3569,6 +3586,7 @@ int bcm43xx_select_wireless_core(struct
        bcm43xx_macfilter_clear(bcm, BCM43xx_MACFILTER_ASSOC);
        bcm43xx_macfilter_set(bcm, BCM43xx_MACFILTER_SELF, (u8 
*)(bcm->net_dev->dev_addr));
        bcm43xx_security_init(bcm);
+       drain_txstatus_queue(bcm);
        ieee80211softmac_start(bcm->net_dev);

This will effectively remove _ALL_ bcm43xx patches between 2.6.19-rc3 and -rc6. If the rtnl_locks still occur, bcm43xx is not causing them. The other patches are not involved for your system.

Larry
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to