Ray Lee wrote:
> First off, thanks for all your help.
>
> Second off,
>
> On 11/16/06, Larry Finger <[EMAIL PROTECTED]> wrote:
>> Ray Lee wrote:
>> >
>> > If I could figure out a way to make it repeatable, I'd happily do a
>> blind
>> > bisect.
> [...]
>> > I'm open to suggestions on how to make the problem trigger more than
>> once
>> > every two days...
>>
>> I don't know what might be causing the lock problems. I'm more
>> concerned with the NETDEV WATCHDOG
>> timeouts. AFAIK, you are the only one still reporting this error. On
>> my system, I get an occasional
>> MAC suspend failure, sometimes followed by an BCM43xx_IRQ_XMIT_ERROR.
>
> Last time I had trouble with 2.6.18-rcX, I wasn't the only one, just
> the only one reporting it. Can you tell me why reverting the likely
> culprit isn't an option? rc6 is out, and Linus is really pushing to
> finalize 2.6.19 here soon.
>
>> From what I read in your post, the timeouts happen a lot more often
>> than once every two days. Once
>> we get those fixed, then we can concentrate on the locking.
>
> It's becoming clear that I wasn't so clear :-). No, it doesn't happen
> more than once every two (three, now) days. I'm saying that it's only
> happened twice, as once the first timeout message starts, the timeouts
> don't stop short of a reboot.
>
> Or, in other words, it happened occasionally under 2.6.19-rc3, but
> fixed itself. Under 2.6.19-rc5, it's happened less frequently (maybe),
> but once it starts, it goes on solid until I reboot the computer.
> Until I reboot, the laptop is fully unusable as things start hanging
> on the rtnl_lock (X, apparently).
>
> Please see http://madrabbit.org/~ray/messages.gz for the
> /var/log/messages to understand what I mean by that. (Though, that was
> captured before I'd rebuilt the module with debugging, unfortunately.
> Regardless, it may help clarify what I mean here.)
>
> So all the NETDEV WATCHDOG timeouts other than the first (of each of
> the two events) appear to be bogus, or side effects of rtnl_lock being
> held after the first time, and not clearing out.
>
> <thinks...> Maybe I've got the culprit backward here. Perhaps
> something else in my system is locking on rtnl_lock, and bcm43xx can't
> acquire it? Could the NETDEV WATCHDOG timeouts be a side effect of
> someone acquiring and not releasing the rtnl_lock()? Is that possible?
> (ie, would it cause the effect I'm seeing?)
It certainly could. Please remove the new line in the hunk below for
drivers/net/wireless/bcm43xx/bcm43xx_main.c:
@@ -3569,6 +3586,7 @@ int bcm43xx_select_wireless_core(struct
bcm43xx_macfilter_clear(bcm, BCM43xx_MACFILTER_ASSOC);
bcm43xx_macfilter_set(bcm, BCM43xx_MACFILTER_SELF, (u8
*)(bcm->net_dev->dev_addr));
bcm43xx_security_init(bcm);
+ drain_txstatus_queue(bcm);
ieee80211softmac_start(bcm->net_dev);
This will effectively remove _ALL_ bcm43xx patches between 2.6.19-rc3 and -rc6.
If the rtnl_locks
still occur, bcm43xx is not causing them. The other patches are not involved
for your system.
Larry
_______________________________________________
Bcm43xx-dev mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev