b43/BCM4312 fails with DMA errors
Hi, I have a Levono S-12 Netbook, which has the Atom N270 processor and a Broadcom 14e4:4315 wireless chip with low power PHY. lspci -vnn | grep 14e4 gives: 02:00.0 Ethernet controller [0200]: Broadcom Corporation NetLink BCM5906M Fast Ethernet PCI Express [14e4:1713] (rev 02) 03:00.0 Network controller [0280]: Broadcom Corporation BCM4312 802.11b/g [14e4:4315] (rev 01) Subsystem: Broadcom Corporation Unknown device [14e4:04b5] As suggested for this chip on the linux wireless b43 howto web page, I am using firmware extracted from broadcom-wl-4.178.10.4.tar.bz2 using the current b43-fwcutter in git. Although wireless works with the broadcom wl driver provided by Broadcom in their hybrid-portsrc-x86_32-v5.10.91.9.3.tar.gz package up to and including kernel 2.6.28 (but not afterwards, presumably because of the replacement of ieee80211 by lib80211), the b43 driver in both 2.6.32-rc4 and current compat-wireless fails on a cold boot with dma errors (I use cold boot advisedly - see the working case 1 for b43 mentioned at the end). Prior to failure, modprobe b43 dmesg | egrep ssb|b43 gives: b43-pci-bridge :03:00.0: PCI INT A - GSI 17 (level, low) - IRQ 17 b43-pci-bridge :03:00.0: setting latency timer to 64 ssb: Sonics Silicon Backplane found on PCI device :03:00.0 b43-phy0: Broadcom 4312 WLAN found (core revision 15) b43-phy0 debug: Found PHY: Analog 6, Type 5, Revision 1 b43-phy0 debug: Found Radio: Manuf 0x17F, Version 0x2062, Revision 2 Registered led device: b43-phy0::tx Registered led device: b43-phy0::rx Registered led device: b43-phy0::radio So far it looks normal. However, bringing up the wlan0 interface and attempting to associate will shortly afterwards trigger DMA errors, and any further use of the interface thereafter will fail. Sometimes failure happens immediately the interface is brought up, usually I can get as far as successfully scanning for APs with 'iwlist scan wlan0', and sometimes it gets as far as negotiating association with the AP, but it always ends at some point with logging output such as this in /var/log/messages (with b43 debugging switched on): b43-phy0: Broadcom 4312 WLAN found (core revision 15) b43-phy0 debug: Found PHY: Analog 6, Type 5, Revision 1 b43-phy0 debug: Found Radio: Manuf 0x17F, Version 0x2062, Revision 2 phy0: Selected rate control algorithm 'minstrel' Registered led device: b43-phy0::tx Registered led device: b43-phy0::rx Registered led device: b43-phy0::radio Broadcom 43xx driver loaded [ Features: PMLS, Firmware-ID: FW13 ] b43 ssb0:0: firmware: requesting b43/ucode15.fw b43 ssb0:0: firmware: requesting b43/lp0initvals15.fw b43 ssb0:0: firmware: requesting b43/lp0bsinitvals15.fw b43-phy0: Loading firmware version 478.104 (2008-07-01 00:50:23) b43-phy0 debug: b2062: Using crystal tab entry 19200 kHz. b43-phy0 debug: RC calib: Failed to switch to channel 7, error = -5 b43-phy0 debug: Chip initialized b43-phy0 debug: 64-bit DMA initialized b43-phy0 debug: QoS enabled b43-phy0 debug: Wireless interface started b43-phy0 debug: Adding Interface type 2 ADDRCONF(NETDEV_UP): wlan0: link is not ready b43-phy0 ERROR: Fatal DMA error: 0x0800, 0x, 0x, 0x, 0x, 0x b43-phy0: Controller RESET (DMA error) ... b43-phy0 debug: Wireless interface stopped b43-phy0 debug: DMA-64 rx_ring: Used slots 1/64, Failed frames 0/0 = 0.0%, Average tries 0.00 b43-phy0 debug: DMA-64 tx_ring_AC_BK: Used slots 0/256, Failed frames 0/0 = 0.0%, Average tries 0.00 b43-phy0 debug: DMA-64 tx_ring_AC_BE: Used slots 0/256, Failed frames 0/0 = 0.0%, Average tries 0.00 b43-phy0 debug: DMA-64 tx_ring_AC_VI: Used slots 0/256, Failed frames 0/0 = 0.0%, Average tries 0.00 b43-phy0 debug: DMA-64 tx_ring_AC_VO: Used slots 2/256, Failed frames 0/11 = 0.0%, Average tries 1.00 b43-phy0 debug: DMA-64 tx_ring_mcast: Used slots 0/256, Failed frames 0/0 = 0.0%, Average tries 0.00 b43-phy0: Loading firmware version 478.104 (2008-07-01 00:50:23) b43-phy0 debug: b2062: Using crystal tab entry 19200 kHz. b43-phy0 debug: Chip initialized [ ... and so on ... ] The DMA error message repeats at approximately 5 second intervals (so filling up /var/log/messages quite nicely after a while). Two other points which may help locate the problem, when b43 WILL work: 1. If I boot up ubuntu kernel-2.6.27 with the proprietary wl driver, and then do a warm reboot to kernel 2.6.32-rc4, b43 works normally. No DMA errors are reported. This may point to a firmware loading issue, but see 2 below. 2. If I choose the force PIO debugging option b43 works OK (albeit no doubt not very efficiently). Chris ___ Bcm43xx-dev mailing list Bcm43xx-dev@lists.berlios.de https://lists.berlios.de/mailman/listinfo/bcm43xx-dev
Re: b43/BCM4312 fails with DMA errors
Hi Chris -- Chris Vine ch...@cvine.freeserve.co.uk wrote: Hi, I have a Levono S-12 Netbook, which has the Atom N270 processor and a Broadcom 14e4:4315 wireless chip with low power PHY. lspci -vnn | grep 14e4 gives: 02:00.0 Ethernet controller [0200]: Broadcom Corporation NetLink BCM5906M Fast Ethernet PCI Express [14e4:1713] (rev 02) 03:00.0 Network controller [0280]: Broadcom Corporation BCM4312 802.11b/g [14e4:4315] (rev 01) Subsystem: Broadcom Corporation Unknown device [14e4:04b5] As suggested for this chip on the linux wireless b43 howto web page, I am using firmware extracted from broadcom-wl-4.178.10.4.tar.bz2 using the current b43-fwcutter in git. Although wireless works with the broadcom wl driver provided by Broadcom in their hybrid-portsrc-x86_32-v5.10.91.9.3.tar.gz package up to and including kernel 2.6.28 (but not afterwards, presumably because of the replacement of ieee80211 by lib80211), the b43 driver in both 2.6.32-rc4 and current compat-wireless fails on a cold boot with dma errors (I use cold boot advisedly - see the working case 1 for b43 mentioned at the end). Prior to failure, modprobe b43 dmesg | egrep ssb|b43 gives: b43-pci-bridge :03:00.0: PCI INT A - GSI 17 (level, low) - IRQ 17 b43-pci-bridge :03:00.0: setting latency timer to 64 ssb: Sonics Silicon Backplane found on PCI device :03:00.0 b43-phy0: Broadcom 4312 WLAN found (core revision 15) b43-phy0 debug: Found PHY: Analog 6, Type 5, Revision 1 b43-phy0 debug: Found Radio: Manuf 0x17F, Version 0x2062, Revision 2 Registered led device: b43-phy0::tx Registered led device: b43-phy0::rx Registered led device: b43-phy0::radio So far it looks normal. However, bringing up the wlan0 interface and attempting to associate will shortly afterwards trigger DMA errors, and any further use of the interface thereafter will fail. Sometimes failure happens immediately the interface is brought up, usually I can get as far as successfully scanning for APs with 'iwlist scan wlan0', and sometimes it gets as far as negotiating association with the AP, but it always ends at some point with logging output such as this in /var/log/messages (with b43 debugging switched on): b43-phy0: Broadcom 4312 WLAN found (core revision 15) b43-phy0 debug: Found PHY: Analog 6, Type 5, Revision 1 b43-phy0 debug: Found Radio: Manuf 0x17F, Version 0x2062, Revision 2 phy0: Selected rate control algorithm 'minstrel' Registered led device: b43-phy0::tx Registered led device: b43-phy0::rx Registered led device: b43-phy0::radio Broadcom 43xx driver loaded [ Features: PMLS, Firmware-ID: FW13 ] b43 ssb0:0: firmware: requesting b43/ucode15.fw b43 ssb0:0: firmware: requesting b43/lp0initvals15.fw b43 ssb0:0: firmware: requesting b43/lp0bsinitvals15.fw b43-phy0: Loading firmware version 478.104 (2008-07-01 00:50:23) b43-phy0 debug: b2062: Using crystal tab entry 19200 kHz. b43-phy0 debug: RC calib: Failed to switch to channel 7, error = -5 b43-phy0 debug: Chip initialized b43-phy0 debug: 64-bit DMA initialized b43-phy0 debug: QoS enabled b43-phy0 debug: Wireless interface started b43-phy0 debug: Adding Interface type 2 ADDRCONF(NETDEV_UP): wlan0: link is not ready b43-phy0 ERROR: Fatal DMA error: 0x0800, 0x, 0x, 0x, 0x, 0x I had 0x0400, rather than 0x0800 b43-phy0: Controller RESET (DMA error) ... b43-phy0 debug: Wireless interface stopped b43-phy0 debug: DMA-64 rx_ring: Used slots 1/64, Failed frames 0/0 = 0.0%, Average tries 0.00 b43-phy0 debug: DMA-64 tx_ring_AC_BK: Used slots 0/256, Failed frames 0/0 = 0.0%, Average tries 0.00 b43-phy0 debug: DMA-64 tx_ring_AC_BE: Used slots 0/256, Failed frames 0/0 = 0.0%, Average tries 0.00 b43-phy0 debug: DMA-64 tx_ring_AC_VI: Used slots 0/256, Failed frames 0/0 = 0.0%, Average tries 0.00 b43-phy0 debug: DMA-64 tx_ring_AC_VO: Used slots 2/256, Failed frames 0/11 = 0.0%, Average tries 1.00 b43-phy0 debug: DMA-64 tx_ring_mcast: Used slots 0/256, Failed frames 0/0 = 0.0%, Average tries 0.00 b43-phy0: Loading firmware version 478.104 (2008-07-01 00:50:23) b43-phy0 debug: b2062: Using crystal tab entry 19200 kHz. b43-phy0 debug: Chip initialized [ ... and so on ... ] The DMA error message repeats at approximately 5 second intervals (so filling up /var/log/messages quite nicely after a while). Two other points which may help locate the problem, when b43 WILL work: 1. If I boot up ubuntu kernel-2.6.27 with the proprietary wl driver, and then do a warm reboot to kernel 2.6.32-rc4, b43 works normally. No DMA errors are reported. This may point to a firmware loading issue, but see 2 below. I had similar problems, and was advised to try the full wireless-testing kernel, which worked OK for me (compat-wireless with Fedora 2.6.31 kernel did *not* work).
Re: [PATCH] b43: Fix Bugzilla #14181 without introducing a new problem
On 10/14/2009 03:06 AM, Michael Buesch wrote: On Wednesday 14 October 2009 03:25:30 Larry Finger wrote: Commit 93bad2b757586fb153ef73b028953a8dcaccde77 entitled b43: Fix PPC crash in rfkill polling on unload fixed the bug reported in Bugzilla No. 14181; however, it introduced a new bug. Whenever the radio switch was turned off, it was necessary to unload and reload the driver for it to recognize the switch again. I believe this patch fixes the original problem without introducing any new problems. Signed-off-by: Larry Finger larry.fin...@lwfinger.net --- John, As Michael correctly points out, this patch substitutes one bug for another. The current bug affects every bcm43xx device with an rfkill switch except BCM4306/3, and the new bug only affects BCM4306/3 users with a kill switch. As the latter group may be the empty set, I think the trade-off is worth it. An additional complication is that I do not have the hardware to test the PPC faults. The OP of Bugzilla #14181 has been helpful; however, if it takes several tries to get a fix, we might miss the 2.6.32 release, which would introduce a significant regression. For the above reasons, I am suggesting that this patch be accepted and pushed to mainline even though it has faults. Thanks, Larry ___ Bcm43xx-dev mailing list Bcm43xx-dev@lists.berlios.de https://lists.berlios.de/mailman/listinfo/bcm43xx-dev
Re: [PATCH] b43: Fix Bugzilla #14181 without introducing a new problem
On Fri, Oct 16, 2009 at 08:42:08AM -0500, Larry Finger wrote: On 10/14/2009 03:06 AM, Michael Buesch wrote: On Wednesday 14 October 2009 03:25:30 Larry Finger wrote: Commit 93bad2b757586fb153ef73b028953a8dcaccde77 entitled b43: Fix PPC crash in rfkill polling on unload fixed the bug reported in Bugzilla No. 14181; however, it introduced a new bug. Whenever the radio switch was turned off, it was necessary to unload and reload the driver for it to recognize the switch again. I believe this patch fixes the original problem without introducing any new problems. Signed-off-by: Larry Finger larry.fin...@lwfinger.net --- As Michael correctly points out, this patch substitutes one bug for another. The current bug affects every bcm43xx device with an rfkill switch except BCM4306/3, and the new bug only affects BCM4306/3 users with a kill switch. As the latter group may be the empty set, I think the trade-off is worth it. An additional complication is that I do not have the hardware to test the PPC faults. The OP of Bugzilla #14181 has been helpful; however, if it takes several tries to get a fix, we might miss the 2.6.32 release, which would introduce a significant regression. For the above reasons, I am suggesting that this patch be accepted and pushed to mainline even though it has faults. Well, hmmm...ok, we have two or three problems here... :-) One is whether or not to take this patch. Normally it is against policy or whatnot to trade one bug for another. In this case, it seems we would fix a real bug in exchange for a theorhetical bug that we believe no one actually has. Is that the case? If so, that might be acceptable. The other problem is a work/patch flow issue. I have occasionally (some would say too often) snagged a patch directly from this list. But in general I have waited for Michael to repost the patches to linux-wireless before merging them. As such, I'm unaccustomed to collecting patches from here. In any event, most patches should be posted to linux-wireless for wider review before merging. That would normally be the maintainer's job, but we are effectively without one for b43 now. I don't suppose anyone wants to stand-up? The third problem (related to the second) is that I missed the original post, so if you don't mind I'd like you to resend it (to linux-wireless)! :-) Thanks, John -- John W. LinvilleSomeday the world will need a hero, and you linvi...@tuxdriver.com might be all we have. Be ready. ___ Bcm43xx-dev mailing list Bcm43xx-dev@lists.berlios.de https://lists.berlios.de/mailman/listinfo/bcm43xx-dev
Re: [PATCH] b43: Fix Bugzilla #14181 without introducing a new problem
On Friday 16 October 2009 16:44:14 John W. Linville wrote: On Fri, Oct 16, 2009 at 08:42:08AM -0500, Larry Finger wrote: On 10/14/2009 03:06 AM, Michael Buesch wrote: On Wednesday 14 October 2009 03:25:30 Larry Finger wrote: Commit 93bad2b757586fb153ef73b028953a8dcaccde77 entitled b43: Fix PPC crash in rfkill polling on unload fixed the bug reported in Bugzilla No. 14181; however, it introduced a new bug. Whenever the radio switch was turned off, it was necessary to unload and reload the driver for it to recognize the switch again. I believe this patch fixes the original problem without introducing any new problems. Signed-off-by: Larry Finger larry.fin...@lwfinger.net --- As Michael correctly points out, this patch substitutes one bug for another. The current bug affects every bcm43xx device with an rfkill switch except BCM4306/3, and the new bug only affects BCM4306/3 users with a kill switch. As the latter group may be the empty set, I think the trade-off is worth it. An additional complication is that I do not have the hardware to test the PPC faults. The OP of Bugzilla #14181 has been helpful; however, if it takes several tries to get a fix, we might miss the 2.6.32 release, which would introduce a significant regression. For the above reasons, I am suggesting that this patch be accepted and pushed to mainline even though it has faults. Well, hmmm...ok, we have two or three problems here... :-) One is whether or not to take this patch. Normally it is against policy or whatnot to trade one bug for another. In this case, it seems we would fix a real bug in exchange for a theorhetical bug that we believe no one actually has. Is that the case? If so, that might be acceptable. The patch reduces the number of affected users, so it's probably OK. I can't say if this reduces it to zero, though. The real fix for all this crapola is to rewrite the b43 init to never shutdown the device (or fix rfkill to never shutdown the device completely and instead add a rfkill hook. But that's not going to happen, because people don't like it. Although keeping the wireless device up just for monitoring a stupid pushbutton is not acceptable behavior to me.) The other problem is a work/patch flow issue. I have occasionally (some would say too often) snagged a patch directly from this list. I suggest we move as much stuff as possible _off_ this list. It's horribly unreliable, it mangles messages and it's slow. So it's a good idea to add CC linux-wireless list on the very first reply. I think, however, it's always a good idea to resend a patch after the discussion has calmed down. It's way easier for the upstream maintainer. -- Greetings, Michael. ___ Bcm43xx-dev mailing list Bcm43xx-dev@lists.berlios.de https://lists.berlios.de/mailman/listinfo/bcm43xx-dev
Re: [PATCH] b43: Fix Bugzilla #14181 without introducing a new problem
On 10/16/2009 09:44 AM, John W. Linville wrote: On Fri, Oct 16, 2009 at 08:42:08AM -0500, Larry Finger wrote: On 10/14/2009 03:06 AM, Michael Buesch wrote: On Wednesday 14 October 2009 03:25:30 Larry Finger wrote: Commit 93bad2b757586fb153ef73b028953a8dcaccde77 entitled b43: Fix PPC crash in rfkill polling on unload fixed the bug reported in Bugzilla No. 14181; however, it introduced a new bug. Whenever the radio switch was turned off, it was necessary to unload and reload the driver for it to recognize the switch again. I believe this patch fixes the original problem without introducing any new problems. Signed-off-by: Larry Finger larry.fin...@lwfinger.net --- As Michael correctly points out, this patch substitutes one bug for another. The current bug affects every bcm43xx device with an rfkill switch except BCM4306/3, and the new bug only affects BCM4306/3 users with a kill switch. As the latter group may be the empty set, I think the trade-off is worth it. An additional complication is that I do not have the hardware to test the PPC faults. The OP of Bugzilla #14181 has been helpful; however, if it takes several tries to get a fix, we might miss the 2.6.32 release, which would introduce a significant regression. For the above reasons, I am suggesting that this patch be accepted and pushed to mainline even though it has faults. Well, hmmm...ok, we have two or three problems here... :-) One is whether or not to take this patch. Normally it is against policy or whatnot to trade one bug for another. In this case, it seems we would fix a real bug in exchange for a theorhetical bug that we believe no one actually has. Is that the case? If so, that might be acceptable. Yes, I believe that to be the case. The other problem is a work/patch flow issue. I have occasionally (some would say too often) snagged a patch directly from this list. But in general I have waited for Michael to repost the patches to linux-wireless before merging them. As such, I'm unaccustomed to collecting patches from here. In any event, most patches should be posted to linux-wireless for wider review before merging. That would normally be the maintainer's job, but we are effectively without one for b43 now. I don't suppose anyone wants to stand-up? I would, but my RE work precludes that. The third problem (related to the second) is that I missed the original post, so if you don't mind I'd like you to resend it (to linux-wireless)! :-) Will do. Larry ___ Bcm43xx-dev mailing list Bcm43xx-dev@lists.berlios.de https://lists.berlios.de/mailman/listinfo/bcm43xx-dev
Re: b43/BCM4312 fails with DMA errors
On Fri, 16 Oct 2009 9:45:40 -0400 fre...@carolina.rr.com wrote: 2.6.32-rc? and wireless-compat was thought to work as well as wireless-testing ... hmmm Ah right. I had been trying 2.6.32-rc4 by itself and compat-wireless with 2.6.31. If I build compat-wireless on top of 2.6.32-rc.4 the dma errors appear to go away but the encryption modules won't load so it is not usable. I think I will try again in a few months time when things have settled down a bit. (Or possibly pull wireless-testing from git once 2.6.32 comes out.) Chris ___ Bcm43xx-dev mailing list Bcm43xx-dev@lists.berlios.de https://lists.berlios.de/mailman/listinfo/bcm43xx-dev
Re: b43/BCM4312 fails with DMA errors
On Fri, Oct 16, 2009 at 6:01 PM, Chris Vine ch...@cvine.freeserve.co.uk wrote: On Fri, 16 Oct 2009 9:45:40 -0400 fre...@carolina.rr.com wrote: 2.6.32-rc? and wireless-compat was thought to work as well as wireless-testing ... hmmm Ah right. I had been trying 2.6.32-rc4 by itself and compat-wireless with 2.6.31. If I build compat-wireless on top of 2.6.32-rc.4 the dma errors appear to go away but the encryption modules won't load so it is not usable. I think I will try again in a few months time when things have settled down a bit. (Or possibly pull wireless-testing from git once 2.6.32 comes out.) Chris ___ Bcm43xx-dev mailing list Bcm43xx-dev@lists.berlios.de https://lists.berlios.de/mailman/listinfo/bcm43xx-dev Could you please try the real wireless-testing tree (as opposed to compat-wireless)? Also, what I noticed is that everyone with this problem appears to have an Intel Atom CPU... weird. Maybe something weird is going on with the Atom chipset's DMA handling. -- Vista: [V]iruses, [I]ntruders, [S]pyware, [T]rojans and [A]dware. :-) ___ Bcm43xx-dev mailing list Bcm43xx-dev@lists.berlios.de https://lists.berlios.de/mailman/listinfo/bcm43xx-dev