b43/BCM4312 fails with DMA errors

2009-10-16 Thread Chris Vine
Hi,

I have a Levono S-12 Netbook, which has the Atom N270 processor and a
Broadcom 14e4:4315 wireless chip with low power PHY.  lspci -vnn | grep
14e4 gives:

  02:00.0 Ethernet controller [0200]: Broadcom Corporation NetLink BCM5906M 
Fast Ethernet PCI Express [14e4:1713] (rev 02)
  03:00.0 Network controller [0280]: Broadcom Corporation BCM4312 802.11b/g 
[14e4:4315] (rev 01)
  Subsystem: Broadcom Corporation Unknown device [14e4:04b5]

As suggested for this chip on the linux wireless b43 howto web page, I
am using firmware extracted from broadcom-wl-4.178.10.4.tar.bz2 using
the current b43-fwcutter in git.

Although wireless works with the broadcom wl driver provided by
Broadcom in their hybrid-portsrc-x86_32-v5.10.91.9.3.tar.gz package up
to and including kernel 2.6.28 (but not afterwards, presumably because
of the replacement of ieee80211 by lib80211), the b43 driver in both
2.6.32-rc4 and current compat-wireless fails on a cold boot with dma
errors (I use cold boot advisedly - see the working case 1 for b43
mentioned at the end).

Prior to failure, modprobe b43  dmesg | egrep ssb|b43 gives:

  b43-pci-bridge :03:00.0: PCI INT A - GSI 17 (level, low) - IRQ 17
  b43-pci-bridge :03:00.0: setting latency timer to 64
  ssb: Sonics Silicon Backplane found on PCI device :03:00.0
  b43-phy0: Broadcom 4312 WLAN found (core revision 15)
  b43-phy0 debug: Found PHY: Analog 6, Type 5, Revision 1
  b43-phy0 debug: Found Radio: Manuf 0x17F, Version 0x2062, Revision 2
  Registered led device: b43-phy0::tx
  Registered led device: b43-phy0::rx
  Registered led device: b43-phy0::radio

So far it looks normal.  However, bringing up the wlan0 interface and
attempting to associate will shortly afterwards trigger DMA errors, and
any further use of the interface thereafter will fail.  Sometimes
failure happens immediately the interface is brought up, usually I can
get as far as successfully scanning for APs with 'iwlist scan wlan0',
and sometimes it gets as far as negotiating association with the AP,
but it always ends at some point with logging output such as this
in /var/log/messages (with b43 debugging switched on):

  b43-phy0: Broadcom 4312 WLAN found (core revision 15)
  b43-phy0 debug: Found PHY: Analog 6, Type 5, Revision 1
  b43-phy0 debug: Found Radio: Manuf 0x17F, Version 0x2062, Revision 2
  phy0: Selected rate control algorithm 'minstrel'
  Registered led device: b43-phy0::tx
  Registered led device: b43-phy0::rx
  Registered led device: b43-phy0::radio
  Broadcom 43xx driver loaded [ Features: PMLS, Firmware-ID: FW13 ]
  b43 ssb0:0: firmware: requesting b43/ucode15.fw
  b43 ssb0:0: firmware: requesting b43/lp0initvals15.fw
  b43 ssb0:0: firmware: requesting b43/lp0bsinitvals15.fw
  b43-phy0: Loading firmware version 478.104 (2008-07-01 00:50:23)
  b43-phy0 debug: b2062: Using crystal tab entry 19200 kHz.
  b43-phy0 debug: RC calib: Failed to switch to channel 7, error = -5
  b43-phy0 debug: Chip initialized
  b43-phy0 debug: 64-bit DMA initialized
  b43-phy0 debug: QoS enabled
  b43-phy0 debug: Wireless interface started
  b43-phy0 debug: Adding Interface type 2
  ADDRCONF(NETDEV_UP): wlan0: link is not ready
  b43-phy0 ERROR: Fatal DMA error: 0x0800, 0x, 0x, 
0x, 0x, 0x
  b43-phy0: Controller RESET (DMA error) ...
  b43-phy0 debug: Wireless interface stopped
  b43-phy0 debug: DMA-64 rx_ring: Used slots 1/64, Failed frames 0/0 = 0.0%, 
Average tries 0.00
  b43-phy0 debug: DMA-64 tx_ring_AC_BK: Used slots 0/256, Failed frames 0/0 = 
0.0%, Average tries 0.00
  b43-phy0 debug: DMA-64 tx_ring_AC_BE: Used slots 0/256, Failed frames 0/0 = 
0.0%, Average tries 0.00
  b43-phy0 debug: DMA-64 tx_ring_AC_VI: Used slots 0/256, Failed frames 0/0 = 
0.0%, Average tries 0.00
  b43-phy0 debug: DMA-64 tx_ring_AC_VO: Used slots 2/256, Failed frames 0/11 = 
0.0%, Average tries 1.00
  b43-phy0 debug: DMA-64 tx_ring_mcast: Used slots 0/256, Failed frames 0/0 = 
0.0%, Average tries 0.00
  b43-phy0: Loading firmware version 478.104 (2008-07-01 00:50:23)
  b43-phy0 debug: b2062: Using crystal tab entry 19200 kHz.
  b43-phy0 debug: Chip initialized
  [ ... and so on ... ]

The DMA error message repeats at approximately 5 second intervals (so
filling up /var/log/messages quite nicely after a while).

Two other points which may help locate the problem, when b43 WILL work:

1.  If I boot up ubuntu kernel-2.6.27 with the proprietary wl driver,
and then do a warm reboot to kernel 2.6.32-rc4, b43 works normally.
No DMA errors are reported.  This may point to a firmware loading issue,
but see 2 below.

2.  If I choose the force PIO debugging option b43 works OK (albeit no
doubt not very efficiently).

Chris


___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43/BCM4312 fails with DMA errors

2009-10-16 Thread fred99

Hi Chris --

 Chris Vine ch...@cvine.freeserve.co.uk wrote: 
 Hi,
 
 I have a Levono S-12 Netbook, which has the Atom N270 processor and a
 Broadcom 14e4:4315 wireless chip with low power PHY.  lspci -vnn | grep
 14e4 gives:
 
   02:00.0 Ethernet controller [0200]: Broadcom Corporation NetLink BCM5906M 
 Fast Ethernet PCI Express [14e4:1713] (rev 02)
   03:00.0 Network controller [0280]: Broadcom Corporation BCM4312 802.11b/g 
 [14e4:4315] (rev 01)
   Subsystem: Broadcom Corporation Unknown device [14e4:04b5]
 
 As suggested for this chip on the linux wireless b43 howto web page, I
 am using firmware extracted from broadcom-wl-4.178.10.4.tar.bz2 using
 the current b43-fwcutter in git.
 
 Although wireless works with the broadcom wl driver provided by
 Broadcom in their hybrid-portsrc-x86_32-v5.10.91.9.3.tar.gz package up
 to and including kernel 2.6.28 (but not afterwards, presumably because
 of the replacement of ieee80211 by lib80211), the b43 driver in both
 2.6.32-rc4 and current compat-wireless fails on a cold boot with dma
 errors (I use cold boot advisedly - see the working case 1 for b43
 mentioned at the end).
 
 Prior to failure, modprobe b43  dmesg | egrep ssb|b43 gives:
 
   b43-pci-bridge :03:00.0: PCI INT A - GSI 17 (level, low) - IRQ 17
   b43-pci-bridge :03:00.0: setting latency timer to 64
   ssb: Sonics Silicon Backplane found on PCI device :03:00.0
   b43-phy0: Broadcom 4312 WLAN found (core revision 15)
   b43-phy0 debug: Found PHY: Analog 6, Type 5, Revision 1
   b43-phy0 debug: Found Radio: Manuf 0x17F, Version 0x2062, Revision 2
   Registered led device: b43-phy0::tx
   Registered led device: b43-phy0::rx
   Registered led device: b43-phy0::radio
 
 So far it looks normal.  However, bringing up the wlan0 interface and
 attempting to associate will shortly afterwards trigger DMA errors, and
 any further use of the interface thereafter will fail.  Sometimes
 failure happens immediately the interface is brought up, usually I can
 get as far as successfully scanning for APs with 'iwlist scan wlan0',
 and sometimes it gets as far as negotiating association with the AP,
 but it always ends at some point with logging output such as this
 in /var/log/messages (with b43 debugging switched on):
 
   b43-phy0: Broadcom 4312 WLAN found (core revision 15)
   b43-phy0 debug: Found PHY: Analog 6, Type 5, Revision 1
   b43-phy0 debug: Found Radio: Manuf 0x17F, Version 0x2062, Revision 2
   phy0: Selected rate control algorithm 'minstrel'
   Registered led device: b43-phy0::tx
   Registered led device: b43-phy0::rx
   Registered led device: b43-phy0::radio
   Broadcom 43xx driver loaded [ Features: PMLS, Firmware-ID: FW13 ]
   b43 ssb0:0: firmware: requesting b43/ucode15.fw
   b43 ssb0:0: firmware: requesting b43/lp0initvals15.fw
   b43 ssb0:0: firmware: requesting b43/lp0bsinitvals15.fw
   b43-phy0: Loading firmware version 478.104 (2008-07-01 00:50:23)
   b43-phy0 debug: b2062: Using crystal tab entry 19200 kHz.
   b43-phy0 debug: RC calib: Failed to switch to channel 7, error = -5
   b43-phy0 debug: Chip initialized
   b43-phy0 debug: 64-bit DMA initialized
   b43-phy0 debug: QoS enabled
   b43-phy0 debug: Wireless interface started
   b43-phy0 debug: Adding Interface type 2
   ADDRCONF(NETDEV_UP): wlan0: link is not ready
   b43-phy0 ERROR: Fatal DMA error: 0x0800, 0x, 0x, 
 0x, 0x, 0x

I had 0x0400, rather than 0x0800

   b43-phy0: Controller RESET (DMA error) ...
   b43-phy0 debug: Wireless interface stopped
   b43-phy0 debug: DMA-64 rx_ring: Used slots 1/64, Failed frames 0/0 = 0.0%, 
 Average tries 0.00
   b43-phy0 debug: DMA-64 tx_ring_AC_BK: Used slots 0/256, Failed frames 0/0 = 
 0.0%, Average tries 0.00
   b43-phy0 debug: DMA-64 tx_ring_AC_BE: Used slots 0/256, Failed frames 0/0 = 
 0.0%, Average tries 0.00
   b43-phy0 debug: DMA-64 tx_ring_AC_VI: Used slots 0/256, Failed frames 0/0 = 
 0.0%, Average tries 0.00
   b43-phy0 debug: DMA-64 tx_ring_AC_VO: Used slots 2/256, Failed frames 0/11 
 = 0.0%, Average tries 1.00
   b43-phy0 debug: DMA-64 tx_ring_mcast: Used slots 0/256, Failed frames 0/0 = 
 0.0%, Average tries 0.00
   b43-phy0: Loading firmware version 478.104 (2008-07-01 00:50:23)
   b43-phy0 debug: b2062: Using crystal tab entry 19200 kHz.
   b43-phy0 debug: Chip initialized
   [ ... and so on ... ]
 
 The DMA error message repeats at approximately 5 second intervals (so
 filling up /var/log/messages quite nicely after a while).
 
 Two other points which may help locate the problem, when b43 WILL work:
 
 1.  If I boot up ubuntu kernel-2.6.27 with the proprietary wl driver,
 and then do a warm reboot to kernel 2.6.32-rc4, b43 works normally.
 No DMA errors are reported.  This may point to a firmware loading issue,
 but see 2 below.

I had similar problems, and was advised to try the full wireless-testing kernel,
which worked OK for me (compat-wireless with Fedora 2.6.31 kernel did *not* 
work).


Re: [PATCH] b43: Fix Bugzilla #14181 without introducing a new problem

2009-10-16 Thread Larry Finger
On 10/14/2009 03:06 AM, Michael Buesch wrote:
 On Wednesday 14 October 2009 03:25:30 Larry Finger wrote:
 Commit 93bad2b757586fb153ef73b028953a8dcaccde77 entitled b43: Fix PPC crash
 in rfkill polling on unload fixed the bug reported in Bugzilla No. 14181;
 however, it introduced a new bug. Whenever the radio switch was turned off,
 it was necessary to unload and reload the driver for it to recognize the
 switch again.

 I believe this patch fixes the original problem without introducing any new
 problems.

 Signed-off-by: Larry Finger larry.fin...@lwfinger.net
 ---


John,

As Michael correctly points out, this patch substitutes one bug for
another. The current bug affects every bcm43xx device with an rfkill
switch except BCM4306/3, and the new bug only affects BCM4306/3 users
with a kill switch. As the latter group may be the empty set, I think
the trade-off is worth it.

An additional complication is that I do not have the hardware to test
the PPC faults. The OP of Bugzilla #14181 has been helpful; however,
if it takes several tries to get a fix, we might miss the 2.6.32
release, which would introduce a significant regression.

For the above reasons, I am suggesting that this patch be accepted and
pushed to mainline even though it has faults.

Thanks,

Larry
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: [PATCH] b43: Fix Bugzilla #14181 without introducing a new problem

2009-10-16 Thread John W. Linville
On Fri, Oct 16, 2009 at 08:42:08AM -0500, Larry Finger wrote:
 On 10/14/2009 03:06 AM, Michael Buesch wrote:
  On Wednesday 14 October 2009 03:25:30 Larry Finger wrote:
  Commit 93bad2b757586fb153ef73b028953a8dcaccde77 entitled b43: Fix PPC 
  crash
  in rfkill polling on unload fixed the bug reported in Bugzilla No. 14181;
  however, it introduced a new bug. Whenever the radio switch was turned off,
  it was necessary to unload and reload the driver for it to recognize the
  switch again.
 
  I believe this patch fixes the original problem without introducing any new
  problems.
 
  Signed-off-by: Larry Finger larry.fin...@lwfinger.net
  ---
 

 As Michael correctly points out, this patch substitutes one bug for
 another. The current bug affects every bcm43xx device with an rfkill
 switch except BCM4306/3, and the new bug only affects BCM4306/3 users
 with a kill switch. As the latter group may be the empty set, I think
 the trade-off is worth it.
 
 An additional complication is that I do not have the hardware to test
 the PPC faults. The OP of Bugzilla #14181 has been helpful; however,
 if it takes several tries to get a fix, we might miss the 2.6.32
 release, which would introduce a significant regression.
 
 For the above reasons, I am suggesting that this patch be accepted and
 pushed to mainline even though it has faults.

Well, hmmm...ok, we have two or three problems here... :-)

One is whether or not to take this patch.  Normally it is against
policy or whatnot to trade one bug for another.  In this case,
it seems we would fix a real bug in exchange for a theorhetical
bug that we believe no one actually has.  Is that the case?  If so,
that might be acceptable.

The other problem is a work/patch flow issue.  I have occasionally
(some would say too often) snagged a patch directly from this list.
But in general I have waited for Michael to repost the patches to
linux-wireless before merging them.  As such, I'm unaccustomed to
collecting patches from here.  In any event, most patches should be
posted to linux-wireless for wider review before merging.  That would
normally be the maintainer's job, but we are effectively without one
for b43 now.  I don't suppose anyone wants to stand-up?

The third problem (related to the second) is that I missed the
original post, so if you don't mind I'd like you to resend it (to
linux-wireless)! :-)

Thanks,

John
-- 
John W. LinvilleSomeday the world will need a hero, and you
linvi...@tuxdriver.com  might be all we have.  Be ready.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: [PATCH] b43: Fix Bugzilla #14181 without introducing a new problem

2009-10-16 Thread Michael Buesch
On Friday 16 October 2009 16:44:14 John W. Linville wrote:
 On Fri, Oct 16, 2009 at 08:42:08AM -0500, Larry Finger wrote:
  On 10/14/2009 03:06 AM, Michael Buesch wrote:
   On Wednesday 14 October 2009 03:25:30 Larry Finger wrote:
   Commit 93bad2b757586fb153ef73b028953a8dcaccde77 entitled b43: Fix PPC 
   crash
   in rfkill polling on unload fixed the bug reported in Bugzilla No. 
   14181;
   however, it introduced a new bug. Whenever the radio switch was turned 
   off,
   it was necessary to unload and reload the driver for it to recognize the
   switch again.
  
   I believe this patch fixes the original problem without introducing any 
   new
   problems.
  
   Signed-off-by: Larry Finger larry.fin...@lwfinger.net
   ---
  
 
  As Michael correctly points out, this patch substitutes one bug for
  another. The current bug affects every bcm43xx device with an rfkill
  switch except BCM4306/3, and the new bug only affects BCM4306/3 users
  with a kill switch. As the latter group may be the empty set, I think
  the trade-off is worth it.
  
  An additional complication is that I do not have the hardware to test
  the PPC faults. The OP of Bugzilla #14181 has been helpful; however,
  if it takes several tries to get a fix, we might miss the 2.6.32
  release, which would introduce a significant regression.
  
  For the above reasons, I am suggesting that this patch be accepted and
  pushed to mainline even though it has faults.
 
 Well, hmmm...ok, we have two or three problems here... :-)
 
 One is whether or not to take this patch.  Normally it is against
 policy or whatnot to trade one bug for another.  In this case,
 it seems we would fix a real bug in exchange for a theorhetical
 bug that we believe no one actually has.  Is that the case?  If so,
 that might be acceptable.

The patch reduces the number of affected users, so it's probably OK.
I can't say if this reduces it to zero, though.
The real fix for all this crapola is to rewrite the b43 init to never
shutdown the device (or fix rfkill to never shutdown the device completely
and instead add a rfkill hook. But that's not going to happen, because
people don't like it. Although keeping the wireless device up just for
monitoring a stupid pushbutton is not acceptable behavior to me.)

 The other problem is a work/patch flow issue.  I have occasionally
 (some would say too often) snagged a patch directly from this list.

I suggest we move as much stuff as possible _off_ this list. It's
horribly unreliable, it mangles messages and it's slow.
So it's a good idea to add CC linux-wireless list on the very first reply.

I think, however, it's always a good idea to resend a patch after the
discussion has calmed down. It's way easier for the upstream maintainer.

-- 
Greetings, Michael.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: [PATCH] b43: Fix Bugzilla #14181 without introducing a new problem

2009-10-16 Thread Larry Finger
On 10/16/2009 09:44 AM, John W. Linville wrote:
 On Fri, Oct 16, 2009 at 08:42:08AM -0500, Larry Finger wrote:
 On 10/14/2009 03:06 AM, Michael Buesch wrote:
 On Wednesday 14 October 2009 03:25:30 Larry Finger wrote:
 Commit 93bad2b757586fb153ef73b028953a8dcaccde77 entitled b43: Fix PPC 
 crash
 in rfkill polling on unload fixed the bug reported in Bugzilla No. 14181;
 however, it introduced a new bug. Whenever the radio switch was turned off,
 it was necessary to unload and reload the driver for it to recognize the
 switch again.

 I believe this patch fixes the original problem without introducing any new
 problems.

 Signed-off-by: Larry Finger larry.fin...@lwfinger.net
 ---

 
 As Michael correctly points out, this patch substitutes one bug for
 another. The current bug affects every bcm43xx device with an rfkill
 switch except BCM4306/3, and the new bug only affects BCM4306/3 users
 with a kill switch. As the latter group may be the empty set, I think
 the trade-off is worth it.

 An additional complication is that I do not have the hardware to test
 the PPC faults. The OP of Bugzilla #14181 has been helpful; however,
 if it takes several tries to get a fix, we might miss the 2.6.32
 release, which would introduce a significant regression.

 For the above reasons, I am suggesting that this patch be accepted and
 pushed to mainline even though it has faults.
 
 Well, hmmm...ok, we have two or three problems here... :-)
 
 One is whether or not to take this patch.  Normally it is against
 policy or whatnot to trade one bug for another.  In this case,
 it seems we would fix a real bug in exchange for a theorhetical
 bug that we believe no one actually has.  Is that the case?  If so,
 that might be acceptable.

Yes, I believe that to be the case.

 The other problem is a work/patch flow issue.  I have occasionally
 (some would say too often) snagged a patch directly from this list.
 But in general I have waited for Michael to repost the patches to
 linux-wireless before merging them.  As such, I'm unaccustomed to
 collecting patches from here.  In any event, most patches should be
 posted to linux-wireless for wider review before merging.  That would
 normally be the maintainer's job, but we are effectively without one
 for b43 now.  I don't suppose anyone wants to stand-up?

I would, but my RE work precludes that.

 The third problem (related to the second) is that I missed the
 original post, so if you don't mind I'd like you to resend it (to
 linux-wireless)! :-)

Will do.

Larry
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43/BCM4312 fails with DMA errors

2009-10-16 Thread Chris Vine
On Fri, 16 Oct 2009 9:45:40 -0400
fre...@carolina.rr.com wrote:

 2.6.32-rc? and wireless-compat was thought to work as well as
 wireless-testing ... hmmm

Ah right.  I had been trying 2.6.32-rc4 by itself and compat-wireless
with 2.6.31.  If I build compat-wireless on top of 2.6.32-rc.4 the dma
errors appear to go away but the encryption modules won't load so it
is not usable.

I think I will try again in a few months time when things have settled
down a bit.  (Or possibly pull wireless-testing from git once 2.6.32
comes out.)

Chris


___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43/BCM4312 fails with DMA errors

2009-10-16 Thread Gábor Stefanik
On Fri, Oct 16, 2009 at 6:01 PM, Chris Vine ch...@cvine.freeserve.co.uk wrote:
 On Fri, 16 Oct 2009 9:45:40 -0400
 fre...@carolina.rr.com wrote:

 2.6.32-rc? and wireless-compat was thought to work as well as
 wireless-testing ... hmmm

 Ah right.  I had been trying 2.6.32-rc4 by itself and compat-wireless
 with 2.6.31.  If I build compat-wireless on top of 2.6.32-rc.4 the dma
 errors appear to go away but the encryption modules won't load so it
 is not usable.

 I think I will try again in a few months time when things have settled
 down a bit.  (Or possibly pull wireless-testing from git once 2.6.32
 comes out.)

 Chris


 ___
 Bcm43xx-dev mailing list
 Bcm43xx-dev@lists.berlios.de
 https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Could you please try the real wireless-testing tree (as opposed to
compat-wireless)?

Also, what I noticed is that everyone with this problem appears to
have an Intel Atom CPU... weird. Maybe something weird is going on
with the Atom chipset's DMA handling.

-- 
Vista: [V]iruses, [I]ntruders, [S]pyware, [T]rojans and [A]dware. :-)
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev