Re: [PATCH] PCI: Add Broadcom 4331 reset quirk to prevent IRQ storm

2016-04-13 Thread Andrew Worsley
Thank-you very much for your comments in your reply.

Actually the patch did work - I confirmed it was run and the iomap
call was successful by adding a pr_info() after the pci_iomap()
success branch.

The only time I am getting the IRQ 17 nobody cared message is on
suspend / resume. A fresh boot always had below the 100k interrupt
threshold level.

I tried your new patch and the number is even lower < 30,000 over two boots.

BUT on suspend resume again 126856.

Have you any insights on fixing suspend to disk  / resume paths which
presumably face the same issue of being passed live hardware on boot
up?


On 13 April 2016 at 04:32, Lukas Wunner <lu...@wunner.de> wrote:
> Hi Andrew,
>
> thank you for the extensive testing.
>
> On Sun, Apr 10, 2016 at 08:09:29PM +1000, Andrew Worsley wrote:
>> Further testing Broadcom 4331 reset quirk to prevent IRQ storm patch
>> testing reveals that:
>>   1. quirk is run on initial boot up and this time appears to have
>> vastly reduced the interrupts (only 81 this time):
>> cat /proc/interrupts| grep 17
>>  17: 81  0  0  0  0  0
>>  0  0   IO-APIC-fasteoi   snd_hda_intel
>
> Something in the ballpark of 81 interrupt requests is fine.
>
> The kernel will print the error message about spurious interrupts and
> switch to polling at 10 requests. But even 2 is way too much.
> This just means that b43 loaded quickly enough to stop the interrupts
> before the kernel limit of 10 was reached, but the wireless card
> wasn't reset early on as it should have been.
>
> It looks like the patch didn't work at all on your machine for some
> reason. Do you see a message "cannot iomap device, IRQ storm ahead"
> in dmesg?

Result from two reboots with my 3.16 kernel and your new patch

Three full boots (all below 30k interrupts):
 17:  23978  0  0  0  0  0
 0  0   IO-APIC-fasteoi   snd_hda_intel
 17:  30088  0  0  0  0  0
 0  0   IO-APIC-fasteoi   snd_hda_intel
 17:  26853  0  0  0  0  0
 0  0   IO-APIC-fasteoi   snd_hda_intel


dmesg output showing quirk running
dmesg | grep -C 5 quirk
[3.270315] pci :00:1c.0: PCI bridge to [bus 03]
[3.270323] pci :00:1c.0:   bridge window [mem 0xc1a0-0xc1af]
[3.270331] pci :00:1c.0:   bridge window [mem
0xc180-0xc18f 64bit pref]
[3.270463] pci :04:00.0: [14e4:4331] type 00 class 0x028000
[3.270495] pci :04:00.0: reg 0x10: [mem 0xc190-0xc1903fff 64bit]
[3.270574] pci :04:00.0: b43 quirk: resetting controller
[3.270711] pci :04:00.0: supports D1 D2
[3.270712] pci :04:00.0: PME# supported from D0 D3hot D3cold
[3.270759] pci :04:00.0: System wakeup disabled by ACPI
[3.278239] pci :00:1c.1: PCI bridge to [bus 04]
[3.278251] pci :00:1c.1:   bridge window [mem 0xc190-0xc19f]

Output after resume.  Note: Some times it looks it can happen on the
suspend to disk? But a new one is always present after the resume.

 17: 126856  0  0  0  0  0
 0  0   IO-APIC-fasteoi   snd_hda_intel
[   53.404157] xhci_hcd :00:14.0: xHCI xhci_drop_endpoint called
with disabled ep 88045d495540
[   53.468249] irq 17: nobody cared (try booting with the "irqpoll" option)
[   53.468253] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G C O
3.16.7-ckt25-3.16-bcm4331-patch2 #7
[   53.468254] Hardware name: Apple Inc.
MacBookPro10,1/Mac-C3EC7CD22292981F, BIOS
MBP101.88Z.00EE.B00.1205101839 05/10/2012
[   53.468259]   81520370 88045a8a8c00
88045a8a8cc4
[   53.468262]  810bfe7d 88045a8a8c00 
0011
[   53.468264]  810c022f  0011

[   53.468265] Call Trace:
[   53.468275][] ? dump_stack+0x5d/0x78
[   53.468282]  [] ? __report_bad_irq+0x2d/0xd0
[   53.468286]  [] ? note_interrupt+0x25f/0x2b0
[   53.468290]  [] ? handle_irq_event_percpu+0x121/0x190
[   53.468294]  [] ? handle_irq_event+0x38/0x50
[   53.468296]  [] ? handle_fasteoi_irq+0x7f/0x150
[   53.468302]  [] ? handle_irq+0x1d/0x30
[   53.468307]  [] ? do_IRQ+0x48/0xe0
[   53.468311]  [] ? common_interrupt+0x6d/0x6d
[   53.468317][] ? cpuidle_enter_state+0x4c/0xc0
[   53.468320]  [] ? cpuidle_enter_state+0x42/0xc0
[   53.468323]  [] ? cpu_startup_entry+0x33a/0x460
[   53.468326]  [] ? start_kernel+0x473/0x47b
[   53.468331]  [] ? early_idt_handler_array+0x120/0x120
[   53.468335]  [] ? x86_64_start_kernel+0x14d/0x15c
[   53.468336] handlers:
[   53.468367] [] azx_interrupt [snd_hda_controller]
[   53.468368] Disabling IRQ #17
[   53.513740] usb 3-1: reset h

Re: [PATCH] PCI: Add Broadcom 4331 reset quirk to prevent IRQ storm

2016-04-10 Thread Andrew Worsley
Further testing Broadcom 4331 reset quirk to prevent IRQ storm patch
testing reveals that:
  1. quirk is run on initial boot up and this time appears to have
vastly reduced the interrupts (only 81 this time):
cat /proc/interrupts| grep 17
 17: 81  0  0  0  0  0
 0  0   IO-APIC-fasteoi   snd_hda_intel

 2. But it is apparently *NOT* run after a suspend/resume and we get
the problem:
 17: 100084  0  0  0  0  0
 0  0   IO-APIC-fasteoi   snd_hda_intel

Rebooting a further nine times shows the low number (below 100) only
happens around 1/3 of the times:
boot #2
 17:  38706  0  0  0  0  0
 0  0   IO-APIC-fasteoi   snd_hda_intel
boot #3
 17: 87  0  0  0  0  0
 0  0   IO-APIC-fasteoi   snd_hda_intel
LOC:   2494   2031   2094   1831   1157   1171
  1573   1271   Local timer interrupts
boot #4
 17:  50616  0  0  0  0  0
 0  0   IO-APIC-fasteoi   snd_hda_intel
boot#5
 17:  26454  0  0  0  0  0
 0  0   IO-APIC-fasteoi   snd_hda_intel
boot#6
 17:  34440  0  0  0  0  0
 0  0   IO-APIC-fasteoi   snd_hda_intel
boot#7
 17: 79  0  0  0  0  0
 0  0   IO-APIC-fasteoi   snd_hda_intel
boot#8
 17: 84  0  0  0  0  0
 0  0   IO-APIC-fasteoi   snd_hda_intel
boot#9
 17:  37054  0  0  0  0  0
 0  0   IO-APIC-fasteoi   snd_hda_intel
boot#10
 17:  24648  0  0  0  0  0
 0  0   IO-APIC-fasteoi   snd_hda_intel

 Is there an easy setpci command to stop this we can add to grub?

Presently I have a grub work around for black screen as described here:
  
http://askubuntu.com/questions/264247/proprietary-nvidia-drivers-with-efi-on-mac-to-prevent-overheating/613573#613573

which basically involves adding a grub scriptlet to enable PCI-E bus
mastering on graphics cards:

In /etc/grub.d/01_enable_vga.conf:

setpci -s "00:01.0" 3e.b=8
setpci -s "01:00.0" 04.b=7

Can we do some similar magic setpci commands to disable 04:00.0
which is my BCM4331

  lspci | grep 4331
04:00.0 Network controller: Broadcom Corporation BCM4331 802.11a/b/g/n (rev 02)



On 7 April 2016 at 22:04, Andrew Worsley <amwors...@gmail.com> wrote:
> Sorry but testing the patch shows no difference.
>
> I have just compiled debian jessie kernel 3.16.7-ckt25 and booted it
> and hibernated it twice, then did the same with your patch applied.
> There appeared to be no difference
>


Thanks for any suggestions

Andrew
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] PCI: Add Broadcom 4331 reset quirk to prevent IRQ storm

2016-04-07 Thread Andrew Worsley
Sorry but testing the patch shows no difference.

I have just compiled debian jessie kernel 3.16.7-ckt25 and booted it
and hibernated it twice, then did the same with your patch applied.
There appeared to be no difference

On first boot I didn't get the nobody card disabling  problem but
after each hibernate I got the problem. But I did get 51130 IRQ 17
interrupts on the first boot but after the hibernate restore  each
time I got 10 extra interrupts in /proc/interrupts and the irq 17:
nobody cared message.

I could not see any difference with or with out the patch.
I boot with grub-efi using the linux/initrd commands

So perhaps the hibernate-restore needs the fix?

Andrew



On 3 April 2016 at 21:49, Lukas Wunner <lu...@wunner.de> wrote:
> Hi Andrew,
>
> On Sat, Apr 02, 2016 at 10:40:41PM +1100, Andrew Worsley wrote:
>> On 30 March 2016 at 04:41, Lukas Wunner <lu...@wunner.de> wrote:
>> > Broadcom 4331 wireless cards built into Apple Macs unleash an IRQ storm
>> > on boot until they are reset, causing spurious interrupts if the IRQ is
>> > shared. Apparently the EFI bootloader enables the device and does not
>> > disable it before passing control to the OS. The bootloader contains a
>> > driver for the wireless card which allows it to phone home to Cupertino.
>> > This is used for Internet Recovery (download and install OS X images)
>> > and probably also for Back to My Mac (remote access, RFC 6281) and to
>> > discover stolen hardware.
>> >
>> > The issue is most pronounced on 2011 and 2012 MacBook Pros where the IRQ
>> > is shared with 3 other devices (Light Ridge Thunderbolt controller, SDXC
>> > reader, HDA card on discrete GPU). As soon as an interrupt handler is
>> > installed for one of these devices, the ensuing storm of spurious IRQs
>> > causes the kernel to disable the IRQ and switch to polling. This lasts
>> > until the b43 driver loads and resets the device.
>> >
>> > Loading the b43 driver first is not always an option, in particular with
>> > the Light Ridge Thunderbolt controller: The PCI hotplug IRQ handler gets
>> > installed early on because it is built in, unlike b43 which is usually
>> > a module.
>> >
>> > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=79301
>> > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=895951
>> > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1009819
>> > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1149632
>>
>> I do see an irq 17 problem on my macbook, but I thought grub is
>> supposed to stop the boardcom wireless?
>>
>> Investigating grub2 git://git.savannah.gnu.org/grub.git I see this
>> patch  rev 9d34bb8 which says it disables Broadcom wireless hardware
>> on Apples:
>
> Thanks for the pointer to the grub2 commit, I wasn't aware of that.
>
> The commit puts the wireless card in power state D3hot but that doesn't
> stop it from sending interrupts. I have just tested that. So it's
> perfectly plausible that you're still seeing spurious interrupts
> despite using grub. Please test the patch I've posted, the spurious
> interrupts should disappear. If you "cat /proc/interrupts", you should
> then only see a few hundred interrupts on IRQ 17. Without the patch it
> should be in the 10+ range.
>
> Best regards,
>
> Lukas
>
>>
>> * commit 9d34bb8
>> | Author: Matthew Garrett <m...@redhat.com>
>> | Date:   Thu May 3 17:26:55 2012 +0200
>> |
>> |   Suspend broadcom cards in order to stop their DMA.
>> |
>> |   * grub-core/Makefile.am (KERNEL_HEADER_FILES): Add pci.h on x86 EFI.
>> |   * grub-core/Makefile.core.def (kernel): Add pci.c on x86 EFI.
>> |   (pci): Don't build on x86 EFI.
>> |   * grub-core/bus/pci.c (grub_pci_find_capability): New function.
>> |   * grub-core/kern/efi/mm.c (stop_broadcom) [__i386__ || __x86_64__]:
>> |   New function.
>> |   (grub_efi_finish_boot_services) [__i386__ || __x86_64__]: Call
>> |   stop_broadcom if running on EFI.
>> |   * include/grub/pci.h (GRUB_PCI_CLASS_NETWORK): New enum value.
>> |   (GRUB_PCI_CAP_POWER_MANAGEMENT): Likewise.
>> |   (GRUB_PCI_VENDOR_BROADCOM): Likewise.
>> |   (grub_pci_find_capability): New proto.
>> |
>> |   Also-By: Vladimir Serbinenko <phco...@gmail.com>
>> |
>> | M ChangeLog
>> | M grub-core/Makefile.am
>> | M grub-core/Makefile.core.def
>> | M grub-core/bus/pci.c
>> | M grub-core/kern/efi/mm.c
>> | M include/grub/pci.h
>>
>> But I run debian gru

Re: [PATCH] PCI: Add Broadcom 4331 reset quirk to prevent IRQ storm

2016-04-06 Thread Andrew Worsley
That patch appears to be the grub 1 equivalent  to  grub2
git://git.savannah.gnu.org/grub.git rev 9d34bb8 which puts the network
device into D3 power state.

I am running grub2 with that patch and it doesn't fix my irq 17 problem.

Can we not fix this in grub2 via something like Lukas original patch
or disable any DMA transfers before the kernel starts?

Andrew
On 6 April 2016 at 05:59, Matthew Garrett  wrote:
> On Tue, Apr 05, 2016 at 02:40:15PM -0500, Bjorn Helgaas wrote:
>
>> https://bugzilla.kernel.org/show_bug.cgi?id=111781 and
>> https://mjg59.dreamwidth.org/11235.html describe a sort of similar
>> issue, but with DMA.  An interrupt from the device is probably to
>> signal a DMA completion, but these problem reports only mention the
>> "IRQ nobody cared" issue; I don't see anything about memory
>> corruption.
>
> I "fixed" this with
> https://github.com/mjg59/grub-fedora/commit/21fcd6d79b7601e4b20ad70c5408adff2dabbc1d
> - doing the same in the kernel EFI stub would probably be the best way
> to handle it. This way you're guaranteed to stop DMA before the kernel
> reclaims boot services memory, which guarantees you won't have any
> corruption.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html