On Monday 29 May 2006 17:51, you wrote: > On Mon, May 29, 2006 at 12:44:05AM +0200, Michael Buesch wrote: > > Ok Jason, could you please test the following patch and try to reproduce > > with it? > > This patch crashes immediately: > > http://gehennom.net/~lunz/bcm43xx_crash2.jpg
Whoah, that's damn weird. I have no idea what's going on. I'm sorry. Does someone else have an idea? Otherwise I think we must drop that patch and live with the high periodic work latency. I really don't see how this is still possible to race. Even on UP(!). Jason, I added an assertion that should be able to catch periodic work vs IRQ handler races. You did not see it trigger by chance? Besides that, this seems like a race of periodic work against the tasklet. But I really don't see how this is possible. > The first patch, otoh, has only crashed twice, and even then only under > prolonged heavy load. (the first time after 4 million bcm43xx > interrupts, the second after 11 million). > > This second patch is so bad I couldn't log in and check /proc/interrupts > without it blowing up. I gave up after the third crash; the backtraces > aren't identical but they do all mention bcm43xx and math error, for > some reason. > > Also, this new patch produced a lot of warnings during boot before > crashing. I got them from syslog: > > kernel: Bootdata ok (command line is ro resume2=swap:/dev/mapper/swap ) > kernel: Linux version 2.6.17-rc5-git4-suspend2 ([EMAIL PROTECTED]) (gcc > version 4.1.1 20060511 (prerelease) (Debian 4.1.0-4)) #2 SMP PREEMPT Sun May > 28 21:40:25 EDT 2006 > [...] > > kernel: bcm43xx driver > kernel: ACPI: PCI Interrupt Link [LNK3] enabled at IRQ 17 > kernel: GSI 21 sharing vector 0xD9 and IRQ 21 > kernel: ACPI: PCI Interrupt 0000:02:02.0[A] -> Link [LNK3] -> GSI 17 (level, > low) -> IRQ 21 > kernel: bcm43xx: Chip ID 0x4306, rev 0x3 > kernel: bcm43xx: Number of cores: 5 > kernel: bcm43xx: Core 0: ID 0x800, rev 0x4, vendor 0x4243, enabled > kernel: bcm43xx: Core 1: ID 0x812, rev 0x5, vendor 0x4243, disabled > kernel: bcm43xx: Core 2: ID 0x80d, rev 0x2, vendor 0x4243, enabled > kernel: bcm43xx: Core 3: ID 0x807, rev 0x2, vendor 0x4243, disabled > kernel: bcm43xx: Core 4: ID 0x804, rev 0x9, vendor 0x4243, enabled > kernel: bcm43xx: PHY connected > kernel: bcm43xx: Detected PHY: Version: 2, Type 2, Revision 2 > kernel: bcm43xx: Detected Radio: ID: 2205017f (Manuf: 17f Ver: 2050 Rev: 2) > kernel: bcm43xx: Radio turned off > kernel: bcm43xx: Radio turned off > kernel: ACPI: PCI Interrupt Link [LMCI] enabled at IRQ 22 > kernel: ACPI: PCI Interrupt 0000:00:06.1[B] -> Link [LMCI] -> GSI 22 (level, > low) -> IRQ 18 > kernel: PCI: Setting latency timer of device 0000:00:06.1 to 64 > kernel: ACPI: PCI Interrupt Link [LACI] enabled at IRQ 21 > kernel: ACPI: PCI Interrupt 0000:00:06.0[A] -> Link [LACI] -> GSI 21 (level, > low) -> IRQ 19 > kernel: PCI: Setting latency timer of device 0000:00:06.0 to 64 > kernel: input: PS/2 Mouse as /class/input/input1 > kernel: Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing > enabled > kernel: input: AlpsPS/2 ALPS GlidePoint as /class/input/input2 > kernel: ieee1394: Host added: ID:BUS[0-00:1023] GUID[453f0200453f0200] > kernel: eth1394: eth0: IEEE-1394 IPv4 over 1394 Ethernet (fw-host0) > kernel: floppy0: no floppy controllers found > kernel: intel8x0_measure_ac97_clock: measured 55368 usecs > kernel: intel8x0: clocking to 47459 > kernel: EXT3 FS on dm-1, internal journal > kernel: ACPI: CPU0 (power states: C1[C1] C2[C2]) > kernel: powernow-k8: Found 1 AMD Athlon 64 / Opteron processors (version > 1.60.2) > kernel: powernow-k8: 0 : fid 0xe (2200 MHz), vid 0x2 (1500 mV) > kernel: powernow-k8: 1 : fid 0xa (1800 MHz), vid 0xa (1300 mV) > kernel: powernow-k8: 2 : fid 0x0 (800 MHz), vid 0x12 (1100 mV) > kernel: cpu_init done, current fid 0xe, vid 0x2 > kernel: fuse init (API version 7.6) > kernel: ieee80211_crypt: registered algorithm 'WEP' > kernel: ieee80211_crypt: registered algorithm 'TKIP' > kernel: ieee80211_crypt: registered algorithm 'CCMP' > kernel: kjournald starting. Commit interval 5 seconds > kernel: EXT3 FS on hda1, internal journal > kernel: EXT3-fs: mounted filesystem with ordered data mode. > kernel: Adding 1750560k swap on /dev/mapper/swap. Priority:-1 extents:1 > across:1750560k > kernel: pcmcia: Detected deprecated PCMCIA ioctl usage from process: discover. > kernel: pcmcia: This interface will soon be removed from the kernel; please > expect breakage unless you upgrade to new tools. > kernel: pcmcia: see > http://www.kernel.org/pub/linux/utils/kernel/pcmcia/pcmcia.html for details. > kernel: bcm43xx: PHY connected > kernel: bcm43xx: Radio turned on > kernel: bcm43xx: Chip initialized > kernel: bcm43xx: DMA initialized > kernel: bcm43xx: 80211 cores initialized > kernel: bcm43xx: Keys cleared > kernel: NET: Registered protocol family 17 > kernel: bcm43xx: set security called > kernel: bcm43xx: .level = 0 > kernel: bcm43xx: .enabled = 0 > kernel: bcm43xx: .encrypt = 0 > kernel: bcm43xx: set security called > kernel: bcm43xx: .level = 0 > kernel: bcm43xx: .enabled = 0 > kernel: bcm43xx: .encrypt = 0 > kernel: bcm43xx: set security called > kernel: bcm43xx: .level = 0 > kernel: bcm43xx: .enabled = 0 > kernel: bcm43xx: .encrypt = 0 > kernel: bcm43xx: set security called > kernel: bcm43xx: .level = 0 > kernel: bcm43xx: .enabled = 0 > kernel: bcm43xx: .encrypt = 0 > kernel: Losing some ticks... checking if CPU frequency changed. > kernel: SoftMAC: Open Authentication completed with 00:12:17:3a:e2:c7 > kernel: bcm43xx: set security called > kernel: bcm43xx: .level = 0 > kernel: bcm43xx: .enabled = 0 > kernel: bcm43xx: .encrypt = 0 > kernel: bcm43xx: set security called > kernel: bcm43xx: .level = 0 > kernel: bcm43xx: .enabled = 0 > kernel: bcm43xx: .encrypt = 0 > kernel: bcm43xx: set security called > kernel: bcm43xx: .level = 0 > kernel: bcm43xx: .enabled = 0 > kernel: bcm43xx: .encrypt = 0 > kernel: bcm43xx: set security called > kernel: bcm43xx: .level = 0 > kernel: bcm43xx: .enabled = 0 > kernel: bcm43xx: .encrypt = 0 > kernel: bcm43xx: set security called > kernel: bcm43xx: .level = 0 > kernel: bcm43xx: .enabled = 0 > kernel: bcm43xx: .encrypt = 0 > kernel: bcm43xx: set security called > kernel: bcm43xx: .level = 0 > kernel: bcm43xx: .enabled = 0 > kernel: bcm43xx: .encrypt = 0 > kernel: bcm43xx: set security called > kernel: bcm43xx: .level = 0 > kernel: bcm43xx: .enabled = 0 > kernel: bcm43xx: .encrypt = 0 > kernel: bcm43xx: set security called > kernel: bcm43xx: .level = 0 > kernel: bcm43xx: .enabled = 0 > kernel: bcm43xx: .encrypt = 0 > kernel: bcm43xx: set security called > kernel: bcm43xx: .level = 0 > kernel: bcm43xx: .enabled = 0 > kernel: bcm43xx: .encrypt = 0 > kernel: bcm43xx: set security called > kernel: bcm43xx: .level = 0 > kernel: bcm43xx: .enabled = 0 > kernel: bcm43xx: .encrypt = 0 > kernel: bcm43xx: set security called > kernel: bcm43xx: .active_key = 0 > kernel: bcm43xx: .level = 4 > kernel: bcm43xx: .enabled = 1 > kernel: bcm43xx: .encrypt = 1 > kernel: bcm43xx: set security called > kernel: bcm43xx: .enabled = 1 > kernel: bcm43xx: .encrypt = 1 > kernel: CCMP: decrypt failed: STA=00:13:02:25:08:7b > kernel: CCMP: decrypt failed: STA=00:12:17:3a:e2:c7 > kernel: CCMP: decrypt failed: STA=00:13:02:25:08:7b > kernel: CCMP: decrypt failed: STA=00:12:17:3a:e2:c7 > kernel: CCMP: decrypt failed: STA=00:13:02:25:08:7b > kernel: CCMP: decrypt failed: STA=00:13:02:25:08:7b > kernel: CCMP: decrypt failed: STA=00:12:17:3a:e2:c7 > kernel: CCMP: decrypt failed: STA=00:12:17:3a:e2:c7 > kernel: CCMP: decrypt failed: STA=00:13:02:25:08:7b > kernel: BUG: spinlock already unlocked on CPU#0, sh/1920 > kernel: lock: ffff81001e8bf108, .magic: dead4ead, .owner: <none>/-1, > .owner_cpu: -1 > kernel: > kernel: Call Trace: <IRQ> <ffffffff802d597b>{_raw_spin_unlock+46} > kernel: <ffffffff803921a2>{_spin_unlock+9} > <ffffffff88212469>{:bcm43xx:bcm43xx_interrupt_handler+518} > kernel: <ffffffff8025a970>{handle_IRQ_event+41} > <ffffffff8025aa3b>{__do_IRQ+154} > kernel: <ffffffff8020ba8c>{do_IRQ+50} > <ffffffff80209ce8>{ret_from_intr+0} > kernel: <ffffffff88217238>{:bcm43xx:bcm43xx_interrupt_tasklet+2} > kernel: <ffffffff88212469>{:bcm43xx:bcm43xx_interrupt_handler+518} > kernel: <ffffffff80231321>{tasklet_action+98} > <ffffffff80230eb7>{__do_softirq+73} > kernel: <ffffffff8020a9ba>{call_softirq+30} > <ffffffff8020b9a0>{do_softirq+44} > kernel: <ffffffff802312b4>{irq_exit+63} <ffffffff8020ba91>{do_IRQ+55} > kernel: <ffffffff880a9562>{:processor:acpi_processor_idle+0} > kernel: <ffffffff80209ce8>{ret_from_intr+0} <EOI> > <ffffffff8026244f>{blockable_page_cache_readahead+86} > kernel: <ffffffff802d4007>{clear_page+7} > <ffffffff802609dd>{get_page_from_freelist+822} > kernel: <ffffffff80260ae7>{__alloc_pages+113} > <ffffffff80260df1>{get_zeroed_page+67} > kernel: <ffffffff8026802d>{__pte_alloc+26} > <ffffffff802681f1>{__handle_mm_fault+263} > kernel: <ffffffff80391d28>{_spin_lock_irqsave+30} > <ffffffff8039416b>{do_page_fault+1098} > kernel: <ffffffff8026caaa>{do_mmap_pgoff+1487} > <ffffffff80391d28>{_spin_lock_irqsave+30} > kernel: <ffffffff8020a4b1>{error_exit+0} > kernel: ----------- [cut here ] --------- [please bite here ] --------- > kernel: Kernel BUG at kernel/sched.c:2875 > kernel: invalid opcode: 0000 [1] PREEMPT SMP > kernel: CPU 0 > kernel: Modules linked in: af_packet ieee80211_crypt_ccmp > ieee80211_crypt_tkip ieee80211_crypt_wep fuse cpufreq_ondemand > cpufreq_conservative powernow_k8 freq_table processor eth1394 8250_pci 8250 > serial_core snd_intel8x0 snd_pcm_oss snd_mixer_oss snd_intel8x0m > snd_ac97_codec snd_ac97_bus bcm43xx snd_pcm snd_timer pcmcia psmouse pcspkr > firmware_class ehci_hcd ohci_hcd ohci1394 ieee1394 serio_raw ide_cd cdrom snd > soundcore snd_page_alloc i2c_nforce2 ieee80211softmac usbcore parport_pc > parport 8139too mii yenta_socket rsrc_nonstatic pcmcia_core ieee80211 > ieee80211_crypt rtc unix ext3 jbd mbcache lzf dm_crypt dm_mod sha256 > aes_x86_64 ide_disk amd74xx generic ide_core evdev fbcon tileblit font > bitblit softcursor > kernel: Pid: 1920, comm: sh Not tainted 2.6.17-rc5-git4-suspend2 #2 > kernel: RIP: 0010:[<ffffffff802243e6>] > <ffffffff802243e6>{sub_preempt_count+21} > kernel: RSP: 0000:ffffffff8049af88 EFLAGS: 00010002 > kernel: RAX: ffff81001df17fd8 RBX: ffff81001df17c18 RCX: 0000000000000001 > kernel: RDX: ffff8100811f3000 RSI: ffffffff80445240 RDI: 0000000000000001 > kernel: RBP: ffffffff8049af88 R08: ffffffff804bd310 R09: 000000000001be31 > kernel: R10: 0000000000000000 R11: ffffffff80441240 R12: 0000000000000001 > kernel: R13: 0000000000000001 R14: 0000000000000256 R15: ffffffff80445240 > kernel: FS: 00002aef1c6b56d0(0000) GS:ffffffff804e6000(0000) > knlGS:0000000000000000 > kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > kernel: CR2: 00002b9079000d18 CR3: 000000001d4d9000 CR4: 00000000000006e0 > kernel: Process sh (pid: 1920, threadinfo ffff81001df16000, task > ffff81001e57ef20) > kernel: Stack: 0000000000000015 ffffffff8020ba91 ffffffff880a9562 > ffff81000168e438 > kernel: ffff81000168e400 ffffffff80209ce8 ffff81001df17c18 <EOI> > 0000000000000000 > kernel: ffff81001f2fec00 ffff81001f4e83b8 > kernel: Call Trace: <IRQ> <ffffffff8020ba91>{do_IRQ+55} > <ffffffff880a9562>{:processor:acpi_processor_idle+0} > kernel: <ffffffff80209ce8>{ret_from_intr+0} <EOI> > <ffffffff8026244f>{blockable_page_cache_readahead+86} > kernel: <ffffffff802d4007>{clear_page+7} > <ffffffff802609dd>{get_page_from_freelist+822} > kernel: <ffffffff80260ae7>{__alloc_pages+113} > <ffffffff80260df1>{get_zeroed_page+67} > kernel: <ffffffff8026802d>{__pte_alloc+26} > <ffffffff802681f1>{__handle_mm_fault+263} > kernel: <ffffffff80391d28>{_spin_lock_irqsave+30} > <ffffffff8039416b>{do_page_fault+1098} > kernel: <ffffffff8026caaa>{do_mmap_pgoff+1487} > <ffffffff80391d28>{_spin_lock_irqsave+30} > kernel: <ffffffff8020a4b1>{error_exit+0} > kernel: > kernel: Code: 0f 0b 68 b9 e1 3a 80 c2 3b 0b 81 ff fe 00 00 00 77 1c 65 48 > kernel: RIP <ffffffff802243e6>{sub_preempt_count+21} RSP <ffffffff8049af88> > kernel: <3>BUG: sleeping function called from invalid context at > include/linux/rwsem.h:43 > kernel: in_atomic():0, irqs_disabled():1 > kernel: > kernel: Call Trace: <IRQ> <ffffffff80238b2c>{blocking_notifier_call_chain+31} > kernel: <ffffffff8022ecec>{do_exit+34} > <ffffffff80318cd8>{do_unblank_screen+39} > kernel: <ffffffff8020afb0>{kernel_math_error+0} > <ffffffff8020b54d>{do_invalid_op+173} > kernel: <ffffffff802243e6>{sub_preempt_count+21} > <ffffffff88226f30>{:bcm43xx:bcm43xx_leds_update+299} > kernel: <ffffffff80392274>{_spin_unlock_irqrestore+47} > <ffffffff88217b4b>{:bcm43xx:bcm43xx_interrupt_tasklet+2325} > kernel: <ffffffff8020a4b1>{error_exit+0} > <ffffffff802243e6>{sub_preempt_count+21} > kernel: <ffffffff802312b4>{irq_exit+63} <ffffffff8020ba91>{do_IRQ+55} > kernel: <ffffffff880a9562>{:processor:acpi_processor_idle+0} > kernel: <ffffffff80209ce8>{ret_from_intr+0} <EOI> > <ffffffff8026244f>{blockable_page_cache_readahead+86} > kernel: <ffffffff802d4007>{clear_page+7} > <ffffffff802609dd>{get_page_from_freelist+822} > kernel: <ffffffff80260ae7>{__alloc_pages+113} > <ffffffff80260df1>{get_zeroed_page+67} > kernel: <ffffffff8026802d>{__pte_alloc+26} > <ffffffff802681f1>{__handle_mm_fault+263} > kernel: <ffffffff80391d28>{_spin_lock_irqsave+30} > <ffffffff8039416b>{do_page_fault+1098} > kernel: <ffffffff8026caaa>{do_mmap_pgoff+1487} > <ffffffff80391d28>{_spin_lock_irqsave+30} > kernel: <ffffffff8020a4b1>{error_exit+0} > kernel: printk: 205 messages suppressed. > kernel: CCMP: decrypt failed: STA=00:13:02:25:08:7b > -- Greetings Michael. [2007: Knorkator (Germany) - 12 points] _______________________________________________ Bcm43xx-dev mailing list [email protected] http://lists.berlios.de/mailman/listinfo/bcm43xx-dev
