Hello, I am experiencing a hard lockup under IRQ while using an external USB 2.0 IDE drive on a SMP system using the EHCI HCD and a VIA additionnal controller.
The oops happens after a few minutes of high activity on the drive, which has an ext3 file system on which rsync is run, and looks like SMP related. I first noticed the problem on 2.6.8.1, and could reproduce it on 2.6.9-rc3. Detailled information gathered with kdb at the time of the oops follows. Thanks for your help, Gaël. -- Gaël Roualland -+- [EMAIL PROTECTED] -- * Kernel version : Linux version 2.6.9-rc3 ([EMAIL PROTECTED]) (version gcc 3.3.4 (Debian 1:3.3.4-6sarge1)) #1 SMP Mon Oct 4 02:57:13 CEST 2004 * Oops (retyped by hand) : Unable to handle kernel paging request at virtual address 00100104 printing eip: c02b03bf *pde = 00000000 Oops: 0002 [#1] PREEMPT SMP Modules linked in: snd_sbwave snd_opl3_lib snd_sb16_dsp snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_sb16_csp snd_sb_common snd_hwdep snd_mpu401 snd_rawmidi snd_seq_device snd_8250 serial_core 3c59x nls_iso8859_1 nls_cp437 w83781d i2c_sensor i2c_piix4 i2c_core ipv6 sg sr_mod sd_mod aic7xxxx CPU: 1 EIP: 0060:[<c02b03bf>] Not tainted VLI EFLAGS: 00010246 (2.6.9-rc3) EIP is at qh_completions+0x13f/0x330 eax: 00100100 ebc: 00000000 ecx: d7d783f8 edx: 00200200 esi: d7d783c0 edi: d7d783c0 ebp: d78c53e0 esp: d4245c84 ds: 007b es: 007b ss: 0068 Process kjournald [...] Stack: d7870c00 d78c53e0 00001000 00001c00 d7d7714c 012659b2 00000000 00000007 00000000 d7d78458 d7d78720 d7d77160 d4245d78 d7d77100 d7d7714c c02b1265 d7870c00 d7d77100 d4245d78 00000000 d7870c00 d4245d78 d7870c00 00000001 Call Trace: [<c02b1265>]: scan_async+0x95/0x170 [<c02b3842>]: ehci_work+0x32/0xc0 [<c02b39b0>]: echi_irq+0xe0/0x170 [<c013dc07>]: mempool_alloc_slab+0x17/0x20 [<c02a53f6>]: usb_hcd_irq+0x36/0x70 [<c0107d24>]: handle_IRQ_event+0x34/0x70 [<c01080ed>]: do_IRQ+09dx/0x140 [<c0105c70>]: common_interrupt+0x18/0x20 [<c015d95c>]: ll_rw_block+0x7c/0x90 [<c01a7df5>]: journal_commit_transaction+0x1175/0x1e0 [<c011c1b0>]: autoremove_wake_function+0x0/0x60 [<c02ef1d7>]: ip_rcv+0x387/0x510 [<c011c1b0>]: autoremove_wake_function+0x/0x60 [<c01aa7dd>]: kjournald+0xdd/0x250 [<c0120c31>]: do_exit+0x2a1/0x430 * Output of 'btc' under kdb : CPU 0 : Pid 2571, rsync 0xd7b89000 2571 2519 ESP EIP Function 0xd7867d08 0xc034eef9 _spin_lock+0x39 0xd7867e3c 0xc02b0249 ehci_urb_done+0x89 0xd7867e5c 0xc02b0312 qh_completions+0x92 0xd7867e9c 0xc02b1265 scan_async+0x95 0xd7867ec0 0xc02b3842 ehci_work+0x32 0xd7867ed4 0xc02b2fd8 ehci_watchdog+0x58 0xd7867ee8 0xc01272eA run_timer_softirq+0xda 0xd7867f18 0xc0122f7a __do_softirq+0xba 0xd7867f34 0xc0122fbd do_softirq+0x2d 0xd7867f3c 0xc01142c7 smp_apic_timer_interrupt+0x97 CPU 1: Pid 2506, kjournald ESP EIP Function 0xd4245ad4 0xc02b03bf qh_completions+0x13f 0xd4245cc4 0xc02b1265 scab_async+0x95 0xd4245ce8 0xc02b3842 ehci_work+0x32 0xd4245d20 0xc02a53f6 usb_hdc_irq+0x36 0xd4245d34 0xc0107d24 handle_IRQ_event+0x34 0xd4245d54 0xc01080ed do_IRQ+0x9d 0xd4245d78 0xc0105c70 common_interrupt+0x18 0xd4245e38 0xc02ef1d7 ip_rcv+0x387 0xd4245ff4 0xc01032e5 kernel_thread_helper+0x5 Looks like a concurrency problem in qh_completions, since rsync and kjournald were both operating on the drive at the time of the crash. * Hardware information if needed $ lspci -vvv [...] 0000:00:0a.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 61) (prog-if 00 [UHCI]) Subsystem: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 32, Cache Line Size: 0x08 (32 bytes) Interrupt: pin A routed to IRQ 9 Region 4: I/O ports at b800 [size=32] Capabilities: <available only to root> 0000:00:0a.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 61) (prog-if 00 [UHCI]) Subsystem: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 32, Cache Line Size: 0x08 (32 bytes) Interrupt: pin B routed to IRQ 9 Region 4: I/O ports at b400 [size=32] Capabilities: <available only to root> 0000:00:0a.2 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 63) (prog-if 20 [EHCI]) Subsystem: VIA Technologies, Inc. USB 2.0 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 32, Cache Line Size: 0x08 (32 bytes) Interrupt: pin C routed to IRQ 10 Region 0: Memory at d5000000 (32-bit, non-prefetchable) [size=256] Capabilities: <available only to root> * Disassembled "qh_completions" around oops : $ gdb vmlinux Dump of assembler code for function qh_completions: [...] 0xc02b0330 <qh_completions+176>: call 0xc0263530 <dma_pool_free> 0xc02b0335 <qh_completions+181>: cmp 0x28(%esp),%esi 0xc02b0339 <qh_completions+185>: je 0xc02b03f0 <qh_completions+368> 0xc02b033f <qh_completions+191>: lock addl $0x0,0x0(%esp) 0xc02b0345 <qh_completions+197>: mov 0x8(%esi),%ebx 0xc02b0348 <qh_completions+200>: test %bl,%bl 0xc02b034a <qh_completions+202>: js 0xc02b0534 <qh_completions+692> 0xc02b0350 <qh_completions+208>: test $0x40,%bl 0xc02b0353 <qh_completions+211>: je 0xc02b04e0 <qh_completions+608> 0xc02b0359 <qh_completions+217>: mov $0x1,%eax 0xc02b035e <qh_completions+222>: mov %eax,0x20(%esp) 0xc02b0362 <qh_completions+226>: lea 0x4(%ebp),%edi 0xc02b0365 <qh_completions+229>: mov %edi,%eax 0xc02b0367 <qh_completions+231>: call 0xc034eec0 <_spin_lock> 0xc02b036c <qh_completions+236>: mov %ebx,0xc(%esp) 0xc02b0370 <qh_completions+240>: mov 0x44(%esi),%eax 0xc02b0373 <qh_completions+243>: xor %ebx,%ebx 0xc02b0375 <qh_completions+245>: mov %ebp,0x4(%esp) 0xc02b0379 <qh_completions+249>: mov %eax,0x8(%esp) 0xc02b037d <qh_completions+253>: mov 0x40(%esp),%eax 0xc02b0381 <qh_completions+257>: mov %eax,(%esp) 0xc02b0384 <qh_completions+260>: call 0xc02b00c0 <qtd_copy_status> 0xc02b0389 <qh_completions+265>: mov %ebx,0x18(%esp) 0xc02b038d <qh_completions+269>: cmpl $0xffffff87,0x28(%ebp) 0xc02b0391 <qh_completions+273>: je 0xc02b04c0 <qh_completions+576> 0xc02b0397 <qh_completions+279>: mov %edi,%eax 0xc02b0399 <qh_completions+281>: call 0xc034f000 <_spin_unlock> 0xc02b039e <qh_completions+286>: mov 0x20(%esp),%ecx 0xc02b03a2 <qh_completions+290>: test %ecx,%ecx 0xc02b03a4 <qh_completions+292>: je 0xc02b03b4 <qh_completions+308> 0xc02b03a6 <qh_completions+294>: mov 0x3c(%esi),%edx 0xc02b03a9 <qh_completions+297>: cmp 0x10(%esp),%edx 0xc02b03ad <qh_completions+301>: je 0xc02b03b4 <qh_completions+308> 0xc02b03af <qh_completions+303>: mov (%esi),%eax 0xc02b03b1 <qh_completions+305>: mov %eax,0xffffffc8(%edx) 0xc02b03b4 <qh_completions+308>: lea 0x38(%esi),%ecx 0xc02b03b7 <qh_completions+311>: mov 0x38(%esi),%eax 0xc02b03ba <qh_completions+314>: mov %esi,%edi 0xc02b03bc <qh_completions+316>: mov 0x4(%ecx),%edx 0xc02b03bf <qh_completions+319>: mov %edx,0x4(%eax) <= OOPS 0xc02b03c2 <qh_completions+322>: mov %eax,(%edx) 0xc02b03c4 <qh_completions+324>: movl $0x200200,0x4(%ecx) 0xc02b03cb <qh_completions+331>: movl $0x100100,0x38(%esi) 0xc02b03d2 <qh_completions+338>: mov 0x24(%esp),%eax 0xc02b03d6 <qh_completions+342>: cmp 0x10(%esp),%eax 0xc02b03da <qh_completions+346>: mov (%eax),%edx 0xc02b03dc <qh_completions+348>: mov %edx,0x24(%esp) 0xc02b03e0 <qh_completions+352>: jne 0xc02b02e8 <qh_completions+104> 0xc02b03e6 <qh_completions+358>: lea 0x0(%esi),%esi 0xc02b03e9 <qh_completions+361>: lea 0x0(%edi),%edi 0xc02b03f0 <qh_completions+368>: test %edi,%edi 0xc02b03f2 <qh_completions+370>: je 0xc02b0430 <qh_completions+432> 0xc02b03f4 <qh_completions+372>: mov 0x48(%esp),%ecx 0xc02b03f8 <qh_completions+376>: mov %ecx,0x8(%esp) 0xc02b03fc <qh_completions+380>: mov 0x40(%edi),%eax 0xc02b03ff <qh_completions+383>: mov %eax,0x4(%esp) 0xc02b0403 <qh_completions+387>: mov 0x40(%esp),%eax 0xc02b0407 <qh_completions+391>: mov %eax,(%esp) 0xc02b040a <qh_completions+394>: call 0xc02b01c0 <ehci_urb_done> [...] ------------------------------------------------------- This SF.net email is sponsored by: IT Product Guide on ITManagersJournal Use IT products in your business? Tell us what you think of them. Give us Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more http://productguide.itmanagersjournal.com/guidepromo.tmpl _______________________________________________ [EMAIL PROTECTED] To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel