Hello,
I am experiencing a hard lockup under IRQ while using an external USB
2.0 IDE drive
on a SMP system using the EHCI HCD and a VIA additionnal controller.
The oops happens after a few minutes of high activity on the drive,
which has an ext3 file system on which rsync is run, and looks like SMP
related.
I first noticed the problem on 2.6.8.1, and could reproduce it on
2.6.9-rc3. Detailled information gathered with kdb at the time of the
oops follows.
Thanks for your help,
Ga�l.
--
Ga�l Roualland -+- [EMAIL PROTECTED]
--
* Kernel version :
Linux version 2.6.9-rc3 ([EMAIL PROTECTED]) (version gcc 3.3.4 (Debian
1:3.3.4-6sarge1)) #1 SMP Mon Oct 4 02:57:13 CEST 2004
* Oops (retyped by hand) :
Unable to handle kernel paging request at virtual address 00100104
printing eip:
c02b03bf
*pde = 00000000
Oops: 0002 [#1]
PREEMPT SMP
Modules linked in: snd_sbwave snd_opl3_lib snd_sb16_dsp snd_pcm_oss
snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_sb16_csp
snd_sb_common snd_hwdep snd_mpu401 snd_rawmidi snd_seq_device snd_8250
serial_core 3c59x nls_iso8859_1 nls_cp437 w83781d i2c_sensor i2c_piix4
i2c_core ipv6 sg sr_mod sd_mod aic7xxxx
CPU: 1
EIP: 0060:[<c02b03bf>] Not tainted VLI
EFLAGS: 00010246 (2.6.9-rc3)
EIP is at qh_completions+0x13f/0x330
eax: 00100100 ebc: 00000000 ecx: d7d783f8 edx: 00200200
esi: d7d783c0 edi: d7d783c0 ebp: d78c53e0 esp: d4245c84
ds: 007b es: 007b ss: 0068
Process kjournald [...]
Stack: d7870c00 d78c53e0 00001000 00001c00 d7d7714c 012659b2 00000000
00000007
00000000 d7d78458 d7d78720 d7d77160 d4245d78 d7d77100 d7d7714c c02b1265
d7870c00 d7d77100 d4245d78 00000000 d7870c00 d4245d78 d7870c00 00000001
Call Trace:
[<c02b1265>]: scan_async+0x95/0x170
[<c02b3842>]: ehci_work+0x32/0xc0
[<c02b39b0>]: echi_irq+0xe0/0x170
[<c013dc07>]: mempool_alloc_slab+0x17/0x20
[<c02a53f6>]: usb_hcd_irq+0x36/0x70
[<c0107d24>]: handle_IRQ_event+0x34/0x70
[<c01080ed>]: do_IRQ+09dx/0x140
[<c0105c70>]: common_interrupt+0x18/0x20
[<c015d95c>]: ll_rw_block+0x7c/0x90
[<c01a7df5>]: journal_commit_transaction+0x1175/0x1e0
[<c011c1b0>]: autoremove_wake_function+0x0/0x60
[<c02ef1d7>]: ip_rcv+0x387/0x510
[<c011c1b0>]: autoremove_wake_function+0x/0x60
[<c01aa7dd>]: kjournald+0xdd/0x250
[<c0120c31>]: do_exit+0x2a1/0x430
* Output of 'btc' under kdb :
CPU 0 :
Pid 2571, rsync
0xd7b89000 2571 2519
ESP EIP Function
0xd7867d08 0xc034eef9 _spin_lock+0x39
0xd7867e3c 0xc02b0249 ehci_urb_done+0x89
0xd7867e5c 0xc02b0312 qh_completions+0x92
0xd7867e9c 0xc02b1265 scan_async+0x95
0xd7867ec0 0xc02b3842 ehci_work+0x32
0xd7867ed4 0xc02b2fd8 ehci_watchdog+0x58
0xd7867ee8 0xc01272eA run_timer_softirq+0xda
0xd7867f18 0xc0122f7a __do_softirq+0xba
0xd7867f34 0xc0122fbd do_softirq+0x2d
0xd7867f3c 0xc01142c7 smp_apic_timer_interrupt+0x97
CPU 1:
Pid 2506, kjournald
ESP EIP Function
0xd4245ad4 0xc02b03bf qh_completions+0x13f
0xd4245cc4 0xc02b1265 scab_async+0x95
0xd4245ce8 0xc02b3842 ehci_work+0x32
0xd4245d20 0xc02a53f6 usb_hdc_irq+0x36
0xd4245d34 0xc0107d24 handle_IRQ_event+0x34
0xd4245d54 0xc01080ed do_IRQ+0x9d
0xd4245d78 0xc0105c70 common_interrupt+0x18
0xd4245e38 0xc02ef1d7 ip_rcv+0x387
0xd4245ff4 0xc01032e5 kernel_thread_helper+0x5
Looks like a concurrency problem in qh_completions, since rsync and
kjournald were both operating on the drive at the time of the crash.
* Hardware information if needed
$ lspci -vvv
[...]
0000:00:0a.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB
1.1 Controller (rev 61) (prog-if 00 [UHCI])
Subsystem: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
Latency: 32, Cache Line Size: 0x08 (32 bytes)
Interrupt: pin A routed to IRQ 9
Region 4: I/O ports at b800 [size=32]
Capabilities: <available only to root>
0000:00:0a.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB
1.1 Controller (rev 61) (prog-if 00 [UHCI])
Subsystem: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
Latency: 32, Cache Line Size: 0x08 (32 bytes)
Interrupt: pin B routed to IRQ 9
Region 4: I/O ports at b400 [size=32]
Capabilities: <available only to root>
0000:00:0a.2 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 63)
(prog-if 20 [EHCI])
Subsystem: VIA Technologies, Inc. USB 2.0
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
Latency: 32, Cache Line Size: 0x08 (32 bytes)
Interrupt: pin C routed to IRQ 10
Region 0: Memory at d5000000 (32-bit, non-prefetchable)
[size=256]
Capabilities: <available only to root>
* Disassembled "qh_completions" around oops :
$ gdb vmlinux
Dump of assembler code for function qh_completions:
[...]
0xc02b0330 <qh_completions+176>: call 0xc0263530
<dma_pool_free>
0xc02b0335 <qh_completions+181>: cmp 0x28(%esp),%esi
0xc02b0339 <qh_completions+185>: je 0xc02b03f0
<qh_completions+368>
0xc02b033f <qh_completions+191>: lock addl $0x0,0x0(%esp)
0xc02b0345 <qh_completions+197>: mov 0x8(%esi),%ebx
0xc02b0348 <qh_completions+200>: test %bl,%bl
0xc02b034a <qh_completions+202>: js 0xc02b0534
<qh_completions+692>
0xc02b0350 <qh_completions+208>: test $0x40,%bl
0xc02b0353 <qh_completions+211>: je 0xc02b04e0
<qh_completions+608>
0xc02b0359 <qh_completions+217>: mov $0x1,%eax
0xc02b035e <qh_completions+222>: mov %eax,0x20(%esp)
0xc02b0362 <qh_completions+226>: lea 0x4(%ebp),%edi
0xc02b0365 <qh_completions+229>: mov %edi,%eax
0xc02b0367 <qh_completions+231>: call 0xc034eec0 <_spin_lock>
0xc02b036c <qh_completions+236>: mov %ebx,0xc(%esp)
0xc02b0370 <qh_completions+240>: mov 0x44(%esi),%eax
0xc02b0373 <qh_completions+243>: xor %ebx,%ebx
0xc02b0375 <qh_completions+245>: mov %ebp,0x4(%esp)
0xc02b0379 <qh_completions+249>: mov %eax,0x8(%esp)
0xc02b037d <qh_completions+253>: mov 0x40(%esp),%eax
0xc02b0381 <qh_completions+257>: mov %eax,(%esp)
0xc02b0384 <qh_completions+260>: call 0xc02b00c0
<qtd_copy_status>
0xc02b0389 <qh_completions+265>: mov %ebx,0x18(%esp)
0xc02b038d <qh_completions+269>: cmpl $0xffffff87,0x28(%ebp)
0xc02b0391 <qh_completions+273>: je 0xc02b04c0
<qh_completions+576>
0xc02b0397 <qh_completions+279>: mov %edi,%eax
0xc02b0399 <qh_completions+281>: call 0xc034f000 <_spin_unlock>
0xc02b039e <qh_completions+286>: mov 0x20(%esp),%ecx
0xc02b03a2 <qh_completions+290>: test %ecx,%ecx
0xc02b03a4 <qh_completions+292>: je 0xc02b03b4
<qh_completions+308>
0xc02b03a6 <qh_completions+294>: mov 0x3c(%esi),%edx
0xc02b03a9 <qh_completions+297>: cmp 0x10(%esp),%edx
0xc02b03ad <qh_completions+301>: je 0xc02b03b4
<qh_completions+308>
0xc02b03af <qh_completions+303>: mov (%esi),%eax
0xc02b03b1 <qh_completions+305>: mov %eax,0xffffffc8(%edx)
0xc02b03b4 <qh_completions+308>: lea 0x38(%esi),%ecx
0xc02b03b7 <qh_completions+311>: mov 0x38(%esi),%eax
0xc02b03ba <qh_completions+314>: mov %esi,%edi
0xc02b03bc <qh_completions+316>: mov 0x4(%ecx),%edx
0xc02b03bf <qh_completions+319>: mov %edx,0x4(%eax) <= OOPS
0xc02b03c2 <qh_completions+322>: mov %eax,(%edx)
0xc02b03c4 <qh_completions+324>: movl $0x200200,0x4(%ecx)
0xc02b03cb <qh_completions+331>: movl $0x100100,0x38(%esi)
0xc02b03d2 <qh_completions+338>: mov 0x24(%esp),%eax
0xc02b03d6 <qh_completions+342>: cmp 0x10(%esp),%eax
0xc02b03da <qh_completions+346>: mov (%eax),%edx
0xc02b03dc <qh_completions+348>: mov %edx,0x24(%esp)
0xc02b03e0 <qh_completions+352>: jne 0xc02b02e8
<qh_completions+104>
0xc02b03e6 <qh_completions+358>: lea 0x0(%esi),%esi
0xc02b03e9 <qh_completions+361>: lea 0x0(%edi),%edi
0xc02b03f0 <qh_completions+368>: test %edi,%edi
0xc02b03f2 <qh_completions+370>: je 0xc02b0430
<qh_completions+432>
0xc02b03f4 <qh_completions+372>: mov 0x48(%esp),%ecx
0xc02b03f8 <qh_completions+376>: mov %ecx,0x8(%esp)
0xc02b03fc <qh_completions+380>: mov 0x40(%edi),%eax
0xc02b03ff <qh_completions+383>: mov %eax,0x4(%esp)
0xc02b0403 <qh_completions+387>: mov 0x40(%esp),%eax
0xc02b0407 <qh_completions+391>: mov %eax,(%esp)
0xc02b040a <qh_completions+394>: call 0xc02b01c0
<ehci_urb_done>
[...]
-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel