Hello,

I am experiencing a hard lockup under IRQ while using an external USB
2.0 IDE drive
on a SMP system using the EHCI HCD and a VIA additionnal controller.

The oops happens after a few minutes of high activity on the drive,
which has an ext3 file system on which rsync is run, and looks like SMP
related.

I first noticed the problem on 2.6.8.1, and could reproduce it on
2.6.9-rc3. Detailled information gathered with kdb at the time of the
oops follows.

Thanks for your help,

Gaël.

-- 
Gaël Roualland -+- [EMAIL PROTECTED]

-- 

* Kernel version :

Linux version 2.6.9-rc3 ([EMAIL PROTECTED]) (version gcc 3.3.4 (Debian
1:3.3.4-6sarge1)) #1 SMP Mon Oct 4 02:57:13 CEST 2004

* Oops (retyped by hand) :

Unable to handle kernel paging request at virtual address 00100104
 printing eip:
c02b03bf
*pde = 00000000
Oops: 0002 [#1]
PREEMPT SMP
Modules linked in: snd_sbwave snd_opl3_lib snd_sb16_dsp snd_pcm_oss
snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_sb16_csp
snd_sb_common snd_hwdep snd_mpu401 snd_rawmidi snd_seq_device snd_8250
serial_core 3c59x nls_iso8859_1 nls_cp437 w83781d i2c_sensor i2c_piix4
i2c_core ipv6 sg sr_mod sd_mod aic7xxxx
CPU:    1
EIP:    0060:[<c02b03bf>]    Not tainted VLI
EFLAGS: 00010246   (2.6.9-rc3)
EIP is at qh_completions+0x13f/0x330
eax: 00100100  ebc: 00000000  ecx: d7d783f8  edx: 00200200
esi: d7d783c0  edi: d7d783c0  ebp: d78c53e0  esp: d4245c84
ds: 007b  es: 007b  ss: 0068
Process kjournald [...]
Stack:  d7870c00 d78c53e0 00001000 00001c00 d7d7714c 012659b2 00000000
00000007
        00000000 d7d78458 d7d78720 d7d77160 d4245d78 d7d77100 d7d7714c c02b1265
        d7870c00 d7d77100 d4245d78 00000000 d7870c00 d4245d78 d7870c00 00000001
Call Trace:
 [<c02b1265>]: scan_async+0x95/0x170
 [<c02b3842>]: ehci_work+0x32/0xc0
 [<c02b39b0>]: echi_irq+0xe0/0x170
 [<c013dc07>]: mempool_alloc_slab+0x17/0x20
 [<c02a53f6>]: usb_hcd_irq+0x36/0x70
 [<c0107d24>]: handle_IRQ_event+0x34/0x70
 [<c01080ed>]: do_IRQ+09dx/0x140
 [<c0105c70>]: common_interrupt+0x18/0x20
 [<c015d95c>]: ll_rw_block+0x7c/0x90
 [<c01a7df5>]: journal_commit_transaction+0x1175/0x1e0
 [<c011c1b0>]: autoremove_wake_function+0x0/0x60
 [<c02ef1d7>]: ip_rcv+0x387/0x510
 [<c011c1b0>]: autoremove_wake_function+0x/0x60
 [<c01aa7dd>]: kjournald+0xdd/0x250
 [<c0120c31>]: do_exit+0x2a1/0x430

* Output of 'btc' under kdb :

CPU 0 :
Pid 2571, rsync
0xd7b89000      2571            2519
ESP             EIP             Function
0xd7867d08      0xc034eef9      _spin_lock+0x39
0xd7867e3c      0xc02b0249      ehci_urb_done+0x89
0xd7867e5c      0xc02b0312      qh_completions+0x92
0xd7867e9c      0xc02b1265      scan_async+0x95
0xd7867ec0      0xc02b3842      ehci_work+0x32
0xd7867ed4      0xc02b2fd8      ehci_watchdog+0x58
0xd7867ee8      0xc01272eA      run_timer_softirq+0xda
0xd7867f18      0xc0122f7a      __do_softirq+0xba
0xd7867f34      0xc0122fbd      do_softirq+0x2d
0xd7867f3c      0xc01142c7      smp_apic_timer_interrupt+0x97

CPU 1:
Pid 2506, kjournald
ESP             EIP             Function
0xd4245ad4      0xc02b03bf      qh_completions+0x13f
0xd4245cc4      0xc02b1265      scab_async+0x95
0xd4245ce8      0xc02b3842      ehci_work+0x32
0xd4245d20      0xc02a53f6      usb_hdc_irq+0x36
0xd4245d34      0xc0107d24      handle_IRQ_event+0x34
0xd4245d54      0xc01080ed      do_IRQ+0x9d
0xd4245d78      0xc0105c70      common_interrupt+0x18
0xd4245e38      0xc02ef1d7      ip_rcv+0x387
0xd4245ff4      0xc01032e5      kernel_thread_helper+0x5

Looks like a concurrency problem in qh_completions, since rsync and
kjournald were both operating on the drive at the time of the crash.

* Hardware information if needed

$ lspci -vvv
[...]
0000:00:0a.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB
1.1 Controller (rev 61) (prog-if 00 [UHCI])
        Subsystem: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop-
ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 32, Cache Line Size: 0x08 (32 bytes)
        Interrupt: pin A routed to IRQ 9
        Region 4: I/O ports at b800 [size=32]
        Capabilities: <available only to root>

0000:00:0a.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB
1.1 Controller (rev 61) (prog-if 00 [UHCI])
        Subsystem: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop-
ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 32, Cache Line Size: 0x08 (32 bytes)
        Interrupt: pin B routed to IRQ 9
        Region 4: I/O ports at b400 [size=32]
        Capabilities: <available only to root>

0000:00:0a.2 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 63)
(prog-if 20 [EHCI])
        Subsystem: VIA Technologies, Inc. USB 2.0
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop-
ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 32, Cache Line Size: 0x08 (32 bytes)
        Interrupt: pin C routed to IRQ 10
        Region 0: Memory at d5000000 (32-bit, non-prefetchable)
[size=256]
        Capabilities: <available only to root>


* Disassembled "qh_completions" around oops :

$ gdb vmlinux
Dump of assembler code for function qh_completions:
[...]
0xc02b0330 <qh_completions+176>:        call   0xc0263530
<dma_pool_free>
0xc02b0335 <qh_completions+181>:        cmp    0x28(%esp),%esi
0xc02b0339 <qh_completions+185>:        je     0xc02b03f0
<qh_completions+368>
0xc02b033f <qh_completions+191>:        lock addl $0x0,0x0(%esp)
0xc02b0345 <qh_completions+197>:        mov    0x8(%esi),%ebx
0xc02b0348 <qh_completions+200>:        test   %bl,%bl
0xc02b034a <qh_completions+202>:        js     0xc02b0534
<qh_completions+692>
0xc02b0350 <qh_completions+208>:        test   $0x40,%bl
0xc02b0353 <qh_completions+211>:        je     0xc02b04e0
<qh_completions+608>
0xc02b0359 <qh_completions+217>:        mov    $0x1,%eax
0xc02b035e <qh_completions+222>:        mov    %eax,0x20(%esp)
0xc02b0362 <qh_completions+226>:        lea    0x4(%ebp),%edi
0xc02b0365 <qh_completions+229>:        mov    %edi,%eax
0xc02b0367 <qh_completions+231>:        call   0xc034eec0 <_spin_lock>
0xc02b036c <qh_completions+236>:        mov    %ebx,0xc(%esp)
0xc02b0370 <qh_completions+240>:        mov    0x44(%esi),%eax
0xc02b0373 <qh_completions+243>:        xor    %ebx,%ebx
0xc02b0375 <qh_completions+245>:        mov    %ebp,0x4(%esp)
0xc02b0379 <qh_completions+249>:        mov    %eax,0x8(%esp)
0xc02b037d <qh_completions+253>:        mov    0x40(%esp),%eax
0xc02b0381 <qh_completions+257>:        mov    %eax,(%esp)
0xc02b0384 <qh_completions+260>:        call   0xc02b00c0
<qtd_copy_status>
0xc02b0389 <qh_completions+265>:        mov    %ebx,0x18(%esp)
0xc02b038d <qh_completions+269>:        cmpl   $0xffffff87,0x28(%ebp)
0xc02b0391 <qh_completions+273>:        je     0xc02b04c0
<qh_completions+576>
0xc02b0397 <qh_completions+279>:        mov    %edi,%eax
0xc02b0399 <qh_completions+281>:        call   0xc034f000 <_spin_unlock>
0xc02b039e <qh_completions+286>:        mov    0x20(%esp),%ecx
0xc02b03a2 <qh_completions+290>:        test   %ecx,%ecx
0xc02b03a4 <qh_completions+292>:        je     0xc02b03b4
<qh_completions+308>
0xc02b03a6 <qh_completions+294>:        mov    0x3c(%esi),%edx
0xc02b03a9 <qh_completions+297>:        cmp    0x10(%esp),%edx
0xc02b03ad <qh_completions+301>:        je     0xc02b03b4
<qh_completions+308>
0xc02b03af <qh_completions+303>:        mov    (%esi),%eax
0xc02b03b1 <qh_completions+305>:        mov    %eax,0xffffffc8(%edx)
0xc02b03b4 <qh_completions+308>:        lea    0x38(%esi),%ecx
0xc02b03b7 <qh_completions+311>:        mov    0x38(%esi),%eax
0xc02b03ba <qh_completions+314>:        mov    %esi,%edi
0xc02b03bc <qh_completions+316>:        mov    0x4(%ecx),%edx
0xc02b03bf <qh_completions+319>:        mov    %edx,0x4(%eax)   <= OOPS
0xc02b03c2 <qh_completions+322>:        mov    %eax,(%edx)
0xc02b03c4 <qh_completions+324>:        movl   $0x200200,0x4(%ecx)
0xc02b03cb <qh_completions+331>:        movl   $0x100100,0x38(%esi)
0xc02b03d2 <qh_completions+338>:        mov    0x24(%esp),%eax
0xc02b03d6 <qh_completions+342>:        cmp    0x10(%esp),%eax
0xc02b03da <qh_completions+346>:        mov    (%eax),%edx
0xc02b03dc <qh_completions+348>:        mov    %edx,0x24(%esp)
0xc02b03e0 <qh_completions+352>:        jne    0xc02b02e8
<qh_completions+104>
0xc02b03e6 <qh_completions+358>:        lea    0x0(%esi),%esi
0xc02b03e9 <qh_completions+361>:        lea    0x0(%edi),%edi
0xc02b03f0 <qh_completions+368>:        test   %edi,%edi
0xc02b03f2 <qh_completions+370>:        je     0xc02b0430
<qh_completions+432>
0xc02b03f4 <qh_completions+372>:        mov    0x48(%esp),%ecx
0xc02b03f8 <qh_completions+376>:        mov    %ecx,0x8(%esp)
0xc02b03fc <qh_completions+380>:        mov    0x40(%edi),%eax
0xc02b03ff <qh_completions+383>:        mov    %eax,0x4(%esp)
0xc02b0403 <qh_completions+387>:        mov    0x40(%esp),%eax
0xc02b0407 <qh_completions+391>:        mov    %eax,(%esp)
0xc02b040a <qh_completions+394>:        call   0xc02b01c0
<ehci_urb_done>
[...]


-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to