e1000||ICH5-ATA BUG in 2.6.12-rc1 triggered by ill behaved CD-ROM drive

2005-03-19 Thread Stefan Schmidt
Here's the story again but shorter:
 (-> netdev: "(solved) Re: Fw: 2.6.11-mm2 weird ethernet RTTs" )
It seems after some reboot my CD-ROM drive went nuts and hogged up my
IDE-Controller (ICH5) which in turn either generated a lot of interrupts
or did not properly respond to em. Half of the time i tried booting the
machine with the CD-ROM drive attached to ide1 and the HD at ide0 i
experienced heavy disk timeouts due to DMA timeouts (°1) and the other
half of the time the following kernel BUG was triggered. Removing the
CD-ROM resolved all those problems.

(typos probable - no camera was handy ;-()

Configuring network interfaces ... --- cut here ---
kernel BUG at include/linux/netdevice.h:860!
invalid operand:  [#1]
PREEMPT SMP
Modules linked in: ide_cd cdrom ... (left out some here) ... commoncap
e1000 3c59x mii
CPU: 0
EIP: 0060: [] Not tainted VLI
EFLAGS: 00010046 (2.6.12-rc1)
EIP is at e1000_clean+0xcb/0xe0 [e1000]
eax: 0017 ebx: 0287 ecx:  edx: e81e6000
esi:  edi: de67e800 ebp: 0040 esp: c042efb0
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=c042e000 task=c0360c00)
Stack: de67ea40  de67e904 de67e800 c13f5fa0 c13f5f80 c02adedf
fffbbf83 012c 0001 c03ec198 c0427bc0  c0123fc6 000a
c03f4f88 0046 c03f4000 0047a007 c010552a
Call Trace:
 [] net_rx_action+0x7f/0x110
 [] __do_softirq+0xb6/0xd0
 [] do_softirq+0x4a/0x60
 [] irq_exit+0x39/0x40
 [] do_IRQ+0x4d/0x70
 [] common_interrupt+0x1a/0x20
 [] default_idle+0x23/0x30
 [] start_kernel+0x15f/0x180
 [] unknown_bootoption+0x0/0x1e0
Code: c0 84 c0 74 1a 8b 82 24 02 00 00 b9 9d 00 00 00 89 88 d0 00 00 00
8b 82 24 02 00 00 8b 40 08 31 d2 82 c4 08 89 d0 5b 5e 5f 5d c3 <0f> 0b
5c 03 35 c6 05 e0 eb 91 8d 74 26 00 8d bc 27 00 00 00 00
<0> Kernel panic - not syncing: Fatal exception in interrupt
e1000: eth1: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex
--- The End ---

Kernel-config and lspci -vvv attached.

°1:
Mar 18 14:47:53 giscard kernel: irq 177: nobody cared!
Mar 18 14:47:53 giscard kernel:  [__report_bad_irq+36/144] 
__report_bad_irq+0x24/0x90
Mar 18 14:47:53 giscard kernel:  [note_interrupt+97/144] 
note_interrupt+0x61/0x90
Mar 18 14:47:53 giscard kernel:  [__do_IRQ+284/288] __do_IRQ+0x11c/0x120
Mar 18 14:47:53 giscard kernel:  [do_IRQ+70/112] do_IRQ+0x46/0x70
Mar 18 14:47:53 giscard kernel:  ===
Mar 18 14:47:53 giscard kernel:  [common_interrupt+26/32] 
common_interrupt+0x1a/0x20
Mar 18 14:47:53 giscard kernel:  [default_idle+35/48] default_idle+0x23/0x30
Mar 18 14:47:53 giscard kernel:  [cpu_idle+95/112] cpu_idle+0x5f/0x70
Mar 18 14:47:53 giscard kernel:  [start_kernel+351/384] start_kernel+0x15f/0x180
Mar 18 14:47:53 giscard kernel:  [unknown_bootoption+0/448] 
unknown_bootoption+0x0/0x1c0
Mar 18 14:47:53 giscard kernel: handlers:
Mar 18 14:47:53 giscard kernel: [ide_intr+0/336] (ide_intr+0x0/0x150)
Mar 18 14:47:53 giscard kernel: [ide_intr+0/336] (ide_intr+0x0/0x150)
Mar 18 14:47:53 giscard kernel: [usb_hcd_irq+0/96] (usb_hcd_irq+0x0/0x60)
Mar 18 14:47:53 giscard kernel: [pg0+532648896/1069179904] 
(e1000_intr+0x0/0x110 [e1000])
Mar 18 14:47:53 giscard kernel: Disabling IRQ #177
Mar 18 14:47:53 giscard kernel: hda: dma_timer_expiry: dma status == 0x24
Mar 18 14:47:53 giscard kernel: hda: DMA interrupt recovery
Mar 18 14:47:53 giscard kernel: hda: lost interrupt

Stefan
-- 
"Reality is a poor substitute for my dreams."
- anonymous
:00:00.0 Host bridge: Intel Corp. 82865G/PE/P DRAM Controller/Host-Hub 
Interface (rev 02)
Subsystem: Asustek Computer, Inc.: Unknown device 80a5
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- TAbort- SERR- Reset- FastB2B-

:00:1d.0 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #1 
(rev 02) (prog-if 00 [UHCI])
Subsystem: Asustek Computer, Inc. P4P800 Mainboard
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
SERR- TAbort- 
SERR- TAbort- 
SERR- TAbort- 
SERR- TAbort- 
SERR- TAbort- SERR- Reset- FastB2B-

:00:1f.0 ISA bridge: Intel Corp. 82801EB/ER (ICH5/ICH5R) LPC Bridge (rev 02)
Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
SERR- TAbort- 
SERR- TAbort- 
SERR- TAbort- 
SERR- TAbort- 
SERR- TAbort- 
SERR- #
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.12-rc1
# Fri Mar 18 13:49:06 2005
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_LOCK_KERNEL=y

#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_SWAP=y
CONFIG_SYSVIPC=y

e1000||ICH5-ATA BUG in 2.6.12-rc1 triggered by ill behaved CD-ROM drive

2005-03-19 Thread Stefan Schmidt
Here's the story again but shorter:
 (- netdev: (solved) Re: Fw: 2.6.11-mm2 weird ethernet RTTs )
It seems after some reboot my CD-ROM drive went nuts and hogged up my
IDE-Controller (ICH5) which in turn either generated a lot of interrupts
or did not properly respond to em. Half of the time i tried booting the
machine with the CD-ROM drive attached to ide1 and the HD at ide0 i
experienced heavy disk timeouts due to DMA timeouts (°1) and the other
half of the time the following kernel BUG was triggered. Removing the
CD-ROM resolved all those problems.

(typos probable - no camera was handy ;-()

Configuring network interfaces ... --- cut here ---
kernel BUG at include/linux/netdevice.h:860!
invalid operand:  [#1]
PREEMPT SMP
Modules linked in: ide_cd cdrom ... (left out some here) ... commoncap
e1000 3c59x mii
CPU: 0
EIP: 0060: [e005159b] Not tainted VLI
EFLAGS: 00010046 (2.6.12-rc1)
EIP is at e1000_clean+0xcb/0xe0 [e1000]
eax: 0017 ebx: 0287 ecx:  edx: e81e6000
esi:  edi: de67e800 ebp: 0040 esp: c042efb0
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=c042e000 task=c0360c00)
Stack: de67ea40  de67e904 de67e800 c13f5fa0 c13f5f80 c02adedf
fffbbf83 012c 0001 c03ec198 c0427bc0  c0123fc6 000a
c03f4f88 0046 c03f4000 0047a007 c010552a
Call Trace:
 [c02adedf] net_rx_action+0x7f/0x110
 [c0123fc6] __do_softirq+0xb6/0xd0
 [c010552a] do_softirq+0x4a/0x60
 [c01240a9] irq_exit+0x39/0x40
 [c010541d] do_IRQ+0x4d/0x70
 [c0103ade] common_interrupt+0x1a/0x20
 [c0101053] default_idle+0x23/0x30
 [c03f58df] start_kernel+0x15f/0x180
 [c03f5350] unknown_bootoption+0x0/0x1e0
Code: c0 84 c0 74 1a 8b 82 24 02 00 00 b9 9d 00 00 00 89 88 d0 00 00 00
8b 82 24 02 00 00 8b 40 08 31 d2 82 c4 08 89 d0 5b 5e 5f 5d c3 0f 0b
5c 03 35 c6 05 e0 eb 91 8d 74 26 00 8d bc 27 00 00 00 00
0 Kernel panic - not syncing: Fatal exception in interrupt
e1000: eth1: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex
--- The End ---

Kernel-config and lspci -vvv attached.

°1:
Mar 18 14:47:53 giscard kernel: irq 177: nobody cared!
Mar 18 14:47:53 giscard kernel:  [__report_bad_irq+36/144] 
__report_bad_irq+0x24/0x90
Mar 18 14:47:53 giscard kernel:  [note_interrupt+97/144] 
note_interrupt+0x61/0x90
Mar 18 14:47:53 giscard kernel:  [__do_IRQ+284/288] __do_IRQ+0x11c/0x120
Mar 18 14:47:53 giscard kernel:  [do_IRQ+70/112] do_IRQ+0x46/0x70
Mar 18 14:47:53 giscard kernel:  ===
Mar 18 14:47:53 giscard kernel:  [common_interrupt+26/32] 
common_interrupt+0x1a/0x20
Mar 18 14:47:53 giscard kernel:  [default_idle+35/48] default_idle+0x23/0x30
Mar 18 14:47:53 giscard kernel:  [cpu_idle+95/112] cpu_idle+0x5f/0x70
Mar 18 14:47:53 giscard kernel:  [start_kernel+351/384] start_kernel+0x15f/0x180
Mar 18 14:47:53 giscard kernel:  [unknown_bootoption+0/448] 
unknown_bootoption+0x0/0x1c0
Mar 18 14:47:53 giscard kernel: handlers:
Mar 18 14:47:53 giscard kernel: [ide_intr+0/336] (ide_intr+0x0/0x150)
Mar 18 14:47:53 giscard kernel: [ide_intr+0/336] (ide_intr+0x0/0x150)
Mar 18 14:47:53 giscard kernel: [usb_hcd_irq+0/96] (usb_hcd_irq+0x0/0x60)
Mar 18 14:47:53 giscard kernel: [pg0+532648896/1069179904] 
(e1000_intr+0x0/0x110 [e1000])
Mar 18 14:47:53 giscard kernel: Disabling IRQ #177
Mar 18 14:47:53 giscard kernel: hda: dma_timer_expiry: dma status == 0x24
Mar 18 14:47:53 giscard kernel: hda: DMA interrupt recovery
Mar 18 14:47:53 giscard kernel: hda: lost interrupt

Stefan
-- 
Reality is a poor substitute for my dreams.
- anonymous
:00:00.0 Host bridge: Intel Corp. 82865G/PE/P DRAM Controller/Host-Hub 
Interface (rev 02)
Subsystem: Asustek Computer, Inc.: Unknown device 80a5
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort+ SERR- PERR-
Latency: 0
Region 0: Memory at fe80 (32-bit, prefetchable) [size=4M]
Capabilities: [e4] #09 [0106]

:00:02.0 VGA compatible controller: Intel Corp. 82865G Integrated Graphics 
Device (rev 02) (prog-if 00 [VGA])
Subsystem: Asustek Computer, Inc.: Unknown device 80a5
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR-
Latency: 0
Interrupt: pin A routed to IRQ 169
Region 0: Memory at f000 (32-bit, prefetchable) [size=128M]
Region 1: Memory at fe78 (32-bit, non-prefetchable) [size=512K]
Region 2: I/O ports at efe0 [size=8]
Capabilities: [d0] Power Management version 1
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

:00:03.0 PCI bridge: Intel Corp. 82865G/PE/P PCI to CSA Bridge (rev 02) 
(prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+