I'm not sure which layer caused this oops, but it is probably repeatable.

Oops first, then summary:

Linux nevets 2.4.0-test4 #2 Thu Jul 20 08:53:51 EDT 2000 i586 unknown

Unable to handle kernel NULL pointer dereference at virtual address 000000c4
 printing eip:
c01a43b0
*pde = 00000000
Oops: 0002
CPU:    0
EIP:    0010:[scsi_allocate_device+384/612]
EFLAGS: 00010046
eax: 00000006   ebx: 00000000   ecx: c03ed000   edx: c03e05c0
esi: c03e0ec8   edi: c03e0ec0   ebp: c2a31df8   esp: c2a31dc0
ds: 0018   es: 0018   ss: 0018
Process xmms (pid: 5434, stackpage=c2a31000)
Stack: 00000218 c0c72860 c2a31e5c 00000086 c03e0ec8 c03ed000 00000000 00000004
       c01705bf c0c72860 c03e0ef0 c03e0ee8 c03e0ee8 000f4240 c2a31e2c c01a9ced
       c03e0ec0 00000000 00000000 00000282 c0ab9200 c2a31e5c 000f4240 c0c72860
Call Trace: [generic_make_request+1843/1940] [scsi_request_fn+409/760] 
[generic_unplug_device+32/40] [__wait_on_buffer+162/216] [bread+71/112] 
[isofs_find_entry+151/1420] [path_walk+1818/2028]
       [isofs_cmp+22/28] [isofs_lookup+40/128] [real_lookup+79/156] 
[path_walk+1442/2028] [open_namei+120/1452] [<f2f4e865>] [<e8c6dff5>] [filp_open+48/80]
       [sys_open+54/172] [system_call+52/64]
Code: c7 83 c4 00 00 00 ff ff 00 00 c7 83 f8 00 00 00 00 00 00 00


Attached devices: 
Host: scsi1 Channel: 00 Id: 06 Lun: 00
  Vendor: NRC      Model: MBR-7            Rev: 110 
  Type:   CD-ROM                           ANSI SCSI revision: 02
[7 disk changer, LUN 0-6, rest deleted for brevity]

I think this is a long standing bug in something or other.
I think we keep chipping pieces away, and the symptoms keep changing.
Originally, this was a fatal panic caused by a bug in the AHA1542 reset code.
Someone changed something between test-1 and test-5, and now instead
of causing a scsi reset, it causes the above.

Here's how I [believe] this is triggered...

I was flipping between two mp3's with xmms that happened to be on different
cdroms in my cd changer.  I think the start of both files ended up in
cache, so I was playing the cached part of the mp3 on disk 5 while disk 3 was 
loaded.  When it ran out of cached data, it had to suddenly change disks.

The disk change process is lengthy--between 3 and 6 seconds plus spin
down and spin up time.  Rapid changes between disks or any other cdrom
related activity that doesn't happen immediately seems to reliably cause
problems.  (I suspect scanning the sectors of a disk in reverse order
(requiring frequent lengthy seeks) also would cause this, but I've yet
to test it.)

In previous kernels, I would get scsi bus resets until the offending
process was killed, or the cdrom was removed from the drive, causing
an I/O error.  Subsequent attempts to access the device in any way puts
the process in uninterruptable disk wait.  Other devices (i.e., other
slots in the same cd changer) work just fine.

In this kernel, instead of bus resets, I got the OOPS above.
This OOPS (unlike previous ones) doesn't seem to be AHA1542 related.
(Maybe all the bugs in that driver leading up to this have finally
been smoothed over enough to find the underlying bug.)

Can someone make sense of this one?

        Steve
[EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Reply via email to