Kernel oops on st module cycling

Jean Delvare Fri, 22 Feb 2013 05:02:49 -0800

Hi Kai, James,

It only takes a few st module rmmod/modprobe cycles to get a kernel
oops. It was reported to me, and reproduced by me, on kernel 3.0.58 /
SLES11 SP2, but I was also able to reproduce it on more recent kernels
(3.4.6 / openSUSE 12.2 and 3.7.6 / openSUSE 12.3 RC1.)


The oops doesn't happen on modprobe proper, but on an scsi_id command
ran by udev right after modprobe:
KERNEL=="st*[0-9]|nst*[0-9]", ENV{ID_SERIAL}!="?*", WAIT_FOR="$env{BSG_DEV}", 
IMPORT="scsi_id --whitelisted --export --device=$env{BSG_DEV}", 
ENV{ID_BUS}="scsi"

Using kdb I could gather the following backtrace:
Stack traceback for pid 4037
0xffff880039dfa040     4037     4027  1    0   R  0xffff880039dfa4e0 *scsi_id
 [<ffffffff812482d9>] blk_get_queue+0x9/0x30
 [<ffffffff81255f88>] bsg_add_device+0x38/0x1c0
 [<ffffffff81256214>] bsg_get_device+0x104/0x140
 [<ffffffff81256266>] bsg_open+0x16/0x40
 [<ffffffff8117949f>] chrdev_open+0x13f/0x200
 [<ffffffff8117303e>] __dentry_open+0x18e/0x310
 [<ffffffff811732bb>] nameidata_to_filp+0x7b/0x80
 [<ffffffff81182942>] do_last+0x1f2/0x7f0
 [<ffffffff81183ed8>] path_openat+0xc8/0x3f0
 [<ffffffff81184328>] do_filp_open+0x48/0xa0
 [<ffffffff811744c2>] do_sys_open+0x162/0x1f0
 [<ffffffff81174590>] sys_open+0x20/0x30
 [<ffffffff814984c2>] system_call_fastpath+0x16/0x1b
 [<00007f205bf94da0>] 0x7f205bf94da0
     r15 = 0xffff88003b9887b8      r14 = 0xffff88003c469368 
     r13 = 0xffff88003bac5b50      r12 = 0x6b6b6b6b6b6b6b6b 
      bp = 0xffff88003bb23bd8       bx = 0xfffffffffffffffa 
     r11 = 0x0000000000000001      r10 = 0x0000000000000000 
      r9 = 0xffff88003d637290       r8 = 0x0000000000000000 
      ax = 0x0000000000000000       cx = 0xffff88003fc00000 
      dx = 0xffff88003bac5b50       si = 0x6b6b6b6b6b6b6b6b 
      di = 0x6b6b6b6b6b6b6b6b  orig_ax = 0xffffffffffffffff 
      ip = 0xffffffff812482d9       cs = 0x0000000000000010 
   flags = 0x0000000000010286       sp = 0xffff88003bb23bc0 
      ss = 0x0000000000000018 &regs = 0xffff88003bb23b28

Note that the kernel log message right before the oops are suspicious.
Normally I would get:

[  272.155460] st: Version 20101219, fixed bufsize 32768, s/g segs 256
[  272.156586] st 3:0:4:0: Attached scsi tape st0
[  272.156592] st 3:0:4:0: st0: try direct i/o: yes (alignment 4 B)

but before the oops I get:

[  482.428527] st: Version 20101219, fixed bufsize 32768, s/g segs 256
[  482.429509] st 3:0:4:0: Attached scsi tape st0
[  482.429515] st 3:0:4:0: st0: try direct i/o: yes (alignment 1802201964 B)
[  482.449542] general protection fault: 0000 [#1] SMP 

Note the odd alignment value.

According to gdb, blk_get_queue+0x9 is:

563     if (likely(!test_bit(QUEUE_FLAG_DEAD, &q->queue_flag))) {

where test_bit is implemented by inline function constant_test_bit().

With kernel 3.4.6 I got a different backtrace, I had no serial console
setup at the time so I could only take a picture, below if a manual copy
of the trace, hope I didn't make any typo:

RIP: elv_may_queue+0x7/0x20
Call trace:
 get_request+0x112/0x4a0
 get_request_wait+0x2d/0x210
 blk_get_request+0x6c/0x90
 bsg_map_hdr.isra.7+0xbe/0x340
 bsg_ioctl+0x187/0x230
 do_vfs_ioctl+0x8f/0x530
 sys_ioctl+0x98/0xa0
 system_call_fastpath+0x1a/0x1f

Original pictures are here if needed:
http://users.suse.com/~jdelvare/work/st-oops/

I'd like this bug to be fixed. What extra information can I provide that
would be helpful?

Thanks,
-- 
Jean Delvare
Suse L3

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Kernel oops on st module cycling

Reply via email to