On Sat, May 13, 2000 at 02:01:47PM -0400, Eric Youngdale wrote:
> > With no (or less) logging, the result code is 26030000, and scsi_error.c
> > goes into error recovery mode. Total time to scan is about 5 secs.
>
> And this corresponds to SUGGEST_ABORT | DRIVER_TIMEOUT | DID_TIME_OUT
>
> This doesn't make much sense to me. In the event that logging is turned
> off, you should also be getting the DID_BAD_TARGET message, but you are not.
> I guess I can suggest tossing some printk in both advansys.c and in
> scsi_old_done in scsi_obsolete.c, and try and narrow down more about the
> sequence of events. You can reduce the amount of spew by ignoring status
> results of 0 - that is what you would get for normal completion of a
> command.
More clues... I've hacked scsi_scan.c to only scan IDs 0 and 1. I've also
hacked advansys_interrupt() to set asc_dbglvl to 2 during second and
subsequent interrupts. Then, by playing with the mlqueue logging level,
I can cause the scan of device 1 to fail or not.
Here's what happens:
# echo scsi log none > /proc/scsi/scsi
# echo scsi log mlqueue 7 > /proc/scsi/scsi
# insmod advansys.o && rmmod advansys
scsi0 : AdvanSys SCSI 3.3A: PCI Ultra 16 CDB: IO 6000/F, IRQ 11
scsi : 1 host.
scsi_do_req (host = 0, channel = 0 target = 0, buffer =00000000,
bufflen = 0, done = c01a0d28, timeout = 600, retries = 5)
command : 00 00 00 00 00 00 67 c2 f8 09
Leaving scsi_do_cmd()
scsi_dispatch_cmnd (host = 0, channel = 0, target = 0,
command = c1c5f258, buffer = 00000000,
bufflen = 0, done = c01a0d28)
queuecommand : routine at c4825f88
leaving scsi_dispatch_cmnd()
Leaving scsi_do_cmd()
scsi_do_req (host = 0, channel = 0 target = 1, buffer =00000000,
bufflen = 0, done = c01a0d28, timeout = 600, retries = 5)
command : 00 00 00 00 00 00 67 c2 f8 09
Leaving scsi_do_cmd()
scsi_dispatch_cmnd (host = 0, channel = 0, target = 1,
command = c1c5f058, buffer = 00000000,
bufflen = 0, done = c01a0d28)
queuecommand : routine at c4825f88
leaving scsi_dispatch_cmnd()
Leaving scsi_do_cmd()
advansys_interrupt: begin
advansys_interrupt: i 0, boardp c0281880
advansys_interrupt: before AscISR()
DvcGetQinfo: (16 bytes)
881828C0 03000000 01000001 CCD482C4
asc_isr_callback: asc_dvc_varp c0281888, qdonep c0257eec
ASC_QDONE_INFO at addr c0257eec
srb_ptr 1c5f000, target_ix 1, cdb_len 6, tag_code 0, done_stat 4
host_stat 11, scsi_stat 0, scsi_msg 0
asc_isr_callback: scp c1c5f000
CDB: (6 bytes)
00000000 0000
asc_isr_callback: shp c0281800
asc_rmqueue: new q_min_tim[1] 26
asc_rmqueue: new q_max_tim[1] 26
asc_isr_callback: QD_WITH_ERROR
asc_isr_callback: host_stat 11
asc_enqueue: new q_max_cnt[1] 1
advansys_interrupt: done_scp 0, last_scp 0
asc_dequeue_list: new q_min_tim[1] 0
asc_scsi_done_list: begin
asc_scsi_done_list: done
advansys_interrupt: end
scsi : 0 hosts.
What's of interest here is the stuff between "advansys_interrupt: begin"
and "advansys_interrupt: end". Now, here's what it looks like when I
switch off mlqueue logging:
# insmod advansys.o && rmmod advansys
scsi0 : AdvanSys SCSI 3.3A: PCI Ultra 16 CDB: IO 6000/F, IRQ 11
scsi : 1 host.
advansys_interrupt: begin
advansys_interrupt: i 0, boardp c0281080
advansys_interrupt: before AscISR()
advansys_interrupt: done_scp 0, last_scp 0
asc_scsi_done_list: begin
asc_scsi_done_list: done
advansys_interrupt: end
advansys_interrupt: begin
advansys_interrupt: i 0, boardp c0281080
advansys_interrupt: before AscISR()
DvcGetQinfo: (16 bytes)
881028C0 03000000 01000001 CCD482C4
asc_isr_callback: asc_dvc_varp c0281088, qdonep c0257eec
ASC_QDONE_INFO at addr c0257eec
srb_ptr 1c5f400, target_ix 1, cdb_len 6, tag_code 0, done_stat 4
host_stat 11, scsi_stat 0, scsi_msg 0
asc_isr_callback: scp c1c5f400
CDB: (6 bytes)
00000000 0000
asc_isr_callback: shp c0281000
asc_rmqueue: new q_min_tim[1] 25
asc_isr_callback: QD_WITH_ERROR
asc_isr_callback: host_stat 11
asc_enqueue: new q_max_cnt[1] 1
advansys_interrupt: done_scp 0, last_scp 0
asc_dequeue_list: new q_min_tim[1] 0
asc_scsi_done_list: begin
asc_scsi_done_list: done
advansys_interrupt: end
scsi : 0 hosts.
Scanning of ID 0 is now silent, and that makes things a bit
different when the card interrupts after the scan of ID 1. From
my reading of this, it looks like either:
o The card is interrupting, but doesn't actually have any results
for us. AscISR() doesn't find any work to do, so asc_isr_callback
never gets called.
o Something else is interrupting on IRQ 11. I don't think this is
likely because I can see all my cards generating interrupts on
other IRQs (via /proc/interrupts).
I'm going to try printk()s in AscISR and friends and see if I can
figure out what condition is causing asc_isr_callback() not to be
called.
Later,
Kenn
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]