On Sat, May 13, 2000 at 02:01:47PM -0400, Eric Youngdale wrote:
> > With no (or less) logging, the result code is 26030000, and scsi_error.c
> > goes into error recovery mode.  Total time to scan is about 5 secs.
> 
>     And this corresponds to SUGGEST_ABORT | DRIVER_TIMEOUT | DID_TIME_OUT
> 
>     This doesn't make much sense to me.  In the event that logging is turned
> off, you should also be getting the DID_BAD_TARGET message, but you are not.
> I guess I can suggest tossing some printk in both advansys.c and in
> scsi_old_done in scsi_obsolete.c, and try and narrow down more about the
> sequence of events.  You can reduce the amount of spew by ignoring status
> results of 0 - that is what you would get for normal completion of a
> command.

More clues...  I've hacked scsi_scan.c to only scan IDs 0 and 1.  I've also
hacked advansys_interrupt() to set asc_dbglvl to 2 during second and
subsequent interrupts.  Then, by playing with the mlqueue logging level,
I can cause the scan of device 1 to fail or not.

Here's what happens:

   # echo scsi log none > /proc/scsi/scsi
   # echo scsi log mlqueue 7 > /proc/scsi/scsi
   # insmod advansys.o && rmmod advansys
   scsi0 : AdvanSys SCSI 3.3A: PCI Ultra 16 CDB: IO 6000/F, IRQ 11
   scsi : 1 host.
   scsi_do_req (host = 0, channel = 0 target = 0, buffer =00000000,
                   bufflen = 0, done = c01a0d28, timeout = 600, retries = 5)
   command : 00  00  00  00  00  00  67  c2  f8  09
   Leaving scsi_do_cmd()
   scsi_dispatch_cmnd (host = 0, channel = 0, target = 0,
                   command = c1c5f258, buffer = 00000000,
   bufflen = 0, done = c01a0d28)
   queuecommand : routine at c4825f88
   leaving scsi_dispatch_cmnd()
   Leaving scsi_do_cmd()
   scsi_do_req (host = 0, channel = 0 target = 1, buffer =00000000,
                   bufflen = 0, done = c01a0d28, timeout = 600, retries = 5)
   command : 00  00  00  00  00  00  67  c2  f8  09
   Leaving scsi_do_cmd()
   scsi_dispatch_cmnd (host = 0, channel = 0, target = 1,
                   command = c1c5f058, buffer = 00000000,
   bufflen = 0, done = c01a0d28)
   queuecommand : routine at c4825f88
   leaving scsi_dispatch_cmnd()
   Leaving scsi_do_cmd()
   advansys_interrupt: begin
   advansys_interrupt: i 0, boardp c0281880
   advansys_interrupt: before AscISR()
   DvcGetQinfo: (16 bytes)
    881828C0 03000000 01000001 CCD482C4
   asc_isr_callback: asc_dvc_varp c0281888, qdonep c0257eec
   ASC_QDONE_INFO at addr c0257eec
    srb_ptr 1c5f000, target_ix 1, cdb_len 6, tag_code 0, done_stat 4
    host_stat 11, scsi_stat 0, scsi_msg 0
   asc_isr_callback: scp c1c5f000
   CDB: (6 bytes)
    00000000 0000
   asc_isr_callback: shp c0281800
   asc_rmqueue: new q_min_tim[1] 26
   asc_rmqueue: new q_max_tim[1] 26
   asc_isr_callback: QD_WITH_ERROR
   asc_isr_callback: host_stat 11
   asc_enqueue: new q_max_cnt[1] 1
   advansys_interrupt: done_scp 0, last_scp 0
   asc_dequeue_list: new q_min_tim[1] 0
   asc_scsi_done_list: begin
   asc_scsi_done_list: done
   advansys_interrupt: end
   scsi : 0 hosts.

What's of interest here is the stuff between "advansys_interrupt: begin"
and "advansys_interrupt: end".  Now, here's what it looks like when I
switch off mlqueue logging:

   # insmod advansys.o && rmmod advansys
   scsi0 : AdvanSys SCSI 3.3A: PCI Ultra 16 CDB: IO 6000/F, IRQ 11
   scsi : 1 host.
   advansys_interrupt: begin
   advansys_interrupt: i 0, boardp c0281080
   advansys_interrupt: before AscISR()
   advansys_interrupt: done_scp 0, last_scp 0
   asc_scsi_done_list: begin
   asc_scsi_done_list: done
   advansys_interrupt: end
   advansys_interrupt: begin
   advansys_interrupt: i 0, boardp c0281080
   advansys_interrupt: before AscISR()
   DvcGetQinfo: (16 bytes)
    881028C0 03000000 01000001 CCD482C4
   asc_isr_callback: asc_dvc_varp c0281088, qdonep c0257eec
   ASC_QDONE_INFO at addr c0257eec
    srb_ptr 1c5f400, target_ix 1, cdb_len 6, tag_code 0, done_stat 4
    host_stat 11, scsi_stat 0, scsi_msg 0
   asc_isr_callback: scp c1c5f400
   CDB: (6 bytes)
    00000000 0000
   asc_isr_callback: shp c0281000
   asc_rmqueue: new q_min_tim[1] 25
   asc_isr_callback: QD_WITH_ERROR
   asc_isr_callback: host_stat 11
   asc_enqueue: new q_max_cnt[1] 1
   advansys_interrupt: done_scp 0, last_scp 0
   asc_dequeue_list: new q_min_tim[1] 0
   asc_scsi_done_list: begin
   asc_scsi_done_list: done
   advansys_interrupt: end
   scsi : 0 hosts.

Scanning of ID 0 is now silent, and that makes things a bit 
different when the card interrupts after the scan of ID 1.  From
my reading of this, it looks like either:

o  The card is interrupting, but doesn't actually have any results
   for us.  AscISR() doesn't find any work to do, so asc_isr_callback
   never gets called.

o  Something else is interrupting on IRQ 11.  I don't think this is 
   likely because I can see all my cards generating interrupts on
   other IRQs (via /proc/interrupts).

I'm going to try printk()s in AscISR and friends and see if I can
figure out what condition is causing asc_isr_callback() not to be
called.

Later,
Kenn


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]

Reply via email to