Re: scsi_unjam_host and scsi_try to abort_command

Eric Youngdale Sat, 10 Feb 2001 12:40:45 -0800
    Yes, I must admit that I still don't like how the SMP locking is done
even in 2.4.  One of the projects for 2.5 is to essentially fix it so that
each low-level driver is responsible for all of its own locking, and the
mid-level will handle all of the mid-layer locking.  Ultimately the
io_request_lock would no longer be needed once this is done - each low-level
driver *should* be using it's own private lock instead of misusing
io_request_lock.

    The locking in the mid-layer proper is currently fairly clean in the 2.4
kernel.  If you look closely, you will see that io_request_lock is grabbed
prior to each call into a low-level driver, as this is what the low-level
drivers currently expect.

    WRT to your specific question, you are correct that this would lead to a
deadlock.   The current design is that the lock is held when the low-level
entrypoints are called.   Given that the error handling entrypoints are all
called from the error handler thread, it is permissible for the error
handler functions in the low-level driver to sleep and wait for a timer
interrupt, but the driver would obviously need to release io_request_lock
before doing this.  Note that the interrupt service routine should also be
grabbing io_request_lock when the interrupt arrives, and releasing it before
the interrupt service routine returns.   Thus when the error handler thread
wakes up from the sleep (as a result of either a timer firing or an
interrupt), the error handler function would need to grab io_request_lock
again.

    Is this a good enough explaination?  I could come up with a little bit
of example code if it would help.

-Eric

----- Original Message -----
From: <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Friday, February 09, 2001 5:59 PM
Subject: RE: scsi_unjam_host and scsi_try to abort_command


> There is one problem here (atleast I think, unless I misunderstood
> something). The scsi_error_handler() routine calls spin_lock_irqsave()
> before calling scsi_unjam host(). On UP system atleast,
spin_lock_irqsave()
> is nothing but save_flags(); cli(). So basically, do we disable all the
> interrupts when we any of the entry points (e.g. eh_abort_handler()) gets
> called ? If that is the case, how is the timer interrupt going to come ?
> And how is the spawned timer in the HBA driver going to help ?
> On SMP system, I understand that the other processor's interrupts are
> still active. So probably this should work. But on UP system, once
> the interrupts are disabled, nothing can proceed until you come out
> of the HBA driver entry point. This is my understanding. Unless the
> timer interrupts are non-maskable interrupts, according to the code
> this is how it is.
>
> Thanks
> -hiren
>
>
> > -----Original Message-----
> > From: Eric Youngdale [mailto:[EMAIL PROTECTED]]
> > Sent: Thursday, February 08, 2001 11:33 AM
> > To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
> > Subject: Re: scsi_unjam_host and scsi_try to abort_command
> >
> >
> >
> >     The behavior you are seeing is by design.   Essentially
> > the problem is
> > that when the mid-level keeps timers and aborts a pending
> > abort, the host
> > adapter itself can become quite confused.  This is
> > essentially what was
> > happening with the old error handling code, and it resulted
> > in a sort of
> > power struggle as the low-level driver and the mid-layer both
> > took it for
> > granted that they "owned" the command in question, and both
> > of them were
> > trying to operate on it.
> >
> >     With the new error handling code it becomes the
> > responsibility of the
> > hba driver to spawn a timer if the driver author feels the
> > need.  The driver
> > should then wait for either an interrupt indicating
> > completion or for the
> > timer to fire.  In any case, the hba driver can then
> > accurately inform the
> > mid-layer as to what really happened.  In the event of a
> > timeout, the hba
> > driver has a much better idea of the current status of the
> > command, and can
> > make much better decisions as to how to handle it.
> >
> > -Eric
> >
> >
> > ----- Original Message -----
> > From: <[EMAIL PROTECTED]>
> > To: <[EMAIL PROTECTED]>
> > Sent: Thursday, February 08, 2001 1:05 PM
> > Subject: scsi_unjam_host and scsi_try to abort_command
> >
> >
> > > Hi All,
> > >
> > > I was looking at the scsi_error.c and in particular at the
> > scsi_unjam_host()
> > > routine. It calls scsi_try_to_abort_command which calls
> > eh_abort_handler()
> > > routine of the hba driver. We are passing timeout to the
> > > scsi_try_to_abort_command(). But we don't seem to making use of it.
> > > Shouldn't scsi_try_to_abort_command() spawn a timer for the
> > specified
> > > timeout and then wait for the completion of the abort operation
> > > instead of depending on the return value of the eh_abort_handler() ?
> > > The reason is because, aborting a command can be an asynchronous
> > > operation in the sense that, the eh_abort_handler() routine will
> > > send an abort to the device and then the abort will be completed
> > > through interrupt.
> > >
> > > Any input on this ?
> > >
> > > Thanks and regards,
> > > -hiren
> > > -
> > > To unsubscribe from this list: send the line "unsubscribe
> > linux-scsi" in
> > > the body of a message to [EMAIL PROTECTED]
> > >
> >
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [EMAIL PROTECTED]
>

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
Re: scsi_unjam_host and scsi_try to abort_command

Reply via email to