Re: Need help with libata error handling in libsas

2008-02-25 Thread Jeff Garzik

James Bottomley wrote:

On Mon, 2008-02-25 at 10:34 -0600, Brian King wrote:

The new libata-eh is used for more than just EH. It is used for device
probing, device revalidation, and power management. It is also woken for
all command failures and is where the request sense for ATAPI devices is
issued. Device revalidation following reset is also critical for ATA and
ATAPI devices. One example of this is some SATA/PATA converter chips
lose their DMA xfer settings following a reset and default to PIO mode
only. Any DMA transfer that is attempted simply hangs.


Strongly seconded.  Doing your own ATA EH would be foolish, as that 
would imply duplicating all that carefully-time-tested logic handling 
devices which follow the ATA specs... about 98% of the time :)


Just the set-transfer-mode logic took years to get right for the 
majority of ATA devices.




OK ... I'm grepping around in the source trying to figure out all of
this.  Is it documented anywhere?  That would really help me out at the
moment.


Unfortunately, not really.  The simplistic version is...  freeze, set 
some flags, call a function to schedule EH as needed -- most notably 
when your HBA signals an ATA device error or some other error in the ATA 
domain.



Regardless of all this...   libsas IMO will cause some libata-EH growing 
pains.  libsas needs libata-EH for probing, revalidation, 
initialization, etc.  But libsas probably does NOT need libata-EH for 
certain duties like SATA PHY diagnosis and link handling.


libsas needs libata-EH.  Unfortunately for libsas, libata-EH was written 
from the "libata controls the world" point of view, and probably needs 
some modifications to play well in the new SATA/SAS shared worldview.


Brian's recommendation is quite sane...  your ->error_handler() probably 
just needs hard reset (aka COMRESET) capability.


Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need help with libata error handling in libsas

2008-02-25 Thread James Bottomley
On Mon, 2008-02-25 at 10:34 -0600, Brian King wrote:
> The new libata-eh is used for more than just EH. It is used for device
> probing, device revalidation, and power management. It is also woken for
> all command failures and is where the request sense for ATAPI devices is
> issued. Device revalidation following reset is also critical for ATA and
> ATAPI devices. One example of this is some SATA/PATA converter chips
> lose their DMA xfer settings following a reset and default to PIO mode
> only. Any DMA transfer that is attempted simply hangs.

OK ... I'm grepping around in the source trying to figure out all of
this.  Is it documented anywhere?  That would really help me out at the
moment.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need help with libata error handling in libsas

2008-02-25 Thread Brian King
James Bottomley wrote:
> I keep hearing that we need to convert libsas to use libata's new error
> handling.  Unfortunately, I have very little conception of what that
> means.  Right at the moment, libsas doesn't use any error handling
> functions of libata at all.
> 
> I've looked through the libata-eh functions, and I find them frankly
> incomprehensible.
> 
> Firstly, let me say what SAS error handling actually does:

Let me chime in with what ipr error handling does/can do. The ipr firmware
provides two basic SATA error handling methods with some modifiers to each.

Cancel All - This cancels all outstanding commands to the device. When issued
to an ATA device, this gets escalated by the firmware to an SRST. When issued
to an ATAPI device, an ATA NOOP is issued.

Reset Device - This command has modifiers to indicate either a soft reset
or a hard reset.

Currently, the only SATA devices that ipr officially attaches are ATAPI
DVD devices. In our testing we've come to the conclusion that trying to
use anything but a hard reset for ERP is generally more trouble than it
is worth.

> All of this leads me to conclude, that all libsas needs is to plumb in
> the ATA equivalent of abort, junk the task query for libata devices and
> simply proceed, as if the task is held at the target, along the
> escalating reset path.

The new libata-eh is used for more than just EH. It is used for device
probing, device revalidation, and power management. It is also woken for
all command failures and is where the request sense for ATAPI devices is
issued. Device revalidation following reset is also critical for ATA and
ATAPI devices. One example of this is some SATA/PATA converter chips
lose their DMA xfer settings following a reset and default to PIO mode
only. Any DMA transfer that is attempted simply hangs.

The other issue is PMP support. The more that gets pushed into libsas,
the more libsas needs to know about things such as PMP.

-Brian

-- 
Brian King
Linux on Power Virtualization
IBM Linux Technology Center



-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html