Re: Need help with libata error handling in libsas
James Bottomley wrote: I keep hearing that we need to convert libsas to use libata's new error handling. Unfortunately, I have very little conception of what that means. Right at the moment, libsas doesn't use any error handling functions of libata at all. I've looked through the libata-eh functions, and I find them frankly incomprehensible. Firstly, let me say what SAS error handling actually does: Let me chime in with what ipr error handling does/can do. The ipr firmware provides two basic SATA error handling methods with some modifiers to each. Cancel All - This cancels all outstanding commands to the device. When issued to an ATA device, this gets escalated by the firmware to an SRST. When issued to an ATAPI device, an ATA NOOP is issued. Reset Device - This command has modifiers to indicate either a soft reset or a hard reset. Currently, the only SATA devices that ipr officially attaches are ATAPI DVD devices. In our testing we've come to the conclusion that trying to use anything but a hard reset for ERP is generally more trouble than it is worth. All of this leads me to conclude, that all libsas needs is to plumb in the ATA equivalent of abort, junk the task query for libata devices and simply proceed, as if the task is held at the target, along the escalating reset path. The new libata-eh is used for more than just EH. It is used for device probing, device revalidation, and power management. It is also woken for all command failures and is where the request sense for ATAPI devices is issued. Device revalidation following reset is also critical for ATA and ATAPI devices. One example of this is some SATA/PATA converter chips lose their DMA xfer settings following a reset and default to PIO mode only. Any DMA transfer that is attempted simply hangs. The other issue is PMP support. The more that gets pushed into libsas, the more libsas needs to know about things such as PMP. -Brian -- Brian King Linux on Power Virtualization IBM Linux Technology Center - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need help with libata error handling in libsas
On Mon, 2008-02-25 at 10:34 -0600, Brian King wrote: The new libata-eh is used for more than just EH. It is used for device probing, device revalidation, and power management. It is also woken for all command failures and is where the request sense for ATAPI devices is issued. Device revalidation following reset is also critical for ATA and ATAPI devices. One example of this is some SATA/PATA converter chips lose their DMA xfer settings following a reset and default to PIO mode only. Any DMA transfer that is attempted simply hangs. OK ... I'm grepping around in the source trying to figure out all of this. Is it documented anywhere? That would really help me out at the moment. James - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need help with libata error handling in libsas
James Bottomley wrote: On Mon, 2008-02-25 at 10:34 -0600, Brian King wrote: The new libata-eh is used for more than just EH. It is used for device probing, device revalidation, and power management. It is also woken for all command failures and is where the request sense for ATAPI devices is issued. Device revalidation following reset is also critical for ATA and ATAPI devices. One example of this is some SATA/PATA converter chips lose their DMA xfer settings following a reset and default to PIO mode only. Any DMA transfer that is attempted simply hangs. Strongly seconded. Doing your own ATA EH would be foolish, as that would imply duplicating all that carefully-time-tested logic handling devices which follow the ATA specs... about 98% of the time :) Just the set-transfer-mode logic took years to get right for the majority of ATA devices. OK ... I'm grepping around in the source trying to figure out all of this. Is it documented anywhere? That would really help me out at the moment. Unfortunately, not really. The simplistic version is... freeze, set some flags, call a function to schedule EH as needed -- most notably when your HBA signals an ATA device error or some other error in the ATA domain. Regardless of all this... libsas IMO will cause some libata-EH growing pains. libsas needs libata-EH for probing, revalidation, initialization, etc. But libsas probably does NOT need libata-EH for certain duties like SATA PHY diagnosis and link handling. libsas needs libata-EH. Unfortunately for libsas, libata-EH was written from the libata controls the world point of view, and probably needs some modifications to play well in the new SATA/SAS shared worldview. Brian's recommendation is quite sane... your -error_handler() probably just needs hard reset (aka COMRESET) capability. Jeff - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html