Hello, Jeff & Albert.
This is another patchset fixing races in libata.
The first patch implements ata_poll_qc_complete which graps host
lock, turns irq back on, completes qc and release host lock. All
command completion in polling tasks are replaced with this function.
The effect is...
* If an interrupt from other ports or a spurious interrupt occurs
inbetween ata_irq_on and ata_qc_complete, it leads to double
completion of the qc. This race is fixed.
* atapi_packet_task() now reenables IRQ on error exit.
* ata_qc_complete() is now always called with host lock.
The second patch adds locking to ata_scsi_error such that
->eng_timeout is called after ata_qc_complete is complete.
Previously, EH and the latter part of ata_qc_complete could run side
by side.
Albert, I think, this patch should kill the race you tried to fix by
moving ATA_QCFLAG_ACTIVE clearing above ->complete_fn invocation.
[ Start of patch descriptions ]
01_libata_add-ata_poll_qc_complete.patch
: implement ata_poll_qc_complete and use it in polling functions
Previously, libata polling functions turned irq back on and
completed qc commands without holding host lock. This creates
a race condition between the polling task and interrupts from
other ports on the same host set or spurious interrupt from
itself.
This patch implements ata_poll_qc_complete which enables irq
and completes qc atomically and convert all polling functions.
Note: Unlike other functions, atapi_packet_task() didn't use
to turn irq back on on error exits. This patch makes it use
ata_poll_qc_complete, so irq is now turned back on on error
exits.
Note: With this change, ALL invocations of ata_qc_complete()
are now done under host_set lock.
02_libata_add-locking-to-ata_scsi_error.patch
: add host_set locking to ata_scsi_error()
SCSI EH can start before ata_qc_complete is completely
complete. so, latter part of ata_qc_complete can run
side-by-side with ->eng_timeout(), interfering EH.
This patch makes ata_scsi_error() to grab and release host_set
lock before invoking ->eng_timeout().
Note: host_failed decrementing and eh_cmd_q banging are moved
above ->eng_timeout() invocation such that they're done while
holding the lock.
[ End of patch descriptions ]
Thanks.
--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html