Having the checker check for this change isn't a solution...it shouldn't prevent any data corruption. Even if you check prior to every IO, you still have an obvious race between the check and the subsequent IO. You gain the ability to recognise the problem and somehow magically warn the user (eg: disallow all subsequent IO to catch the their attention). So then what is a reasonable frequency for the check? How much corruption will you tolerate, because a lot of IO can go out an FC connection in a short time.
Also, I'd think your #1 pre-requisite isn't quite right. I'd say the hole is wider and that you probably could have IO running heavily, as long as none of it is failed by the lower layers in response to the missing cable (and the default retries/timeouts could easily mask even a clumsy cable swap). Personally I'm also not inclined to think triggering hot plug events on the cable removal/replacement is ideal. I think fibre channel isn't exactly intended as a dynamic environment and the SAN topology is static or expected to be static from a given host's perspective. (I could be wrong.) If this is the case, it would be helpful for a host (or it's admin) to be able to know that topology and it's health over time instead of only knowing the healthy portion of the topology because the unhealthy parts are removed from its mappings. Problem determination seems like it's easier if you have a list of bad parts instead of just a shrinking list of good parts. Shouldn't the hba hardware/software be able to recognise and discern between different classes of fabric failures, hba port link loss and the return of a link that puts the port in a different place in the SAN. Does the HBA API give a common way to get detailed info out to userspace daemons? - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html