Having the checker check for this change isn't a solution...it
shouldn't prevent any data corruption.  Even if you check prior to
every IO, you still have an obvious race between the check and the
subsequent IO.  You gain the ability to recognise the problem and
somehow magically warn the user (eg: disallow all subsequent IO to
catch the their attention).  So then what is a reasonable frequency
for the check?  How much corruption will you tolerate, because a lot
of IO can go out an FC connection in a short time.

Also, I'd think your #1 pre-requisite isn't quite right.  I'd say the
hole is wider and that you probably could have IO running heavily, as
long as none of it is failed by the lower layers in response to the
missing cable (and the default retries/timeouts could easily mask even
a clumsy cable swap).

Personally I'm also not inclined to think triggering hot plug events
on the cable removal/replacement is ideal.  I think fibre channel
isn't exactly intended as a dynamic environment and the SAN topology
is static or expected to be static from a given host's perspective. 
(I could be wrong.)  If this is the case, it would be helpful for a
host (or it's admin) to be able to know that topology and it's health
over time instead of only knowing the healthy portion of the topology
because the unhealthy parts are removed from its mappings.  Problem
determination seems like it's easier if you have a list of bad parts
instead of just a shrinking list of good parts.

Shouldn't the hba hardware/software be able to recognise and discern
between different classes of fabric failures, hba port link loss and
the return of a link that puts the port in a different place in the
SAN.  Does the HBA API give a common way to get detailed info out to
userspace daemons?
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to