Hi All,

We have a number of RHEL5 hosts that we're wanting to connect to a
backend SAN (a Sun 6540).  The hosts have two Emulex HBAs and they are
connected to the SAN via a pair of redundant switches.

This means there are *four* paths to each LUN - two active and two passive.

We are using the standard "lpfc" driver, configured with these options
in /etc/modprobe.conf:

    options lpfc lpfc_nodev_tmo=28 lpfc_lun_queue_depth=16 
lpfc_discovery_threads=32

The problem is that when the machines boot and the lightpulse driver
is started, large numbers of I/O errors are output on the console and
the bootup process subsequently takes a _long_ time.

          Reading all physical volumes.  This may take a while...
        end_request: I/O error, dev sdr, sector 0
          /dev/sdr: read failed after 0 of 4096 at 0: Input/output error
        end_request: I/O error, dev sdr, sector 573312
          /dev/sdr: read failed after 0 of 4096 at 293535744: Input/output error
        end_request: I/O error, dev sdr, sector 573424
          /dev/sdr: read failed after 0 of 4096 at 293593088: Input/output error
        end_request: I/O error, dev sdr, sector 0
          /dev/sdr: read failed after 0 of 4096 at 0: Input/output error
        end_request: I/O error, dev sdr, sector 8
          /dev/sdr: read failed after 0 of 4096 at 4096: Input/output error
        end_request: I/O error, dev sdr, sector 0
          /dev/sdr: read failed after 0 of 4096 at 0: Input/output error
        end_request: I/O error, dev sdah, sector 0
          /dev/sdah: read failed after 0 of 4096 at 0: Input/output error
        end_request: I/O error, dev sdah, sector 573312
          /dev/sdah: read failed after 0 of 4096 at 293535744: Input/output 
error
        end_request: I/O error, dev sdah, sector 573424
          /dev/sdah: read failed after 0 of 4096 at 293593088: Input/output 
error
        end_request: I/O error, dev sdah, sector 0
          /dev/sdah: read failed after 0 of 4096 at 0: Input/output error
        end_request: I/O error, dev sdah, sector 8
          /dev/sdah: read failed after 0 of 4096 at 4096: Input/output error
        end_request: I/O error, dev sdah, sector 0
          /dev/sdah: read failed after 0 of 4096 at 0: Input/output error
        end_request: I/O error, dev sdc, sector 0
          /dev/sdc: read failed after 0 of 4096 at 0: Input/output error
        end_request: I/O error, dev sdc, sector 573312
          /dev/sdc: read failed after 0 of 4096 at 293535744: Input/output error
        end_request: I/O error, dev sdc, sector 573424
          /dev/sdc: read failed after 0 of 4096 at 293593088: Input/output error
        end_request: I/O error, dev sdc, sector 0

There are also some errors output to the console when the machine is shutdown:

        Please stand by while rebooting the system...
        md: stopping all md devices.
        md: md0 switched to read-only mode.
        md: md1 still in use.
        Synchronizing SCSI cache for disk sdah: 
        Synchronizing SCSI cache for disk sdag: 
        FAILED
          status = 1, message = 00, host = 0, driver = 08
          <6>sd: Current: sense key: Illegal Request
            <<vendor>> ASC=0x94 ASCQ=0x1ASC=0x94 ASCQ=0x1

        Synchronizing SCSI cache for disk sdaf: 
        Synchronizing SCSI cache for disk sdae: 
        Synchronizing SCSI cache for disk sdad: 
        Synchronizing SCSI cache for disk sdac: 
        FAILED
          status = 1, message = 00, host = 0, driver = 08
          <6>sd: Current: sense key: Illegal Request
            <<vendor>> ASC=0x94 ASCQ=0x1ASC=0x94 ASCQ=0x1

        Synchronizing SCSI cache for disk sdab: 
        Synchronizing SCSI cache for disk sdaa: 
        FAILED
          status = 1, message = 00, host = 0, driver = 08
          <6>sd: Current: sense key: Illegal Request
            <<vendor>> ASC=0x94 ASCQ=0x1ASC=0x94 ASCQ=0x1
        [...]

We have been able to mitigate the long boot times using the modprobe.conf
parameters as outlined above, but the I/O errors are still coming up.

Do you have some recommendations as to how we can prevent these I/O errors
from happening?  Is there some way to make the lpfc driver handle the passive
paths more elegantly?

(I should add that we have a perfectly functioning multipathd
configuration, so all's well once the system is booted.  It's just the
slow boot times and excessive console errors I'm interested in finding
a solution for).

Regards,

Robert Sturrock

_______________________________________________
rhelv5-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/rhelv5-list

Reply via email to