If the problem is device naming, use multipath to create a /dev/mapper/something static device name which will always map to a particular disk independent of load order.

Anuj Singh (अनुज) wrote:
Thanks,
changed script a bit, things working now. resetting iscsi service.
But device name order independent will be better.

Thanks and regards
Anuj Singh



On Fri, Sep 5, 2008 at 5:02 PM, Anuj Singh (अनुज) <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote:

    Hi,
I configured a cluster using gfs1 on rhel-4 kernel version 2.6.9-55.16.EL.
    Using iscsi-target and initiator.
    gfs1 mount is exported via nfs service.

    I can manually stop all services in following sequence:
    nfs, portmap, rgmanager, gfs, clvmd, fenced, cman, ccsd.
    to stop my iscsi service first I give 'vgchange -aln' then I stop
    iscsi service, otherwise i get an error of module in use, as I
    have an clusterd lvm over iscsi device (/dev/sda1)

    Everything works fine, but when i am trying to simulate a possible
    problem, f.e. iscsi service is stopped I get following error.

    Test1:
    When cluster is working I stop iscsi service with
     /etc/init.d/iscsi stop
    Searching for iscsi-based multipath maps
    Found 0 maps
    Stopping iscsid:                                           [  OK  ]
    Removing iscsi driver: ERROR: Module iscsi_sfnet is in use
                                                               [FAILED]
    To stop my iscsi service without a failure,  I stop all cluster
    services as follows.
    /etc/init.d/nfs stop
    /etc/init.d/portmap stop
    /etc/init.d/rgmanager stop
    /etc/init.d/gfs stop
    /etc/init.d/clvmd stop
    /etc/init.d/fenced stop
    /etc/init.d/cman stop
    /etc/init.d/ccsd stop
    Every service stops with a ok message. now again when i stop my
    iscsi service I get same error
     /etc/init.d/iscsi stop
    Removing iscsi driver: ERROR: Module iscsi_sfnet is in
    use                             [FAILED]

    On my iscsi device (which is /dev/sd1), i have a LVM with gfs1
    file-system,
    as all the cluster services are stopped, I try to deactivate the
    lvm with:

     vgchange -aln
      /dev/dm-0: read failed after 0 of 4096 at 0: Input/output error
      No volume groups found

    At the moment if I start my iscsi service, my /dev/sda becomes
    /dev/sdb as well as iscsi service gives me following error:

    [EMAIL PROTECTED] new]# /sbin/service iscsi start
    Checking iscsi config:                                     [  OK  ]
    Loading iscsi driver:                                      [  OK  ]
    mknod: `/dev/iscsictl': File exists
    Starting iscsid:                                           [  OK  ]

    Sep  5 16:42:37 pr0031 iscsi: iscsi config check succeeded
    Sep  5 16:42:37 pr0031 iscsi: Loading iscsi driver:  succeeded
    Sep  5 16:42:42 pr0031 iscsid[20732]: version 4:0.1.11-7 variant
    (14-Apr-2008)
    Sep  5 16:42:42 pr0031 iscsi: iscsid startup succeeded
    Sep  5 16:42:42 pr0031 iscsid[20736]: Connected to Discovery
    Address 192.168.10.199 <http://192.168.10.199>
    Sep  5 16:42:42 pr0031 kernel: iscsi-sfnet:host16: Session established
    Sep  5 16:42:42 pr0031 kernel: scsi16 : SFNet iSCSI driver
    Sep  5 16:42:42 pr0031 kernel:   Vendor: IET       Model:
VIRTUAL-DISK Rev: 0 Sep 5 16:42:42 pr0031 kernel: Type: Direct-Access ANSI SCSI revision: 04
    Sep  5 16:42:42 pr0031 kernel: SCSI device sdb: 1975932 512-byte
    hdwr sectors (1012 MB)
    Sep  5 16:42:42 pr0031 kernel: SCSI device sdb: drive cache: write
    through
    Sep  5 16:42:42 pr0031 kernel: SCSI device sdb: 1975932 512-byte
    hdwr sectors (1012 MB)
    Sep  5 16:42:42 pr0031 kernel: SCSI device sdb: drive cache: write
    through
    Sep  5 16:42:42 pr0031 kernel:  sdb: sdb1
    Sep  5 16:42:42 pr0031 kernel: Attached scsi disk sdb at scsi16,
    channel 0, id 0, lun 0
    Sep  5 16:42:43 pr0031 scsi.agent[20764]: disk at
    /devices/platform/host16/target16:0:0/16:0:0:0

    As my /dev/sda1 became /dev/sdb1, if i start cluster services, I
    have no gfs mount.

    clurgmgrd[21062]: <notice> Starting stopped service flx
    Sep  5 16:47:16 pr0031 kernel: scsi15 (0:0): rejecting I/O to dead
    device
Sep 5 16:47:16 pr0031 clurgmgrd: [21062]: <err> 'mount -t gfs /dev/mapper/VG01-LV01 /u01' failed, error=32
    Sep  5 16:47:16 pr0031 clurgmgrd[21062]: <notice> start on
    clusterfs:gfsmount_u01 returned 2 (invalid argument(s))
    Sep  5 16:47:16 pr0031 clurgmgrd[21062]: <warning> #68: Failed to
    start flx; return value: 1
    Sep  5 16:47:16 pr0031 clurgmgrd[21062]: <notice> Stopping service
    flx


    After the above situation I need to restart the nodes, which I
    don't want to, I created a script to handle all this, in which if
    i restart all the services first, first I get the same /dev/sdb (
    which should be /dev/sda so that my cluster can have a gfs mount).
    When I restart all the services second time, I get no error (this
    time iscsi disk is attached with /dev/sda device name and I don't
    see any /dev/iscsctl exist error at the iscsi startup time) and
    cluster starts working.
    my script : http://www.grex.org/~anuj/cluster.txt
    <http://www.grex.org/%7Eanuj/cluster.txt>

    So, how to get my cluster working if my /dev/sda becomes /dev/sdb?

    Thanks and Regards
    Anuj Singh






------------------------------------------------------------------------

--
Linux-cluster mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/linux-cluster

Reply via email to