Hello All, I am getting a strange issue while configuring 3 node clustering..
The issue is I have successfully deployed 2 node Active-passive cluster for the Virtozzo with shared iSCSI. Problem: Now I have a requirement for 3 node clusetring with 2 Active and 1 passive node. But when I try to add resource Filesystem with /vz partition and has also fixed the location for this newer resource filesystem to 3rd node but unfortunately this resource(Filesystem for new LUN) get shifts to the active node where already some other LUN partition mount on /vz instead of 3rd newer node and my whole cluster become unmanageable since that active node already have some other LUN bind on /vz. Details: iSCSI Server: I am using common storage iSCSI server where I have created 2 Lun's for my 2 active nodes. Resources created: Locations: I have created 2 locations: one for 1 active server and 1 for newer active server. Groups: I have created 2 groups for each active nodes. And in each group I have added 3 resources (Virtual IP, Filesystem, vz-cluster script) Everything was working fine, even I am able to add my 2nd virtual ip on to node 3 but whenever I try to add filesystem resource to /vz on to newer node then this filesystem get shifts to another active running node and my whole cluster become unmanageable. For testing I have mount this partition with some different folder say /mnt and everything was fine and this time this resource remains in my 3rd node. I think the problem is with /vz directory. But it should not like that. Please help me out to figure out the issue whether the above scenerio is possible to bind different LUN's on same /vz directory but on different different machines. The details for the my scenerio are as: 1) I am running CentOS-5.3 2) Kernel version is 2.6.18-028stab059.6 (Virtoozzo Kernel) 3) Heartbeat version- 2.1.3-3. 4) Installation through RPM. 5) Using Heartbeat version 2. 6) cat /etc/ha.d/ha.cf ___________________________________________________________________________________________ deadtime 10 bcast eth1 crm yes node node_master node node_slave node node3 debugfile /var/log/ha-debug logfile /var/log/ha-log logfacility local0 _____________________________________________________________________________________________ This file is same in all the 3 nodes. 7) tail -f /var/log/ha-log ________________________________________________________________________________________________________ crmd[8325]: 2009/12/18_13:39:55 info: do_lrm_rsc_op: Performing op=resource_mount_node3_monitor_0 key=6:166:9acb7de2-c2b3-42ab-9ee0-89a3c3ad1b88) lrmd[8322]: 2009/12/18_13:39:55 info: rsc:resource_mount_node3: monitor Filesystem[11186]: 2009/12/18_13:39:55 WARNING: Couldn't find device [/dev/disk/by-uuid/81c3845e-c2f6-4cb0-a0cd-e00c074942fb]. Expected /dev/??? to exist cib[11213]: 2009/12/18_13:39:55 info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.xml (digest: /var/lib/heartbeat/crm/cib.xml.sig) cib[11213]: 2009/12/18_13:39:55 info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.xml (digest: /var/lib/heartbeat/crm/cib.xml.sig) cib[11213]: 2009/12/18_13:39:55 info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.xml.last (digest: /var/lib/heartbeat/crm/cib.xml.sig.last) cib[11213]: 2009/12/18_13:39:55 info: write_cib_contents: Wrote version 0.344.3 of the CIB to disk (digest: 221e874690c5176734064408319181b0) crmd[8325]: 2009/12/18_13:39:55 info: process_lrm_event: LRM operation resource_mount_node3_monitor_0 (call=73, rc=0) complete cib[11213]: 2009/12/18_13:39:55 info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.xml (digest: /var/lib/heartbeat/crm/cib.xml.sig) cib[11213]: 2009/12/18_13:39:55 info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.xml.last (digest: /var/lib/heartbeat/crm/cib.xml.sig.last) crmd[8325]: 2009/12/18_13:39:57 info: do_lrm_rsc_op: Performing op=resource_mount_node3_stop_0 key=21:167:9acb7de2-c2b3-42ab-9ee0-89a3c3ad1b88) lrmd[8322]: 2009/12/18_13:39:57 info: rsc:resource_mount_node3: stop Filesystem[11225]: 2009/12/18_13:39:57 WARNING: Couldn't find device [/dev/disk/by-uuid/81c3845e-c2f6-4cb0-a0cd-e00c074942fb]. Expected /dev/??? to exist Filesystem[11225]: 2009/12/18_13:39:57 INFO: Running stop for /dev/disk/by-uuid/81c3845e-c2f6-4cb0-a0cd-e00c074942fb on /vz Filesystem[11225]: 2009/12/18_13:39:57 INFO: Trying to unmount /vz lrmd[8322]: 2009/12/18_13:39:57 info: RA output: (resource_mount_node3:stop:stderr) umount: /vz: device is busy umount: /vz: device is busy Filesystem[11225]: 2009/12/18_13:39:57 ERROR: Couldn't unmount /vz; trying cleanup with SIGTERM lrmd[8322]: 2009/12/18_13:39:57 info: RA output: (resource_mount_node3:stop:stderr) /vz: lrmd[8322]: 2009/12/18_13:39:57 info: RA output: (resource_mount_node3:stop:stdout) 6778 8857 8858 8862 8869 8880 8883 8884 8898 9002 lrmd[8322]: 2009/12/18_13:39:57 info: RA output: (resource_mount_node3:stop:stderr) mmmmmmmmmm lrmd[8322]: 2009/12/18_13:39:57 info: RA output: (resource_mount_node3:stop:stderr) Could not kill process 8857: No such process lrmd[8322]: 2009/12/18_13:39:57 info: RA output: (resource_mount_node3:stop:stderr) Could not kill process 8858: No such process lrmd[8322]: 2009/12/18_13:39:57 info: RA output: (resource_mount_node3:stop:stderr) Could not kill process 8862: No such process lrmd[8322]: 2009/12/18_13:39:57 info: RA output: (resource_mount_node3:stop:stderr) Could not kill process 8869: No such process lrmd[8322]: 2009/12/18_13:39:57 info: RA output: (resource_mount_node3:stop:stderr) Could not kill process 8880: No such process lrmd[8322]: 2009/12/18_13:39:57 info: RA output: (resource_mount_node3:stop:stderr) Could not kill process 8883: No such process lrmd[8322]: 2009/12/18_13:39:57 info: RA output: (resource_mount_node3:stop:stderr) Could not kill process 8884: No such process lrmd[8322]: 2009/12/18_13:39:57 info: RA output: (resource_mount_node3:stop:stderr) Could not kill process 9002: No such process Filesystem[11225]: 2009/12/18_13:39:57 INFO: Some processes on /vz were signalled lrmd[8322]: 2009/12/18_13:39:58 info: RA output: (resource_mount_node3:stop:stderr) umount: /vz: device is busy lrmd[8322]: 2009/12/18_13:39:58 info: RA output: (resource_mount_node3:stop:stderr) umount: /vz: device is busy Filesystem[11225]: 2009/12/18_13:39:58 ERROR: Couldn't unmount /vz; trying cleanup with SIGTERM Filesystem[11225]: 2009/12/18_13:39:58 INFO: No processes on /vz were signalled lrmd[8322]: 2009/12/18_13:39:59 info: RA output: (resource_mount_node3:stop:stderr) umount: /vz: device is busy umount: /vz: device is busy Filesystem[11225]: 2009/12/18_13:39:59 ERROR: Couldn't unmount /vz; trying cleanup with SIGTERM Filesystem[11225]: 2009/12/18_13:39:59 INFO: No processes on /vz were signalled lrmd[8322]: 2009/12/18_13:40:00 info: RA output: (resource_mount_node3:stop:stderr) umount: /vz: device is busy lrmd[8322]: 2009/12/18_13:40:00 info: RA output: (resource_mount_node3:stop:stderr) umount: /vz: device is busy Filesystem[11225]: 2009/12/18_13:40:00 ERROR: Couldn't unmount /vz; trying cleanup with SIGKILL Filesystem[11225]: 2009/12/18_13:40:00 INFO: No processes on /vz were signalled lrmd[8322]: 2009/12/18_13:40:01 info: RA output: (resource_mount_node3:stop:stderr) umount: /vz: device is busy lrmd[8322]: 2009/12/18_13:40:01 info: RA output: (resource_mount_node3:stop:stderr) umount: /vz: device is busy Filesystem[11225]: 2009/12/18_13:40:01 ERROR: Couldn't unmount /vz; trying cleanup with SIGKILL Filesystem[11225]: 2009/12/18_13:40:01 INFO: No processes on /vz were signalled lrmd[8322]: 2009/12/18_13:40:02 info: RA output: (resource_mount_node3:stop:stderr) umount: /vz: device is busy umount: /vz: device is busy Filesystem[11225]: 2009/12/18_13:40:02 ERROR: Couldn't unmount /vz; trying cleanup with SIGKILL Filesystem[11225]: 2009/12/18_13:40:02 INFO: No processes on /vz were signalled Filesystem[11225]: 2009/12/18_13:40:03 ERROR: Couldn't unmount /vz, giving up! _______________________________________________________________________________________________________ Waiting for the positive response.. Cheer's, Jaspal _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
