Hi all, I set up a http HA cluster consist of 3 nodes. Node 1 is set to gnbd server for fencing. Node 2 and node 3 are set to http HA. In case the http service is running on node 3. Once the network cable of node 3 was unplug, the service would shift to node 2 properly, but cman service on node 3 was killed after the catble was plugged in, and cman's pid file was still there. The worse thing is cman service can not be started again, and node 3 can not be shutdown.
OS: RHEL 5 (2.6.18-8.el5) rpms related: Cluster_Administration-en-US-5.0.0-5.noarch.rpm cluster-cim-0.8-27.el5.i386.rpm cluster-snmp-0.8-27.el5.i386.rpm modcluster-0.8-27.el5.i386.rpm rgmanager-2.0.23-1.i386.rpm system-config-cluster-1.0.50-1.0.noarch.rpm gnbd-1.1.5-1.el5.i386.rpm kmod-gnbd-0.1.3-4.2.6.18_8.el5.i686.rpm kmod-gnbd-PAE-0.1.3-4.2.6.18_8.el5.i686.rpm kmod-gnbd-xen-0.1.3-4.2.6.18_8.el5.i686.rpm partial log messages on node 3: openais[6621]: [CPG ] got joinlist message from node 1 openais[6621]: [CPG ] got joinlist message from node 2 openais[6621]: [CMAN ] cman killed by node 3 for reason 2 gnbd_import: ERROR [../../utils/gnbd_utils.c:78] cman_init failed : Connection refused gfs_controld[6648]: cman_start_notification error -1 104 dlm_controld[6641]: cluster is down, exiting fenced[6635]: cluster is down, exiting fence_node[6645]: agent "fence_gnbd" reports: gnbd_import: ERROR cannot get node name : Connection refused gnbd_import: ERROR If you are not planning to use a cluster manager, use -n failed: fence_gnbd, node03 kernel: dlm: closing connection to node 3 fence_node[6645]: Fence of "node03" was unsuccessful kernel: dlm: closing connection to node 2 kernel: dlm: closing connection to node 1 ccsd[6615]: Unable to connect to cluster infrastructure after 30 seconds. ccsd[6615]: Unable to connect to cluster infrastructure after 60 seconds. Any help would be greatly appreciated. -- Regards, Changer
-- Linux-cluster mailing list [email protected] https://www.redhat.com/mailman/listinfo/linux-cluster
