A curious observation, there is a sudden surge of sending emails on private addresses rather than sending over a mailing list.
Please send your doubts / questions on mailing list " linux-cluster@redhat.com" instead of addressing personally. Regarding configuration for manual fencing - I don't have it with me, it was available with RHEL 5.5. Check it out in system-config-cluster tool if you can add manual fencing. Thanks, Parvez On Wed, Oct 3, 2012 at 10:46 AM, Renchu Mathew <rench...@gmail.com> wrote: > Hi Purvez, > > I am trying to setup a test cluster environmet. But I haven't doen > fencing. Please find below error messages. Some time after the nodes > restarted, the other node is going down. can you please send me > theconfiguration for manual fencing? > > >> > Please find attached my cluster setup. It is not stable >> > and /var/log/messages shows the below errors. >> > >> > >> > Sep 11 08:49:10 node1 corosync[1814]: [QUORUM] Members[2]: 1 2 >> > Sep 11 08:49:10 node1 corosync[1814]: [QUORUM] Members[2]: 1 2 >> > Sep 11 08:49:10 node1 corosync[1814]: [CPG ] chosen downlist: >> > sender r(0) ip(192.168.1.251) ; members(old:2 left:1) >> > Sep 11 08:49:10 node1 corosync[1814]: [MAIN ] Completed service >> > synchronization, ready to provide service. >> > Sep 11 08:49:11 node1 corosync[1814]: cman killed by node 2 because we >> > were killed by cman_tool or other application >> > Sep 11 08:49:11 node1 fenced[1875]: telling cman to remove nodeid 2 >> > from cluster >> > Sep 11 08:49:11 node1 fenced[1875]: cluster is down, exiting >> > Sep 11 08:49:11 node1 gfs_controld[1950]: cluster is down, exiting >> > Sep 11 08:49:11 node1 gfs_controld[1950]: daemon cpg_dispatch error 2 >> > Sep 11 08:49:11 node1 gfs_controld[1950]: cpg_dispatch error 2 >> > Sep 11 08:49:11 node1 dlm_controld[1889]: cluster is down, exiting >> > Sep 11 08:49:11 node1 dlm_controld[1889]: daemon cpg_dispatch error 2 >> > Sep 11 08:49:11 node1 dlm_controld[1889]: cpg_dispatch error 2 >> > Sep 11 08:49:11 node1 dlm_controld[1889]: cpg_dispatch error 2 >> > Sep 11 08:49:11 node1 dlm_controld[1889]: cpg_dispatch error 2 >> > Sep 11 08:49:11 node1 fenced[1875]: daemon cpg_dispatch error 2 >> > Sep 11 08:49:11 node1 rgmanager[2409]: #67: Shutting down uncleanly >> > Sep 11 08:49:11 node1 rgmanager[17059]: [clusterfs] unmounting /Data >> > Sep 11 08:49:11 node1 rgmanager[17068]: [clusterfs] Sending SIGTERM to >> > processes on /Data >> > Sep 11 08:49:16 node1 rgmanager[17104]: [clusterfs] unmounting /Data >> > Sep 11 08:49:16 node1 rgmanager[17113]: [clusterfs] Sending SIGKILL to >> > processes on /Data >> > Sep 11 08:49:19 node1 kernel: dlm: closing connection to node 2 >> > Sep 11 08:49:19 node1 kernel: dlm: closing connection to node 1 >> > Sep 11 08:49:19 node1 kernel: dlm: gfs2: no userland control daemon, >> > stopping lockspace >> > Sep 11 08:49:22 node1 rgmanager[17149]: [clusterfs] unmounting /Data >> > Sep 11 08:49:22 node1 rgmanager[17158]: [clusterfs] Sending SIGKILL to >> > processes on /Data >> > >> > >> > >> > Also when I try to restart the cman service, below error comes. >> > Starting cluster: >> > Checking if cluster has been disabled at boot... [ OK ] >> > Checking Network Manager... [ OK ] >> > Global setup... [ OK ] >> > Loading kernel modules... [ OK ] >> > Mounting configfs... [ OK ] >> > Starting cman... [ OK ] >> > Waiting for quorum... [ OK ] >> > Starting fenced... [ OK ] >> > Starting dlm_controld... [ OK ] >> > Starting gfs_controld... [ OK ] >> > Unfencing self... fence_node: cannot connect to cman >> > [FAILED] >> > Stopping cluster: >> > Leaving fence domain... [ OK ] >> > Stopping gfs_controld... [ OK ] >> > Stopping dlm_controld... [ OK ] >> > Stopping fenced... [ OK ] >> > Stopping cman... [ OK ] >> > Unloading kernel modules... [ OK ] >> > Unmounting configfs... [ OK ] >> > >> > Thanks again. >> > Renchu Mathew >> > On Tue, Sep 11, 2012 at 9:10 PM, Arun Eapen CISSP, RHCA >> > <a...@redhat.com> wrote: >> > >> > >> > >> > Put the fenced in debug mode and copy the error messages, for >> > me to >> > debug >> > >> > On Tue, 2012-09-11 at 11:52 +0400, Renchu Mathew wrote: >> > > Hi Arun, >> > > >> > > I have done the RH436 course in conducted by you at Redhat >> > b'lore. How >> > > r u? >> > > >> > > I have configured a 2 node failover cluster setup (almost >> > same like >> > > our RH436 lab setup in b'lore) It is almost ok except >> > fencing. If I >> > > pull the active node network cable it is not switching to >> > the other >> > > automatically. It is getting hung. Then I have to do this >> > manually. Is >> > > there any script for creating the dummy fencing in RHCS >> > which will >> > > restart or shutdown the other node. Please find attached my >> > > cluster.conf file. is there anyway we can power fence using >> > APC UPS. >> > > >> > > Could you please help me if you get some time. >> > > >> > > Thanks and regards >> > > Renchu Mathew >> > > >> > > >> > > >> > >> > >> > >> > -- >> > Arun Eapen >> > CISSP, RHC{A,DS,E,I,SS,VA,X} >> > Senior Technical Consultant & Certification Poobah >> > Red Hat India Pvt. Ltd., >> > No - 4/1, Bannergatta Road, >> > IBC Knowledge Park, >> > 11th floor, Tower D, >> > Bangalore - 560029, INDIA. >> > >> > >> > >> >> >> -- >> Arun Eapen >> CISSP, RHC{A,DS,E,I,SS,VA,X} >> Senior Technical Consultant & Certification Poobah >> Red Hat India Pvt. Ltd., >> No - 4/1, Bannergatta Road, >> IBC Knowledge Park, >> 11th floor, Tower D, >> Bangalore - 560029, INDIA. >> > > >
-- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster