On 3/14/12 6:02 AM, emmanuel segura wrote:
I think it's better you make clvmd start at bootchkconfig cman on ; chkconfig clvmd on
I've already tried it. It doesn't work. The problem is that my LVM information is on the drbd. If I start up clvmd before drbd, it won't find the logical volumes.
I also don't see why that would make a difference (although this could be part of the confusion): a service is a service. I've tried starting up clvmd inside and outside pacemaker control, with the same problem. Why would starting clvmd at boot make a difference?
Il giorno 13 marzo 2012 23:29, William Seligman<[email protected]ha scritto:On 3/13/12 5:50 PM, emmanuel segura wrote:So if you using cman why you use lsb::clvmd I think you are very confusedI don't dispute that I may be very confused! However, from what I can tell, I still need to run clvmd even if I'm running cman (I'm not using rgmanager). If I just run cman, gfs2 and any other form of mount fails. If I run cman, then clvmd, then gfs2, everything behaves normally. Going by these instructions: <https://alteeve.com/w/2-Node_Red_Hat_KVM_Cluster_Tutorial> the resources he puts under "cluster control" (rgmanager) I have to put under pacemaker control. Those include drbd, clvmd, and gfs2. The difference between what I've got, and what's in "Clusters From Scratch", is in CFS they assign one DRBD volume to a single filesystem. I create an LVM physical volume on my DRBD resource, as in the above tutorial, and so I have to start clvmd or the logical volumes in the DRBD partition won't be recognized.>> Is there some way to get logical volumes recognized automatically by cman without rgmanager that I've missed?Il giorno 13 marzo 2012 22:42, William Seligman<[email protected]ha scritto:On 3/13/12 12:29 PM, William Seligman wrote:I'm not sure if this is a "Linux-HA" question; please direct me to the appropriate list if it's not. I'm setting up a two-node cman+pacemaker+gfs2 cluster as described in "Clusters From Scratch." Fencing is through forcibly rebooting a node by cutting and restoring its power via UPS. My fencing/failover tests have revealed a problem. If I gracefully turn off one node ("crm node standby"; "service pacemaker stop"; "shutdown -r now") all the resources transfer to the other node with no problems. If I cut power to one node (as would happen if it were fenced), the lsb::clvmd resource on the remaining node eventually fails. Since all the other resources depend on clvmd, all the resources on the remaining node stop and the cluster is left with nothing running. I've traced why the lsb::clvmd fails: The monitor/status command includes "vgdisplay", which hangs indefinitely. Therefore the monitor will always time-out. So this isn't a problem with pacemaker, but with clvmd/dlm: If a node is cut off, the cluster isn't handling it properly. Has anyone on this list seen this before? Any ideas?>> Details:versions: Redhat Linux 6.2 (kernel 2.6.32) cman-3.0.12.1 corosync-1.4.1 pacemaker-1.1.6 lvm2-2.02.87 lvm2-cluster-2.02.87This may be a Linux-HA question after all! I ran a few more tests. Here's the output from a typical test of grep -E "(dlm|gfs2}clvmd|fenc|syslogd)" /var/log/messages <http://pastebin.com/uqC6bc1b> It looks like what's happening is that the fence agent (one I wrote) is not returning the proper error code when a node crashes. According to this page, if a fencing agent fails GFS2 will freeze to protect the data: <http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Global_File_System_2/s1-gfs2hand-allnodes.html > As a test, I tried to fence my test node via standard means: stonith_admin -F orestes-corosync.nevis.columbia.edu These were the log messages, which show that stonith_admin did its job and CMAN was notified of the fencing:<http://pastebin.com/jaH820Bv>. Unfortunately, I still got the gfs2 freeze, so this is not the complete story. First things first. I vaguely recall a web page that went over the STONITH return codes, but I can't locate it again. Is there any reference to the return codes expected from a fencing agent, perhaps as function of the state of the fencing device?
-- Bill Seligman | mailto://[email protected] Nevis Labs, Columbia Univ | http://www.nevis.columbia.edu/~seligman/ PO Box 137 | Irvington NY 10533 USA | Phone: (914) 591-2823
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
