On 03/15/2012 11:50 PM, William Seligman wrote: > On 3/15/12 6:07 PM, William Seligman wrote: >> On 3/15/12 6:05 PM, William Seligman wrote: >>> On 3/15/12 4:57 PM, emmanuel segura wrote: >>> >>>> we can try to understand what happen when clvm hang >>>> >>>> edit the /etc/lvm/lvm.conf and change level = 7 in the log session and >>>> uncomment this line >>>> >>>> file = "/var/log/lvm2.log" >>> >>> Here's the tail end of the file (the original is 1.6M). Because there no >>> times >>> in the log, it's hard for me to point you to the point where I crashed the >>> other >>> system. I think (though I'm not sure) that the crash happened after the last >>> occurrence of >>> >>> cache/lvmcache.c:1484 Wiping internal VG cache >>> >>> Honestly, it looks like a wall of text to me. Does it suggest anything to >>> you? >> >> Maybe it would help if I included the link to the pastebin where I put the >> output: <http://pastebin.com/8pgW3Muw> > > Could the problem be with lvm+drbd? > > In lvm2.conf, I see this sequence of lines pre-crash: > > device/dev-io.c:535 Opened /dev/md0 RO O_DIRECT > device/dev-io.c:271 /dev/md0: size is 1027968 sectors > device/dev-io.c:137 /dev/md0: block size is 1024 bytes > device/dev-io.c:588 Closed /dev/md0 > device/dev-io.c:271 /dev/md0: size is 1027968 sectors > device/dev-io.c:535 Opened /dev/md0 RO O_DIRECT > device/dev-io.c:137 /dev/md0: block size is 1024 bytes > device/dev-io.c:588 Closed /dev/md0 > filters/filter-composite.c:31 Using /dev/md0 > device/dev-io.c:535 Opened /dev/md0 RO O_DIRECT > device/dev-io.c:137 /dev/md0: block size is 1024 bytes > label/label.c:186 /dev/md0: No label detected > device/dev-io.c:588 Closed /dev/md0 > device/dev-io.c:535 Opened /dev/drbd0 RO O_DIRECT > device/dev-io.c:271 /dev/drbd0: size is 5611549368 sectors > device/dev-io.c:137 /dev/drbd0: block size is 4096 bytes > device/dev-io.c:588 Closed /dev/drbd0 > device/dev-io.c:271 /dev/drbd0: size is 5611549368 sectors > device/dev-io.c:535 Opened /dev/drbd0 RO O_DIRECT > device/dev-io.c:137 /dev/drbd0: block size is 4096 bytes > device/dev-io.c:588 Closed /dev/drbd0 > > I interpret this: Look at /dev/md0, get some info, close; look at /dev/drbd0, > get some info, close. > > Post-crash, I see: > > evice/dev-io.c:535 Opened /dev/md0 RO O_DIRECT > device/dev-io.c:271 /dev/md0: size is 1027968 sectors > device/dev-io.c:137 /dev/md0: block size is 1024 bytes > device/dev-io.c:588 Closed /dev/md0 > device/dev-io.c:271 /dev/md0: size is 1027968 sectors > device/dev-io.c:535 Opened /dev/md0 RO O_DIRECT > device/dev-io.c:137 /dev/md0: block size is 1024 bytes > device/dev-io.c:588 Closed /dev/md0 > filters/filter-composite.c:31 Using /dev/md0 > device/dev-io.c:535 Opened /dev/md0 RO O_DIRECT > device/dev-io.c:137 /dev/md0: block size is 1024 bytes > label/label.c:186 /dev/md0: No label detected > device/dev-io.c:588 Closed /dev/md0 > device/dev-io.c:535 Opened /dev/drbd0 RO O_DIRECT > device/dev-io.c:271 /dev/drbd0: size is 5611549368 sectors > device/dev-io.c:137 /dev/drbd0: block size is 4096 bytes > > ... and then it hangs. Comparing the two, it looks like it can't close > /dev/drbd0. > > If I look at /proc/drbd when I crash one node, I see this: > > # cat /proc/drbd > version: 8.3.12 (api:88/proto:86-96) > GIT-hash: e2a8ef4656be026bbae540305fcb998a5991090f build by > [email protected], 2012-02-28 18:01:34 > 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C s----- > ns:7000064 nr:0 dw:0 dr:7049728 al:0 bm:516 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b > oos:0
s----- ... DRBD suspended io, most likely because of it's fencing-policy. For valid dual-primary setups you have to use "resource-and-stonith" policy and a working "fence-peer" handler. In this mode I/O is suspended until fencing of peer was succesful. Question is, why the peer does _not_ also suspend its I/O because obviously fencing was not successful ..... So with a correct DRBD configuration one of your nodes should already have been fenced because of connection loss between nodes (on drbd replication link). You can use e.g. that nice fencing script: http://goo.gl/O4N8f Regards, Andreas -- Need help with Pacemaker? http://www.hastexo.com/now > > > If I look at /proc/drbd if I bring down one node gracefully (crm node > standby), > I get this: > > # cat /proc/drbd > version: 8.3.12 (api:88/proto:86-96) > GIT-hash: e2a8ef4656be026bbae540305fcb998a5991090f build by > [email protected], 2012-02-28 18:01:34 > 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/Outdated C r----- > ns:7000064 nr:40 dw:40 dr:7036496 al:0 bm:516 lo:0 pe:0 ua:0 ap:0 ep:1 > wo:b > oos:0 > > Could it be that drbd can't respond to certain requests from lvm if the state > of > the peer is DUnknown instead of Outdated? > >>>> Il giorno 15 marzo 2012 20:50, William Seligman >>>> <[email protected] >>>>> ha scritto: >>>> >>>>> On 3/15/12 12:55 PM, emmanuel segura wrote: >>>>> >>>>>> I don't see any error and the answer for your question it's yes >>>>>> >>>>>> can you show me your /etc/cluster/cluster.conf and your crm configure >>>>> show >>>>>> >>>>>> like that more later i can try to look if i found some fix >>>>> >>>>> Thanks for taking a look. >>>>> >>>>> My cluster.conf: <http://pastebin.com/w5XNYyAX> >>>>> crm configure show: <http://pastebin.com/atVkXjkn> >>>>> >>>>> Before you spend a lot of time on the second file, remember that clvmd >>>>> will hang >>>>> whether or not I'm running pacemaker. >>>>> >>>>>> Il giorno 15 marzo 2012 17:42, William Seligman < >>>>> [email protected] >>>>>>> ha scritto: >>>>>> >>>>>>> On 3/15/12 12:15 PM, emmanuel segura wrote: >>>>>>> >>>>>>>> Ho did you created your volume group >>>>>>> >>>>>>> pvcreate /dev/drbd0 >>>>>>> vgcreate -c y ADMIN /dev/drbd0 >>>>>>> lvcreate -L 200G -n usr ADMIN # ... and so on >>>>>>> # "Nevis-HA" is the cluster name I used in cluster.conf >>>>>>> mkfs.gfs2 -p lock_dlm -j 2 -t Nevis_HA:usr /dev/ADMIN/usr # ... and so >>>>> on >>>>>>> >>>>>>>> give me the output of vgs command when the cluster it's up >>>>>>> >>>>>>> Here it is: >>>>>>> >>>>>>> Logging initialised at Thu Mar 15 12:40:39 2012 >>>>>>> Set umask from 0022 to 0077 >>>>>>> Finding all volume groups >>>>>>> Finding volume group "ROOT" >>>>>>> Finding volume group "ADMIN" >>>>>>> VG #PV #LV #SN Attr VSize VFree >>>>>>> ADMIN 1 5 0 wz--nc 2.61t 765.79g >>>>>>> ROOT 1 2 0 wz--n- 117.16g 0 >>>>>>> Wiping internal VG cache >>>>>>> >>>>>>> I assume the "c" in the ADMIN attributes means that clustering is turned >>>>>>> on? >>>>>>> >>>>>>>> Il giorno 15 marzo 2012 17:06, William Seligman < >>>>>>> [email protected] >>>>>>>>> ha scritto: >>>>>>>> >>>>>>>>> On 3/15/12 11:50 AM, emmanuel segura wrote: >>>>>>>>>> yes william >>>>>>>>>> >>>>>>>>>> Now try clvmd -d and see what happen >>>>>>>>>> >>>>>>>>>> locking_type = 3 it's lvm cluster lock type >>>>>>>>> >>>>>>>>> Since you asked for confirmation, here it is: the output of 'clvmd -d' >>>>>>>>> just now. <http://pastebin.com/bne8piEw>. I crashed the other node at >>>>>>>>> Mar 15 12:02:35, when you see the only additional line of output. >>>>>>>>> >>>>>>>>> I don't see any particular difference between this and the previous >>>>>>>>> result <http://pastebin.com/sWjaxAEF>, which suggests that I had >>>>>>>>> cluster locking enabled before, and still do now. >>>>>>>>> >>>>>>>>>> Il giorno 15 marzo 2012 16:15, William Seligman < >>>>>>>>> [email protected] >>>>>>>>>>> ha scritto: >>>>>>>>>> >>>>>>>>>>> On 3/15/12 5:18 AM, emmanuel segura wrote: >>>>>>>>>>> >>>>>>>>>>>> The first thing i seen in your clvmd log it's this >>>>>>>>>>>> >>>>>>>>>>>> ============================================= >>>>>>>>>>>> WARNING: Locking disabled. Be careful! This could corrupt your >>>>> metadata. >>>>>>>>>>>> ============================================= >>>>>>>>>>> >>>>>>>>>>> I saw that too, and thought the same as you did. I did some checks >>>>>>>>>>> (see below), but some web searches suggest that this message is a >>>>>>>>>>> normal consequence of clvmd initialization; e.g., >>>>>>>>>>> >>>>>>>>>>> <http://markmail.org/message/vmy53pcv52wu7ghx> >>>>>>>>>>> >>>>>>>>>>>> use this command >>>>>>>>>>>> >>>>>>>>>>>> lvmconf --enable-cluster >>>>>>>>>>>> >>>>>>>>>>>> and remember for cman+pacemaker you don't need qdisk >>>>>>>>>>> >>>>>>>>>>> Before I tried your lvmconf suggestion, here was my >>>>> /etc/lvm/lvm.conf: >>>>>>>>>>> <http://pastebin.com/841VZRzW> and the output of "lvm dumpconfig": >>>>>>>>>>> <http://pastebin.com/rtw8c3Pf>. >>>>>>>>>>> >>>>>>>>>>> Then I did as you suggested, but with a check to see if anything >>>>>>>>>>> changed: >>>>>>>>>>> >>>>>>>>>>> # cd /etc/lvm/ >>>>>>>>>>> # cp lvm.conf lvm.conf.cluster >>>>>>>>>>> # lvmconf --enable-cluster >>>>>>>>>>> # diff lvm.conf lvm.conf.cluster >>>>>>>>>>> # >>>>>>>>>>> >>>>>>>>>>> So the key lines have been there all along: >>>>>>>>>>> locking_type = 3 >>>>>>>>>>> fallback_to_local_locking = 0 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Il giorno 14 marzo 2012 23:17, William Seligman < >>>>>>>>>>> [email protected] >>>>>>>>>>>>> ha scritto: >>>>>>>>>>>> >>>>>>>>>>>>> On 3/14/12 9:20 AM, emmanuel segura wrote: >>>>>>>>>>>>>> Hello William >>>>>>>>>>>>>> >>>>>>>>>>>>>> i did new you are using drbd and i dont't know what type of >>>>>>>>>>>>>> configuration you using >>>>>>>>>>>>>> >>>>>>>>>>>>>> But it's better you try to start clvm with clvmd -d >>>>>>>>>>>>>> >>>>>>>>>>>>>> like thak we can see what it's the problem >>>>>>>>>>>>> >>>>>>>>>>>>> For what it's worth, here's the output of running clvmd -d on >>>>>>>>>>>>> the node that stays up: <http://pastebin.com/sWjaxAEF> >>>>>>>>>>>>> >>>>>>>>>>>>> What's probably important in that big mass of output are the >>>>>>>>>>>>> last two lines. Up to that point, I have both nodes up and >>>>>>>>>>>>> running cman + clvmd; cluster.conf is here: >>>>>>>>>>>>> <http://pastebin.com/w5XNYyAX> >>>>>>>>>>>>> >>>>>>>>>>>>> At the time of the next-to-the-last line, I cut power to the >>>>>>>>>>>>> other node. >>>>>>>>>>>>> >>>>>>>>>>>>> At the time of the last line, I run "vgdisplay" on the >>>>>>>>>>>>> remaining node, which hangs forever. >>>>>>>>>>>>> >>>>>>>>>>>>> After a lot of web searching, I found that I'm not the only one >>>>>>>>>>>>> with this problem. Here's one case that doesn't seem relevant >>>>>>>>>>>>> to me, since I don't use qdisk: >>>>>>>>>>>>> < >>>>> http://www.redhat.com/archives/linux-cluster/2007-October/msg00212.html>. >>>>>>>>>>>>> Here's one with the same problem with the same OS: >>>>>>>>>>>>> <http://bugs.centos.org/view.php?id=5229>, but with no >>>>> resolution. >>>>>>>>>>>>> >>>>>>>>>>>>> Out of curiosity, has anyone on this list made a two-node >>>>>>>>>>>>> cman+clvmd cluster work for them? >>>>>>>>>>>>> >>>>>>>>>>>>>> Il giorno 14 marzo 2012 14:02, William Seligman < >>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>> ha scritto: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 3/14/12 6:02 AM, emmanuel segura wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I think it's better you make clvmd start at boot >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> chkconfig cman on ; chkconfig clvmd on >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I've already tried it. It doesn't work. The problem is that >>>>>>>>>>>>>>> my LVM information is on the drbd. If I start up clvmd >>>>>>>>>>>>>>> before drbd, it won't find the logical volumes. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I also don't see why that would make a difference (although >>>>>>>>>>>>>>> this could be part of the confusion): a service is a >>>>>>>>>>>>>>> service. I've tried starting up clvmd inside and outside >>>>>>>>>>>>>>> pacemaker control, with the same problem. Why would >>>>>>>>>>>>>>> starting clvmd at boot make a difference? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Il giorno 13 marzo 2012 23:29, William Seligman< >>>>> [email protected]> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ha scritto: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 3/13/12 5:50 PM, emmanuel segura wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> So if you using cman why you use lsb::clvmd >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I think you are very confused >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I don't dispute that I may be very confused! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> However, from what I can tell, I still need to run >>>>>>>>>>>>>>>>> clvmd even if I'm running cman (I'm not using >>>>>>>>>>>>>>>>> rgmanager). If I just run cman, gfs2 and any other form >>>>>>>>>>>>>>>>> of mount fails. If I run cman, then clvmd, then gfs2, >>>>>>>>>>>>>>>>> everything behaves normally. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Going by these instructions: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> <https://alteeve.com/w/2-Node_**Red_Hat_KVM_Cluster_Tutorial> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> the resources he puts under "cluster control" >>>>>>>>>>>>>>>>> (rgmanager) I have to put under pacemaker control. >>>>>>>>>>>>>>>>> Those include drbd, clvmd, and gfs2. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The difference between what I've got, and what's in >>>>>>>>>>>>>>>>> "Clusters From Scratch", is in CFS they assign one DRBD >>>>>>>>>>>>>>>>> volume to a single filesystem. I create an LVM physical >>>>>>>>>>>>>>>>> volume on my DRBD resource, as in the above tutorial, >>>>>>>>>>>>>>>>> and so I have to start clvmd or the logical volumes in >>>>>>>>>>>>>>>>> the DRBD partition won't be recognized.>> Is there some >>>>>>>>>>>>>>>>> way to get logical volumes recognized automatically by >>>>>>>>>>>>>>>>> cman without rgmanager that I've missed? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Il giorno 13 marzo 2012 22:42, William Seligman< >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> ha scritto: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 3/13/12 12:29 PM, William Seligman wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I'm not sure if this is a "Linux-HA" question; >>>>>>>>>>>>>>>>>>>> please direct me to the appropriate list if it's >>>>>>>>>>>>>>>>>>>> not. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I'm setting up a two-node cman+pacemaker+gfs2 >>>>>>>>>>>>>>>>>>>> cluster as described in "Clusters From Scratch." >>>>>>>>>>>>>>>>>>>> Fencing is through forcibly rebooting a node by >>>>>>>>>>>>>>>>>>>> cutting and restoring its power via UPS. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> My fencing/failover tests have revealed a >>>>>>>>>>>>>>>>>>>> problem. If I gracefully turn off one node ("crm >>>>>>>>>>>>>>>>>>>> node standby"; "service pacemaker stop"; >>>>>>>>>>>>>>>>>>>> "shutdown -r now") all the resources transfer to >>>>>>>>>>>>>>>>>>>> the other node with no problems. If I cut power >>>>>>>>>>>>>>>>>>>> to one node (as would happen if it were fenced), >>>>>>>>>>>>>>>>>>>> the lsb::clvmd resource on the remaining node >>>>>>>>>>>>>>>>>>>> eventually fails. Since all the other resources >>>>>>>>>>>>>>>>>>>> depend on clvmd, all the resources on the >>>>>>>>>>>>>>>>>>>> remaining node stop and the cluster is left with >>>>>>>>>>>>>>>>>>>> nothing running. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I've traced why the lsb::clvmd fails: The >>>>>>>>>>>>>>>>>>>> monitor/status command includes "vgdisplay", >>>>>>>>>>>>>>>>>>>> which hangs indefinitely. Therefore the monitor >>>>>>>>>>>>>>>>>>>> will always time-out. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> So this isn't a problem with pacemaker, but with >>>>>>>>>>>>>>>>>>>> clvmd/dlm: If a node is cut off, the cluster >>>>>>>>>>>>>>>>>>>> isn't handling it properly. Has anyone on this >>>>>>>>>>>>>>>>>>>> list seen this before? Any ideas? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Details: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> versions: >>>>>>>>>>>>>>>>>>>> Redhat Linux 6.2 (kernel 2.6.32) >>>>>>>>>>>>>>>>>>>> cman-3.0.12.1 >>>>>>>>>>>>>>>>>>>> corosync-1.4.1 >>>>>>>>>>>>>>>>>>>> pacemaker-1.1.6 >>>>>>>>>>>>>>>>>>>> lvm2-2.02.87 >>>>>>>>>>>>>>>>>>>> lvm2-cluster-2.02.87 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> This may be a Linux-HA question after all! >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I ran a few more tests. Here's the output from a >>>>>>>>>>>>>>>>>>> typical test of >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> grep -E "(dlm|gfs2}clvmd|fenc|syslogd)**" >>>>>>>>>>>>>>>>>>> /var/log/messages >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> <http://pastebin.com/uqC6bc1b> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> It looks like what's happening is that the fence >>>>>>>>>>>>>>>>>>> agent (one I wrote) is not returning the proper >>>>>>>>>>>>>>>>>>> error code when a node crashes. According to this >>>>>>>>>>>>>>>>>>> page, if a fencing agent fails GFS2 will freeze to >>>>>>>>>>>>>>>>>>> protect the data: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> < >>>>> http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Global_File_System_2/s1-gfs2hand-allnodes.html >>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> As a test, I tried to fence my test node via >>>>>>>>>>>>>>>>>>> standard means: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> stonith_admin -F \ >>>>>>>>>>>>>>>>>>> orestes-corosync.nevis.columbia.edu >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> These were the log messages, which show that >>>>>>>>>>>>>>>>>>> stonith_admin did its job and CMAN was notified of >>>>>>>>>>>>>>>>>>> the fencing:<http://pastebin.com/jaH820Bv>. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Unfortunately, I still got the gfs2 freeze, so this >>>>>>>>>>>>>>>>>>> is not the complete story. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> First things first. I vaguely recall a web page >>>>>>>>>>>>>>>>>>> that went over the STONITH return codes, but I >>>>>>>>>>>>>>>>>>> can't locate it again. Is there any reference to >>>>>>>>>>>>>>>>>>> the return codes expected from a fencing agent, >>>>>>>>>>>>>>>>>>> perhaps as function of the state of the fencing >>>>>>>>>>>>>>>>>>> device? > See also: http://linux-ha.org/ReportingProblems > > > > > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
