https://bugzilla.redhat.com/show_bug.cgi?id=1127289#c4 https://bugzilla.redhat.com/show_bug.cgi?id=1127289
2014-12-29 11:57 GMT+01:00 Marlon Guao <[email protected]>: > here it is.. > > > ==Dumping header on disk /dev/mapper/sbd > Header version : 2.1 > UUID : 36074673-f48e-4da2-b4ee-385e83e6abcc > Number of slots : 255 > Sector size : 512 > Timeout (watchdog) : 5 > Timeout (allocate) : 2 > Timeout (loop) : 1 > Timeout (msgwait) : 10 > > On Mon, Dec 29, 2014 at 6:42 PM, emmanuel segura <[email protected]> wrote: > >> Dlm isn't the problem, but i think is your fencing, when you powered >> off the active node, the dead remain in unclean state? can you show me >> your sbd timeouts? sbd -d /dev/path_of_your_device dump >> >> Thanks >> >> 2014-12-29 11:02 GMT+01:00 Marlon Guao <[email protected]>: >> > Hi, >> > >> > ah yeah.. tried to poweroff the active node.. and tried pvscan on the >> > passive.. and yes.. it didn't worked --- it doesn't return to the shell. >> > So, the problem is on DLM? >> > >> > On Mon, Dec 29, 2014 at 5:51 PM, emmanuel segura <[email protected]> >> wrote: >> > >> >> Power off the active node and after one seconde try to use one lvm >> >> command, for example pvscan, if this command doesn't response is >> >> because dlm relay on cluster fencing, if the cluster fencing doesn't >> >> work the dlm state in blocked state. >> >> >> >> 2014-12-29 10:43 GMT+01:00 Marlon Guao <[email protected]>: >> >> > perhaps, we need to focus on this message. as mentioned.. the cluster >> is >> >> > working fine under normal circumstances. my only concern is that, LVM >> >> > resource agent doesn't try to re-activate the VG on the passive node >> when >> >> > the active node goes down ungracefully (powered off). Hence, it could >> not >> >> > mount the filesystems.. etc. >> >> > >> >> > >> >> > Dec 29 17:12:26 s1 crmd[1495]: notice: process_lrm_event: Operation >> >> > sbd_monitor_0: not running (node= >> >> > s1, call=5, rc=7, cib-update=35, confirmed=true) >> >> > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating >> >> action >> >> > 13: monitor dlm:0_monitor_0 >> >> > on s2 >> >> > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating >> >> action >> >> > 5: monitor dlm:1_monitor_0 o >> >> > n s1 (local) >> >> > Dec 29 17:12:26 s1 crmd[1495]: notice: process_lrm_event: Operation >> >> > dlm_monitor_0: not running (node= >> >> > s1, call=10, rc=7, cib-update=36, confirmed=true) >> >> > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating >> >> action >> >> > 14: monitor clvm:0_monitor_0 >> >> > on s2 >> >> > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating >> >> action >> >> > 6: monitor clvm:1_monitor_0 >> >> > on s1 (local) >> >> > Dec 29 17:12:26 s1 crmd[1495]: notice: process_lrm_event: Operation >> >> > clvm_monitor_0: not running (node >> >> > =s1, call=15, rc=7, cib-update=37, confirmed=true) >> >> > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating >> >> action >> >> > 15: monitor cluIP_monitor_0 >> >> > on s2 >> >> > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating >> >> action >> >> > 7: monitor cluIP_monitor_0 o >> >> > n s1 (local) >> >> > Dec 29 17:12:26 s1 crmd[1495]: notice: process_lrm_event: Operation >> >> > cluIP_monitor_0: not running (nod >> >> > e=s1, call=19, rc=7, cib-update=38, confirmed=true) >> >> > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating >> >> action >> >> > 16: monitor vg1_monitor_0 on >> >> > s2 >> >> > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating >> >> action >> >> > 8: monitor vg1_monitor_0 on >> >> > s1 (local) >> >> > Dec 29 17:12:26 s1 LVM(vg1)[1583]: WARNING: LVM Volume cluvg1 is not >> >> > available (stopped) >> >> > Dec 29 17:12:26 s1 crmd[1495]: notice: process_lrm_event: Operation >> >> > vg1_monitor_0: not running (node= >> >> > s1, call=23, rc=7, cib-update=39, confirmed=true) >> >> > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating >> >> action >> >> > 17: monitor fs1_monitor_0 on >> >> > s2 >> >> > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating >> >> action >> >> > 9: monitor fs1_monitor_0 on >> >> > s1 (local) >> >> > Dec 29 17:12:26 s1 Filesystem(fs1)[1600]: WARNING: Couldn't find >> device >> >> > [/dev/mapper/cluvg1-clulv1]. Ex >> >> > pected /dev/??? to exist >> >> > Dec 29 17:12:26 s1 crmd[1495]: notice: process_lrm_event: Operation >> >> > fs1_monitor_0: not running (node= >> >> > s1, call=27, rc=7, cib-update=40, confirmed=true) >> >> > >> >> > On Mon, Dec 29, 2014 at 5:38 PM, emmanuel segura <[email protected]> >> >> wrote: >> >> > >> >> >> Dec 27 15:38:00 s1 cib[1514]: error: crm_xml_err: XML Error: >> >> >> Permission deniedPermission deniedI/O warning : failed to load >> >> >> external entity "/var/lib/pacemaker/cib/cib.xml" >> >> >> Dec 27 15:38:00 s1 cib[1514]: error: write_cib_contents: Cannot >> >> >> link /var/lib/pacemaker/cib/cib.xml to >> >> >> /var/lib/pacemaker/cib/cib-0.raw: Operation not permitted (1) >> >> >> >> >> >> 2014-12-29 10:33 GMT+01:00 emmanuel segura <[email protected]>: >> >> >> > Hi, >> >> >> > >> >> >> > You have a problem with the cluster stonithd:"error: crm_abort: >> >> >> > crm_glib_handler: Forked child 6186 to record non-fatal assert at >> >> >> > logging.c:73 " >> >> >> > >> >> >> > Try to post your cluster version(packages), maybe someone can tell >> you >> >> >> > if this is a known bug or new. >> >> >> > >> >> >> > >> >> >> > >> >> >> > 2014-12-29 10:29 GMT+01:00 Marlon Guao <[email protected]>: >> >> >> >> ok, sorry for that.. please use this instead. >> >> >> >> >> >> >> >> http://pastebin.centos.org/14771/ >> >> >> >> >> >> >> >> thanks. >> >> >> >> >> >> >> >> On Mon, Dec 29, 2014 at 5:25 PM, emmanuel segura < >> [email protected] >> >> > >> >> >> wrote: >> >> >> >> >> >> >> >>> Sorry, >> >> >> >>> >> >> >> >>> But your paste is empty. >> >> >> >>> >> >> >> >>> 2014-12-29 10:19 GMT+01:00 Marlon Guao <[email protected]>: >> >> >> >>> > hi, >> >> >> >>> > >> >> >> >>> > uploaded it here. >> >> >> >>> > >> >> >> >>> > http://susepaste.org/45413433 >> >> >> >>> > >> >> >> >>> > thanks. >> >> >> >>> > >> >> >> >>> > On Mon, Dec 29, 2014 at 5:09 PM, Marlon Guao < >> >> [email protected]> >> >> >> >>> wrote: >> >> >> >>> > >> >> >> >>> >> Ok, i attached the log file of one of the nodes. >> >> >> >>> >> >> >> >> >>> >> On Mon, Dec 29, 2014 at 4:42 PM, emmanuel segura < >> >> >> [email protected]> >> >> >> >>> >> wrote: >> >> >> >>> >> >> >> >> >>> >>> please use pastebin and show your whole logs >> >> >> >>> >>> >> >> >> >>> >>> 2014-12-29 9:06 GMT+01:00 Marlon Guao <[email protected] >> >: >> >> >> >>> >>> > by the way.. just to note that.. for a normal testing >> (manual >> >> >> >>> failover, >> >> >> >>> >>> > rebooting the active node)... the cluster is working fine. >> I >> >> only >> >> >> >>> >>> encounter >> >> >> >>> >>> > this error if I try to poweroff/shutoff the active node. >> >> >> >>> >>> > >> >> >> >>> >>> > On Mon, Dec 29, 2014 at 4:05 PM, Marlon Guao < >> >> >> [email protected]> >> >> >> >>> >>> wrote: >> >> >> >>> >>> > >> >> >> >>> >>> >> Hi. >> >> >> >>> >>> >> >> >> >> >>> >>> >> >> >> >> >>> >>> >> Dec 29 13:47:16 s1 LVM(vg1)[1601]: WARNING: LVM Volume >> cluvg1 >> >> >> is not >> >> >> >>> >>> >> available (stopped) >> >> >> >>> >>> >> Dec 29 13:47:16 s1 crmd[1515]: notice: >> process_lrm_event: >> >> >> >>> Operation >> >> >> >>> >>> >> vg1_monitor_0: not running (node= >> >> >> >>> >>> >> s1, call=23, rc=7, cib-update=40, confirmed=true) >> >> >> >>> >>> >> Dec 29 13:47:16 s1 crmd[1515]: notice: te_rsc_command: >> >> >> Initiating >> >> >> >>> >>> action >> >> >> >>> >>> >> 9: monitor fs1_monitor_0 on >> >> >> >>> >>> >> s1 (local) >> >> >> >>> >>> >> Dec 29 13:47:16 s1 crmd[1515]: notice: te_rsc_command: >> >> >> Initiating >> >> >> >>> >>> action >> >> >> >>> >>> >> 16: monitor vg1_monitor_0 on >> >> >> >>> >>> >> s2 >> >> >> >>> >>> >> Dec 29 13:47:16 s1 Filesystem(fs1)[1618]: WARNING: >> Couldn't >> >> find >> >> >> >>> device >> >> >> >>> >>> >> [/dev/mapper/cluvg1-clulv1]. Ex >> >> >> >>> >>> >> pected /dev/??? to exist >> >> >> >>> >>> >> >> >> >> >>> >>> >> >> >> >> >>> >>> >> from the LVM agent, it checked if the volume is already >> >> >> available.. >> >> >> >>> and >> >> >> >>> >>> >> will raise the above error if not. But, I don't see that >> it >> >> >> tries to >> >> >> >>> >>> >> activate it before raising the VG. Perhaps, it assumes >> that >> >> the >> >> >> VG >> >> >> >>> is >> >> >> >>> >>> >> already activated... so, I'm not sure who should be >> >> activating >> >> >> it >> >> >> >>> >>> (should >> >> >> >>> >>> >> it be LVM?). >> >> >> >>> >>> >> >> >> >> >>> >>> >> >> >> >> >>> >>> >> if [ $rc -ne 0 ]; then >> >> >> >>> >>> >> ocf_log $loglevel "LVM Volume $1 is not >> >> >> available >> >> >> >>> >>> >> (stopped)" >> >> >> >>> >>> >> rc=$OCF_NOT_RUNNING >> >> >> >>> >>> >> else >> >> >> >>> >>> >> case $(get_vg_mode) in >> >> >> >>> >>> >> 1) # exclusive with tagging. >> >> >> >>> >>> >> # If vg is running, make sure the >> >> >> correct >> >> >> >>> tag >> >> >> >>> >>> is >> >> >> >>> >>> >> present. Otherwise we >> >> >> >>> >>> >> # can not guarantee exclusive >> >> >> activation. >> >> >> >>> >>> >> if ! check_tags; then >> >> >> >>> >>> >> ocf_exit_reason "WARNING: >> >> >> >>> >>> >> $OCF_RESKEY_volgrpname is active without the cluster tag, >> >> >> >>> \"$OUR_TAG\"" >> >> >> >>> >>> >> >> >> >> >>> >>> >> On Mon, Dec 29, 2014 at 3:36 PM, emmanuel segura < >> >> >> >>> [email protected]> >> >> >> >>> >>> >> wrote: >> >> >> >>> >>> >> >> >> >> >>> >>> >>> logs? >> >> >> >>> >>> >>> >> >> >> >>> >>> >>> 2014-12-29 6:54 GMT+01:00 Marlon Guao < >> >> [email protected]>: >> >> >> >>> >>> >>> > Hi, >> >> >> >>> >>> >>> > >> >> >> >>> >>> >>> > just want to ask regarding the LVM resource agent on >> >> >> >>> >>> pacemaker/corosync. >> >> >> >>> >>> >>> > >> >> >> >>> >>> >>> > I setup 2 nodes cluster (opensuse13.2 -- my config >> below). >> >> >> The >> >> >> >>> >>> cluster >> >> >> >>> >>> >>> > works as expected, like doing a manual failover (via >> crm >> >> >> resource >> >> >> >>> >>> move), >> >> >> >>> >>> >>> > and automatic failover (by rebooting the active node >> for >> >> >> >>> instance). >> >> >> >>> >>> >>> But, if >> >> >> >>> >>> >>> > i try to just "shutoff" the active node (it's a VM, so >> I >> >> can >> >> >> do a >> >> >> >>> >>> >>> > poweroff). The resources won't be able to failover to >> the >> >> >> passive >> >> >> >>> >>> node. >> >> >> >>> >>> >>> > when I did an investigation, it's due to an LVM >> resource >> >> not >> >> >> >>> >>> starting >> >> >> >>> >>> >>> > (specifically, the VG). I found out that the LVM >> resource >> >> >> won't >> >> >> >>> try >> >> >> >>> >>> to >> >> >> >>> >>> >>> > activate the volume group in the passive node. Is this >> an >> >> >> >>> expected >> >> >> >>> >>> >>> > behaviour? >> >> >> >>> >>> >>> > >> >> >> >>> >>> >>> > what I really expect is that, in the event that the >> active >> >> >> node >> >> >> >>> be >> >> >> >>> >>> >>> shutoff >> >> >> >>> >>> >>> > (by a power outage for instance), all resources should >> be >> >> >> >>> failover >> >> >> >>> >>> >>> > automatically to the passive. LVM should re-activate >> the >> >> VG. >> >> >> >>> >>> >>> > >> >> >> >>> >>> >>> > >> >> >> >>> >>> >>> > here's my config. >> >> >> >>> >>> >>> > >> >> >> >>> >>> >>> > node 1: s1 >> >> >> >>> >>> >>> > node 2: s2 >> >> >> >>> >>> >>> > primitive cluIP IPaddr2 \ >> >> >> >>> >>> >>> > params ip=192.168.13.200 cidr_netmask=32 \ >> >> >> >>> >>> >>> > op monitor interval=30s >> >> >> >>> >>> >>> > primitive clvm ocf:lvm2:clvmd \ >> >> >> >>> >>> >>> > params daemon_timeout=30 \ >> >> >> >>> >>> >>> > op monitor timeout=90 interval=30 >> >> >> >>> >>> >>> > primitive dlm ocf:pacemaker:controld \ >> >> >> >>> >>> >>> > op monitor interval=60s timeout=90s on-fail=ignore \ >> >> >> >>> >>> >>> > op start interval=0 timeout=90 >> >> >> >>> >>> >>> > primitive fs1 Filesystem \ >> >> >> >>> >>> >>> > params device="/dev/mapper/cluvg1-clulv1" >> >> directory="/data" >> >> >> >>> >>> fstype=btrfs >> >> >> >>> >>> >>> > primitive mariadb mysql \ >> >> >> >>> >>> >>> > params config="/etc/my.cnf" >> >> >> >>> >>> >>> > primitive sbd stonith:external/sbd \ >> >> >> >>> >>> >>> > op monitor interval=15s timeout=60s >> >> >> >>> >>> >>> > primitive vg1 LVM \ >> >> >> >>> >>> >>> > params volgrpname=cluvg1 exclusive=yes \ >> >> >> >>> >>> >>> > op start timeout=10s interval=0 \ >> >> >> >>> >>> >>> > op stop interval=0 timeout=10 \ >> >> >> >>> >>> >>> > op monitor interval=10 timeout=30 on-fail=restart >> depth=0 >> >> >> >>> >>> >>> > group base-group dlm clvm >> >> >> >>> >>> >>> > group rgroup cluIP vg1 fs1 mariadb \ >> >> >> >>> >>> >>> > meta target-role=Started >> >> >> >>> >>> >>> > clone base-clone base-group \ >> >> >> >>> >>> >>> > meta interleave=true target-role=Started >> >> >> >>> >>> >>> > property cib-bootstrap-options: \ >> >> >> >>> >>> >>> > dc-version=1.1.12-1.1.12.git20140904.266d5c2 \ >> >> >> >>> >>> >>> > cluster-infrastructure=corosync \ >> >> >> >>> >>> >>> > no-quorum-policy=ignore \ >> >> >> >>> >>> >>> > last-lrm-refresh=1419514875 \ >> >> >> >>> >>> >>> > cluster-name=xxx \ >> >> >> >>> >>> >>> > stonith-enabled=true >> >> >> >>> >>> >>> > rsc_defaults rsc-options: \ >> >> >> >>> >>> >>> > resource-stickiness=100 >> >> >> >>> >>> >>> > >> >> >> >>> >>> >>> > -- >> >> >> >>> >>> >>> >>>> import this >> >> >> >>> >>> >>> > _______________________________________________ >> >> >> >>> >>> >>> > Linux-HA mailing list >> >> >> >>> >>> >>> > [email protected] >> >> >> >>> >>> >>> > http://lists.linux-ha.org/mailman/listinfo/linux-ha >> >> >> >>> >>> >>> > See also: http://linux-ha.org/ReportingProblems >> >> >> >>> >>> >>> >> >> >> >>> >>> >>> >> >> >> >>> >>> >>> >> >> >> >>> >>> >>> -- >> >> >> >>> >>> >>> esta es mi vida e me la vivo hasta que dios quiera >> >> >> >>> >>> >>> _______________________________________________ >> >> >> >>> >>> >>> Linux-HA mailing list >> >> >> >>> >>> >>> [email protected] >> >> >> >>> >>> >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> >> >> >>> >>> >>> See also: http://linux-ha.org/ReportingProblems >> >> >> >>> >>> >>> >> >> >> >>> >>> >> >> >> >> >>> >>> >> >> >> >> >>> >>> >> >> >> >> >>> >>> >> -- >> >> >> >>> >>> >> >>> import this >> >> >> >>> >>> >> >> >> >> >>> >>> > >> >> >> >>> >>> > >> >> >> >>> >>> > >> >> >> >>> >>> > -- >> >> >> >>> >>> >>>> import this >> >> >> >>> >>> > _______________________________________________ >> >> >> >>> >>> > Linux-HA mailing list >> >> >> >>> >>> > [email protected] >> >> >> >>> >>> > http://lists.linux-ha.org/mailman/listinfo/linux-ha >> >> >> >>> >>> > See also: http://linux-ha.org/ReportingProblems >> >> >> >>> >>> >> >> >> >>> >>> >> >> >> >>> >>> >> >> >> >>> >>> -- >> >> >> >>> >>> esta es mi vida e me la vivo hasta que dios quiera >> >> >> >>> >>> _______________________________________________ >> >> >> >>> >>> Linux-HA mailing list >> >> >> >>> >>> [email protected] >> >> >> >>> >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> >> >> >>> >>> See also: http://linux-ha.org/ReportingProblems >> >> >> >>> >>> >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> -- >> >> >> >>> >> >>> import this >> >> >> >>> >> >> >> >> >>> > >> >> >> >>> > >> >> >> >>> > >> >> >> >>> > -- >> >> >> >>> >>>> import this >> >> >> >>> > _______________________________________________ >> >> >> >>> > Linux-HA mailing list >> >> >> >>> > [email protected] >> >> >> >>> > http://lists.linux-ha.org/mailman/listinfo/linux-ha >> >> >> >>> > See also: http://linux-ha.org/ReportingProblems >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> -- >> >> >> >>> esta es mi vida e me la vivo hasta que dios quiera >> >> >> >>> _______________________________________________ >> >> >> >>> Linux-HA mailing list >> >> >> >>> [email protected] >> >> >> >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> >> >> >>> See also: http://linux-ha.org/ReportingProblems >> >> >> >>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> >>>>> import this >> >> >> >> _______________________________________________ >> >> >> >> Linux-HA mailing list >> >> >> >> [email protected] >> >> >> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> >> >> >> See also: http://linux-ha.org/ReportingProblems >> >> >> > >> >> >> > >> >> >> > >> >> >> > -- >> >> >> > esta es mi vida e me la vivo hasta que dios quiera >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> esta es mi vida e me la vivo hasta que dios quiera >> >> >> _______________________________________________ >> >> >> Linux-HA mailing list >> >> >> [email protected] >> >> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> >> >> See also: http://linux-ha.org/ReportingProblems >> >> >> >> >> > >> >> > >> >> > >> >> > -- >> >> >>>> import this >> >> > _______________________________________________ >> >> > Linux-HA mailing list >> >> > [email protected] >> >> > http://lists.linux-ha.org/mailman/listinfo/linux-ha >> >> > See also: http://linux-ha.org/ReportingProblems >> >> >> >> >> >> >> >> -- >> >> esta es mi vida e me la vivo hasta que dios quiera >> >> _______________________________________________ >> >> Linux-HA mailing list >> >> [email protected] >> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> >> See also: http://linux-ha.org/ReportingProblems >> >> >> > >> > >> > >> > -- >> >>>> import this >> > _______________________________________________ >> > Linux-HA mailing list >> > [email protected] >> > http://lists.linux-ha.org/mailman/listinfo/linux-ha >> > See also: http://linux-ha.org/ReportingProblems >> >> >> >> -- >> esta es mi vida e me la vivo hasta que dios quiera >> _______________________________________________ >> Linux-HA mailing list >> [email protected] >> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> See also: http://linux-ha.org/ReportingProblems >> > > > > -- >>>> import this > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems -- esta es mi vida e me la vivo hasta que dios quiera _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
