Re: [Linux-HA] More than one drbd resource possible in pacemaker?

Andrew Beekhof Mon, 03 May 2010 00:16:29 -0700

On Fri, Apr 30, 2010 at 9:46 AM, Gianluca Cecchi
<[email protected]> wrote:
> Hello,
> I have configured a drbd0 resource (nfsdata) in pacemaker, acting as
> active/passive, using the linbit resource agent with master/slave config.
> It works ok in different operations I tried with pacemaker.
>
> Then on both two nodes I'm going to test ocfs2 for another drbd1 resource I
> have created (ocfs2data).
>
> Env for drbd seems ok on both nodes, with ocfs2 fs mounted on both, but the
> drbd0 pacemaker resource failed
>
> [r...@ha1 ~]# cat /proc/drbd
> version: 8.3.6 (api:88/proto:86-91)
> GIT-hash: f3606c47cc6fcf6b3f086e425cb34af8b7a81bbf build by r...@ha1,
> 2010-04-28 09:01:04
>  0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
>    ns:3128688 nr:8 dw:1027576 dr:2586722 al:524 bm:130 lo:0 pe:0 ua:0 ap:0
> ep:1 wo:d oos:0
>  1: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r----
>    ns:1280472 nr:3146699 dw:4427171 dr:28918 al:343 bm:129 lo:0 pe:0 ua:0
> ap:0 ep:1 wo:d oos:0
>
> [r...@ha1 ~]# crm_mon -1
> ============
> Last updated: Fri Apr 30 09:46:28 2010
> Stack: openais
> Current DC: ha1 - partition with quorum
> Version: 1.0.8-9881a7350d6182bae9e8e557cf20a3cc5dac3ee7
> 2 Nodes configured, 2 expected votes
> 3 Resources configured.
> ============
>
> Online: [ ha1 ha2 ]
>
>  Master/Slave Set: NfsData
>     nfsdrbd:0 (ocf::linbit:drbd): Slave ha1 (unmanaged) FAILED
>     nfsdrbd:1 (ocf::linbit:drbd): Slave ha2 (unmanaged) FAILED
>
> Failed actions:
>    nfsdrbd:0_demote_0 (node=ha1, call=590, rc=5, status=complete): not
> installed
>    nfsdrbd:0_stop_0 (node=ha1, call=593, rc=5, status=complete): not
> installed
>    nfsdrbd:1_monitor_60000 (node=ha2, call=33, rc=5, status=complete): not
> installed
>    nfsdrbd:1_stop_0 (node=ha2, call=38, rc=5, status=complete): not
> installed


It _looks_ like you deleted the ocf::linbit:drbd agent (or something it needs).
As indicated, rc=5 means "not installed".

Maybe someone from linbit can comment.

>
>
> drbd1 is not mentioned (yet) in my pacemajer config, but I notice I got this
> error in messages, that seems to say that pacemaker tries to take care of
> drbd1 too...

Nope.  nfsdrbd:1 != drbd1

> Apr 29 17:58:25 ha1 pengine: [1616]: notice: unpack_rsc_op: Hard error -
> nfsdrbd:1_monitor_60000 failed with rc=5: Preventing NfsData from
> re-starting on ha2
> Apr 29 17:58:25 ha1 pengine: [1616]: WARN: unpack_rsc_op: Processing failed
> op nfsdrbd:1_monitor_60000 on ha2: not installed (5)
> Apr 29 17:58:25 ha1 pengine: [1616]: notice: native_print: SitoWeb
>  (ocf::heartbeat:apache):        Started ha1
> Apr 29 17:58:25 ha1 pengine: [1616]: notice: clone_print:  Master/Slave Set:
> NfsData
> Apr 29 17:58:25 ha1 pengine: [1616]: notice: native_print:      nfsdrbd:1
>    (ocf::linbit:drbd):     Slave ha2 FAILED
> Apr 29 17:58:25 ha1 pengine: [1616]: notice: short_print:      Masters: [
> ha1 ]
> Apr 29 17:58:25 ha1 pengine: [1616]: notice: group_print:  Resource Group:
> nfs-group
> Apr 29 17:58:25 ha1 pengine: [1616]: notice: native_print:      ClusterIP
>    (ocf::heartbeat:IPaddr2):       Started ha1
> Apr 29 17:58:25 ha1 pengine: [1616]: notice: native_print:      lv_drbd0
>   (ocf::heartbeat:LVM):   Started ha1
> Apr 29 17:58:25 ha1 pengine: [1616]: notice: native_print:      NfsFS
> (ocf::heartbeat:Filesystem):    Started ha1
> Apr 29 17:58:25 ha1 pengine: [1616]: info: get_failcount: NfsData has failed
> 1 times on ha2
>
> As I didn't see anything about caveats with multiple resources both managed
> and not managed by pacemaker @ linbit site, I presumed it is possible.
> Is this correct?
>
> My current config is this (btw: there are nfs words in resources, but not a
> nfs server inside at the moment...):
>
>
> [r...@ha1 ~]# crm configure show
> node ha1 \
> attributes standby="off"
> node ha2 \
> attributes standby="off"
> primitive ClusterIP ocf:heartbeat:IPaddr2 \
> params ip="192.168.101.53" cidr_netmask="32" \
> op monitor interval="30s"
> primitive NfsFS ocf:heartbeat:Filesystem \
> params device="/dev/vg_drbd0/lv_drbd0" directory="/nfsdata" fstype="ext3" \
> op start interval="0" timeout="60" \
> op stop interval="0" timeout="60"
> primitive SitoWeb ocf:heartbeat:apache \
> params configfile="/etc/httpd/conf/httpd.conf" \
> op monitor interval="1min" \
> op start interval="0" timeout="40" \
> op stop interval="0" timeout="60"
> primitive lv_drbd0 ocf:heartbeat:LVM \
> params volgrpname="vg_drbd0" exclusive="yes" \
> op monitor interval="10" timeout="30" depth="0" \
> op start interval="0" timeout="30" \
> op stop interval="0" timeout="30"
> primitive nfsdrbd ocf:linbit:drbd \
> params drbd_resource="nfsdata" \
> op monitor interval="60s" \
> op start interval="0" timeout="240" \
> op stop interval="0" timeout="100"
> group nfs-group ClusterIP lv_drbd0 NfsFS
> ms NfsData nfsdrbd \
> meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1"
> notify="true"
> location prefer-ha1 SitoWeb 50: ha1
> colocation nfs_on_drbd0 inf: nfs-group NfsData:Master
> colocation website-with-ip inf: SitoWeb nfs-group
> order NfsFS-after-NfsData inf: NfsData:promote nfs-group:start
> order apache-after-ip inf: nfs-group SitoWeb
> property $id="cib-bootstrap-options" \
> dc-version="1.0.8-9881a7350d6182bae9e8e557cf20a3cc5dac3ee7" \
> cluster-infrastructure="openais" \
> expected-quorum-votes="2" \
> stonith-enabled="false" \
> no-quorum-policy="ignore"
> rsc_defaults $id="rsc-options" \
> resource-stickiness="100"
>
> Thanks,
> Gianluca
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] More than one drbd resource possible in pacemaker?

Reply via email to