Am 09.06.2011 01:05, schrieb Anton Altaparmakov: > Hi Klaus, > > On 8 Jun 2011, at 22:21, Klaus Darilion wrote: >> Hi! >> >> Currently I have a 2 node cluster and I want to add a 3rd node to use >> quorum to avoid split brain. >> >> The service (DRBD+DB) should only run either on node1 or node2. Node3 >> can not provide the service - it should just help the other nodes to >> find out if their network is broken or the other node's network is broken. >> >> Is my idea useful? > > Yes. That is what we do for all our Pacemake based setups. > >> How do I add such a "simple" 3rd node - just by using location >> constraints for the service to be run only on node1 or node2? > > Here is an example: > > [...]
Hi Anton! Thanks for toe config snippet. I try to add one thing after the other to my config and I am already stuck without adding the 3rd node. Currently I just have configured the DRBD resource and the filesystem resource: node db1-bh node db2-bh primitive drbd0 ocf:linbit:drbd \ params drbd_resource="r0" \ op monitor interval="15s" primitive drbd0_fs ocf:heartbeat:Filesystem \ params device="/dev/drbd0" directory="/mnt" fstype="ext4" group grp_database drbd0_fs ms ms_drbd0 drbd0 \ meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" colocation database_on_drbd0 inf: grp_database ms_drbd0:Master property $id="cib-bootstrap-options" \ dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ stonith-enabled="false" \ no-quorum-policy="ignore" \ pe-error-series-max="100" \ pe-warn-series-max="100" \ pe-input-series-max="100" rsc_defaults $id="rsc-options" \ resource-stickiness="5" I start node 1. (node 2 is down). Here, the problem is, that the filesystem can not be started, crm_mon shows: ============ Last updated: Thu Jun 9 16:12:35 2011 Stack: openais Current DC: db1-bh - partition WITHOUT quorum Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b 2 Nodes configured, 2 expected votes 2 Resources configured. ============ Online: [ db1-bh ] OFFLINE: [ db2-bh ] Master/Slave Set: ms_drbd0 Masters: [ db1-bh ] Stopped: [ drbd0:1 ] Failed actions: drbd0_fs_start_0 (node=db1-bh, call=7, rc=1, status=complete): unknown error Analysing the logfile it seems that the filesystem primitive is started before ms_drbd0 is promoted to Primary: Jun 9 15:56:49 db1-bh pengine: [8667]: notice: clone_print: Master/Slave Set: ms_drbd0 Jun 9 15:56:49 db1-bh pengine: [8667]: notice: short_print: Slaves: [ db1-bh ] Jun 9 15:56:49 db1-bh pengine: [8667]: notice: short_print: Stopped: [ drbd0:1 ] Jun 9 15:56:49 db1-bh pengine: [8667]: info: native_color: Resource drbd0:1 cannot run anywhere Jun 9 15:56:49 db1-bh pengine: [8667]: info: master_color: Promoting drbd0:0 (Slave db1-bh) Jun 9 15:56:49 db1-bh pengine: [8667]: info: master_color: ms_drbd0: Promoted 1 instances of a possible 1 to master Jun 9 15:56:49 db1-bh pengine: [8667]: info: master_color: Promoting drbd0:0 (Slave db1-bh) Jun 9 15:56:49 db1-bh pengine: [8667]: info: master_color: ms_drbd0: Promoted 1 instances of a possible 1 to master ... Jun 9 15:56:49 db1-bh Filesystem[8865]: INFO: Running start for /dev/drbd0 on /mnt Jun 9 15:56:49 db1-bh lrmd: [8665]: info: RA output: (drbd0_fs:start:stderr) FATAL: Module scsi_hostadapter not found. ... Jun 9 15:56:49 db1-bh Filesystem[8865]: ERROR: Couldn't sucessfully fsck filesystem for /dev/drbd0 ... Jun 9 15:56:50 db1-bh kernel: [21875.203353] block drbd0: role( Secondary -> Primary ) I suspect that Pacemaker tells DRBD to promote the Secondary to Primary and immediately starts the Filesystem primitive - before DRBD has promoted the resource to Primary. Any ideas how to solve this? Thanks Klaus _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker