Tank you for the quick anser
Im tryng to make a cluster using pacemaker on OpenSUSE 12.1 with DRBD +
OCFS2 + Mysql on top of the filesystem.
The system will have two DRBD resources to be mounted on /var/lib/mysql
and on /export.
Be sure you configured "resource-and-stonith" fencing policy for DRBD
and you use a correct fencing script like: goo.gl/O4N8f
I have configured it this way
stonith-enabled="false" \
Bad idea! You should really use stonith in such a setup ... in any
cluster setup.
Itd disable while im getting the cluster to mount the filesystems
colocation col_ocfs2 inf: .......
Use "crm configure help colocation" to find out more, the same for order.
You could also add the two file system primitives to the cl_ocfs2_mgmt
group, then only the constraints between this group and DRBD are needed.
Regards,
Andreas
I was not abble to get any of the partitions mounted, I tought there was something very bad on my
configuration
and changed it to resemble the configuration showed on
http://www.clusterlabs.org/wiki/Dual_Primary_DRBD_%2B_OCFS2 and try to get at least one of the
partitions mounted.
Now this is my running configuration:
node artemis
node jupiter
primitive ip_mysql ocf:heartbeat:IPaddr2 \
params ip="10.10.10.5" cidr_netmask="32" nic="vlan0" \
op monitor interval="30s"
primitive resDLM ocf:pacemaker:controld \
op monitor interval="60" timeout="60"
primitive resDRBD_export ocf:linbit:drbd \
params drbd_resource="export" \
operations $id="opsDRBD_export" \
op monitor interval="20" role="Master" timeout="20" \
op monitor interval="30" role="Slave" timeout="20" \
meta target-role="started"
primitive resDRBD_mysql ocf:linbit:drbd \
params drbd_resource="mysql" \
operations $id="opsDRBD_mysql" \
op monitor interval="20" role="Master" timeout="20" \
op monitor interval="30" role="Slave" timeout="20" \
meta target-role="started"
primitive resFSexport ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/export" directory="/export" fstype="ocfs2"
options="rw,noatime" \
op monitor interval="120s"
primitive resO2CB ocf:ocfs2:o2cb \
op monitor interval="60" timeout="60"
ms msDRBD_export resDRBD_export \
meta resource-stickines="100" master-max="2" clone-max="2" notify="true"
interleave="true"
ms msDRBD_mysql resDRBD_mysql \
meta resource-stickines="100" master-max="2" clone-max="2" notify="true"
interleave="true"
clone cloneDLM resDLM \
meta globally-unique="false" interleave="true"
clone cloneFSexport resFSexport \
meta interleave="true" ordered="true"
clone cloneO2CB resO2CB \
meta globally-unique="false" interleave="true"
colocation colDLMDRBD inf: cloneDLM msDRBD_export:Master
colocation colFSO2CB inf: cloneFSexport cloneO2CB
colocation colO2CBDLM inf: cloneO2CB cloneDLM
order ordDLMO2CB 0: cloneDLM cloneO2CB
order ordDRBDDLM 0: msDRBD_export:promote cloneDLM
order ordO2CBFS 0: cloneO2CB cloneFSexport
property $id="cib-bootstrap-options" \
dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore" \
last-lrm-refresh="1332330463" \
default-resource-stickiness="1000" \
maintenance-mode="false"
I commited the configuration to see if I would end up with the /export mounted, but no luck on this
too.
then I stopped the pacemaker on both hosts and started it just on jupter. The filesystem did not get
mounted.and taking a look at the /var/log/messages i could see this entries:
Mar 21 10:11:35 jupiter pengine: [28282]: WARN: unpack_rsc_op: Processing failed op
resFSexport:0_last_failure_0 on jupiter: unknown error (1)
Mar 21 10:11:35 jupiter pengine: [28282]: WARN: common_apply_stickiness: Forcing cloneFSexport away
from jupiter after 1000000 failures (max=1000000)
Mar 21 10:11:35 jupiter pengine: [28282]: WARN: common_apply_stickiness: Forcing cloneFSexport away
from jupiter after 1000000 failures (max=1000000)
Mar 21 10:11:35 jupiter pengine: [28282]: ERROR: rsc_expand_action: Couldn't expand
cloneDLM_demote_0
Mar 21 10:11:35 jupiter pengine: [28282]: ERROR: crm_abort: clone_update_actions_interleave:
Triggered assert at clone.c:1200 : first_action != NULL || is_set(first_child->flags, pe_rsc_orphan)
Mar 21 10:11:35 jupiter pengine: [28282]: ERROR: clone_update_actions_interleave: No action found
for demote in resDLM:0 (first)
Mar 21 10:11:35 jupiter pengine: [28282]: ERROR: crm_abort: clone_update_actions_interleave:
Triggered assert at clone.c:1200 : first_action != NULL || is_set(first_child->flags, pe_rsc_orphan)
Mar 21 10:11:35 jupiter pengine: [28282]: ERROR: clone_update_actions_interleave: No action found
for demote in resDLM:0 (first)
Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave
ip_mysql#011(Started jupiter)
Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave resDRBD_mysql:0#011(Master
jupiter)
Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave
resDRBD_mysql:1#011(Stopped)
Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave resDRBD_export:0#011(Master
jupiter)
Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave
resDRBD_export:1#011(Stopped)
Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave
resDLM:0#011(Started jupiter)
Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave
resDLM:1#011(Stopped)
Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave
resO2CB:0#011(Started jupiter)
Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave
resO2CB:1#011(Stopped)
Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave
resFSexport:0#011(Stopped)
Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave
resFSexport:1#011(Stopped)
Mar 21 10:11:35 jupiter crmd: [28283]: info: do_state_transition: State transition
S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE
origin=handle_response ]
But looking back on the log give no clues. Then I started the pacemaker on the second host and took
a look at the log, then i found this:
Mar 21 10:28:13 artemis lrmd: [2429]: info: rsc:resFSexport:0 start[26] (pid
3315)
Mar 21 10:28:13 artemis lrmd: [2429]: info: operation monitor[25] on resO2CB:1 for client 2432: pid
3314 exited with return code 0
Mar 21 10:28:13 artemis crmd: [2432]: info: process_lrm_event: LRM operation resO2CB:1_monitor_60000
(call=25, rc=0, cib-update=26, confirmed=false) ok
Mar 21 10:28:13 artemis Filesystem(resFSexport:0)[3315]: [3362]: INFO: Running start for
/dev/drbd/by-res/export on /export
Mar 21 10:28:13 artemis lrmd: [2429]: info: RA output: (resFSexport:0:start:stderr) FATAL: Module
scsi_hostadapter not found.
Mar 21 10:28:13 artemis lrmd: [2429]: info: RA output: (resFSexport:0:start:stderr) mount.ocfs2:
Cluster stack specified does not match the one currently running while trying to join the group
Mar 21 10:28:13 artemis Filesystem(resFSexport:0)[3315]: [3382]: ERROR: Couldn't mount filesystem
/dev/drbd/by-res/export on /export
Mar 21 10:28:13 artemis lrmd: [2429]: info: operation start[26] on resFSexport:0 for client 2432:
pid 3315 exited with return code 1
Mar 21 10:28:13 artemis crmd: [2432]: info: process_lrm_event: LRM operation resFSexport:0_start_0
(call=26, rc=1, cib-update=27, confirmed=true) unknown error
Mar 21 10:28:13 artemis attrd: [2430]: notice: attrd_ais_dispatch: Update
relayed from jupiter
Mar 21 10:28:13 artemis attrd: [2430]: notice: attrd_trigger_update: Sending flush op to all hosts
for: fail-count-resFSexport:0 (INFINITY)
Mar 21 10:28:13 artemis attrd: [2430]: notice: attrd_perform_update: Sent update 13:
fail-count-resFSexport:0=INFINITY
Mar 21 10:28:13 artemis attrd: [2430]: notice: attrd_ais_dispatch: Update
relayed from jupiter
Mar 21 10:28:13 artemis attrd: [2430]: notice: attrd_trigger_update: Sending flush op to all hosts
for: last-failure-resFSexport:0 (1332336493)
Mar 21 10:28:13 artemis attrd: [2430]: notice: attrd_perform_update: Sent update 16:
last-failure-resFSexport:0=1332336493
Mar 21 10:28:13 artemis crmd: [2432]: info: do_lrm_rsc_op: Performing
key=8:10:0:0c5a17ef-3075-47e7-a0c0-a564ec772af8 op=resFSexport:0_stop_0 )
Mar 21 10:28:13 artemis lrmd: [2429]: info: rsc:resFSexport:0 stop[27] (pid
3389)
Mar 21 10:28:13 artemis Filesystem(resFSexport:0)[3389]: [3423]: INFO: Running stop for
/dev/drbd/by-res/export on /export
Mar 21 10:28:13 artemis lrmd: [2429]: info: operation stop[27] on resFSexport:0 for client 2432: pid
3389 exited with return code 0
Mar 21 10:28:13 artemis crmd: [2432]: info: process_lrm_event: LRM operation resFSexport:0_stop_0
(call=27, rc=0, cib-update=28, confirmed=true) ok
The weird thing is at this line:
Mar 21 10:28:13 artemis lrmd: [2429]: info: RA output: (resFSexport:0:start:stderr) FATAL: Module
scsi_hostadapter not found
Why pacemaker is looking for a scsi device since it is configured to use DRBD?.
Please, can someone shade a light over this?
Regards,
Carlos
----- Original Message -----
From: "Andreas Kurz" <andr...@hastexo.com>
To: <openais@lists.linux-foundation.org>
Sent: Wednesday, March 21, 2012 7:49 AM
Subject: Re: [Openais] Help on mounting ocfs2 filesystems
_______________________________________________
Openais mailing list
Openais@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/openais
_______________________________________________
Openais mailing list
Openais@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/openais