Hello, ... see comments inline ...
On 03/21/2012 03:24 PM, Carlos Xavier wrote: > Tank you for the quick anser >>> >>> Im tryng to make a cluster using pacemaker on OpenSUSE 12.1 with DRBD + >>> OCFS2 + Mysql on top of the filesystem. >>> The system will have two DRBD resources to be mounted on /var/lib/mysql >>> and on /export. >> >> Be sure you configured "resource-and-stonith" fencing policy for DRBD >> and you use a correct fencing script like: goo.gl/O4N8f >> > > I have configured it this way > >>> stonith-enabled="false" \ >> >> Bad idea! You should really use stonith in such a setup ... in any >> cluster setup. >> > > Itd disable while im getting the cluster to mount the filesystems > >>> >> >> colocation col_ocfs2 inf: ....... >> >> Use "crm configure help colocation" to find out more, the same for order. >> >> You could also add the two file system primitives to the cl_ocfs2_mgmt >> group, then only the constraints between this group and DRBD are needed. >> >> Regards, >> Andreas > > I was not abble to get any of the partitions mounted, I tought there was > something very bad on my configuration > and changed it to resemble the configuration showed on > http://www.clusterlabs.org/wiki/Dual_Primary_DRBD_%2B_OCFS2 and try to > get at least one of the partitions mounted. > Now this is my running configuration: > > node artemis > node jupiter > primitive ip_mysql ocf:heartbeat:IPaddr2 \ > params ip="10.10.10.5" cidr_netmask="32" nic="vlan0" \ > op monitor interval="30s" > primitive resDLM ocf:pacemaker:controld \ > op monitor interval="60" timeout="60" > primitive resDRBD_export ocf:linbit:drbd \ > params drbd_resource="export" \ > operations $id="opsDRBD_export" \ > op monitor interval="20" role="Master" timeout="20" \ > op monitor interval="30" role="Slave" timeout="20" \ > meta target-role="started" > primitive resDRBD_mysql ocf:linbit:drbd \ > params drbd_resource="mysql" \ > operations $id="opsDRBD_mysql" \ > op monitor interval="20" role="Master" timeout="20" \ > op monitor interval="30" role="Slave" timeout="20" \ > meta target-role="started" > primitive resFSexport ocf:heartbeat:Filesystem \ > params device="/dev/drbd/by-res/export" directory="/export" > fstype="ocfs2" options="rw,noatime" \ > op monitor interval="120s" > primitive resO2CB ocf:ocfs2:o2cb \ > op monitor interval="60" timeout="60" > ms msDRBD_export resDRBD_export \ > meta resource-stickines="100" master-max="2" clone-max="2" > notify="true" interleave="true" > ms msDRBD_mysql resDRBD_mysql \ > meta resource-stickines="100" master-max="2" clone-max="2" > notify="true" interleave="true" > clone cloneDLM resDLM \ > meta globally-unique="false" interleave="true" > clone cloneFSexport resFSexport \ > meta interleave="true" ordered="true" > clone cloneO2CB resO2CB \ > meta globally-unique="false" interleave="true" > colocation colDLMDRBD inf: cloneDLM msDRBD_export:Master > colocation colFSO2CB inf: cloneFSexport cloneO2CB > colocation colO2CBDLM inf: cloneO2CB cloneDLM > order ordDLMO2CB 0: cloneDLM cloneO2CB > order ordDRBDDLM 0: msDRBD_export:promote cloneDLM .. should be cloneDLM:start > order ordO2CBFS 0: cloneO2CB cloneFSexport > property $id="cib-bootstrap-options" \ > dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="2" \ > stonith-enabled="false" \ > no-quorum-policy="ignore" \ > last-lrm-refresh="1332330463" \ > default-resource-stickiness="1000" \ > maintenance-mode="false" > > I commited the configuration to see if I would end up with the /export > mounted, but no luck on this too. > then I stopped the pacemaker on both hosts and started it just on > jupter. The filesystem did not get mounted.and taking a look at the > /var/log/messages i could see this entries: > > Mar 21 10:11:35 jupiter pengine: [28282]: WARN: unpack_rsc_op: > Processing failed op resFSexport:0_last_failure_0 on jupiter: unknown > error (1) > Mar 21 10:11:35 jupiter pengine: [28282]: WARN: common_apply_stickiness: > Forcing cloneFSexport away from jupiter after 1000000 failures > (max=1000000) > Mar 21 10:11:35 jupiter pengine: [28282]: WARN: common_apply_stickiness: > Forcing cloneFSexport away from jupiter after 1000000 failures > (max=1000000) > Mar 21 10:11:35 jupiter pengine: [28282]: ERROR: rsc_expand_action: > Couldn't expand cloneDLM_demote_0 > Mar 21 10:11:35 jupiter pengine: [28282]: ERROR: crm_abort: > clone_update_actions_interleave: Triggered assert at clone.c:1200 : > first_action != NULL || is_set(first_child->flags, pe_rsc_orphan) > Mar 21 10:11:35 jupiter pengine: [28282]: ERROR: > clone_update_actions_interleave: No action found for demote in resDLM:0 > (first) > Mar 21 10:11:35 jupiter pengine: [28282]: ERROR: crm_abort: > clone_update_actions_interleave: Triggered assert at clone.c:1200 : > first_action != NULL || is_set(first_child->flags, pe_rsc_orphan) > Mar 21 10:11:35 jupiter pengine: [28282]: ERROR: > clone_update_actions_interleave: No action found for demote in resDLM:0 > (first) > Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave > ip_mysql#011(Started jupiter) > Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave > resDRBD_mysql:0#011(Master jupiter) > Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave > resDRBD_mysql:1#011(Stopped) > Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave > resDRBD_export:0#011(Master jupiter) > Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave > resDRBD_export:1#011(Stopped) > Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave > resDLM:0#011(Started jupiter) > Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave > resDLM:1#011(Stopped) > Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave > resO2CB:0#011(Started jupiter) > Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave > resO2CB:1#011(Stopped) > Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave > resFSexport:0#011(Stopped) > Mar 21 10:11:35 jupiter pengine: [28282]: notice: LogActions: Leave > resFSexport:1#011(Stopped) > Mar 21 10:11:35 jupiter crmd: [28283]: info: do_state_transition: State > transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS > cause=C_IPC_MESSAGE origin=handle_response ] > > > But looking back on the log give no clues. Then I started the pacemaker > on the second host and took a look at the log, then i found this: > > Mar 21 10:28:13 artemis lrmd: [2429]: info: rsc:resFSexport:0 start[26] > (pid 3315) > Mar 21 10:28:13 artemis lrmd: [2429]: info: operation monitor[25] on > resO2CB:1 for client 2432: pid 3314 exited with return code 0 > Mar 21 10:28:13 artemis crmd: [2432]: info: process_lrm_event: LRM > operation resO2CB:1_monitor_60000 (call=25, rc=0, cib-update=26, > confirmed=false) ok > Mar 21 10:28:13 artemis Filesystem(resFSexport:0)[3315]: [3362]: INFO: > Running start for /dev/drbd/by-res/export on /export > Mar 21 10:28:13 artemis lrmd: [2429]: info: RA output: > (resFSexport:0:start:stderr) FATAL: Module scsi_hostadapter not found. > Mar 21 10:28:13 artemis lrmd: [2429]: info: RA output: > (resFSexport:0:start:stderr) mount.ocfs2: Cluster stack specified does > not match the one currently running while trying to join the group You created the ocfs2 file system without Pacemaker running? You need to do a: tunefs.ocfs2 --update-cluster-stack <device> > Mar 21 10:28:13 artemis Filesystem(resFSexport:0)[3315]: [3382]: ERROR: > Couldn't mount filesystem /dev/drbd/by-res/export on /export > Mar 21 10:28:13 artemis lrmd: [2429]: info: operation start[26] on > resFSexport:0 for client 2432: pid 3315 exited with return code 1 > Mar 21 10:28:13 artemis crmd: [2432]: info: process_lrm_event: LRM > operation resFSexport:0_start_0 (call=26, rc=1, cib-update=27, > confirmed=true) unknown error > Mar 21 10:28:13 artemis attrd: [2430]: notice: attrd_ais_dispatch: > Update relayed from jupiter > Mar 21 10:28:13 artemis attrd: [2430]: notice: attrd_trigger_update: > Sending flush op to all hosts for: fail-count-resFSexport:0 (INFINITY) > Mar 21 10:28:13 artemis attrd: [2430]: notice: attrd_perform_update: > Sent update 13: fail-count-resFSexport:0=INFINITY > Mar 21 10:28:13 artemis attrd: [2430]: notice: attrd_ais_dispatch: > Update relayed from jupiter > Mar 21 10:28:13 artemis attrd: [2430]: notice: attrd_trigger_update: > Sending flush op to all hosts for: last-failure-resFSexport:0 (1332336493) > Mar 21 10:28:13 artemis attrd: [2430]: notice: attrd_perform_update: > Sent update 16: last-failure-resFSexport:0=1332336493 > Mar 21 10:28:13 artemis crmd: [2432]: info: do_lrm_rsc_op: Performing > key=8:10:0:0c5a17ef-3075-47e7-a0c0-a564ec772af8 op=resFSexport:0_stop_0 ) > Mar 21 10:28:13 artemis lrmd: [2429]: info: rsc:resFSexport:0 stop[27] > (pid 3389) > Mar 21 10:28:13 artemis Filesystem(resFSexport:0)[3389]: [3423]: INFO: > Running stop for /dev/drbd/by-res/export on /export > Mar 21 10:28:13 artemis lrmd: [2429]: info: operation stop[27] on > resFSexport:0 for client 2432: pid 3389 exited with return code 0 > Mar 21 10:28:13 artemis crmd: [2432]: info: process_lrm_event: LRM > operation resFSexport:0_stop_0 (call=27, rc=0, cib-update=28, > confirmed=true) ok > > > The weird thing is at this line: > > Mar 21 10:28:13 artemis lrmd: [2429]: info: RA output: > (resFSexport:0:start:stderr) FATAL: Module scsi_hostadapter not found A left over from older days, it's already gone in latest resource agents ... but not a problem for you here. Regards, Andreas -- Need help with Pacemaker? http://www.hastexo.com/now > > Why pacemaker is looking for a scsi device since it is configured to use > DRBD?. > > Please, can someone shade a light over this? > > Regards, > Carlos > > > > ----- Original Message ----- From: "Andreas Kurz" <[email protected]> > To: <[email protected]> > Sent: Wednesday, March 21, 2012 7:49 AM > Subject: Re: [Openais] Help on mounting ocfs2 filesystems > > >> _______________________________________________ >> Openais mailing list >> [email protected] >> https://lists.linuxfoundation.org/mailman/listinfo/openais > > _______________________________________________ > Openais mailing list > [email protected] > https://lists.linuxfoundation.org/mailman/listinfo/openais
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Openais mailing list [email protected] https://lists.linuxfoundation.org/mailman/listinfo/openais
