it's done, Here is the log from the second node, thank for your help, it very apreciated!:
Mar 23 13:54:53 node2 attrd: [3102]: info: find_hash_entry: Creating hash entry for last-failure-mysqld Mar 23 13:54:53 node2 attrd: [3102]: info: attrd_perform_update: Delaying operation last-failure-mysqld=<null>: cib not connected Mar 23 13:54:53 node2 attrd: [3102]: info: find_hash_entry: Creating hash entry for fail-count-mysqld Mar 23 13:54:53 node2 attrd: [3102]: info: attrd_perform_update: Delaying operation fail-count-mysqld=<null>: cib not connected Mar 23 13:54:53 node2 lrmd: [3101]: debug: on_msg_add_rsc:client [3104] adds resource mysqld Mar 23 13:54:53 node2 crmd: [3104]: info: do_lrm_rsc_op: Performing key=10:46:7:1f6f7a59-8e04-46d9-8a47-4b1ada0e6ea1 op=mysqld_monitor_0 ) Mar 23 13:54:53 node2 lrmd: [3101]: debug: on_msg_perform_op:2359: copying parameters for rsc mysqld Mar 23 13:54:53 node2 lrmd: [3101]: debug: on_msg_perform_op: add an operation operation monitor[6] on ocf::mysql::mysqld for client 3104, its parameters: socket=[/var/lib/mysql/mysql.sock] binary=[/usr/bin/mysqld_safe] group=[mysql] CRM_meta_timeout=[20000] crm_feature_set=[3.0.5] pid=[/var/run/mysqld/mysqld.pid] user=[mysql] config=[/etc/my.cnf] datadir=[/data/mysql/databases] to the operation list. Mar 23 13:54:53 node2 lrmd: [3101]: info: rsc:mysqld:6: probe Mar 23 13:54:53 node2 lrmd: [3101]: WARN: Managed mysqld:monitor process 3223 exited with return code 1. Mar 23 13:54:53 node2 lrmd: [3101]: info: RA output: (mysqld:monitor:stderr) /usr/lib/ocf/resource.d//heartbeat/mysql: line 45: /usr/lib/ocf/lib/heartbeat/ocf-shellfuncs: No such file or directory Mar 23 13:54:53 node2 crmd: [3104]: debug: create_operation_update: do_update_resource: Updating resouce mysqld after complete monitor op (interval=0) Mar 23 13:54:53 node2 crmd: [3104]: info: process_lrm_event: LRM operation mysqld_monitor_0 (call=6, rc=1, cib-update=10, confirmed=true) unknown error Mar 23 13:54:54 node2 crmd: [3104]: info: do_lrm_rsc_op: Performing key=3:47:0:1f6f7a59-8e04-46d9-8a47-4b1ada0e6ea1 op=mysqld_stop_0 ) Mar 23 13:54:54 node2 lrmd: [3101]: debug: on_msg_perform_op: add an operation operation stop[7] on ocf::mysql::mysqld for client 3104, its parameters: crm_feature_set=[3.0.5] to the operation list. Mar 23 13:54:54 node2 lrmd: [3101]: info: rsc:mysqld:7: stop Mar 23 13:54:54 node2 lrmd: [3101]: info: RA output: (mysqld:stop:stderr) /usr/lib/ocf/resource.d//heartbeat/mysql: line 45: /usr/lib/ocf/lib/heartbeat/ocf-shellfuncs: No such file or directory -> I just saw it, I will look over Mar 23 13:54:54 node2 lrmd: [3101]: WARN: Managed mysqld:stop process 3241 exited with return code 1. Mar 23 13:54:54 node2 crmd: [3104]: debug: create_operation_update: do_update_resource: Updating resouce mysqld after complete stop op (interval=0) Mar 23 13:54:54 node2 crmd: [3104]: info: process_lrm_event: LRM operation mysqld_stop_0 (call=7, rc=1, cib-update=12, confirmed=true) unknown error Mar 23 13:54:54 node2 attrd: [3102]: debug: attrd_local_callback: update message from node1: fail-count-mysqld=INFINITY Mar 23 13:54:54 node2 attrd: [3102]: debug: attrd_local_callback: New value of fail-count-mysqld is INFINITY Mar 23 13:54:54 node2 attrd: [3102]: info: attrd_trigger_update: Sending flush op to all hosts for: fail-count-mysqld (INFINITY) Mar 23 13:54:54 node2 attrd: [3102]: info: attrd_perform_update: Delaying operation fail-count-mysqld=INFINITY: cib not connected Mar 23 13:54:54 node2 attrd: [3102]: debug: attrd_local_callback: update message from node1: last-failure-mysqld=1332507294 Mar 23 13:54:54 node2 attrd: [3102]: debug: attrd_local_callback: New value of last-failure-mysqld is 1332507294 Mar 23 13:54:54 node2 attrd: [3102]: info: attrd_trigger_update: Sending flush op to all hosts for: last-failure-mysqld (1332507294) Mar 23 13:54:54 node2 attrd: [3102]: info: attrd_perform_update: Delaying operation last-failure-mysqld=1332507294: cib not connected Mar 23 13:54:54 node2 attrd: [3102]: info: attrd_trigger_update: Sending flush op to all hosts for: last-failure-mysqld (1332507294) Mar 23 13:54:54 node2 cib: [3100]: debug: cib_process_xpath: cib_query: //cib/status//node_state[@id='node2']//transient_attributes//nvpair[@name='last-failure-mysqld'] does not exist Mar 23 13:54:54 node2 attrd: [3102]: info: attrd_perform_update: Sent update 4: last-failure-mysqld=1332507294 Mar 23 13:54:54 node2 attrd: [3102]: info: attrd_trigger_update: Sending flush op to all hosts for: fail-count-mysqld (INFINITY) Mar 23 13:54:54 node2 cib: [3100]: debug: cib_process_xpath: cib_query: //cib/status//node_state[@id='node2']//transient_attributes//nvpair[@name='fail-count-mysqld'] does not exist Mar 23 13:54:54 node2 attrd: [3102]: info: attrd_perform_update: Sent update 11: fail-count-mysqld=INFINITY Mar 23 13:54:54 node2 attrd: [3102]: debug: attrd_cib_callback: Update 4 for last-failure-mysqld=1332507294 passed Mar 23 13:54:54 node2 attrd: [3102]: debug: attrd_cib_callback: Update 11 for fail-count-mysqld=INFINITY passed 2012/3/23 emmanuel segura <emi2f...@gmail.com> > The first thing you can do it's eliminate this "location > master-prefer-node-1 Cluster-VIP 25: node1" > > because you have your virtual in group and i would like to see the log > from the second node > > Thanks :-) > > Il giorno 23 marzo 2012 13:42, coma <coma....@gmail.com> ha scritto: > > Thank you for your responses, >> >> i have fixed my migration-treshold problem with lsb:mysqld ressource (i >> can see migration-threshold=2 with crm_mon failcounts so its ok) bu >> failover still doesn't work when mysql fail (but work fine when node fail >> or standby). >> So i've tried with the ocf ressource agent, it's work fine on my first >> node but fail on my second node with unknown error. >> >> crm_mon --failcount: >> >> Failed actions: >> mysqld_monitor_0 (node=node2, call=6, rc=1, status=complete): unknown >> error >> mysqld_stop_0 (node=node2, call=7, rc=1, status=complete): unknown >> error >> >> >> I have exactly the same mysql packages version and configuration on my >> two nodes (with proper permissions), also corosync/heartbeat and pacemaker >> are in the same version too: >> >> corosynclib-1.2.7-1.1.el5 >> corosync-1.2.7-1.1.el5 >> pacemaker-libs-1.1.5-1.1.el5 >> pacemaker-1.1.5-1.1.el5 >> heartbeat-3.0.3-2.3.el5 >> heartbeat-libs-3.0.3-2.3.el5 >> heartbeat-debuginfo-3.0.2-2.el5 >> >> So i don't andersand wy it's works on one node but not on the second? >> >> >> ressource config: >> >> primitive mysqld ocf:heartbeat:mysql \ >> params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" \ >> user="mysql" group="mysql" pid="/var/run/mysqld/mysqld.pid" \ >> datadir="/data/mysql/databases" >> socket="/var/lib/mysql/mysql.sock" \ >> op start interval="0" timeout="120" \ >> op stop interval="0" timeout="120" \ >> op monitor interval="30" timeout="30" depth="0" \ >> target-role="Started" >> >> >> And same with (i have created test database / table + grant test user on >> it): >> >> primitive mysqld ocf:heartbeat:mysql \ >> params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" \ >> datadir="/data/mysql/databases" user="mysql" \ >> pid="/var/run/mysqld/mysqld.pid" >> socket="/var/lib/mysql/mysql.sock" \ >> test_passwd="test" test_table="Cluster.dbcheck" test_user="test" \ >> op start interval="0" timeout="120" \ >> op stop interval="0" timeout="120" \ >> op monitor interval="30s" timeout="30s" OCF_CHECK_LEVEL="1" \ >> meta migration-threshold="3" target-role="Started" >> >> >> Full config: >> >> node node2 \ >> attributes standby="off" >> node node1 \ >> attributes standby="off" >> primitive Cluster-VIP ocf:heartbeat:IPaddr2 \ >> params ip="x.x.x.x" broadcast="x.x.x.x" nic="eth0" >> cidr_netmask="21" iflabel="VIP1" \ >> op monitor interval="10s" timeout="20s" \ >> meta is-managed="true" >> primitive datavg ocf:heartbeat:LVM \ >> params volgrpname="datavg" exclusive="true" \ >> op start interval="0" timeout="30" \ >> op stop interval="0" timeout="30" >> primitive drbd_mysql ocf:linbit:drbd \ >> params drbd_resource="drbd-mysql" \ >> op monitor interval="15s" >> primitive fs_mysql ocf:heartbeat:Filesystem \ >> params device="/dev/datavg/data" directory="/data" fstype="ext3" >> primitive mysqld ocf:heartbeat:mysql \ >> params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" >> user="mysql" group="mysql" pid="/var/run/mysqld/mysqld.pid" >> datadir="/data/mysql/databases" socket="/var/lib/mysql/mysql.sock" \ >> op start interval="0" timeout="120" \ >> op stop interval="0" timeout="120" \ >> op monitor interval="30" timeout="30" depth="0" >> group mysql datavg fs_mysql Cluster-VIP mysqld >> ms ms_drbd_mysql drbd_mysql \ >> meta master-max="1" master-node-max="1" clone-max="2" >> clone-node-max="1" notify="true" >> location master-prefer-node-1 Cluster-VIP 25: node1 >> colocation mysql_on_drbd inf: mysql ms_drbd_mysql:Master >> order mysql_after_drbd inf: ms_drbd_mysql:promote mysql:start >> property $id="cib-bootstrap-options" \ >> >> dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \ >> cluster-infrastructure="openais" \ >> expected-quorum-votes="2" \ >> stonith-enabled="false" \ >> no-quorum-policy="ignore" \ >> last-lrm-refresh="1332504626" >> rsc_defaults $id="rsc-options" \ >> resource-stickiness="100" >> >> >> >> >> >> 2012/3/22 Andreas Kurz <andr...@hastexo.com> >> >>> On 03/22/2012 03:23 PM, coma wrote: >>> > Thank you for your responses, >>> > >>> > I have added the migration-treshold on my mysqld ressource, when i kill >>> > or manually stop mysql on one node, there is not failover on the second >>> > node. >>> > Also, when i look crm_mon --failcounts, i can see "mysqld: >>> > migration-threshold=1000000 fail-count=1000000", so i don"t anderstand >>> > why migration-threshold not equal 2? >>> > >>> > Migration summary: >>> > * Node node1: >>> > mysqld: migration-threshold=1000000 fail-count=1000000 >>> > * Node node2: >>> > >>> > Failed actions: >>> > mysqld_monitor_10000 (node=node1, call=90, rc=7, status=complete): >>> > not running >>> > mysqld_stop_0 (node=node1, call=93, rc=1, status=complete): unknown >>> > error >>> >>> The lsb init script you are using seems to be not LSB compliant ... >>> looks like it returns an error on stopping an already stopped mysql. >>> >>> >>> http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/index.html#ap-lsb >>> >>> Fix the script ... or better use the ocf resource agent. >>> >>> Regards, >>> Andreas >>> >>> -- >>> Need help with Pacemaker? >>> http://www.hastexo.com/now >>> >>> > >>> > >>> > >>> > configuration: >>> > >>> > node node1 \ >>> > attributes standby="off" >>> > node node2 \ >>> > attributes standby="off" >>> > primitive Cluster-VIP ocf:heartbeat:IPaddr2 \ >>> > params ip="x.x.x.x" broadcast="x.x.x.x" nic="eth0" >>> > cidr_netmask="21" iflabel="VIP1" \ >>> > op monitor interval="10s" timeout="20s" \ >>> > meta is-managed="true" >>> > primitive datavg ocf:heartbeat:LVM \ >>> > params volgrpname="datavg" exclusive="true" \ >>> > op start interval="0" timeout="30" \ >>> > op stop interval="0" timeout="30" >>> > primitive drbd_mysql ocf:linbit:drbd \ >>> > params drbd_resource="drbd-mysql" \ >>> > op monitor interval="15s" >>> > primitive fs_mysql ocf:heartbeat:Filesystem \ >>> > params device="/dev/datavg/data" directory="/data" >>> fstype="ext3" >>> > primitive mysqld lsb:mysqld \ >>> > op monitor interval="10s" timeout="30s" \ >>> > op start interval="0" timeout="120" \ >>> > op stop interval="0" timeout="120" \ >>> > meta target-role="Started" migration-threshold="2" >>> > failure-timeout="20s" >>> > group mysql datavg fs_mysql Cluster-VIP mysqld >>> > ms ms_drbd_mysql drbd_mysql \ >>> > meta master-max="1" master-node-max="1" clone-max="2" >>> > clone-node-max="1" notify="true" >>> > location master-prefer-node-1 Cluster-VIP 25: node1 >>> > colocation mysql_on_drbd inf: mysql ms_drbd_mysql:Master >>> > order mysql_after_drbd inf: ms_drbd_mysql:promote mysql:start >>> > property $id="cib-bootstrap-options" \ >>> > >>> > dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \ >>> > cluster-infrastructure="openais" \ >>> > expected-quorum-votes="2" \ >>> > stonith-enabled="false" \ >>> > no-quorum-policy="ignore" \ >>> > last-lrm-refresh="1332425337" >>> > rsc_defaults $id="rsc-options" \ >>> > resource-stickiness="100" >>> > >>> > >>> > >>> > 2012/3/22 Andreas Kurz <andr...@hastexo.com <mailto: >>> andr...@hastexo.com>> >>> > >>> > On 03/22/2012 01:51 PM, coma wrote: >>> > > Ah yes thank you, the service status mysql is now monitored, but >>> the >>> > > failover is not performed? >>> > >>> > As long as local restarts are successful there is no need for a >>> failover >>> > ... there is migration-treshold to limit local restart tries. >>> > >>> > Regards, >>> > Andreas >>> > >>> > -- >>> > Need help with Pacemaker? >>> > http://www.hastexo.com/now >>> > >>> > > >>> > > >>> > > >>> > > 2012/3/22 emmanuel segura <emi2f...@gmail.com >>> > <mailto:emi2f...@gmail.com> <mailto:emi2f...@gmail.com >>> > <mailto:emi2f...@gmail.com>>> >>> > > >>> > > sorry >>> > > I think you missed the op monitor operetion in your primitive >>> > definition >>> > > >>> > > >>> > > >>> > > Il giorno 22 marzo 2012 11:52, emmanuel segura >>> > <emi2f...@gmail.com <mailto:emi2f...@gmail.com> >>> > > <mailto:emi2f...@gmail.com <mailto:emi2f...@gmail.com>>> ha >>> > scritto: >>> > > >>> > > I think you missed the op monitor operetion you primitive >>> > definition >>> > > >>> > > Il giorno 22 marzo 2012 11:33, coma <coma....@gmail.com >>> > <mailto:coma....@gmail.com> >>> > > <mailto:coma....@gmail.com <mailto:coma....@gmail.com>>> >>> > ha scritto: >>> > > >>> > > Hello, >>> > > >>> > > I have a question about mysql service monitoring >>> into a >>> > > MySQL HA cluster with pacemaker and DRBD, >>> > > I have set up a configuration to allow a failover >>> between >>> > > two nodes, it work fine when a node is offline (or >>> > standby), >>> > > but i want to know if it is possible to monitor the >>> mysql >>> > > service to perform a failover if mysql is stopped or >>> > > unavailable? >>> > > >>> > > Thank you in advance for any response. >>> > > >>> > > My crm configuration: >>> > > >>> > > node node1 \ >>> > > attributes standby="off" >>> > > node node2 \ >>> > > attributes standby="off" >>> > > primitive Cluster-VIP ocf:heartbeat:IPaddr2 \ >>> > > params ip="x.x.x.x" broadcast="x.x.x.x" >>> nic="eth0" >>> > > cidr_netmask="21" iflabel="VIP1" \ >>> > > op monitor interval="10s" timeout="20s" \ >>> > > meta is-managed="true" >>> > > primitive datavg ocf:heartbeat:LVM \ >>> > > params volgrpname="datavg" exclusive="true" \ >>> > > op start interval="0" timeout="30" \ >>> > > op stop interval="0" timeout="30" >>> > > primitive drbd_mysql ocf:linbit:drbd \ >>> > > params drbd_resource="drbd-mysql" \ >>> > > op monitor interval="15s" >>> > > primitive fs_mysql ocf:heartbeat:Filesystem \ >>> > > params device="/dev/datavg/data" >>> directory="/data" >>> > > fstype="ext3" >>> > > primitive mysqld lsb:mysqld >>> > > group mysql datavg fs_mysql Cluster-VIP mysqld >>> > > ms ms_drbd_mysql drbd_mysql \ >>> > > meta master-max="1" master-node-max="1" >>> > > clone-max="2" clone-node-max="1" notify="true" >>> > > location master-prefer-node-1 Cluster-VIP 25: node1 >>> > > colocation mysql_on_drbd inf: mysql >>> ms_drbd_mysql:Master >>> > > order mysql_after_drbd inf: ms_drbd_mysql:promote >>> > mysql:start >>> > > property $id="cib-bootstrap-options" \ >>> > > >>> > > >>> > dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" >>> > > \ >>> > > cluster-infrastructure="openais" \ >>> > > expected-quorum-votes="2" \ >>> > > stonith-enabled="false" \ >>> > > no-quorum-policy="ignore" \ >>> > > last-lrm-refresh="1332254494" >>> > > rsc_defaults $id="rsc-options" \ >>> > > resource-stickiness="100" >>> > > >>> > > >>> > > _______________________________________________ >>> > > Pacemaker mailing list: >>> Pacemaker@oss.clusterlabs.org >>> > <mailto:Pacemaker@oss.clusterlabs.org> >>> > > <mailto:Pacemaker@oss.clusterlabs.org >>> > <mailto:Pacemaker@oss.clusterlabs.org>> >>> > > >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> > > >>> > > Project Home: http://www.clusterlabs.org >>> > > Getting started: >>> > > >>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> > > Bugs: http://bugs.clusterlabs.org >>> > > >>> > > >>> > > >>> > > >>> > > -- >>> > > esta es mi vida e me la vivo hasta que dios quiera >>> > > >>> > > >>> > > >>> > > >>> > > -- >>> > > esta es mi vida e me la vivo hasta que dios quiera >>> > > >>> > > _______________________________________________ >>> > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> > <mailto:Pacemaker@oss.clusterlabs.org> >>> > > <mailto:Pacemaker@oss.clusterlabs.org >>> > <mailto:Pacemaker@oss.clusterlabs.org>> >>> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> > > >>> > > Project Home: http://www.clusterlabs.org >>> > > Getting started: >>> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> > > Bugs: http://bugs.clusterlabs.org >>> > > >>> > > >>> > > >>> > > >>> > > _______________________________________________ >>> > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> > <mailto:Pacemaker@oss.clusterlabs.org> >>> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> > > >>> > > Project Home: http://www.clusterlabs.org >>> > > Getting started: >>> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> > > Bugs: http://bugs.clusterlabs.org >>> > >>> > >>> > >>> > _______________________________________________ >>> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> > <mailto:Pacemaker@oss.clusterlabs.org> >>> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> > >>> > Project Home: http://www.clusterlabs.org >>> > Getting started: >>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> > Bugs: http://bugs.clusterlabs.org >>> > >>> > >>> > >>> > >>> > _______________________________________________ >>> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> > >>> > Project Home: http://www.clusterlabs.org >>> > Getting started: >>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> > Bugs: http://bugs.clusterlabs.org >>> >>> >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >>> >>> >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> >> > > > -- > esta es mi vida e me la vivo hasta que dios quiera > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > >
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org