Ok, that is the reason why Fridays are better to make documentation or something like that, instead of fixing problems on clusters. Sorry, that was my fault.
Here are the logs of that service group before the shutdown: Aug 17 13:45:33 hydra2 crmd: [26828]: info: te_rsc_command: Initiating action 413: start mysqltest03-db_start_0 on hydra2 (local) Aug 17 13:45:33 hydra2 crmd: [26828]: info: do_lrm_rsc_op: Performing key=413:850:0:eb13866d-3a8f-4d87-bc81-82e893dc72d6 op=mysqltest03-db_start_0 ) Aug 17 13:45:35 hydra2 lrmd: [26825]: info: rsc:mysqltest03-db start[238] (pid 11388) Aug 17 13:45:39 hydra2 lrmd: [26825]: info: operation start[238] on mysqltest03-db for client 26828: pid 11388 exited with return code 1 Aug 17 13:45:39 hydra2 crmd: [26828]: info: process_lrm_event: LRM operation mysqltest03-db_start_0 (call=238, rc=1, cib-update=1320, confirmed=true) unknown error Aug 17 13:45:39 hydra2 crmd: [26828]: WARN: status_from_rc: Action 413 (mysqltest03-db_start_0) on hydra2 failed (target: 0 vs. rc: 1): Error Aug 17 13:45:39 hydra2 crmd: [26828]: WARN: update_failcount: Updating failcount for mysqltest03-db on hydra2 after failed start: rc=1 (update=INFINITY, time=1345203939) Aug 17 13:45:39 hydra2 crmd: [26828]: info: abort_transition_graph: match_graph_event:277 - Triggered transition abort (complete=0, tag=lrm_rsc_op, id=mysqltest03-db_last_failure_0, magic=0:1;413:850:0:eb13866d-3a8f-4d87-bc81-82e893dc72d6, cib=0.244.114) : Event failed Aug 17 13:45:39 hydra2 crmd: [26828]: info: match_graph_event: Action mysqltest03-db_start_0 (413) confirmed on hydra2 (rc=4) Aug 17 13:45:39 hydra2 attrd: [26826]: notice: attrd_trigger_update: Sending flush op to all hosts for: fail-count-mysqltest03-db (INFINITY) Aug 17 13:45:40 hydra2 attrd: [26826]: notice: attrd_perform_update: Sent update 23: fail-count-mysqltest03-db=INFINITY Aug 17 13:45:40 hydra2 attrd: [26826]: notice: attrd_trigger_update: Sending flush op to all hosts for: last-failure-mysqltest03-db (1345203939) Aug 17 13:45:40 hydra2 attrd: [26826]: notice: attrd_perform_update: Sent update 26: last-failure-mysqltest03-db=1345203939 Be patient with me. I only see in the logs, that there was an unknown error. As I wrote before, if I start the cluster agent by hand it worked without problems: 782 export OCF_ROOT=/usr/lib/ocf 783 export OCF_RESKEY_binary=/srv/mysql-server/releases/mysql/bin/mysqld 784 export OCF_RESKEY_config="/srv/mysql/mysqltest03/admin/etc/my.cnf" 785 export OCF_RESKEY_user=mysql 786 export OCF_RESKEY_group=mysql 787 export OCF_RESKEY_datadir=/srv/mysql/mysqltest03/data 788 export OCF_RESKEY_log="/srv/mysql/mysqltest03/admin/log/mysqld.log" 789 export OCF_RESKEY_pid="/srv/mysql/mysqltest03/admin/run/mysqld.pid" 790 export OCF_RESKEY_socket="/srv/mysql/mysqltest03/admin/run/mysqld.sock" 791 export OCF_RESKEY_additional_parameters="--bind-address=xx.xx.xx.xx" 792 /usr/lib/ocf/resource.d/heartbeat/mysql start; echo $? The only reason,I can Imagine, for this behavior is, that the cluster sends a monitor call after startup with a negative response, because the mysql needs more time to start running. Thanks Josef Hi, thank for the logs. The logs after 13:45:40 tell that the pacemaker is stopping the mysql resource successfully. But the logs UNTIL 13:45:40 would tell WHY pacemaker does this. -- Dr. Michael Schwartzkopff Guardinistr. 63 81375 München Tel: (0163) 172 50 98 _______________________________________________ Pacemaker mailing list: [email protected] http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
