Re: [Linux-HA] strange(?) group behaviour

Andrew Beekhof Tue, 25 Mar 2008 01:12:02 -0700


On Mar 22, 2008, at 5:51 PM, [EMAIL PROTECTED] wrote:

I (think) I am seeing strange behavior with my groups, and wouldappreciate some assistance to figure it out.
I currently have 3 groups, all collocated and ordered.

* GROUP_bofus_assp
 * bofus_assp_ip
 * bofus_assp_initscript
* GROUP_nagios
 * resource_glusterfs_mount_nagios
 * resource_nagios_daemon
* GROUPS_KNWORKS_mail
 * drbddisk_knworks_mail
 * ip_knworks_mail
 * fs_knworks_mail
 * ip_knworks_mail_external
* GROUP_KNWORKS_mysql
 * drbddisk_knworks_mysql
 * ip_knworks_mysql
 * fs_knworks_mysql
When I try to fail a group over to the other server (specifying thehost):
# crm_resource -M -r GROUP_KNWORKS_mysql -H asknmapr01
nothing happens...
When I try to fail a group over to the other server (not specifyingthe host):
# crm_resource -M -r GROUP_KNWORKS_mysql

I see in the logs that it does the following:
*- sets the preference for the group I am trying to failover toINFINITY on asknmapr01
-------------------------------------------------------
*Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:cib:diff: - <cib num_updates="41" epoch="385">Mar 22 11:26:19 asknmapr02 tengine: [22390]: info:update_abort_priority: Abort priority upgraded to 1000000Mar 22 11:26:19 asknmapr02 crmd: [22372]: info: do_state_transition:All 2 cluster nodes are eligible to run resources.Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:cib:diff: - <configuration>Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:cib:diff: - <constraints>Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:cib:diff: - <rsc_location id="cli-prefer-GROUP_KNWORKS_mysql"rsc="GROUP_KNWORKS_mysql">Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:cib:diff: - <rule id="cli-prefer-rule-GROUP_KNWORKS_mysql"score="INFINITY">Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:cib:diff: - <expression id="cli-prefer-expr-GROUP_KNWORKS_mysql" attribute="#uname" operation="eq"value="asknmapr01" type="string"/>Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:cib:diff: - </rule>Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:cib:diff: - </rsc_location>Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:cib:diff: - </constraints>Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:cib:diff: - </configuration>Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:cib:diff: - </cib>Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:cib:diff: + <cib num_updates="1" epoch="386"/>
*-------------------------------------------------------
- then says handling failure of start for the filesystem (I dontknow what the failure reason is)*


thats what you're going to need to find out

the failed start op is what is preventing the group from moving to (orstaying on) asknmapr01


figure out the reason (see logs) and use crm_resource -C

Mar 22 11:26:19 asknmapr02 pengine: [22391]: WARN: unpack_rsc_op:Processing failed op (fs_knworks_mysql_start_0) on asknmapr01Mar 22 11:26:19 asknmapr02 pengine: [22391]: WARN: unpack_rsc_op:Handling failed start for fs_knworks_mysql on asknmapr01
*
- then it restarts another unrelated group?? knworks_mail*
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:Leave resource bofus_assp_ip (asknmapr01)Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:Leave resource bofus_assp_initscript (asknmapr01)Mar 22 11:26:20 asknmapr02 pengine: [22391]: WARN: native_color:Resource resource_glusterfs_mount_nagios cannot run anywhereMar 22 11:26:20 asknmapr02 pengine: [22391]: WARN: native_color:Resource resource_nagios_daemon cannot run anywhereMar 22 11:26:20 asknmapr02 pengine: [22391]: WARN: custom_action:Action resource_glusterfs_mount_nagios_stop_0 (unmanaged)Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StopRsc:asknmapr02 Stop drbddisk_knworks_mailMar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StartRsc:asknmapr02 Start drbddisk_knworks_mailMar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:Restart resource drbddisk_knworks_mail (asknmapr02)Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StopRsc:asknmapr02 Stop ip_knworks_mailMar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StartRsc:asknmapr02 Start ip_knworks_mailMar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:Restart resource ip_knworks_mail (asknmapr02)Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StopRsc:asknmapr02 Stop fs_knworks_mailMar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StartRsc:asknmapr02 Start fs_knworks_mailMar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:Restart resource fs_knworks_mail (asknmapr02)Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StopRsc:asknmapr02 Stop ip_knworks_mail_externalMar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StartRsc:asknmapr02 Start ip_knworks_mail_externalMar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:Restart resource ip_knworks_mail_external (asknmapr02)Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:Leave resource drbddisk_knworks_mysql (asknmapr02)Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:Leave resource ip_knworks_mysql (asknmapr02)Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:Leave resource fs_knworks_mysql (asknmapr02)
Is this normal behavior for heartbeat to restart another group, ifthe one I am trying to manage (failover) fails?
Thanks!!

-JPH


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] strange(?) group behaviour

Reply via email to