I (think) I am seeing strange behavior with my groups, and would
appreciate some assistance to figure it out.
I currently have 3 groups, all collocated and ordered.
* GROUP_bofus_assp
* bofus_assp_ip
* bofus_assp_initscript
* GROUP_nagios
* resource_glusterfs_mount_nagios
* resource_nagios_daemon
* GROUPS_KNWORKS_mail
* drbddisk_knworks_mail
* ip_knworks_mail
* fs_knworks_mail
* ip_knworks_mail_external
* GROUP_KNWORKS_mysql
* drbddisk_knworks_mysql
* ip_knworks_mysql
* fs_knworks_mysql
When I try to fail a group over to the other server (specifying the host):
# crm_resource -M -r GROUP_KNWORKS_mysql -H asknmapr01
nothing happens...
When I try to fail a group over to the other server (not specifying the
host):
# crm_resource -M -r GROUP_KNWORKS_mysql
I see in the logs that it does the following:
*- sets the preference for the group I am trying to failover to INFINITY
on asknmapr01
-------------------------------------------------------
*Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - <cib num_updates="41" epoch="385">
Mar 22 11:26:19 asknmapr02 tengine: [22390]: info:
update_abort_priority: Abort priority upgraded to 1000000
Mar 22 11:26:19 asknmapr02 crmd: [22372]: info: do_state_transition: All
2 cluster nodes are eligible to run resources.
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - <configuration>
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - <constraints>
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - <rsc_location id="cli-prefer-GROUP_KNWORKS_mysql"
rsc="GROUP_KNWORKS_mysql">
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - <rule id="cli-prefer-rule-GROUP_KNWORKS_mysql"
score="INFINITY">
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - <expression
id="cli-prefer-expr-GROUP_KNWORKS_mysql" attribute="#uname"
operation="eq" value="asknmapr01" type="string"/>
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - </rule>
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - </rsc_location>
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - </constraints>
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - </configuration>
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - </cib>
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: + <cib num_updates="1" epoch="386"/>
*-------------------------------------------------------
- then says handling failure of start for the filesystem (I dont know
what the failure reason is)*
Mar 22 11:26:19 asknmapr02 pengine: [22391]: WARN: unpack_rsc_op:
Processing failed op (fs_knworks_mysql_start_0) on asknmapr01
Mar 22 11:26:19 asknmapr02 pengine: [22391]: WARN: unpack_rsc_op:
Handling failed start for fs_knworks_mysql on asknmapr01
*
- then it restarts another unrelated group?? knworks_mail*
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange: Leave
resource bofus_assp_ip (asknmapr01)
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange: Leave
resource bofus_assp_initscript (asknmapr01)
Mar 22 11:26:20 asknmapr02 pengine: [22391]: WARN: native_color:
Resource resource_glusterfs_mount_nagios cannot run anywhere
Mar 22 11:26:20 asknmapr02 pengine: [22391]: WARN: native_color:
Resource resource_nagios_daemon cannot run anywhere
Mar 22 11:26:20 asknmapr02 pengine: [22391]: WARN: custom_action: Action
resource_glusterfs_mount_nagios_stop_0 (unmanaged)
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StopRsc:
asknmapr02 Stop drbddisk_knworks_mail
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StartRsc:
asknmapr02 Start drbddisk_knworks_mail
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:
Restart resource drbddisk_knworks_mail (asknmapr02)
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StopRsc:
asknmapr02 Stop ip_knworks_mail
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StartRsc:
asknmapr02 Start ip_knworks_mail
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:
Restart resource ip_knworks_mail (asknmapr02)
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StopRsc:
asknmapr02 Stop fs_knworks_mail
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StartRsc:
asknmapr02 Start fs_knworks_mail
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:
Restart resource fs_knworks_mail (asknmapr02)
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StopRsc:
asknmapr02 Stop ip_knworks_mail_external
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StartRsc:
asknmapr02 Start ip_knworks_mail_external
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:
Restart resource ip_knworks_mail_external (asknmapr02)
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange: Leave
resource drbddisk_knworks_mysql (asknmapr02)
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange: Leave
resource ip_knworks_mysql (asknmapr02)
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange: Leave
resource fs_knworks_mysql (asknmapr02)
Is this normal behavior for heartbeat to restart another group, if the
one I am trying to manage (failover) fails?
Thanks!!
-JPH
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems