On Mar 22, 2008, at 5:51 PM, [EMAIL PROTECTED] wrote:
I (think) I am seeing strange behavior with my groups, and would
appreciate some assistance to figure it out.
I currently have 3 groups, all collocated and ordered.
* GROUP_bofus_assp
* bofus_assp_ip
* bofus_assp_initscript
* GROUP_nagios
* resource_glusterfs_mount_nagios
* resource_nagios_daemon
* GROUPS_KNWORKS_mail
* drbddisk_knworks_mail
* ip_knworks_mail
* fs_knworks_mail
* ip_knworks_mail_external
* GROUP_KNWORKS_mysql
* drbddisk_knworks_mysql
* ip_knworks_mysql
* fs_knworks_mysql
When I try to fail a group over to the other server (specifying the
host):
# crm_resource -M -r GROUP_KNWORKS_mysql -H asknmapr01
nothing happens...
When I try to fail a group over to the other server (not specifying
the host):
# crm_resource -M -r GROUP_KNWORKS_mysql
I see in the logs that it does the following:
*- sets the preference for the group I am trying to failover to
INFINITY on asknmapr01
-------------------------------------------------------
*Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - <cib num_updates="41" epoch="385">
Mar 22 11:26:19 asknmapr02 tengine: [22390]: info:
update_abort_priority: Abort priority upgraded to 1000000
Mar 22 11:26:19 asknmapr02 crmd: [22372]: info: do_state_transition:
All 2 cluster nodes are eligible to run resources.
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - <configuration>
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - <constraints>
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - <rsc_location id="cli-prefer-GROUP_KNWORKS_mysql"
rsc="GROUP_KNWORKS_mysql">
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - <rule id="cli-prefer-rule-GROUP_KNWORKS_mysql"
score="INFINITY">
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - <expression id="cli-prefer-expr-
GROUP_KNWORKS_mysql" attribute="#uname" operation="eq"
value="asknmapr01" type="string"/>
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - </rule>
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - </rsc_location>
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - </constraints>
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - </configuration>
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: - </cib>
Mar 22 11:26:19 asknmapr02 cib: [22368]: info: log_data_element:
cib:diff: + <cib num_updates="1" epoch="386"/>
*-------------------------------------------------------
- then says handling failure of start for the filesystem (I dont
know what the failure reason is)*
thats what you're going to need to find out
the failed start op is what is preventing the group from moving to (or
staying on) asknmapr01
figure out the reason (see logs) and use crm_resource -C
Mar 22 11:26:19 asknmapr02 pengine: [22391]: WARN: unpack_rsc_op:
Processing failed op (fs_knworks_mysql_start_0) on asknmapr01
Mar 22 11:26:19 asknmapr02 pengine: [22391]: WARN: unpack_rsc_op:
Handling failed start for fs_knworks_mysql on asknmapr01
*
- then it restarts another unrelated group?? knworks_mail*
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:
Leave resource bofus_assp_ip (asknmapr01)
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:
Leave resource bofus_assp_initscript (asknmapr01)
Mar 22 11:26:20 asknmapr02 pengine: [22391]: WARN: native_color:
Resource resource_glusterfs_mount_nagios cannot run anywhere
Mar 22 11:26:20 asknmapr02 pengine: [22391]: WARN: native_color:
Resource resource_nagios_daemon cannot run anywhere
Mar 22 11:26:20 asknmapr02 pengine: [22391]: WARN: custom_action:
Action resource_glusterfs_mount_nagios_stop_0 (unmanaged)
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StopRsc:
asknmapr02 Stop drbddisk_knworks_mail
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StartRsc:
asknmapr02 Start drbddisk_knworks_mail
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:
Restart resource drbddisk_knworks_mail (asknmapr02)
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StopRsc:
asknmapr02 Stop ip_knworks_mail
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StartRsc:
asknmapr02 Start ip_knworks_mail
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:
Restart resource ip_knworks_mail (asknmapr02)
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StopRsc:
asknmapr02 Stop fs_knworks_mail
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StartRsc:
asknmapr02 Start fs_knworks_mail
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:
Restart resource fs_knworks_mail (asknmapr02)
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StopRsc:
asknmapr02 Stop ip_knworks_mail_external
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: StartRsc:
asknmapr02 Start ip_knworks_mail_external
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:
Restart resource ip_knworks_mail_external (asknmapr02)
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:
Leave resource drbddisk_knworks_mysql (asknmapr02)
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:
Leave resource ip_knworks_mysql (asknmapr02)
Mar 22 11:26:20 asknmapr02 pengine: [22391]: notice: NoRoleChange:
Leave resource fs_knworks_mysql (asknmapr02)
Is this normal behavior for heartbeat to restart another group, if
the one I am trying to manage (failover) fails?
Thanks!!
-JPH
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems