Hi folks, (Sorry for the long post)
I'm working on a two-nodes cluster with Openais + DRBD on Debian Lenny. I'm trying to have a Zimbra Suite HA cluster, with RAs for failover IP, DRBD, and Zimbra daemon controller. I've modified the zimbra lsb init script to comply with http://wiki.linux-ha.org/LSBResourceAgent The zimbra suite normally takes up to 4 minutes to start on my hardware. I've modified the timeouts accordingly. node2:/home/zadmin# date; /etc/init.d/zimbraha start; echo $?; date Sat Oct 17 10:37:15 ART 2009 0 Sat Oct 17 10:40:57 ART 2009 node2:/home/zadmin# But still I'm struggling to get it done. I'll greatly appreciate any hints to spot the problem. Here's my scenario: node2:/home/zadmin# uname -a Linux node2 2.6.26-2-amd64 #1 SMP Wed Aug 19 22:33:18 UTC 2009 x86_64 GNU/Linux node2:/home/zadmin# crm_mon --one-shot -V ============ Last updated: Sat Oct 17 10:23:05 2009 Stack: openais Current DC: node2 - partition with quorum Version: 1.0.5-unknown 2 Nodes configured, 2 expected votes 2 Resources configured. ============ Online: [ node1 node2 ] Resource Group: zimbra_group zimbra_fs (ocf::heartbeat:Filesystem): Started node1 zimbra_ip (ocf::heartbeat:IPaddr2): Started node1 zimbra_daemon (lsb:zimbraha): Stopped Master/Slave Set: ms_zimbra_drbd Masters: [ node1 ] Slaves: [ node2 ] Failed actions: zimbra_daemon_start_0 (node=node2, call=19, rc=-2, status=Timed Out): unknown exec error zimbra_daemon_start_0 (node=node1, call=18, rc=-2, status=Timed Out): unknown exec error node2:/home/zadmin# cat /proc/drbd version: 8.0.14 (api:86/proto:86) GIT-hash: bb447522fc9a87d0069b7e14f0234911ebdab0f7 build by p...@fat-tyre, 2008-11-12 16:40:33 0: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r--- ns:8 nr:16928 dw:16936 dr:117 al:1 bm:7 lo:0 pe:0 ua:0 ap:0 resync: used:0/61 hits:0 misses:0 starving:0 dirty:0 changed:0 act_log: used:0/257 hits:1 misses:1 starving:0 dirty:0 changed:1 node2:/home/zadmin# crm_verify -LV crm_verify[4435]: 2009/10/17_10:23:21 WARN: unpack_rsc_op: Processing failed op zimbra_daemon_start_0 on node2: unknown exec error crm_verify[4435]: 2009/10/17_10:23:21 WARN: unpack_rsc_op: Processing failed op zimbra_daemon_start_0 on node1: unknown exec error crm_verify[4435]: 2009/10/17_10:23:21 WARN: common_apply_stickiness: Forcing zimbra_daemon away from node1 after 1000000 failures (max=1000000) crm_verify[4435]: 2009/10/17_10:23:21 WARN: common_apply_stickiness: Forcing zimbra_daemon away from node2 after 1000000 failures (max=1000000) crm_verify[4435]: 2009/10/17_10:23:21 WARN: native_color: Resource zimbra_daemon cannot run anywhere Warnings found during check: config may not be valid node2:/home/zadmin# crm crm(live)# configure crm(live)configure# show node node1 node node2 primitive zimbra_daemon lsb:zimbraha \ op stop interval="600s" timeout="20s" \ op status interval="600s" timeout="20s" \ op start interval="600s" timeout="360s" \ meta target-role="Started" primitive zimbra_drbd ocf:heartbeat:drbd \ params drbd_resource="r0" \ op monitor interval="25s" role="Master" timeout="10s" \ op monitor interval="30s" role="Slave" timeout="20s" \ meta is-managed="true" primitive zimbra_fs ocf:heartbeat:Filesystem \ params device="/dev/drbd0" directory="/opt/zimbra/store" fstype="ext3" \ op monitor interval="25s" timeout="10s" \ meta is-managed="true" primitive zimbra_ip ocf:heartbeat:IPaddr2 \ params ip="192.168.2.80" interface="lan" \ op monitor interval="25s" timeout="10s" \ meta is-managed="true" group zimbra_group zimbra_fs zimbra_ip zimbra_daemon ms ms_zimbra_drbd zimbra_drbd \ meta clone_max="2" clone_node_max="1" master_max="1" master_node_max="1" notify="true" is-managed="true" location zimbra_daemon_loc zimbra_daemon 100: node2 colocation zimbra_on_drbd inf: zimbra_group ms_zimbra_drbd:Master order zimbra_after_drbd inf: ms_zimbra_drbd:promote zimbra_group:start property $id="cib-bootstrap-options" \ stonith-enabled="false" \ dc-version="1.0.5-unknown" \ cluster-infrastructure="openais" \ last-lrm-refresh="1255203334" \ no-quorum-policy="ignore" \ expected-quorum-votes="2" crm(live)configure# node2:/home/zadmin# grep zimbra_daemon /var/log/debug | tail -n 30 Oct 17 09:17:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011 zimbra_daemon_status_600000 (cancelled : start un-runnable) Oct 17 09:17:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011 zimbra_daemon_start_600000 (cancelled : start un-runnable) Oct 17 09:32:45 node2 pengine: [2768]: debug: native_assign_node: Could not allocate a node for zimbra_daemon Oct 17 09:32:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011 zimbra_daemon_monitor_900000 (cancelled : start un-runnable) Oct 17 09:32:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011 zimbra_daemon_stop_600000 (cancelled : start un-runnable) Oct 17 09:32:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011 zimbra_daemon_status_600000 (cancelled : start un-runnable) Oct 17 09:32:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011 zimbra_daemon_start_600000 (cancelled : start un-runnable) Oct 17 09:47:45 node2 pengine: [2768]: debug: native_assign_node: Could not allocate a node for zimbra_daemon Oct 17 09:47:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011 zimbra_daemon_monitor_900000 (cancelled : start un-runnable) Oct 17 09:47:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011 zimbra_daemon_stop_600000 (cancelled : start un-runnable) Oct 17 09:47:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011 zimbra_daemon_status_600000 (cancelled : start un-runnable) Oct 17 09:47:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011 zimbra_daemon_start_600000 (cancelled : start un-runnable) Oct 17 10:02:45 node2 pengine: [2768]: debug: native_assign_node: Could not allocate a node for zimbra_daemon Oct 17 10:02:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011 zimbra_daemon_monitor_900000 (cancelled : start un-runnable) Oct 17 10:02:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011 zimbra_daemon_stop_600000 (cancelled : start un-runnable) Oct 17 10:02:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011 zimbra_daemon_status_600000 (cancelled : start un-runnable) Oct 17 10:02:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011 zimbra_daemon_start_600000 (cancelled : start un-runnable) Oct 17 10:03:27 node2 pengine: [2768]: debug: native_assign_node: Could not allocate a node for zimbra_daemon Oct 17 10:03:27 node2 pengine: [2768]: debug: RecurringOp: <null>#011 zimbra_daemon_stop_600000 (cancelled : start un-runnable) Oct 17 10:03:27 node2 pengine: [2768]: debug: RecurringOp: <null>#011 zimbra_daemon_status_600000 (cancelled : start un-runnable) Oct 17 10:03:27 node2 pengine: [2768]: debug: RecurringOp: <null>#011 zimbra_daemon_start_600000 (cancelled : start un-runnable) Oct 17 10:03:42 node2 cib: [2765]: debug: cib_process_xpath: Processing cib_query op for //cib/configuration/resources//*...@id="zimbra_daemon"]//meta_attributes//nvpa...@name="is-managed"] (/cib/configuration/resources/group/primitive[3]/meta_attributes/nvpair) Oct 17 10:03:42 node2 pengine: [2768]: debug: native_assign_node: All nodes for resource zimbra_daemon are unavailable, unclean or shutting down (node2: 1, -1000000) Oct 17 10:07:30 node2 pengine: [2768]: debug: native_assign_node: All nodes for resource zimbra_daemon are unavailable, unclean or shutting down (node2: 1, -1000000) Oct 17 10:10:50 node2 cib: [2765]: debug: cib_process_xpath: cib_query: //cib/configuration/resources//*...@id="zimbra_daemon"]//meta_attributes//nvpa...@name="target-role"] does not exist Oct 17 10:10:50 node2 pengine: [2768]: debug: native_assign_node: All nodes for resource zimbra_daemon are unavailable, unclean or shutting down (node2: 1, -1000000) Oct 17 10:11:39 node2 pengine: [2768]: debug: native_assign_node: All nodes for resource zimbra_daemon are unavailable, unclean or shutting down (node2: 1, -1000000) Oct 17 10:12:01 node2 cib: [2765]: debug: cib_process_xpath: cib_query: //cib/configuration/resources//*...@id="zimbra_daemon"]//meta_attributes//nvpa...@name="target-role"] does not exist Oct 17 10:12:01 node2 pengine: [2768]: debug: native_assign_node: All nodes for resource zimbra_daemon are unavailable, unclean or shutting down (node2: 1, -1000000) Oct 17 10:27:01 node2 pengine: [2768]: debug: native_assign_node: All nodes for resource zimbra_daemon are unavailable, unclean or shutting down (node2: 1, -1000000) node2:/home/zadmin# Cheers, Bdab _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
