Hi folks,

(Sorry for the long post)

I'm working on a two-nodes cluster with Openais + DRBD on Debian Lenny. I'm
trying to have a Zimbra Suite HA cluster, with RAs for failover IP, DRBD,
and Zimbra daemon controller.

I've modified the zimbra lsb init script to comply with
http://wiki.linux-ha.org/LSBResourceAgent

The zimbra suite normally takes up to 4 minutes to start on my hardware.
I've modified the timeouts accordingly.

node2:/home/zadmin# date; /etc/init.d/zimbraha start; echo $?; date
Sat Oct 17 10:37:15 ART 2009
0
Sat Oct 17 10:40:57 ART 2009
node2:/home/zadmin#

But still I'm struggling to get it done. I'll greatly appreciate any hints
to spot the problem.

Here's my scenario:

node2:/home/zadmin# uname -a
Linux node2 2.6.26-2-amd64 #1 SMP Wed Aug 19 22:33:18 UTC 2009 x86_64
GNU/Linux
node2:/home/zadmin# crm_mon --one-shot -V

============
Last updated: Sat Oct 17 10:23:05 2009
Stack: openais
Current DC: node2 - partition with quorum
Version: 1.0.5-unknown
2 Nodes configured, 2 expected votes
2 Resources configured.
============

Online: [ node1 node2 ]

Resource Group: zimbra_group
    zimbra_fs   (ocf::heartbeat:Filesystem):    Started node1
    zimbra_ip   (ocf::heartbeat:IPaddr2):       Started node1
    zimbra_daemon       (lsb:zimbraha): Stopped
Master/Slave Set: ms_zimbra_drbd
        Masters: [ node1 ]
        Slaves: [ node2 ]

Failed actions:
    zimbra_daemon_start_0 (node=node2, call=19, rc=-2, status=Timed Out):
unknown exec error
    zimbra_daemon_start_0 (node=node1, call=18, rc=-2, status=Timed Out):
unknown exec error
node2:/home/zadmin# cat /proc/drbd
version: 8.0.14 (api:86/proto:86)
GIT-hash: bb447522fc9a87d0069b7e14f0234911ebdab0f7 build by p...@fat-tyre,
2008-11-12 16:40:33
 0: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r---
    ns:8 nr:16928 dw:16936 dr:117 al:1 bm:7 lo:0 pe:0 ua:0 ap:0
        resync: used:0/61 hits:0 misses:0 starving:0 dirty:0 changed:0
        act_log: used:0/257 hits:1 misses:1 starving:0 dirty:0 changed:1
node2:/home/zadmin# crm_verify -LV
crm_verify[4435]: 2009/10/17_10:23:21 WARN: unpack_rsc_op: Processing failed
op zimbra_daemon_start_0 on node2: unknown exec error
crm_verify[4435]: 2009/10/17_10:23:21 WARN: unpack_rsc_op: Processing failed
op zimbra_daemon_start_0 on node1: unknown exec error
crm_verify[4435]: 2009/10/17_10:23:21 WARN: common_apply_stickiness: Forcing
zimbra_daemon away from node1 after 1000000 failures (max=1000000)
crm_verify[4435]: 2009/10/17_10:23:21 WARN: common_apply_stickiness: Forcing
zimbra_daemon away from node2 after 1000000 failures (max=1000000)
crm_verify[4435]: 2009/10/17_10:23:21 WARN: native_color: Resource
zimbra_daemon cannot run anywhere
Warnings found during check: config may not be valid

node2:/home/zadmin# crm
crm(live)# configure
crm(live)configure# show
node node1
node node2
primitive zimbra_daemon lsb:zimbraha \
    op stop interval="600s" timeout="20s" \
    op status interval="600s" timeout="20s" \
    op start interval="600s" timeout="360s" \
    meta target-role="Started"
primitive zimbra_drbd ocf:heartbeat:drbd \
    params drbd_resource="r0" \
    op monitor interval="25s" role="Master" timeout="10s" \
    op monitor interval="30s" role="Slave" timeout="20s" \
    meta is-managed="true"
primitive zimbra_fs ocf:heartbeat:Filesystem \
    params device="/dev/drbd0" directory="/opt/zimbra/store" fstype="ext3" \
    op monitor interval="25s" timeout="10s" \
    meta is-managed="true"
primitive zimbra_ip ocf:heartbeat:IPaddr2 \
    params ip="192.168.2.80" interface="lan" \
    op monitor interval="25s" timeout="10s" \
    meta is-managed="true"
group zimbra_group zimbra_fs zimbra_ip zimbra_daemon
ms ms_zimbra_drbd zimbra_drbd \
    meta clone_max="2" clone_node_max="1" master_max="1" master_node_max="1"
notify="true" is-managed="true"
location zimbra_daemon_loc zimbra_daemon 100: node2
colocation zimbra_on_drbd inf: zimbra_group ms_zimbra_drbd:Master
order zimbra_after_drbd inf: ms_zimbra_drbd:promote zimbra_group:start
property $id="cib-bootstrap-options" \
    stonith-enabled="false" \
    dc-version="1.0.5-unknown" \
    cluster-infrastructure="openais" \
    last-lrm-refresh="1255203334" \
    no-quorum-policy="ignore" \
    expected-quorum-votes="2"
crm(live)configure#

node2:/home/zadmin# grep zimbra_daemon /var/log/debug | tail -n 30
Oct 17 09:17:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011
zimbra_daemon_status_600000 (cancelled : start un-runnable)
Oct 17 09:17:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011
zimbra_daemon_start_600000 (cancelled : start un-runnable)
Oct 17 09:32:45 node2 pengine: [2768]: debug: native_assign_node: Could not
allocate a node for zimbra_daemon
Oct 17 09:32:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011
zimbra_daemon_monitor_900000 (cancelled : start un-runnable)
Oct 17 09:32:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011
zimbra_daemon_stop_600000 (cancelled : start un-runnable)
Oct 17 09:32:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011
zimbra_daemon_status_600000 (cancelled : start un-runnable)
Oct 17 09:32:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011
zimbra_daemon_start_600000 (cancelled : start un-runnable)
Oct 17 09:47:45 node2 pengine: [2768]: debug: native_assign_node: Could not
allocate a node for zimbra_daemon
Oct 17 09:47:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011
zimbra_daemon_monitor_900000 (cancelled : start un-runnable)
Oct 17 09:47:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011
zimbra_daemon_stop_600000 (cancelled : start un-runnable)
Oct 17 09:47:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011
zimbra_daemon_status_600000 (cancelled : start un-runnable)
Oct 17 09:47:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011
zimbra_daemon_start_600000 (cancelled : start un-runnable)
Oct 17 10:02:45 node2 pengine: [2768]: debug: native_assign_node: Could not
allocate a node for zimbra_daemon
Oct 17 10:02:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011
zimbra_daemon_monitor_900000 (cancelled : start un-runnable)
Oct 17 10:02:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011
zimbra_daemon_stop_600000 (cancelled : start un-runnable)
Oct 17 10:02:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011
zimbra_daemon_status_600000 (cancelled : start un-runnable)
Oct 17 10:02:45 node2 pengine: [2768]: debug: RecurringOp: <null>#011
zimbra_daemon_start_600000 (cancelled : start un-runnable)
Oct 17 10:03:27 node2 pengine: [2768]: debug: native_assign_node: Could not
allocate a node for zimbra_daemon
Oct 17 10:03:27 node2 pengine: [2768]: debug: RecurringOp: <null>#011
zimbra_daemon_stop_600000 (cancelled : start un-runnable)
Oct 17 10:03:27 node2 pengine: [2768]: debug: RecurringOp: <null>#011
zimbra_daemon_status_600000 (cancelled : start un-runnable)
Oct 17 10:03:27 node2 pengine: [2768]: debug: RecurringOp: <null>#011
zimbra_daemon_start_600000 (cancelled : start un-runnable)
Oct 17 10:03:42 node2 cib: [2765]: debug: cib_process_xpath: Processing
cib_query op for
//cib/configuration/resources//*...@id="zimbra_daemon"]//meta_attributes//nvpa...@name="is-managed"]
(/cib/configuration/resources/group/primitive[3]/meta_attributes/nvpair)
Oct 17 10:03:42 node2 pengine: [2768]: debug: native_assign_node: All nodes
for resource zimbra_daemon are unavailable, unclean or shutting down (node2:
1, -1000000)
Oct 17 10:07:30 node2 pengine: [2768]: debug: native_assign_node: All nodes
for resource zimbra_daemon are unavailable, unclean or shutting down (node2:
1, -1000000)
Oct 17 10:10:50 node2 cib: [2765]: debug: cib_process_xpath: cib_query:
//cib/configuration/resources//*...@id="zimbra_daemon"]//meta_attributes//nvpa...@name="target-role"]
does not exist
Oct 17 10:10:50 node2 pengine: [2768]: debug: native_assign_node: All nodes
for resource zimbra_daemon are unavailable, unclean or shutting down (node2:
1, -1000000)
Oct 17 10:11:39 node2 pengine: [2768]: debug: native_assign_node: All nodes
for resource zimbra_daemon are unavailable, unclean or shutting down (node2:
1, -1000000)
Oct 17 10:12:01 node2 cib: [2765]: debug: cib_process_xpath: cib_query:
//cib/configuration/resources//*...@id="zimbra_daemon"]//meta_attributes//nvpa...@name="target-role"]
does not exist
Oct 17 10:12:01 node2 pengine: [2768]: debug: native_assign_node: All nodes
for resource zimbra_daemon are unavailable, unclean or shutting down (node2:
1, -1000000)
Oct 17 10:27:01 node2 pengine: [2768]: debug: native_assign_node: All nodes
for resource zimbra_daemon are unavailable, unclean or shutting down (node2:
1, -1000000)
node2:/home/zadmin#

Cheers,

Bdab
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to