Hello all,
I have DRBD 8.2.7 and Heartbeat 2.1.3 configured in CRM mode in
Openfiler-2.3 system. I can able to run heartbeat server successfully in
primary server, but when i try to check failover it cat not start. It will
start to stop the crm resources group, it will work fine when i manually
mount the resources. below i have give the system information's.
*1) crm_mon command information.*
*[r...@gtt5 ~]# crm_mon *
============
Last updated: Fri Aug 21 17:26:02 2009
Current DC: gtt5.linux.com (7d892d6c-d277-45c2-beb6-331fca5b3920)
2 Nodes configured.
1 Resources configured.
============
Node: gtt5.linux.com (7d892d6c-d277-45c2-beb6-331fca5b3920): online
Node: gtt4.linux.com (87dc2dcc-791b-4bfb-a971-b30fbd909255): online
Resource Group: group_1
open-iscsi_1 (lsb:open-iscsi): Started gtt5.linux.com
MailTo_2 (heartbeat::ocf:MailTo): Started gtt5.linux.com
IPaddr_192_168_2_20 (heartbeat::ocf:IPaddr): Started gtt5.linux.com
drbddisk_4 (heartbeat:drbddisk): Started gtt5.linux.com
LVM_5 (heartbeat::ocf:LVM): Started gtt5.linux.com
Filesystem_6 (heartbeat::ocf:Filesystem): Started gtt5.linux.com
MakeMounts_7 (heartbeat:MakeMounts): Started gtt5.linux.com
*Filesystem_8 (heartbeat::ocf:Filesystem): Stopped*
*nfs_9 (lsb:nfs): Stopped*
*smb_10 (lsb:smb): Stopped*
*acpid_11 (lsb:acpid): Stopped*
*openfiler_12 (lsb:openfiler): Stopped*
F*ailed actions:*
*Filesystem_8_start_0 (node=gtt5.linux.com, call=32, rc=1): Error*
*IPaddr_192_168_2_20_start_0 (node=gtt4.linux.com, call=56, rc=1): Error*
*2)This is drbd status what i have created. *
*[r...@gtt5 ~]# service drbd status*
drbd driver loaded OK; device status:
version: 8.2.7 (api:88/proto:86-88)
GIT-hash: 61b7f4c2fc34fe3d2acf7be6bcc1fc2684708a7d build by p...@fat-tyre,
2008-11-12 16:47:11
m:res cs st ds p mounted fstype
0:cluster_metadata Connected Primary/Secondary UpToDate/UpToDate C
/cluster_metadata ext3
1:vg0_drbd Connected Primary/Secondary Diskless/UpToDate C
[r...@gtt5 ~]#
1.
*If i see the mount list here there is no any lvm mount poing it has only
mounted /cluster_metadata.*
*[r...@gtt5 ~]# mount*
/dev/sda5 on / type ext3 (rw)
/proc on /proc type proc (rw)
/sys on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/sda7 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
*/dev/drbd0 on /cluster_metadata type ext3 (rw,noatime)*
[r...@gtt5 ~]#
*4) This is my cib.xml configuration file*
<cib generated="true" admin_epoch="0" epoch="1" num_updates="1"
have_quorum="true" ignore_dtd="false" num_peers="2" ccm_transition="2"
cib_feature_revision="2.0" dc_uuid="7d892d6c-d277-45c2-beb6-331fca5b3920"
cib-last-written="Sat Aug 22 11:56:53 2009">
<configuration>
<crm_config>
<cluster_property_set id="cib-bootstrap-options">
<attributes>
<nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
value="2.1.3-node: 4a3eac571f442c7cfcefc18fcaad35314460c1f6"/>
<nvpair id="cib-bootstrap-options-symmetric-cluster"
name="symmetric-cluster" value="true"/>
<nvpair id="cib-bootstrap-options-no-quorum-policy"
name="no-quorum-policy" value="stop"/>
<nvpair id="cib-bootstrap-options-default-resource-stickiness"
name="default-resource-stickiness" value="0"/>
<nvpair
id="cib-bootstrap-options-default-resource-failure-stickiness"
name="default-resource-failure-stickiness" value="0"/>
<nvpair id="cib-bootstrap-options-stonith-enabled"
name="stonith-enabled" value="false"/>
<nvpair id="cib-bootstrap-options-stonith-action"
name="stonith-action" value="reboot"/>
<nvpair id="cib-bootstrap-options-startup-fencing"
name="startup-fencing" value="true"/>
<nvpair id="cib-bootstrap-options-stop-orphan-resources"
name="stop-orphan-resources" value="true"/>
<nvpair id="cib-bootstrap-options-stop-orphan-actions"
name="stop-orphan-actions" value="true"/>
<nvpair id="cib-bootstrap-options-remove-after-stop"
name="remove-after-stop" value="false"/>
<nvpair id="cib-bootstrap-options-short-resource-names"
name="short-resource-names" value="true"/>
<nvpair id="cib-bootstrap-options-transition-idle-timeout"
name="transition-idle-timeout" value="5min"/>
<nvpair id="cib-bootstrap-options-default-action-timeout"
name="default-action-timeout" value="20s"/>
<nvpair id="cib-bootstrap-options-is-managed-default"
name="is-managed-default" value="true"/>
<nvpair id="cib-bootstrap-options-cluster-delay"
name="cluster-delay" value="60s"/>
<nvpair id="cib-bootstrap-options-pe-error-series-max"
name="pe-error-series-max" value="-1"/>
<nvpair id="cib-bootstrap-options-pe-warn-series-max"
name="pe-warn-series-max" value="-1"/>
<nvpair id="cib-bootstrap-options-pe-input-series-max"
name="pe-input-series-max" value="-1"/>
</attributes>
</cluster_property_set>
</crm_config>
<nodes>
<node id="7d892d6c-d277-45c2-beb6-331fca5b3920" uname="gtt5.linux.com"
type="normal"/>
<node id="87dc2dcc-791b-4bfb-a971-b30fbd909255" uname="gtt4.linux.com"
type="normal"/>
</nodes>
<resources>
<group id="group_1">
<primitive class="lsb" id="open-iscsi_1" provider="heartbeat"
type="open-iscsi">
<operations>
<op id="open-iscsi_1_mon" interval="120s" name="monitor" timeout="60s"/>
</operations>
</primitive>
<primitive class="ocf" id="MailTo_2" provider="heartbeat"
type="MailTo">
<operations>
<op id="MailTo_2_mon" interval="120s" name="monitor"
timeout="60s"/>
</operations>
<instance_attributes id="MailTo_2_inst_attr">
<attributes>
<nvpair id="MailTo_2_attr_0" name="email"
value="r...@localhost"/>
<nvpair id="MailTo_2_attr_1" name="subject"
value="ClusterFailover"/>
</attributes>
</instance_attributes>
</primitive>
<primitive class="ocf" id="IPaddr_192_168_2_20"
provider="heartbeat" type="IPaddr">
<operations>
<op id="IPaddr_192_168_2_20_mon" interval="5s" name="monitor"
timeout="5s"/>
</operations>
<instance_attributes id="IPaddr_192_168_2_20_inst_attr">
<attributes>
<nvpair id="IPaddr_192_168_2_20_attr_0" name="ip"
value="192.168.2.20"/>
<nvpair id="IPaddr_192_168_2_20_attr_3" name="broadcast"
value="255.255.255.0"/>
</attributes>
</instance_attributes>
</primitive>
<primitive class="heartbeat" id="drbddisk_4" provider="heartbeat"
type="drbddisk">
<operations>
<op id="drbddisk_4_mon" interval="120s" name="start"
timeout="60s"/>
</operations>
<instance_attributes id="drbddisk_4_inst_attr">
<attributes>
<nvpair id="drbddisk_4_attr_1" name="1"/>
</attributes>
</instance_attributes>
</primitive>
<primitive class="ocf" id="LVM_5" provider="heartbeat" type="LVM">
<operations>
<op id="LVM_5_mon" interval="120s" name="start" timeout="60s"/>
</operations>
<instance_attributes id="LVM_5_inst_attr">
<attributes>
<nvpair id="LVM_5_attr_0" name="volgrpname"
value="vg0_drbd"/>
</attributes>
</instance_attributes>
</primitive>
<primitive class="ocf" id="Filesystem_6" provider="heartbeat"
type="Filesystem">
<operations>
<op id="Filesystem_6_mon" interval="120s" name="start"
timeout="60s"/>
</operations>
<instance_attributes id="Filesystem_6_inst_attr">
<attributes>
<nvpair id="Filesystem_6_attr_0" name="device"
value="/dev/drbd0"/>
<nvpair id="Filesystem_6_attr_1" name="directory"
value="/cluster_metadata"/>
<nvpair id="Filesystem_6_attr_2" name="fstype" value="ext3"/>
<nvpair id="Filesystem_6_attr_3" name="options"
value="defaults,noatime"/>
</attributes>
</instance_attributes>
</primitive>
<primitive class="heartbeat" id="MakeMounts_7" provider="heartbeat"
type="MakeMounts">
<operations>
<op id="MakeMounts_7_mon" interval="120s" name="start"
timeout="60s"/>
</operations>
</primitive>
<primitive class="ocf" id="Filesystem_8" provider="heartbeat"
type="Filesystem">
<operations>
<op id="Filesystem_8_mon" interval="120s" name="start"
timeout="60s"/>
</operations>
<instance_attributes id="Filesystem_8_inst_attr">
<attributes>
<nvpair id="Filesystem_8_attr_0" name="device"
value="/dev/vg0_drbd/lvm0"/>
<nvpair id="Filesystem_8_attr_1" name="directory"
value="/mnt/vg0_drbd/lvm0"/>
<nvpair id="Filesystem_8_attr_2" name="fstype" value="ext3"/>
<nvpair id="Filesystem_8_attr_3" name="options"
value="defaults,usrquota,grpquota,acl,user_xattr"/>
</attributes>
</instance_attributes>
</primitive>
<primitive class="lsb" id="nfs_9" provider="heartbeat" type="nfs">
<operations>
<op id="nfs_9_mon" interval="120s" name="start" timeout="60s"/>
</operations>
</primitive>
<primitive class="lsb" id="smb_10" provider="heartbeat" type="smb">
<operations>
<op id="smb_10_mon" interval="120s" name="start"
timeout="60s"/>
</operations>
</primitive>
<primitive class="lsb" id="acpid_11" provider="heartbeat"
type="acpid">
<operations>
<op id="acpid_11_mon" interval="120s" name="start"
timeout="60s"/>
</operations>
</primitive>
<primitive class="lsb" id="openfiler_12" provider="heartbeat"
type="openfiler">
<operations>
<op id="openfiler_12_mon" interval="120s" name="start"
timeout="60s"/>
</operations>
</primitive>
</group>
</resources>
<constraints>
<rsc_location id="rsc_location_group_1" rsc="group_1">
<rule id="prefered_location_group_1" score="100">
<expression attribute="#uname"
id="prefered_location_group_1_expr" operation="eq" value="gtt4.linux.com"/>
</rule>
</rsc_location>
</constraints>
</configuration>
</cib>
*5) secondary no system log information is as follow*
crmd[3828]: 2009/08/21_17:20:32 info: process_lrm_event: LRM operation
drbddisk_4_start_0 (call=45, rc=0) complete
tengine[3833]: 2009/08/21_17:20:32 info: match_graph_event: Action
drbddisk_4_start_0 (20) confirmed on gtt5.linux.com (rc=0)
LVM[5307]: 2009/08/21_17:20:32 INFO: Activating volume group vg0_drbd
LVM[5307]: 2009/08/21_17:20:33 INFO: File descriptor 4 left open File
descriptor 5 left open File descriptor 6 left open File descriptor 7 left
open File descriptor 8 left open File descriptor 9 left open File descriptor
10 left open File descriptor 11 left open File descriptor 13 left open File
descriptor 14 left open File descriptor 16 left open Device '/dev/drbd1' has
been left open. Reading all physical volumes. This may take a while... Found
volume group "vg0_drbd" using metadata type lvm2
LVM[5307]: 2009/08/21_17:20:33 INFO: File descriptor 4 left open File
descriptor 5 left open File descriptor 6 left open File descriptor 7 left
open File descriptor 8 left open File descriptor 9 left open File descriptor
10 left open File descriptor 11 left open File descriptor 13 left open File
descriptor 14 left open File descriptor 16 left open 1 logical volume(s) in
volume group "vg0_drbd" now active
lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 4 left open
lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 5 left open
lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 6 left open
lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 7 left open
lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 8 left open
lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 9 left open
lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 10 left open
lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 11 left open
lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 13 left open
lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 14 left open
lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 16 left open
lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr)
lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr)
lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) Using
volume group(s) on command line
Finding volume group "vg0_drbd"
crmd[3828]: 2009/08/21_17:20:33 info: process_lrm_event: LRM operation
LVM_5_start_0 (call=46, rc=0) complete
tengine[3833]: 2009/08/21_17:20:33 info: match_graph_event: Action
LVM_5_start_0 (12) confirmed on gtt5.linux.com (rc=0)
tengine[3833]: 2009/08/21_17:20:33 info: send_rsc_command: Initiating action
22: LVM_5_start_120000 on gtt5.linux.com
tengine[3833]: 2009/08/21_17:20:33 info: send_rsc_command: Initiating action
11: Filesystem_6_start_0 on gtt5.linux.com
crmd[3828]: 2009/08/21_17:20:33 ERROR: construct_op: Start and Stop actions
cannot have an interval
crmd[3828]: 2009/08/21_17:20:33 info: do_lrm_rsc_op: Performing
op=LVM_5_start_0 key=22:6:b2620470-2f4f-4191-b958-fb74219903cf)
lrmd[3825]: 2009/08/21_17:20:33 info: rsc:LVM_5: start
crmd[3828]: 2009/08/21_17:20:33 info: do_lrm_rsc_op: Performing
op=Filesystem_6_start_0 key=11:6:b2620470-2f4f-4191-b958-fb74219903cf)
lrmd[3825]: 2009/08/21_17:20:33 info: rsc:Filesystem_6: start
LVM[5382]: 2009/08/21_17:20:33 INFO: Activating volume group vg0_drbd
Filesystem[5383]: 2009/08/21_17:20:33 INFO: Running start for /dev/drbd0 on
/cluster_metadata
crmd[3828]: 2009/08/21_17:20:33 info: process_lrm_event: LRM operation
Filesystem_6_start_0 (call=48, rc=0) complete
tengine[3833]: 2009/08/21_17:20:33 info: match_graph_event: Action
Filesystem_6_start_0 (11) confirmed on gtt5.linux.com (rc=0)
tengine[3833]: 2009/08/21_17:20:33 info: send_rsc_command: Initiating action
24: Filesystem_6_start_120000 on gtt5.linux.com
tengine[3833]: 2009/08/21_17:20:33 info: send_rsc_command: Initiating action
10: MakeMounts_7_start_0 on gtt5.linux.com
crmd[3828]: 2009/08/21_17:20:33 ERROR: construct_op: Start and Stop actions
cannot have an interval
crmd[3828]: 2009/08/21_17:20:33 info: do_lrm_rsc_op: Performing
op=Filesystem_6_start_0 key=24:6:b2620470-2f4f-4191-b958-fb74219903cf)
lrmd[3825]: 2009/08/21_17:20:33 info: rsc:Filesystem_6: start
crmd[3828]: 2009/08/21_17:20:33 info: do_lrm_rsc_op: Performing
op=MakeMounts_7_start_0 key=10:6:b2620470-2f4f-4191-b958-fb74219903cf)
lrmd[3825]: 2009/08/21_17:20:33 info: rsc:MakeMounts_7: start
MakeMounts[5451]: 2009/08/21_17:20:33 Openfiler making mount paths...
Filesystem[5448]: 2009/08/21_17:20:33 INFO: Running start for /dev/drbd0 on
/cluster_metadata
Filesystem[5448]: 2009/08/21_17:20:33 INFO: Filesystem /cluster_metadata is
already mounted.
crmd[3828]: 2009/08/21_17:20:33 info: process_lrm_event: LRM operation
Filesystem_6_start_0 (call=49, rc=0) complete
tengine[3833]: 2009/08/21_17:20:33 info: match_graph_event: Action
Filesystem_6_start_0 (24) confirmed on gtt5.linux.com (rc=0)
crmd[3828]: 2009/08/21_17:20:33 info: process_lrm_event: LRM operation
MakeMounts_7_start_0 (call=50, rc=0) complete
tengine[3833]: 2009/08/21_17:20:33 info: match_graph_event: Action
MakeMounts_7_start_0 (10) confirmed on gtt5.linux.com (rc=0)
tengine[3833]: 2009/08/21_17:20:33 info: send_rsc_command: Initiating action
26: MakeMounts_7_start_120000 on gtt5.linux.com
crmd[3828]: 2009/08/21_17:20:33 ERROR: construct_op: Start and Stop actions
cannot have an interval
crmd[3828]: 2009/08/21_17:20:33 info: do_lrm_rsc_op: Performing
op=MakeMounts_7_start_0 key=26:6:b2620470-2f4f-4191-b958-fb74219903cf)
lrmd[3825]: 2009/08/21_17:20:33 info: rsc:MakeMounts_7: start
LVM[5382]: 2009/08/21_17:20:34 INFO: File descriptor 4 left open File
descriptor 5 left open File descriptor 6 left open File descriptor 7 left
open File descriptor 8 left open File descriptor 9 left open File descriptor
16 left open Device '/dev/drbd1' has been left open. Reading all physical
volumes. This may take a while... Found volume group "vg0_drbd" using
metadata type lvm2
MakeMounts[5523]: 2009/08/21_17:20:34 Openfiler making mount paths...
crmd[3828]: 2009/08/21_17:20:34 info: process_lrm_event: LRM operation
MakeMounts_7_start_0 (call=51, rc=0) complete
LVM[5382]: 2009/08/21_17:20:34 INFO: File descriptor 4 left open File
descriptor 5 left open File descriptor 6 left open File descriptor 7 left
open File descriptor 8 left open File descriptor 9 left open File descriptor
16 left open 1 logical volume(s) in volume group "vg0_drbd" now active
tengine[3833]: 2009/08/21_17:20:34 info: match_graph_event: Action
MakeMounts_7_start_0 (26) confirmed on gtt5.linux.com (rc=0)
lrmd[3825]: 2009/08/21_17:20:34 info: RA output: (LVM_5:start:stderr) File
descriptor 4 left open
File descriptor 5 left open
File descriptor 6 left open
File descriptor 7 left open
File descriptor 8 left open
File descriptor 9 left open
File descriptor 16 left open
lrmd[3825]: 2009/08/21_17:20:34 info: RA output: (LVM_5:start:stderr) Using
volume group(s) on command line
lrmd[3825]: 2009/08/21_17:20:34 info: RA output: (LVM_5:start:stderr)
Finding volume group "vg0_drbd"
crmd[3828]: 2009/08/21_17:20:34 info: process_lrm_event: LRM operation
LVM_5_start_0 (call=47, rc=0) complete
tengine[3833]: 2009/08/21_17:20:34 info: match_graph_event: Action
LVM_5_start_0 (22) confirmed on gtt5.linux.com (rc=0)
tengine[3833]: 2009/08/21_17:20:34 info: run_graph: Transition 6:
(Complete=24, Pending=0, Fired=0, Skipped=0, Incomplete=0)
tengine[3833]: 2009/08/21_17:20:34 info: notify_crmd: Transition 6 status:
te_complete - <null>
crmd[3828]: 2009/08/21_17:20:34 info: do_state_transition: State transition
S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_IPC_MESSAGE
origin=route_message ]
heartbeat[3398]: 2009/08/21_17:22:03 info: Link gtt4.linux.com:eth0 dead.
cib[3824]: 2009/08/21_17:26:28 info: cib_stats: Processed 104 operations
(12403.00us average, 0% utilization) in the last 10min
cib[3824]: 2009/08/21_17:36:28 info: cib_stats: Processed 40 operations
(8250.00us average, 0% utilization) in the last 10min
cib[3824]: 2009/08/21_17:46:28 info: cib_stats: Processed 40 operations
(7750.00us average, 0% utilization) in the last 10min
cib[3824]: 2009/08/21_17:56:28 info: cib_stats: Processed 14 operations
(9285.00us average, 0% utilization) in the last 10min
Please help me what shoude i have to do.
Thans
Prakash KH
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems