Hello all,

I have DRBD 8.2.7 and Heartbeat 2.1.3 configured in CRM mode in
Openfiler-2.3 system. I can able to run heartbeat server successfully in
primary server, but when i try to check failover it cat not start. It will
start to stop the crm resources group, it will work fine when i manually
mount the resources. below i have give the system information's.


 *1) crm_mon command information.*

*[r...@gtt5 ~]# crm_mon *

============

Last updated: Fri Aug 21 17:26:02 2009

Current DC: gtt5.linux.com (7d892d6c-d277-45c2-beb6-331fca5b3920)

2 Nodes configured.

1 Resources configured.

============

Node: gtt5.linux.com (7d892d6c-d277-45c2-beb6-331fca5b3920): online

Node: gtt4.linux.com (87dc2dcc-791b-4bfb-a971-b30fbd909255): online


 Resource Group: group_1

open-iscsi_1 (lsb:open-iscsi): Started gtt5.linux.com

MailTo_2 (heartbeat::ocf:MailTo): Started gtt5.linux.com

IPaddr_192_168_2_20 (heartbeat::ocf:IPaddr): Started gtt5.linux.com

drbddisk_4 (heartbeat:drbddisk): Started gtt5.linux.com

LVM_5 (heartbeat::ocf:LVM): Started gtt5.linux.com

Filesystem_6 (heartbeat::ocf:Filesystem): Started gtt5.linux.com

MakeMounts_7 (heartbeat:MakeMounts): Started gtt5.linux.com

*Filesystem_8 (heartbeat::ocf:Filesystem): Stopped*

*nfs_9 (lsb:nfs): Stopped*

*smb_10 (lsb:smb): Stopped*

*acpid_11 (lsb:acpid): Stopped*

*openfiler_12 (lsb:openfiler): Stopped*


 F*ailed actions:*

*Filesystem_8_start_0 (node=gtt5.linux.com, call=32, rc=1): Error*

*IPaddr_192_168_2_20_start_0 (node=gtt4.linux.com, call=56, rc=1): Error*


*2)This is drbd status what i have created. *

*[r...@gtt5 ~]# service drbd status*

drbd driver loaded OK; device status:

version: 8.2.7 (api:88/proto:86-88)

GIT-hash: 61b7f4c2fc34fe3d2acf7be6bcc1fc2684708a7d build by p...@fat-tyre,
2008-11-12 16:47:11

m:res cs st ds p mounted fstype

0:cluster_metadata Connected Primary/Secondary UpToDate/UpToDate C
/cluster_metadata ext3

1:vg0_drbd Connected Primary/Secondary Diskless/UpToDate C

[r...@gtt5 ~]#



   1.

   *If i see the mount list here there is no any lvm mount poing it has only
   mounted /cluster_metadata.*

*[r...@gtt5 ~]# mount*

/dev/sda5 on / type ext3 (rw)

/proc on /proc type proc (rw)

/sys on /sys type sysfs (rw)

devpts on /dev/pts type devpts (rw,gid=5,mode=620)

/dev/sda7 on /boot type ext3 (rw)

tmpfs on /dev/shm type tmpfs (rw)

none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)

sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)

*/dev/drbd0 on /cluster_metadata type ext3 (rw,noatime)*

[r...@gtt5 ~]#


*4) This is my  cib.xml configuration file*

 <cib generated="true" admin_epoch="0" epoch="1" num_updates="1"
have_quorum="true" ignore_dtd="false" num_peers="2" ccm_transition="2"
cib_feature_revision="2.0" dc_uuid="7d892d6c-d277-45c2-beb6-331fca5b3920"
cib-last-written="Sat Aug 22 11:56:53 2009">
   <configuration>
     <crm_config>
       <cluster_property_set id="cib-bootstrap-options">
         <attributes>
           <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
value="2.1.3-node: 4a3eac571f442c7cfcefc18fcaad35314460c1f6"/>
           <nvpair id="cib-bootstrap-options-symmetric-cluster"
name="symmetric-cluster" value="true"/>
           <nvpair id="cib-bootstrap-options-no-quorum-policy"
name="no-quorum-policy" value="stop"/>
           <nvpair id="cib-bootstrap-options-default-resource-stickiness"
name="default-resource-stickiness" value="0"/>
           <nvpair
id="cib-bootstrap-options-default-resource-failure-stickiness"
name="default-resource-failure-stickiness" value="0"/>
           <nvpair id="cib-bootstrap-options-stonith-enabled"
name="stonith-enabled" value="false"/>
           <nvpair id="cib-bootstrap-options-stonith-action"
name="stonith-action" value="reboot"/>
           <nvpair id="cib-bootstrap-options-startup-fencing"
name="startup-fencing" value="true"/>
           <nvpair id="cib-bootstrap-options-stop-orphan-resources"
name="stop-orphan-resources" value="true"/>
           <nvpair id="cib-bootstrap-options-stop-orphan-actions"
name="stop-orphan-actions" value="true"/>
           <nvpair id="cib-bootstrap-options-remove-after-stop"
name="remove-after-stop" value="false"/>
           <nvpair id="cib-bootstrap-options-short-resource-names"
name="short-resource-names" value="true"/>
           <nvpair id="cib-bootstrap-options-transition-idle-timeout"
name="transition-idle-timeout" value="5min"/>
           <nvpair id="cib-bootstrap-options-default-action-timeout"
name="default-action-timeout" value="20s"/>
           <nvpair id="cib-bootstrap-options-is-managed-default"
name="is-managed-default" value="true"/>
           <nvpair id="cib-bootstrap-options-cluster-delay"
name="cluster-delay" value="60s"/>
           <nvpair id="cib-bootstrap-options-pe-error-series-max"
name="pe-error-series-max" value="-1"/>
           <nvpair id="cib-bootstrap-options-pe-warn-series-max"
name="pe-warn-series-max" value="-1"/>
           <nvpair id="cib-bootstrap-options-pe-input-series-max"
name="pe-input-series-max" value="-1"/>
         </attributes>
       </cluster_property_set>
     </crm_config>
     <nodes>
       <node id="7d892d6c-d277-45c2-beb6-331fca5b3920" uname="gtt5.linux.com"
type="normal"/>
       <node id="87dc2dcc-791b-4bfb-a971-b30fbd909255" uname="gtt4.linux.com"
type="normal"/>
     </nodes>
     <resources>
       <group id="group_1">
         <primitive class="lsb" id="open-iscsi_1" provider="heartbeat"
type="open-iscsi">
           <operations>
  <op id="open-iscsi_1_mon" interval="120s" name="monitor" timeout="60s"/>
           </operations>
         </primitive>
         <primitive class="ocf" id="MailTo_2" provider="heartbeat"
type="MailTo">
           <operations>
             <op id="MailTo_2_mon" interval="120s" name="monitor"
timeout="60s"/>
           </operations>
           <instance_attributes id="MailTo_2_inst_attr">
             <attributes>
               <nvpair id="MailTo_2_attr_0" name="email"
value="r...@localhost"/>
               <nvpair id="MailTo_2_attr_1" name="subject"
value="ClusterFailover"/>
             </attributes>
           </instance_attributes>
         </primitive>
         <primitive class="ocf" id="IPaddr_192_168_2_20"
provider="heartbeat" type="IPaddr">
           <operations>
             <op id="IPaddr_192_168_2_20_mon" interval="5s" name="monitor"
timeout="5s"/>
           </operations>
           <instance_attributes id="IPaddr_192_168_2_20_inst_attr">
             <attributes>
               <nvpair id="IPaddr_192_168_2_20_attr_0" name="ip"
value="192.168.2.20"/>
               <nvpair id="IPaddr_192_168_2_20_attr_3" name="broadcast"
value="255.255.255.0"/>
             </attributes>
           </instance_attributes>
         </primitive>
         <primitive class="heartbeat" id="drbddisk_4" provider="heartbeat"
type="drbddisk">
           <operations>
             <op id="drbddisk_4_mon" interval="120s" name="start"
timeout="60s"/>
           </operations>
           <instance_attributes id="drbddisk_4_inst_attr">
             <attributes>
               <nvpair id="drbddisk_4_attr_1" name="1"/>
             </attributes>
           </instance_attributes>
 </primitive>
         <primitive class="ocf" id="LVM_5" provider="heartbeat" type="LVM">
           <operations>
             <op id="LVM_5_mon" interval="120s" name="start" timeout="60s"/>
           </operations>
           <instance_attributes id="LVM_5_inst_attr">
             <attributes>
               <nvpair id="LVM_5_attr_0" name="volgrpname"
value="vg0_drbd"/>
             </attributes>
           </instance_attributes>
         </primitive>
         <primitive class="ocf" id="Filesystem_6" provider="heartbeat"
type="Filesystem">
           <operations>
             <op id="Filesystem_6_mon" interval="120s" name="start"
timeout="60s"/>
           </operations>
           <instance_attributes id="Filesystem_6_inst_attr">
             <attributes>
               <nvpair id="Filesystem_6_attr_0" name="device"
value="/dev/drbd0"/>
               <nvpair id="Filesystem_6_attr_1" name="directory"
value="/cluster_metadata"/>
               <nvpair id="Filesystem_6_attr_2" name="fstype" value="ext3"/>
               <nvpair id="Filesystem_6_attr_3" name="options"
value="defaults,noatime"/>
             </attributes>
           </instance_attributes>
         </primitive>
         <primitive class="heartbeat" id="MakeMounts_7" provider="heartbeat"
type="MakeMounts">
           <operations>
             <op id="MakeMounts_7_mon" interval="120s" name="start"
timeout="60s"/>
           </operations>
         </primitive>
         <primitive class="ocf" id="Filesystem_8" provider="heartbeat"
type="Filesystem">
           <operations>
             <op id="Filesystem_8_mon" interval="120s" name="start"
timeout="60s"/>
           </operations>
           <instance_attributes id="Filesystem_8_inst_attr">
             <attributes>
               <nvpair id="Filesystem_8_attr_0" name="device"
value="/dev/vg0_drbd/lvm0"/>
               <nvpair id="Filesystem_8_attr_1" name="directory"
value="/mnt/vg0_drbd/lvm0"/>
 <nvpair id="Filesystem_8_attr_2" name="fstype" value="ext3"/>
               <nvpair id="Filesystem_8_attr_3" name="options"
value="defaults,usrquota,grpquota,acl,user_xattr"/>
             </attributes>
           </instance_attributes>
         </primitive>
         <primitive class="lsb" id="nfs_9" provider="heartbeat" type="nfs">
           <operations>
             <op id="nfs_9_mon" interval="120s" name="start" timeout="60s"/>
           </operations>
         </primitive>
         <primitive class="lsb" id="smb_10" provider="heartbeat" type="smb">
           <operations>
             <op id="smb_10_mon" interval="120s" name="start"
timeout="60s"/>
           </operations>
         </primitive>
         <primitive class="lsb" id="acpid_11" provider="heartbeat"
type="acpid">
           <operations>
             <op id="acpid_11_mon" interval="120s" name="start"
timeout="60s"/>
           </operations>
         </primitive>
         <primitive class="lsb" id="openfiler_12" provider="heartbeat"
type="openfiler">
           <operations>
             <op id="openfiler_12_mon" interval="120s" name="start"
timeout="60s"/>
           </operations>
         </primitive>
       </group>
     </resources>
     <constraints>
       <rsc_location id="rsc_location_group_1" rsc="group_1">
         <rule id="prefered_location_group_1" score="100">
           <expression attribute="#uname"
id="prefered_location_group_1_expr" operation="eq" value="gtt4.linux.com"/>
         </rule>
       </rsc_location>
     </constraints>
   </configuration>
 </cib>




*5) secondary no system log information is as follow*

crmd[3828]: 2009/08/21_17:20:32 info: process_lrm_event: LRM operation
drbddisk_4_start_0 (call=45, rc=0) complete

tengine[3833]: 2009/08/21_17:20:32 info: match_graph_event: Action
drbddisk_4_start_0 (20) confirmed on gtt5.linux.com (rc=0)

LVM[5307]: 2009/08/21_17:20:32 INFO: Activating volume group vg0_drbd

LVM[5307]: 2009/08/21_17:20:33 INFO: File descriptor 4 left open File
descriptor 5 left open File descriptor 6 left open File descriptor 7 left
open File descriptor 8 left open File descriptor 9 left open File descriptor
10 left open File descriptor 11 left open File descriptor 13 left open File
descriptor 14 left open File descriptor 16 left open Device '/dev/drbd1' has
been left open. Reading all physical volumes. This may take a while... Found
volume group "vg0_drbd" using metadata type lvm2

LVM[5307]: 2009/08/21_17:20:33 INFO: File descriptor 4 left open File
descriptor 5 left open File descriptor 6 left open File descriptor 7 left
open File descriptor 8 left open File descriptor 9 left open File descriptor
10 left open File descriptor 11 left open File descriptor 13 left open File
descriptor 14 left open File descriptor 16 left open 1 logical volume(s) in
volume group "vg0_drbd" now active

lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 4 left open


 lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 5 left open


 lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 6 left open


 lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 7 left open


 lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 8 left open


 lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 9 left open


 lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 10 left open


 lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 11 left open


 lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 13 left open


 lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 14 left open


 lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) File
descriptor 16 left open


 lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr)

lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr)

lrmd[3825]: 2009/08/21_17:20:33 info: RA output: (LVM_5:start:stderr) Using
volume group(s) on command line

Finding volume group "vg0_drbd"

crmd[3828]: 2009/08/21_17:20:33 info: process_lrm_event: LRM operation
LVM_5_start_0 (call=46, rc=0) complete

tengine[3833]: 2009/08/21_17:20:33 info: match_graph_event: Action
LVM_5_start_0 (12) confirmed on gtt5.linux.com (rc=0)

tengine[3833]: 2009/08/21_17:20:33 info: send_rsc_command: Initiating action
22: LVM_5_start_120000 on gtt5.linux.com

tengine[3833]: 2009/08/21_17:20:33 info: send_rsc_command: Initiating action
11: Filesystem_6_start_0 on gtt5.linux.com

crmd[3828]: 2009/08/21_17:20:33 ERROR: construct_op: Start and Stop actions
cannot have an interval

crmd[3828]: 2009/08/21_17:20:33 info: do_lrm_rsc_op: Performing
op=LVM_5_start_0 key=22:6:b2620470-2f4f-4191-b958-fb74219903cf)

lrmd[3825]: 2009/08/21_17:20:33 info: rsc:LVM_5: start

crmd[3828]: 2009/08/21_17:20:33 info: do_lrm_rsc_op: Performing
op=Filesystem_6_start_0 key=11:6:b2620470-2f4f-4191-b958-fb74219903cf)

lrmd[3825]: 2009/08/21_17:20:33 info: rsc:Filesystem_6: start

LVM[5382]: 2009/08/21_17:20:33 INFO: Activating volume group vg0_drbd

Filesystem[5383]: 2009/08/21_17:20:33 INFO: Running start for /dev/drbd0 on
/cluster_metadata

crmd[3828]: 2009/08/21_17:20:33 info: process_lrm_event: LRM operation
Filesystem_6_start_0 (call=48, rc=0) complete

tengine[3833]: 2009/08/21_17:20:33 info: match_graph_event: Action
Filesystem_6_start_0 (11) confirmed on gtt5.linux.com (rc=0)

tengine[3833]: 2009/08/21_17:20:33 info: send_rsc_command: Initiating action
24: Filesystem_6_start_120000 on gtt5.linux.com

tengine[3833]: 2009/08/21_17:20:33 info: send_rsc_command: Initiating action
10: MakeMounts_7_start_0 on gtt5.linux.com

crmd[3828]: 2009/08/21_17:20:33 ERROR: construct_op: Start and Stop actions
cannot have an interval

crmd[3828]: 2009/08/21_17:20:33 info: do_lrm_rsc_op: Performing
op=Filesystem_6_start_0 key=24:6:b2620470-2f4f-4191-b958-fb74219903cf)

lrmd[3825]: 2009/08/21_17:20:33 info: rsc:Filesystem_6: start

crmd[3828]: 2009/08/21_17:20:33 info: do_lrm_rsc_op: Performing
op=MakeMounts_7_start_0 key=10:6:b2620470-2f4f-4191-b958-fb74219903cf)

lrmd[3825]: 2009/08/21_17:20:33 info: rsc:MakeMounts_7: start

MakeMounts[5451]: 2009/08/21_17:20:33 Openfiler making mount paths...

Filesystem[5448]: 2009/08/21_17:20:33 INFO: Running start for /dev/drbd0 on
/cluster_metadata

Filesystem[5448]: 2009/08/21_17:20:33 INFO: Filesystem /cluster_metadata is
already mounted.

crmd[3828]: 2009/08/21_17:20:33 info: process_lrm_event: LRM operation
Filesystem_6_start_0 (call=49, rc=0) complete

tengine[3833]: 2009/08/21_17:20:33 info: match_graph_event: Action
Filesystem_6_start_0 (24) confirmed on gtt5.linux.com (rc=0)

crmd[3828]: 2009/08/21_17:20:33 info: process_lrm_event: LRM operation
MakeMounts_7_start_0 (call=50, rc=0) complete

tengine[3833]: 2009/08/21_17:20:33 info: match_graph_event: Action
MakeMounts_7_start_0 (10) confirmed on gtt5.linux.com (rc=0)

tengine[3833]: 2009/08/21_17:20:33 info: send_rsc_command: Initiating action
26: MakeMounts_7_start_120000 on gtt5.linux.com

crmd[3828]: 2009/08/21_17:20:33 ERROR: construct_op: Start and Stop actions
cannot have an interval

crmd[3828]: 2009/08/21_17:20:33 info: do_lrm_rsc_op: Performing
op=MakeMounts_7_start_0 key=26:6:b2620470-2f4f-4191-b958-fb74219903cf)

lrmd[3825]: 2009/08/21_17:20:33 info: rsc:MakeMounts_7: start

LVM[5382]: 2009/08/21_17:20:34 INFO: File descriptor 4 left open File
descriptor 5 left open File descriptor 6 left open File descriptor 7 left
open File descriptor 8 left open File descriptor 9 left open File descriptor
16 left open Device '/dev/drbd1' has been left open. Reading all physical
volumes. This may take a while... Found volume group "vg0_drbd" using
metadata type lvm2

MakeMounts[5523]: 2009/08/21_17:20:34 Openfiler making mount paths...

crmd[3828]: 2009/08/21_17:20:34 info: process_lrm_event: LRM operation
MakeMounts_7_start_0 (call=51, rc=0) complete

LVM[5382]: 2009/08/21_17:20:34 INFO: File descriptor 4 left open File
descriptor 5 left open File descriptor 6 left open File descriptor 7 left
open File descriptor 8 left open File descriptor 9 left open File descriptor
16 left open 1 logical volume(s) in volume group "vg0_drbd" now active

tengine[3833]: 2009/08/21_17:20:34 info: match_graph_event: Action
MakeMounts_7_start_0 (26) confirmed on gtt5.linux.com (rc=0)

lrmd[3825]: 2009/08/21_17:20:34 info: RA output: (LVM_5:start:stderr) File
descriptor 4 left open

File descriptor 5 left open

File descriptor 6 left open

File descriptor 7 left open

File descriptor 8 left open

File descriptor 9 left open

File descriptor 16 left open


 lrmd[3825]: 2009/08/21_17:20:34 info: RA output: (LVM_5:start:stderr) Using
volume group(s) on command line

lrmd[3825]: 2009/08/21_17:20:34 info: RA output: (LVM_5:start:stderr)

Finding volume group "vg0_drbd"


 crmd[3828]: 2009/08/21_17:20:34 info: process_lrm_event: LRM operation
LVM_5_start_0 (call=47, rc=0) complete

tengine[3833]: 2009/08/21_17:20:34 info: match_graph_event: Action
LVM_5_start_0 (22) confirmed on gtt5.linux.com (rc=0)

tengine[3833]: 2009/08/21_17:20:34 info: run_graph: Transition 6:
(Complete=24, Pending=0, Fired=0, Skipped=0, Incomplete=0)

tengine[3833]: 2009/08/21_17:20:34 info: notify_crmd: Transition 6 status:
te_complete - <null>

crmd[3828]: 2009/08/21_17:20:34 info: do_state_transition: State transition
S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_IPC_MESSAGE
origin=route_message ]

heartbeat[3398]: 2009/08/21_17:22:03 info: Link gtt4.linux.com:eth0 dead.

cib[3824]: 2009/08/21_17:26:28 info: cib_stats: Processed 104 operations
(12403.00us average, 0% utilization) in the last 10min

cib[3824]: 2009/08/21_17:36:28 info: cib_stats: Processed 40 operations
(8250.00us average, 0% utilization) in the last 10min

cib[3824]: 2009/08/21_17:46:28 info: cib_stats: Processed 40 operations
(7750.00us average, 0% utilization) in the last 10min

cib[3824]: 2009/08/21_17:56:28 info: cib_stats: Processed 14 operations
(9285.00us average, 0% utilization) in the last 10min


Please help me what shoude i have to do.


Thans

Prakash KH
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to