On Monday 04 February 2008, Mike Toler wrote: > I'm finally able to run my DRBD/HA NFS server on a V1 setup without > serious issue. My failovers work correctly and NFS service takes only a > minor interruption when a server is lost. The only thing I'm still > having problems using V1 with is SNMP. > > Now, as an exercise in masochism, I'm trying to convert it over to V2 so > that I can use all the nifty new HA V2 functions. (We also already are > using SNMP with a V2 HA setup for some of our other components, so I'm > hoping this will also fix my last issue there.) > > My problem: > > Using the information from the "DRBD/HowTov2: Linux HA" page I should be > able to easily setup the DRBD portion. However, my config fails to pass > the "crm_verify" command. > > crm_verify -L -V > crm_verify[5272]: 2008/02/04_15:58:52 WARN: unpack_rsc_op: Processing > failed op drbd0:0_start_0 on nfs_server1.prodea.local.lab: Error > crm_verify[5272]: 2008/02/04_15:58:52 WARN: unpack_rsc_op: Compatability > handling for failed op drbd0:0_start_0 on nfs_server1.prodea.local.lab > crm_verify[5272]: 2008/02/04_15:58:52 WARN: unpack_rsc_op: Processing > failed op drbd0:1_start_0 on nfs_server1.prodea.local.lab: Error > crm_verify[5272]: 2008/02/04_15:58:52 WARN: unpack_rsc_op: Compatability > handling for failed op drbd0:1_start_0 on nfs_server1.prodea.local.lab > crm_verify[5272]: 2008/02/04_15:58:52 WARN: native_color: Resource > drbd0:0 cannot run anywhere > crm_verify[5272]: 2008/02/04_15:58:52 WARN: native_color: Resource > drbd0:1 cannot run anywhere > crm_verify[5272]: 2008/02/04_15:58:52 WARN: native_color: Resource fs0 > cannot run anywhere > Warnings found during check: config may not be valid > > My nodes seem to be named correctly (when viewed through uname -a) > > [EMAIL PROTECTED] ha.d]# uname -a > Linux nfs_server1.prodea.local.lab 2.6.9-55.ELsmp #1 SMP > Fri Apr 20 17:03:35 EDT 2007 i686 i686 i386 GNU/Linux > > > Why would the DRBD resource not be able to run anywhere? I followed > the instructions from the setup page pretty much to the letter, with the > only changes being the DRBD resource name on my system is "r0" instead > of "drbd0".
As I recall I worked around this a week or two ago by adding a location
constraint for each group of resources; the drbd resource is in the group,
along with a floating ip and a few other things. The constraint snippet is
below.
<constraints>
<rsc_location id="location_iscsi" rsc="group_iscsi">
<rule id="prefered_location_iscsi" score="100">
<expression attribute="#uname"
id="33825ff5-9614-462f-bc25-d371d863a155" operation="eq" value="host1"/>
</rule>
</rsc_location>
<rsc_location id="location_mysql" rsc="group_mysql">
<rule id="prefered_location_mysql" score="100">
<expression attribute="#uname"
id="1be8da01-4dc8-495c-b6bd-ecf013534a72" operation="eq" value="host2"/>
</rule>
</rsc_location>
</constraints>
> Here are my CIB file and the important parts of the drbd.conf file,
> along with a snippet from the /var/log/messages file.
> /var/log/messages
> Feb 4 15:46:11 nfs_server1 crmd: [4573]: info: do_lrm_rsc_op:
> Performing op=drbd0:0_start_0
> key=5:1:bffa1a55-8ea4-4c1c-91bc-599bf9e6d49e)
> Feb 4 15:46:11 nfs_server1 lrmd: [4570]: info: rsc:drbd0:0: start
> Feb 4 15:46:11 nfs_server1 drbd[4850]: INFO: r0: Using hostname node_0
> Feb 4 15:46:11 nfs_server1 lrmd: [4570]: info: RA output:
> (drbd0:0:start:stdout) /etc/drbd.conf:395: in resource r0, on
> nfs_server1.prodea.local.lab { ... } ... on nfs_server2.prodea.local.lab
> { ... }: There are multiple host sections for the peer. Maybe misspelled
> local host name 'node_0'? /etc/drbd.conf:395: in resource r0, there is
> no host section for this host. Missing 'on node_0 {...}' ?
> Feb 4 15:46:11 nfs_server1 drbd[4850]: ERROR: r0 start: not in
> Secondary mode after start.
> Feb 4 15:46:11 nfs_server1 crmd: [4573]: ERROR: process_lrm_event: LRM
> operation drbd0:0_start_0 (call=7, rc=1) Error unknown error
> Feb 4 15:46:11 nfs_server1 tengine: [4575]: WARN: status_from_rc:
> Action start on nfs_server1.prodea.local.lab failed (target: <null> vs.
> rc: 1): Error
> Feb 4 15:46:11 nfs_server1 tengine: [4575]: WARN: update_failcount:
> Updating failcount for drbd0:0 on 1d040f02-a506-4c46-b661-319c5e024e10
> after failed start: rc=1
>
>
> cib.xml
> <cib generated="false" admin_epoch="0" have_quorum="true"
> ignore_dtd="false" num_peers="0" cib_feature_revision="2.0" epoch="14"
> num_updates="1" cib-last-written="Mon Feb 4 15:45:54 2008"
> ccm_transition="1">
> <configuration>
> <crm_config>
> <cluster_property_set id="cib-bootstrap-options">
> <attributes>
> <nvpair id="cib-bootstrap-options-dc-version"
> name="dc-version" value="2.1.3-node:
> 552305612591183b1628baa5bc6e903e0f1e26a3"/>
> <nvpair id="cib-bootstrap-options-last-lrm-refresh"
> name="last-lrm-refresh" value="1202136349"/>
> </attributes>
> </cluster_property_set>
> </crm_config>
> <nodes>
> <node id="20f292a2-876b-4b71-a3c1-5802d4af9b2d"
> uname="nfs_server2.prodea.local.lab" type="normal">
> <instance_attributes
> id="nodes-20f292a2-876b-4b71-a3c1-5802d4af9b2d">
> <attributes>
> <nvpair id="standby-20f292a2-876b-4b71-a3c1-5802d4af9b2d"
> name="standby" value="off"/>
> </attributes>
> </instance_attributes>
> </node>
> <node id="1d040f02-a506-4c46-b661-319c5e024e10"
> uname="nfs_server1.prodea.local.lab" type="normal"/>
> </nodes>
> <resources>
> <master_slave id="ms-drbd0">
> <meta_attributes id="ma-ms-drbd0">
> <attributes>
> <nvpair id="ma-ms-drbd0-1" name="clone_max" value="2"/>
> <nvpair id="ma-ms-drbd0-2" name="clone_node_max"
> value="1"/>
> <nvpair id="ma-ms-drbd0-3" name="master_max" value="1"/>
> <nvpair id="ma-ms-drbd0-4" name="master_node_max"
> value="1"/>
> <nvpair id="ma-ms-drbd0-5" name="notify" value="yes"/>
> <nvpair id="ma-ms-drbd0-6" name="globally_unique"
> value="false"/>
> <nvpair id="ma-ms-drbd0-7" name="target_role"
> value="stopped"/>
> </attributes>
> </meta_attributes>
> <primitive class="ocf" provider="heartbeat" type="drbd"
> id="drbd0">
> <instance_attributes id="ia-drbd0">
> <attributes>
> <nvpair name="drbd_resource" id="ia-drbd0-1" value="r0"/>
> <nvpair id="ia-drbd0-2" name="clone_overrides_hostname"
> value="yes"/>
> <nvpair id="drbd0:0_target_role" name="target_role"
> value="started"/>
> </attributes>
> </instance_attributes>
> </primitive>
> </master_slave>
> <primitive class="ocf" provider="heartbeat" type="Filesystem"
> id="fs0">
> <meta_attributes id="ma-fs0">
> <attributes>
> <nvpair name="target_role" id="ma-fs0-1" value="stopped"/>
> </attributes>
> </meta_attributes>
> <instance_attributes id="ia-fs0">
> <attributes>
> <nvpair id="ia-fs0-1" name="fstype" value="ext3"/>
> <nvpair id="ia-fs0-2" name="directory"
> value="/mnt/share1"/>
> <nvpair id="ia-fs0-3" name="device" value="/dev/drbd0"/>
> </attributes>
> </instance_attributes>
> </primitive>
> <primitive class="ocf" provider="heartbeat" type="IPaddr"
> id="ip0">
> <instance_attributes id="ia-ip0">
> <attributes>
> <nvpair id="ia-ip0-1" name="ip" value="172.24.1.167"/>
> </attributes>
> </instance_attributes>
> </primitive>
> </resources>
> <constraints>
> <rsc_location id="location-ip0" rsc="ip0">
> <rule id="ip0-rule-1" score="-INFINITY">
> <expression id="exp-ip0-1" value="a" attribute="site"
> operation="eq"/>
> </rule>
> </rsc_location>
> <rsc_order id="order_drbd0_ip0" to="ip0" from="ms-drbd0"/>
> <rsc_order id="drbd0_before_fs0" from="fs0" action="start"
> to="ms-drbd0" to_action="promote"/>
> <rsc_colocation id="fs0_on_drbd0" to="ms-drbd0" to_role="master"
> from="fs0" score="infinity"/>
> <rsc_colocation id="colo_drbd0_ip0" to="ip0" from="drbd0:0"
> score="infinity"/>
> </constraints>
> </configuration>
> </cib>
>
> drbd.conf
>
> resource r0 {
> protocol C;
>
> . . .
> on nfs_server1.prodea.local.lab {
> device /dev/drbd0;
> disk /dev/sdc1;
> address 172.24.1.160:7788;
> meta-disk /dev/sdb1[0];
>
> }
> on nfs_server2.prodea.local.lab {
> device /dev/drbd0;
> disk /dev/sdc1;
> address 172.24.1.159:7788;
> meta-disk /dev/sdb1[0];
>
> }
> }
>
>
> Michael Toler
> System Test Engineer
> Prodea Systems, Inc.
> 214-278-1834 (office)
> 972-816-7790 (mobile)
--
-- Michael
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
