I'm finally able to run my DRBD/HA NFS server on a V1 setup without
serious issue. My failovers work correctly and NFS service takes only a
minor interruption when a server is lost. The only thing I'm still
having problems using V1 with is SNMP.
Now, as an exercise in masochism, I'm trying to convert it over to V2 so
that I can use all the nifty new HA V2 functions. (We also already are
using SNMP with a V2 HA setup for some of our other components, so I'm
hoping this will also fix my last issue there.)
My problem:
Using the information from the "DRBD/HowTov2: Linux HA" page I should be
able to easily setup the DRBD portion. However, my config fails to pass
the "crm_verify" command.
crm_verify -L -V
crm_verify[5272]: 2008/02/04_15:58:52 WARN: unpack_rsc_op: Processing
failed op drbd0:0_start_0 on nfs_server1.prodea.local.lab: Error
crm_verify[5272]: 2008/02/04_15:58:52 WARN: unpack_rsc_op: Compatability
handling for failed op drbd0:0_start_0 on nfs_server1.prodea.local.lab
crm_verify[5272]: 2008/02/04_15:58:52 WARN: unpack_rsc_op: Processing
failed op drbd0:1_start_0 on nfs_server1.prodea.local.lab: Error
crm_verify[5272]: 2008/02/04_15:58:52 WARN: unpack_rsc_op: Compatability
handling for failed op drbd0:1_start_0 on nfs_server1.prodea.local.lab
crm_verify[5272]: 2008/02/04_15:58:52 WARN: native_color: Resource
drbd0:0 cannot run anywhere
crm_verify[5272]: 2008/02/04_15:58:52 WARN: native_color: Resource
drbd0:1 cannot run anywhere
crm_verify[5272]: 2008/02/04_15:58:52 WARN: native_color: Resource fs0
cannot run anywhere
Warnings found during check: config may not be valid
My nodes seem to be named correctly (when viewed through uname -a)
[EMAIL PROTECTED] ha.d]# uname -a
Linux nfs_server1.prodea.local.lab 2.6.9-55.ELsmp #1 SMP
Fri Apr 20 17:03:35 EDT 2007 i686 i686 i386 GNU/Linux
Why would the DRBD resource not be able to run anywhere? I followed
the instructions from the setup page pretty much to the letter, with the
only changes being the DRBD resource name on my system is "r0" instead
of "drbd0".
Here are my CIB file and the important parts of the drbd.conf file,
along with a snippet from the /var/log/messages file.
/var/log/messages
Feb 4 15:46:11 nfs_server1 crmd: [4573]: info: do_lrm_rsc_op:
Performing op=drbd0:0_start_0
key=5:1:bffa1a55-8ea4-4c1c-91bc-599bf9e6d49e)
Feb 4 15:46:11 nfs_server1 lrmd: [4570]: info: rsc:drbd0:0: start
Feb 4 15:46:11 nfs_server1 drbd[4850]: INFO: r0: Using hostname node_0
Feb 4 15:46:11 nfs_server1 lrmd: [4570]: info: RA output:
(drbd0:0:start:stdout) /etc/drbd.conf:395: in resource r0, on
nfs_server1.prodea.local.lab { ... } ... on nfs_server2.prodea.local.lab
{ ... }: There are multiple host sections for the peer. Maybe misspelled
local host name 'node_0'? /etc/drbd.conf:395: in resource r0, there is
no host section for this host. Missing 'on node_0 {...}' ?
Feb 4 15:46:11 nfs_server1 drbd[4850]: ERROR: r0 start: not in
Secondary mode after start.
Feb 4 15:46:11 nfs_server1 crmd: [4573]: ERROR: process_lrm_event: LRM
operation drbd0:0_start_0 (call=7, rc=1) Error unknown error
Feb 4 15:46:11 nfs_server1 tengine: [4575]: WARN: status_from_rc:
Action start on nfs_server1.prodea.local.lab failed (target: <null> vs.
rc: 1): Error
Feb 4 15:46:11 nfs_server1 tengine: [4575]: WARN: update_failcount:
Updating failcount for drbd0:0 on 1d040f02-a506-4c46-b661-319c5e024e10
after failed start: rc=1
cib.xml
<cib generated="false" admin_epoch="0" have_quorum="true"
ignore_dtd="false" num_peers="0" cib_feature_revision="2.0" epoch="14"
num_updates="1" cib-last-written="Mon Feb 4 15:45:54 2008"
ccm_transition="1">
<configuration>
<crm_config>
<cluster_property_set id="cib-bootstrap-options">
<attributes>
<nvpair id="cib-bootstrap-options-dc-version"
name="dc-version" value="2.1.3-node:
552305612591183b1628baa5bc6e903e0f1e26a3"/>
<nvpair id="cib-bootstrap-options-last-lrm-refresh"
name="last-lrm-refresh" value="1202136349"/>
</attributes>
</cluster_property_set>
</crm_config>
<nodes>
<node id="20f292a2-876b-4b71-a3c1-5802d4af9b2d"
uname="nfs_server2.prodea.local.lab" type="normal">
<instance_attributes
id="nodes-20f292a2-876b-4b71-a3c1-5802d4af9b2d">
<attributes>
<nvpair id="standby-20f292a2-876b-4b71-a3c1-5802d4af9b2d"
name="standby" value="off"/>
</attributes>
</instance_attributes>
</node>
<node id="1d040f02-a506-4c46-b661-319c5e024e10"
uname="nfs_server1.prodea.local.lab" type="normal"/>
</nodes>
<resources>
<master_slave id="ms-drbd0">
<meta_attributes id="ma-ms-drbd0">
<attributes>
<nvpair id="ma-ms-drbd0-1" name="clone_max" value="2"/>
<nvpair id="ma-ms-drbd0-2" name="clone_node_max"
value="1"/>
<nvpair id="ma-ms-drbd0-3" name="master_max" value="1"/>
<nvpair id="ma-ms-drbd0-4" name="master_node_max"
value="1"/>
<nvpair id="ma-ms-drbd0-5" name="notify" value="yes"/>
<nvpair id="ma-ms-drbd0-6" name="globally_unique"
value="false"/>
<nvpair id="ma-ms-drbd0-7" name="target_role"
value="stopped"/>
</attributes>
</meta_attributes>
<primitive class="ocf" provider="heartbeat" type="drbd"
id="drbd0">
<instance_attributes id="ia-drbd0">
<attributes>
<nvpair name="drbd_resource" id="ia-drbd0-1" value="r0"/>
<nvpair id="ia-drbd0-2" name="clone_overrides_hostname"
value="yes"/>
<nvpair id="drbd0:0_target_role" name="target_role"
value="started"/>
</attributes>
</instance_attributes>
</primitive>
</master_slave>
<primitive class="ocf" provider="heartbeat" type="Filesystem"
id="fs0">
<meta_attributes id="ma-fs0">
<attributes>
<nvpair name="target_role" id="ma-fs0-1" value="stopped"/>
</attributes>
</meta_attributes>
<instance_attributes id="ia-fs0">
<attributes>
<nvpair id="ia-fs0-1" name="fstype" value="ext3"/>
<nvpair id="ia-fs0-2" name="directory"
value="/mnt/share1"/>
<nvpair id="ia-fs0-3" name="device" value="/dev/drbd0"/>
</attributes>
</instance_attributes>
</primitive>
<primitive class="ocf" provider="heartbeat" type="IPaddr"
id="ip0">
<instance_attributes id="ia-ip0">
<attributes>
<nvpair id="ia-ip0-1" name="ip" value="172.24.1.167"/>
</attributes>
</instance_attributes>
</primitive>
</resources>
<constraints>
<rsc_location id="location-ip0" rsc="ip0">
<rule id="ip0-rule-1" score="-INFINITY">
<expression id="exp-ip0-1" value="a" attribute="site"
operation="eq"/>
</rule>
</rsc_location>
<rsc_order id="order_drbd0_ip0" to="ip0" from="ms-drbd0"/>
<rsc_order id="drbd0_before_fs0" from="fs0" action="start"
to="ms-drbd0" to_action="promote"/>
<rsc_colocation id="fs0_on_drbd0" to="ms-drbd0" to_role="master"
from="fs0" score="infinity"/>
<rsc_colocation id="colo_drbd0_ip0" to="ip0" from="drbd0:0"
score="infinity"/>
</constraints>
</configuration>
</cib>
drbd.conf
resource r0 {
protocol C;
. . .
on nfs_server1.prodea.local.lab {
device /dev/drbd0;
disk /dev/sdc1;
address 172.24.1.160:7788;
meta-disk /dev/sdb1[0];
}
on nfs_server2.prodea.local.lab {
device /dev/drbd0;
disk /dev/sdc1;
address 172.24.1.159:7788;
meta-disk /dev/sdb1[0];
}
}
Michael Toler
System Test Engineer
Prodea Systems, Inc.
214-278-1834 (office)
972-816-7790 (mobile)
This message is confidential to Prodea Systems, Inc unless otherwise indicated
or apparent from its nature. This message is directed to the intended recipient
only, who may be readily determined by the sender of this message and its
contents. If the reader of this message is not the intended recipient, or an
employee or agent responsible for delivering this message to the intended
recipient:(a)any dissemination or copying of this message is strictly
prohibited; and(b)immediately notify the sender by return message and destroy
any copies of this message in any form(electronic, paper or otherwise) that you
have.The delivery of this message and its information is neither intended to be
nor constitutes a disclosure or waiver of any trade secrets, intellectual
property, attorney work product, or attorney-client communications. The
authority of the individual sending this message to legally bind Prodea Systems
is neither apparent nor implied,and must be independently verified.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems