Hi, On Fri, Oct 05, 2007 at 12:51:36PM -0500, Alejandro Rios Peña wrote: > Hello. > > DRBD partition can't be properly configured by heartbeat on master node, > causing filesystem mounting to fail. > > How can I tell heartbeat to do a ''drbdadm primary all'' or > something?
You can't and you shouldn't need to. > Doing it by hand and mounting the filesystem works without > problem. > > Maybe I configured something wrong, could somebody help me pointing it out? > > I'm using: > heartbeat 2.0.7-2 You could upgrade to 2.1.2. > drbd 0.7.21 > drbdlinks 1.07 > > --------------------------------------------------------------- > "crm_mon -n -1 -r " shows: > > ============ > Last updated: Fri Oct 5 12:26:57 2007 > Current DC: hal-9000 (9f275006-1303-4c16-b5c6-3f52c4ed0865) > 2 Nodes configured. > 4 Resources configured. > ============ > > Node: hal-9000 (9f275006-1303-4c16-b5c6-3f52c4ed0865): online > r0:1 (heartbeat::ocf:drbd) > Node: flexo (d66a3fcc-da10-4807-afec-950187f4c084): online > IPaddr_192_168_0_101 (heartbeat::ocf:IPaddr) > samba_3 (lsb:samba) > r0:0 (heartbeat::ocf:drbd) > > Inactive resources: > fs0 (heartbeat::ocf:Filesystem): Stopped > --------------------------------------------------------------- > > Here's relevant part of syslog (extended debug output is at > http://paste.debian.net/38881): > > [snip] >From where is this log (looks as if it's from flexo)? I don't see any problems here (beware: I don't have much experience with drbd). The other log (from the DC node) could be more interesting. Thanks, Dejan > Oct 5 11:45:46 localhost crmd: [7163]: info: do_lrm_rsc_op:lrm.c > Performing op monitor on r0:1 (interval=0ms, > key=3:190e3e58-5139-4b1f-a9a1-6faa888e5adc) > # > Oct 5 11:45:46 localhost cib: [7159]: info: activateCibXml:io.c CIB > size is 156692 bytes (was 154400) > # > Oct 5 11:45:46 localhost cib: [7159]: info: cib_diff_notify:notify.c > Update (client: 1760, call:2): 0.17.585 -> 0.17.586 (ok) > # > Oct 5 11:45:46 localhost cib: [7159]: info: activateCibXml:io.c CIB > size is 156588 bytes (was 156692) > # > Oct 5 11:45:46 localhost cib: [7159]: info: cib_diff_notify:notify.c > Update (client: 7163, call:28): 0.17.586 -> 0.17.587 (ok) > # > Oct 5 11:45:46 localhost drbd[7513]: [7520]: DEBUG: r0: Calling > /sbin/drbdadm -c /etc/drbd.conf state r0 > # > Oct 5 11:45:46 localhost drbd[7513]: [7524]: DEBUG: r0: Exit code 0 > # > Oct 5 11:45:46 localhost drbd[7513]: [7525]: DEBUG: r0: Command output: > Secondary/Secondary > # > Oct 5 11:45:46 localhost drbd[7513]: [7533]: DEBUG: r0: Calling > /sbin/drbdadm -c /etc/drbd.conf cstate r0 > # > Oct 5 11:45:46 localhost cib: [7159]: info: activateCibXml:io.c CIB > size is 158880 bytes (was 156588) > # > Oct 5 11:45:46 localhost cib: [7159]: info: cib_diff_notify:notify.c > Update (client: 1715, call:36): 0.17.587 -> 0.17.588 (ok) > # > Oct 5 11:45:46 localhost cib: [7539]: info: write_cib_contents:io.c > Wrote version 0.17.588 of the CIB to disk (digest: > d794dd901148deb7f0af574270f73c00) > # > Oct 5 11:45:46 localhost drbd[7513]: [7544]: DEBUG: r0: Exit code 0 > # > Oct 5 11:45:46 localhost drbd[7513]: [7545]: DEBUG: r0: Command output: > Connected > # > Oct 5 11:45:46 localhost drbd[7513]: [7546]: DEBUG: r0 status: > Secondary/Secondary Secondary Secondary Connected > # > Oct 5 11:45:46 localhost crmd: [7163]: info: process_lrm_event:lrm.c > LRM operation (13) monitor_0 on r0:1 complete > # > Oct 5 11:45:48 localhost crmd: [7163]: info: do_lrm_rsc_op:lrm.c > Performing op stop on r0:1 (interval=0ms, > key=4:190e3e58-5139-4b1f-a9a1-6faa888e5adc) > # > Oct 5 11:45:48 localhost drbd[7547]: [7554]: DEBUG: r0: Calling > /sbin/drbdadm -c /etc/drbd.conf state r0 > # > Oct 5 11:45:48 localhost drbd[7547]: [7558]: DEBUG: r0: Exit code 0 > # > Oct 5 11:45:48 localhost drbd[7547]: [7559]: DEBUG: r0: Command output: > Secondary/Secondary > # > Oct 5 11:45:48 localhost drbd[7547]: [7567]: DEBUG: r0: Calling > /sbin/drbdadm -c /etc/drbd.conf cstate r0 > # > Oct 5 11:45:48 localhost cib: [7159]: info: activateCibXml:io.c CIB > size is 158776 bytes (was 158880) > # > Oct 5 11:45:48 localhost cib: [7159]: info: cib_diff_notify:notify.c > Update (client: 7163, call:29): 0.17.588 -> 0.17.589 (ok) > # > Oct 5 11:45:48 localhost cib: [7572]: info: write_cib_contents:io.c > Wrote version 0.17.589 of the CIB to disk (digest: > 18831e94c442d16a54c1a9659314d39e) > # > Oct 5 11:45:48 localhost drbd[7547]: [7584]: DEBUG: r0: Exit code 0 > # > Oct 5 11:45:48 localhost drbd[7547]: [7585]: DEBUG: r0: Command output: > Connected > # > Oct 5 11:45:48 localhost drbd[7547]: [7586]: DEBUG: r0 status: > Secondary/Secondary Secondary Secondary Connected > # > Oct 5 11:45:48 localhost drbd[7547]: [7587]: DEBUG: r0: Calling > /sbin/drbdadm -c /etc/drbd.conf down r0 > # > Oct 5 11:45:48 localhost kernel: drbd0: drbdsetup [7590]: cstate > Connected --> Unconnected > # > Oct 5 11:45:48 localhost kernel: drbd0: drbd0_receiver [7508]: cstate > Unconnected --> BrokenPipe > # > Oct 5 11:45:48 localhost kernel: drbd0: short read expecting header on > sock: r=-512 > # > Oct 5 11:45:48 localhost kernel: drbd0: worker terminated > # > Oct 5 11:45:48 localhost kernel: drbd0: asender terminated > # > Oct 5 11:45:48 localhost kernel: drbd0: drbd0_receiver [7508]: cstate > BrokenPipe --> StandAlone > # > Oct 5 11:45:48 localhost kernel: drbd0: Connection lost. > # > Oct 5 11:45:48 localhost kernel: drbd0: receiver terminated > # > Oct 5 11:45:48 localhost kernel: drbd0: drbdsetup [7590]: cstate > StandAlone --> StandAlone > # > Oct 5 11:45:48 localhost kernel: drbd0: drbdsetup [7590]: cstate > StandAlone --> Unconfigured > # > Oct 5 11:45:48 localhost kernel: drbd0: worker terminated > # > Oct 5 11:45:48 localhost drbd[7547]: [7592]: DEBUG: r0: Exit code 0 > # > Oct 5 11:45:48 localhost drbd[7547]: [7593]: DEBUG: r0: Command output: > # > Oct 5 11:45:48 localhost lrmd: [7160]: info: RA output: (r0:1:stop:stdout) > # > Oct 5 11:45:48 localhost drbd[7547]: [7594]: DEBUG: r0 stop: drbdadm > down succeeded. > # > Oct 5 11:45:48 localhost crmd: [7163]: info: process_lrm_event:lrm.c > LRM operation (14) stop_0 on r0:1 complete > # > Oct 5 11:45:49 localhost cib: [7159]: info: activateCibXml:io.c CIB > size is 161068 bytes (was 158776) > # > Oct 5 11:45:49 localhost cib: [7159]: info: cib_diff_notify:notify.c > Update (client: 1715, call:38): 0.17.589 -> 0.17.590 (ok) > # > Oct 5 11:45:49 localhost cib: [7598]: info: write_cib_contents:io.c > Wrote version 0.17.590 of the CIB to disk (digest: > 0b9ae0f4610a1af2e1c8a843608192e5) > # > Oct 5 11:45:49 localhost crmd: [7163]: info: do_lrm_rsc_op:lrm.c > Performing op start on fs0 (interval=0ms, > key=5:190e3e58-5139-4b1f-a9a1-6faa888e5adc) > # > Oct 5 11:45:49 localhost Filesystem[7599]: [7605]: INFO: Running start > for /dev/drbd0 on /shared > # > Oct 5 11:45:49 localhost cib: [7159]: info: activateCibXml:io.c CIB > size is 163360 bytes (was 161068) > # > Oct 5 11:45:49 localhost cib: [7159]: info: cib_diff_notify:notify.c > Update (client: 7163, call:30): 0.17.590 -> 0.17.591 (ok) > # > Oct 5 11:45:49 localhost cib: [7159]: info: activateCibXml:io.c CIB > size is 165652 bytes (was 163360) > # > Oct 5 11:45:49 localhost cib: [7159]: info: cib_diff_notify:notify.c > Update (client: 1715, call:40): 0.17.591 -> 0.17.592 (ok) > # > Oct 5 11:45:49 localhost cib: [7611]: info: write_cib_contents:io.c > Wrote version 0.17.592 of the CIB to disk (digest: > 7750e155947fcba58272b13b572530cf) > # > Oct 5 11:45:49 localhost Filesystem[7599]: [7615]: ERROR: Couldn't > mount filesystem /dev/drbd0 on /shared > # > Oct 5 11:45:49 localhost crmd: [7163]: WARN: process_lrm_event:lrm.c > LRM operation (15) start_0 on fs0 Error: (1) unknown error > # > Oct 5 11:45:49 localhost lrmd: [7160]: info: RA output: > (fs0:start:stderr) BLKFLSBUF: Inappropriate ioctl for device mount: > block device /dev/drbd0 is write-protected, mounting read-only mount: > /dev/drbd0 already mounted or /shared busy > # > Oct 5 11:45:51 localhost crmd: [7163]: info: do_lrm_rsc_op:lrm.c > Performing op stop on fs0 (interval=0ms, > key=6:190e3e58-5139-4b1f-a9a1-6faa888e5adc) > # > Oct 5 11:45:51 localhost cib: [7159]: info: activateCibXml:io.c CIB > size is 167944 bytes (was 165652) > # > Oct 5 11:45:51 localhost cib: [7159]: info: cib_diff_notify:notify.c > Update (client: 7163, call:31): 0.17.592 -> 0.17.593 (ok) > # > Oct 5 11:45:51 localhost cib: [7619]: info: write_cib_contents:io.c > Wrote version 0.17.593 of the CIB to disk (digest: > 69c7df548815877a47420fece9960bf1) > # > Oct 5 11:45:51 localhost Filesystem[7616]: [7623]: INFO: Running stop > for /dev/drbd0 on /shared > # > Oct 5 11:45:51 localhost lrmd: [7160]: info: RA output: > (fs0:stop:stderr) BLKFLSBUF: Inappropriate ioctl for device > # > Oct 5 11:45:51 localhost crmd: [7163]: info: process_lrm_event:lrm.c > LRM operation (16) stop_0 on fs0 complete > # > Oct 5 11:45:53 localhost cib: [7159]: info: activateCibXml:io.c CIB > size is 170236 bytes (was 167944) > # > Oct 5 11:45:53 localhost cib: [7159]: info: cib_diff_notify:notify.c > Update (client: 7163, call:32): 0.17.593 -> 0.17.594 (ok) > # > Oct 5 11:45:53 localhost cib: [7633]: info: write_cib_contents:io.c > Wrote version 0.17.594 of the CIB to disk (digest: > 885fea462a3e6c8e73f0e1d62973f572) > > > --------------------------------------------------------------- > Here's my cib.xml file: > <cib admin_epoch="0" have_quorum="true" num_peers="2" > cib_feature_revision="1.3" ccm_transition="2" generated="true" > dc_uuid="9f275006-1303-4c16-b5c6-3f52c4ed0865" epoch="17" > num_updates="594" cib-last-written="Fri Oct 5 11:45:53 2007"> > <configuration> > <crm_config> > <cluster_property_set id="cib-bootstrap-options"> > <attributes> > <nvpair id="cib-bootstrap-options-symmetric_cluster" > name="symmetric_cluster" value="true"/> > <nvpair id="cib-bootstrap-options-no_quorum_policy" > name="no_quorum_policy" value="stop"/> > <nvpair > id="cib-bootstrap-options-default_resource_stickiness" > name="default_resource_stickiness" value="0"/> > <nvpair > id="cib-bootstrap-options-default_resource_failure_stickiness" > name="default_resource_failure_stickiness" value="0"/> > <nvpair id="cib-bootstrap-options-stonith_enabled" > name="stonith_enabled" value="false"/> > <nvpair id="cib-bootstrap-options-stonith_action" > name="stonith_action" value="reboot"/> > <nvpair id="cib-bootstrap-options-stop_orphan_resources" > name="stop_orphan_resources" value="true"/> > <nvpair id="cib-bootstrap-options-stop_orphan_actions" > name="stop_orphan_actions" value="true"/> > <nvpair id="cib-bootstrap-options-remove_after_stop" > name="remove_after_stop" value="false"/> > <nvpair id="cib-bootstrap-options-short_resource_names" > name="short_resource_names" value="true"/> > <nvpair id="cib-bootstrap-options-transition_idle_timeout" > name="transition_idle_timeout" value="5min"/> > <nvpair id="cib-bootstrap-options-default_action_timeout" > name="default_action_timeout" value="5s"/> > <nvpair id="cib-bootstrap-options-is_managed_default" > name="is_managed_default" value="true"/> > </attributes> > </cluster_property_set> > </crm_config> > <nodes> > <node uname="hal-9000" type="normal" > id="9f275006-1303-4c16-b5c6-3f52c4ed0865"> > <instance_attributes > id="master-9f275006-1303-4c16-b5c6-3f52c4ed0865"> > <attributes/> > </instance_attributes> > </node> > <node uname="flexo" type="normal" > id="d66a3fcc-da10-4807-afec-950187f4c084"> > <instance_attributes > id="master-d66a3fcc-da10-4807-afec-950187f4c084"> > <attributes/> > </instance_attributes> > </node> > </nodes> > <resources> > <group id="group_1"> > <primitive class="ocf" id="IPaddr_192_168_0_101" > provider="heartbeat" type="IPaddr"> > <instance_attributes id="IPaddr_192_168_0_101_inst_attr"> > <attributes> > <nvpair id="IPaddr_192_168_0_101_attr_0" name="ip" > value="192.168.0.101"/> > </attributes> > </instance_attributes> > </primitive> > <primitive class="lsb" id="samba_3" provider="heartbeat" > type="samba"> > <operations> > <op id="samba_1_start" name="stop" timeout="5"/> > <op id="samba_2_stop" name="start" timeout="5s"/> > </operations> > </primitive> > </group> > <master_slave id="ms-r0"> > <meta_attributes id="ma-ms-r0"> > <attributes> > <nvpair id="ma-ms-r0-1" name="clone_max" value="2"/> > <nvpair id="ma-ms-r0-2" name="clone_node_max" value="1"/> > <nvpair id="ma-ms-r0-3" name="master_max" value="1"/> > <nvpair id="ma-ms-r0-4" name="master_node_max" value="1"/> > <nvpair id="ma-ms-r0-5" name="notify" value="yes"/> > <nvpair id="ma-ms-r0-6" name="globally_unique" value="false"/> > <nvpair id="ma-ms-r0-7" name="target_role" value="#default"/> > </attributes> > </meta_attributes> > <primitive id="r0" class="ocf" provider="heartbeat" type="drbd"> > <instance_attributes id="ia-r0"> > <attributes> > <nvpair id="ia-r0-1" name="drbd_resource" value="r0"/> > </attributes> > </instance_attributes> > </primitive> > </master_slave> > <primitive class="ocf" provider="heartbeat" type="Filesystem" > id="fs0"> > <meta_attributes id="ma-fs0"> > <attributes> > <nvpair name="target_role" id="ma-fs0-1" value="#default"/> > </attributes> > </meta_attributes> > <instance_attributes id="ia-fs0"> > <attributes> > <nvpair id="ia-fs0-1" name="fstype" value="ext3"/> > <nvpair id="ia-fs0-2" name="directory" value="/shared"/> > <nvpair id="ia-fs0-3" name="device" value="/dev/drbd0"/> > </attributes> > </instance_attributes> > </primitive> > </resources> > <constraints> > <rsc_location id="rsc_location_r0" rsc="ms-r0"> > <rule id="prefered_location_r0" score="100"> > <expression attribute="#uname" id="prefered_location_r0_expr" > operation="eq" value="flexo"/> > </rule> > </rsc_location> > <rsc_order id="r0_before_fs0" from="fs0" action="start" to="ms-r0"/> > <rsc_colocation id="fs0_on_r0" to="ms-r0" from="fs0" > score="INFINITY"/> > <rsc_location id="rsc_location_group_1" rsc="group_1"> > <rule id="prefered_location_group_1" score="100"> > <expression attribute="#uname" > id="prefered_location_group_1_expr" operation="eq" value="flexo"/> > </rule> > </rsc_location> > </constraints> > </configuration> > </cib> > --------------------------------------------------------------- > drbd.conf: > > resource r0 { > protocol C; > incon-degr-cmd "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ; > halt -f"; > > startup { > degr-wfc-timeout 120; # 2 minutes. > } > > disk { > on-io-error detach; > } > > net { > # TODO: Should these timeouts be relative to some heartbeat settings? > # timeout 60; # 6 seconds (unit = 0.1 seconds) > # connect-int 10; # 10 seconds (unit = 1 second) > # ping-int 10; # 10 seconds (unit = 1 second) > on-disconnect reconnect; > } > > syncer { > rate 10M; > group 1; > al-extents 257; > } > > on flexo { > device /dev/drbd0; > disk /dev/sda4; > address 192.168.0.105:7788; > meta-disk internal; > } > > on hal-9000 { > device /dev/drbd0; > disk /dev/sda7; > address 192.168.0.100:7788; > meta-disk internal; > } > } > > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
