Hi, I built a test cluster with 2 nodes. Ubuntu 10.4.3 LTS with ppa:ubuntu-ha-maintainers/ppa
corosync 1.4.2 pacemaker 1.1.6 primitive clvm ocf:lvm2:clvmd \ params daemon_timeout="30" \ operations $id="clvm-operations" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="100" \ op monitor interval="0" timeout="20" start-delay="0" \ meta target-role="started" primitive data ocf:heartbeat:LVM \ params volgrpname="data" \ operations $id="data-operations" \ op start interval="0" timeout="30" \ op stop interval="0" timeout="30" \ op monitor interval="10" timeout="120" start-delay="0" \ op methods interval="0" timeout="5" \ meta target-role="started" primitive dlm ocf:pacemaker:controld \ operations $id="dlm-operations" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="100" \ op monitor interval="10" timeout="20" start-delay="0" \ meta target-role="started" primitive fs ocf:heartbeat:Filesystem \ params device="/dev/data/test" directory="/data/test" fstype="ocfs2" \ operations $id="fs-operations" \ op start interval="0" timeout="60" \ op stop interval="0" timeout="60" \ op monitor interval="120" timeout="40" start-delay="0" \ op notify interval="0" timeout="60" \ meta target-role="started" primitive o2cb ocf:pacemaker:o2cb \ operations $id="o2cb-operations" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="100" \ op monitor interval="0" timeout="20" start-delay="0" \ meta target-role="started" primitive res_DRBD ocf:linbit:drbd \ params drbd_resource="r0" \ operations $id="res_DRBD-operations" \ op start interval="0" timeout="240" \ op promote interval="0" timeout="90" \ op demote interval="0" timeout="90" \ op stop interval="0" timeout="100" \ op monitor interval="30" timeout="20" start-delay="1min" \ op notify interval="0" timeout="90" \ meta target-role="started" group dlm-clvm dlm clvm ms ms_DRBD res_DRBD \ meta master-max="2" clone-max="2" notify="true" interleave="true" clone clone_data data \ meta clone-max="2" ordered="true" interleave="true" clone dlm-clvm-clone dlm-clvm \ meta interleave="true" ordered="true" clone fs-clone fs \ meta clone-max="2" ordered="true" interleave="true" clone o2cb-clone o2cb \ meta clone-max="2" interleave="true" colocation col_data_clvm-dlm-clone inf: clone_data dlm-clvm-clone colocation col_fs_o2cb inf: fs-clone o2cb-clone colocation col_ms_DRBD_dlm-clvm-clone inf: dlm-clvm-clone ms_DRBD:Master colocation col_o2cb_dlm-clvm inf: o2cb-clone dlm-clvm-clone order ord_data_after_clvm-dlm-clone inf: dlm-clvm-clone clone_data order ord_ms_DRBD_dlm-clvm-clone inf: ms_DRBD:promote dlm-clvm-clone:start order ord_o2cb_after_dlm-clvm 0: dlm-clvm-clone o2cb-clone order ord_o2cb_fs inf: o2cb-clone fs-clone property $id="cib-bootstrap-options" \ dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ stonith-enabled="false" \ no-quorum-policy="ignore" \ last-lrm-refresh="1323246238" \ default-resource-stickiness="1000" The problem is to restart corosync or to reboot a cluster node. All resources are stopped except for drbd resource. Than the system hangs for a long time. corosync.log: ubuntu0 crmd: [926]: info: do_state_transition: (Re)Issuing shutdown request now that we are the DC ubuntu0 crmd: [926]: info: do_state_transition: Starting PEngine Recheck Timer ubuntu0 crmd: [926]: info: do_shutdown_req: Sending shutdown request to DC: ubuntu0 ubuntu0 crmd: [926]: info: handle_shutdown_request: Creating shutdown request for ubuntu0 (state=S_IDLE) corosync [pcmk ] notice: pcmk_shutdown: Still waiting for crmd (pid=926, seq=6) to terminate... corosync [pcmk ] notice: pcmk_shutdown: Still waiting for crmd (pid=926, seq=6) to terminate... corosync [pcmk ] notice: pcmk_shutdown: Still waiting for crmd (pid=926, seq=6) to terminate... corosync [pcmk ] notice: pcmk_shutdown: Still waiting for crmd (pid=926, seq=6) to terminate... corosync [pcmk ] notice: pcmk_shutdown: Still waiting for crmd (pid=926, seq=6) to terminate... I tested the same config with a debian 6.0.3. The reboot works. The behaviour there is, that in the first step the drbd resource demote to secondary and then goes down. Is this a known problem?? Thank you for help. Regards, Erik
_______________________________________________ Pacemaker mailing list: [email protected] http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
