Forgot mentioning that the pair of nodes work before. And I can still run "crm configure show". Here is the configuration.
node $id="bc6bf61d-6b5f-4307-85f3-bf7bb11531bb" arsvr2 \ attributes standby="off" node $id="bf0e7394-9684-42b9-893b-5a9a6ecddd7e" arsvr1 \ attributes standby="off" primitive apache2 lsb:apache2 \ op start interval="0" timeout="60" \ op stop interval="0" timeout="120" start-delay="15" \ meta target-role="Started" primitive drbd_mysql ocf:linbit:drbd \ params drbd_resource="r0" \ op monitor interval="15s" primitive drbd_webfs ocf:linbit:drbd \ params drbd_resource="r1" \ op monitor interval="15s" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="100" primitive fs_mysql ocf:heartbeat:Filesystem \ params device="/dev/drbd/by-res/r0" directory="/var/lib/mysql" fstype="ext4" \ op start interval="0" timeout="60" \ op stop interval="0" timeout="120" \ meta target-role="Started" primitive fs_webfs ocf:heartbeat:Filesystem \ params device="/dev/drbd/by-res/r1" directory="/srv" fstype="ext4" \ op start interval="0" timeout="60" \ op stop interval="0" timeout="120" \ meta target-role="Started" primitive ip1 ocf:heartbeat:IPaddr2 \ params ip="138.214.240.193" nic="eth0" \ op monitor interval="5s" \ meta target-role="Started" primitive ip1arp ocf:heartbeat:SendArp \ params ip="138.214.240.193" nic="eth0" \ meta target-role="Started" primitive mysql ocf:heartbeat:mysql \ params binary="/usr/bin/mysqld_safe" config="/etc/mysql/my.cnf" user="mysql" group="mysql" log="/var/log/mysql.log" pid="/var/lib/mysql/mysqld.pid" datadir="/var/lib/mysql" socket="/var/run/mysqld/mysqld.sock" \ op monitor interval="30s" timeout="30s" \ op start interval="0" timeout="120" \ op stop interval="0" timeout="120" \ meta target-role="Started" group MySQLDB fs_mysql mysql \ meta target-role="Started" group WebServices ip1 ip1arp fs_webfs apache2 \ meta target-role="Started" ms ms_drbd_mysql drbd_mysql \ meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" ms ms_drbd_webfs drbd_webfs \ meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" target-role="Started" colocation apache2_with_ip inf: apache2 ip1 colocation apache2_with_mysql inf: apache2 ms_drbd_mysql:Master colocation apache2_with_webfs inf: apache2 ms_drbd_webfs:Master colocation fs_on_drbd inf: fs_mysql ms_drbd_mysql:Master colocation ip_with_ip_arp inf: ip1 ip1arp colocation mysql_on_drbd inf: MySQLDB ms_drbd_mysql:Master colocation mysql_with_ip inf: MySQLDB ip1 colocation webfs_on_drbd inf: fs_webfs ms_drbd_webfs:Master order apache2-after-arp inf: ip1arp:start apache2:start order arp-after-ip inf: ip1:start ip1arp:start order fs-mysql-after-drbd inf: ms_drbd_mysql:promote fs_mysql:start order fs-webfs-after-drbd inf: ms_drbd_webfs:promote fs_webfs:start order ip-after-mysql inf: mysql:start ip1:start order mysql-after-fs-mysql inf: fs_mysql:start mysql:start property $id="cib-bootstrap-options" \ dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \ cluster-infrastructure="Heartbeat" \ expected-quorum-votes="1" \ stonith-enabled="false" \ no-quorum-policy="ignore" rsc_defaults $id="rsc-options" \ resource-stickiness="100" Liang Ma Contractuel | Consultant | SED Systems Inc. Ground Systems Analyst Agence spatiale canadienne | Canadian Space Agency 6767, Route de l'Aéroport, Longueuil (St-Hubert), QC, Canada, J3Y 8Y9 Tél/Tel : (450) 926-5099 | Téléc/Fax: (450) 926-5083 Courriel/E-mail : [liang...@space.gc.ca] Site web/Web site : [www.space.gc.ca ] -----Original Message----- From: Ma, Liang Sent: February 9, 2011 9:59 AM To: 'The Pacemaker cluster resource manager' Subject: Could not connect to the CIB: Remote node did not respond Hi There, After a network and power shutdown, my LAMP cluster servers were totally screwed up. Now crm status gives me crm status ============ Last updated: Wed Feb 9 09:44:17 2011 Stack: Heartbeat Current DC: arsvr2 (bc6bf61d-6b5f-4307-85f3-bf7bb11531bb) - partition with quorum Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd 2 Nodes configured, 1 expected votes 4 Resources configured. ============ Online: [ arsvr1 arsvr2 ] None of the resources comes up. First I found a brain split in drbd disks. I fixed that and the drbd disks are health. I can mount them manually without problem. However if I try anything to bring up a resource or edit cib or even a query, it gives me errors as following crm resource start fs_mysql Call cib_replace failed (-41): Remote node did not respond <null> crm configure edit Could not connect to the CIB: Remote node did not respond ERROR: creating tmp shadow __crmshell.2540 failed cibadmin -Q Call cib_query failed (-41): Remote node did not respond <null> Any idea what I can do to bring the cluster back? Thank you, Liang Ma Contractuel | Consultant | SED Systems Inc. Ground Systems Analyst Agence spatiale canadienne | Canadian Space Agency 6767, Route de l'Aéroport, Longueuil (St-Hubert), QC, Canada, J3Y 8Y9 Tél/Tel : (450) 926-5099 | Téléc/Fax: (450) 926-5083 Courriel/E-mail : [liang...@space.gc.ca] Site web/Web site : [www.space.gc.ca ] er _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker