Alan Robertson schreef: > Martijn Grendelman wrote: >> Hi, >> >> I am trying to build a 2-node cluster serving DRBD+NFS, among other >> things. It has been operational on Debian Sarge, with Heartbeat 1.2, but >> recently, both machines were upgraded to Debian Etch, and today I >> upgraded Heartbeat to 2.0.7. I maintained the R1 style configuration. >> Heartbeat is running in an active/passive fashion.
[snip] > We run /etc/init.d/nfs-kernel-server status before starting it. If it > says OK or running, then we don't start it because it's already running. > > See http://linux-ha.org/HeartbeatResourceAgent Thank you for the information. There is one other problem that I haven't been able to solve, and I hope someone can help me with that too. Sometimes it happens that Heartbeat tries to take over a resource group that it's already running: [EMAIL PROTECTED]:~> cl_status rscstatus all [EMAIL PROTECTED]:~> cl_status rscstatus none Now, when I shutdown or reboot Vodka, I would expect nothing much to happen in the cluster, but instead, Heartbeat on Whisky, the node that's already running things, says: May 7 17:21:34 whisky mach_down[11872]: [11888]: info: Taking over resource group 213.207.104.20 May 7 17:21:34 whisky ResourceManager[11889]: [11897]: info: Acquiring resource group: vodka 213.207.104.20 ipvsadm mon drbddisk::all Filesystem::/dev/drbd0::/extra1::ext3 nfs-kernel-server Delay::3::0 IPaddr::10.50.1.20/32/eth0 mysql and it starts running init scripts with the 'start' argument. This is bound to fail, so: May 7 17:21:34 whisky ResourceManager[11889]: [12047]: debug: Starting /etc/init.d/mon start May 7 17:21:34 whisky ResourceManager[11889]: [12052]: debug: /etc/init.d/mon start done. RC=1 May 7 17:21:34 whisky ResourceManager[11889]: [12053]: ERROR: Return code 1 from /etc/init.d/mon May 7 17:21:34 whisky ResourceManager[11889]: [12054]: CRIT: Giving up resources due to failure of mon May 7 17:21:34 whisky ResourceManager[11889]: [12055]: info: Releasing resource group: vodka 213.20 7.104.20 ipvsadm mon drbddisk::all Filesystem::/dev/drbd0::/extra1::ext3 nfs-kernel-server Delay::3::0 IPaddr::10.50.1.20/32/eth0 mysql ... and down goes my entire cluster!!! Why does Heartbeat want to start a resource group that it already runs? Any help is most welcome! Best regards, Martijn Grendelman _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
