Alan Robertson schreef:
> Martijn Grendelman wrote:
>> Hi,
>>
>> I am trying to build a 2-node cluster serving DRBD+NFS, among other
>> things. It has been operational on Debian Sarge, with Heartbeat 1.2, but
>> recently, both machines were upgraded to Debian Etch, and today I
>> upgraded Heartbeat to 2.0.7. I maintained the R1 style configuration.
>> Heartbeat is running in an active/passive fashion.

[snip]

> We run /etc/init.d/nfs-kernel-server status before starting it.  If it
> says OK or running, then we don't start it because it's already running.
> 
> See  http://linux-ha.org/HeartbeatResourceAgent

Thank you for the information.

There is one other problem that I haven't been able to solve, and I hope
someone can help me with that too.

Sometimes it happens that Heartbeat tries to take over a resource group
that it's already running:

[EMAIL PROTECTED]:~> cl_status rscstatus
all

[EMAIL PROTECTED]:~> cl_status rscstatus
none

Now, when I shutdown or reboot Vodka, I would expect nothing much to
happen in the cluster, but instead, Heartbeat on Whisky, the node that's
already running things, says:

May  7 17:21:34 whisky mach_down[11872]: [11888]: info: Taking over
resource group 213.207.104.20
May  7 17:21:34 whisky ResourceManager[11889]: [11897]: info: Acquiring
resource group: vodka 213.207.104.20 ipvsadm mon drbddisk::all
Filesystem::/dev/drbd0::/extra1::ext3 nfs-kernel-server Delay::3::0
IPaddr::10.50.1.20/32/eth0 mysql

and it starts running init scripts with the 'start' argument. This is
bound to fail, so:

May  7 17:21:34 whisky ResourceManager[11889]: [12047]: debug: Starting
/etc/init.d/mon  start
May  7 17:21:34 whisky ResourceManager[11889]: [12052]: debug:
/etc/init.d/mon  start done. RC=1
May  7 17:21:34 whisky ResourceManager[11889]: [12053]: ERROR: Return
code 1 from /etc/init.d/mon
May  7 17:21:34 whisky ResourceManager[11889]: [12054]: CRIT: Giving up
resources due to failure of mon
May  7 17:21:34 whisky ResourceManager[11889]: [12055]: info: Releasing
resource group: vodka 213.20
7.104.20 ipvsadm mon drbddisk::all Filesystem::/dev/drbd0::/extra1::ext3
nfs-kernel-server Delay::3::0 IPaddr::10.50.1.20/32/eth0 mysql

... and down goes my entire cluster!!!

Why does Heartbeat want to start a resource group that it already runs?
Any help is most welcome!

Best regards,

Martijn Grendelman

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to