On Wed, Aug 08, 2007 at 01:07:19PM +0200, Andreas Kurz wrote:
> Hello all,
> 
> I am running a two-node test cluster (heartbeat 2.1.2) using pingd as
> an OCF resource and encountered the following behaviour in my
> configuration:
> 
> - I disabled clusterwide resource monitoring to restart heartbeat on
> on one node, because lrmd was not working as expected

What was it doing?

> - "/etc/init.d/hearbeat stop" hanged infinitely so I killed all
> heartbeatprocesses and the second node stonithed the other as
> expected, the resources were not started on the second node because
> they were unmanaged

There must have been a reason for that. Logs and the CIB should
provide more details.

> - when the first node was up again and integrated again in the cluster
> I reenabled clusterwide resource monitoring

What do you mean by "clusterwide resource monitoring"?

> - now the resources were all started on the second node, whith its
> higher weight because of the already running pingd and its
> score_attributes

If pingd was running on all nodes then the resources should have
moved to their prefered node.

> Now my question is: Is it possible to configure heartbeat to always
> wait for all pingd clone-instances to be started before the
> calculation of the scores for other resources (where a constraint with
> a pingd score_attribute exists) ?

This is an interesting question: if I got it right, you are
talking about the delay between pingd being started and updating
the attributes. Since it is not possible to establish how much
it would take for the program (in this case pingd) to obtain data
necessary to update the attributes it wouldn't make sense to wait
for the update. However, once the CIB changes through that
update, the CRM will recalculate scores and move resources if
appropriate.

> The only idea I had was to start pingd from ha.cf or to stop pingd
> also on the second node before reenabling the resource monitoring to
> allow a "clean" resource placing.

But why didn't pingd run on the other first node? Shouldn't it
run if the node is eligible to run the resources? Isn't that the
point of it after all, to establish that the node is connected?

Cheers,

Dejan

> Regards,
> Andreas


> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to