On 8/8/07, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote: > On Wed, Aug 08, 2007 at 01:07:19PM +0200, Andreas Kurz wrote: > > Hello all, > > > > I am running a two-node test cluster (heartbeat 2.1.2) using pingd as > > an OCF resource and encountered the following behaviour in my > > configuration: > > > > - I disabled clusterwide resource monitoring to restart heartbeat on > > on one node, because lrmd was not working as expected > > What was it doing?
I know what it was not doing ... executing monitors and monitor action initiated by the DC (after an 'crm_resource -P') timed out. > > - "/etc/init.d/hearbeat stop" hanged infinitely so I killed all > > heartbeatprocesses and the second node stonithed the other as > > expected, the resources were not started on the second node because > > they were unmanaged > > There must have been a reason for that. Logs and the CIB should > provide more details. I will start an extra thread for this and the lrmd respawn problem. > > > - when the first node was up again and integrated again in the cluster > > I reenabled clusterwide resource monitoring > > What do you mean by "clusterwide resource monitoring"? setting 'is-managed-default' in the crm_config section > > > - now the resources were all started on the second node, whith its > > higher weight because of the already running pingd and its > > score_attributes > > If pingd was running on all nodes then the resources should have > moved to their prefered node. It was only running on the second node, because the first node was stonithed. > > Now my question is: Is it possible to configure heartbeat to always > > wait for all pingd clone-instances to be started before the > > calculation of the scores for other resources (where a constraint with > > a pingd score_attribute exists) ? > > This is an interesting question: if I got it right, you are > talking about the delay between pingd being started and updating > the attributes. Since it is not possible to establish how much > it would take for the program (in this case pingd) to obtain data > necessary to update the attributes it wouldn't make sense to wait > for the update. However, once the CIB changes through that > update, the CRM will recalculate scores and move resources if > appropriate. I am talking about the behaviour of the CRM when 'is-managed-default' is reenabled again in a cluster and pingd is not running everywhere and there are some nodes which have a pingd node attribute and some which have not. As the pingd resource is a resource that influences the score of nodes when placing other resources I think it would be nice to have pingd started on all nodes _before_ all other resources .... when there are constraints including pingd score_attributes. > > > The only idea I had was to start pingd from ha.cf or to stop pingd > > also on the second node before reenabling the resource monitoring to > > allow a "clean" resource placing. > > But why didn't pingd run on the other first node? Shouldn't it > run if the node is eligible to run the resources? Isn't that the > point of it after all, to establish that the node is connected? Yes of course ... but as described above the first node was stonithed and with 'is-managed-default' disabled no resource and so no pingd was started after the reboot. To sum it up, my question is: If the pingd is used to influence the score of nodes in case of resource placement decisions and the CRM encounters that there are nodes who have no pingd attribute (not 0 but undefined) wouldnt it be a nice feature to start or restart the pingd on those nodes (that are involved in the placement decisions) without pingd node attributes? Regards, Andreas
pe-input-8.bz2
Description: BZip2 compressed data
_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
