HA Version 2.1.4, CRM Version 2.0 (CIB feature set 2.0) node: aa909246edb386137b986c5773344b98c6969999

Okay, so you should not hit the group bug I was referring to.

I think that may be where the disconnect is happening. I ran ptest -Ls as you suggested and found that for my service, the score on the primary node is 0 and the score on the secondary node is -1000000. Could that be because I am also setting the default-resource-stickiness to -INFINITY, as I interpreted the FAQ to suggest?

It is very strange, actually; I appear to have two sets of scores. ptest -Ls output gives me two sets of values appended to each other. Here is my simplification of the output exactly as it appears:

group_1 allocation score on secondary: 0
group_1 allocation score on primary: 0
IP address allocation score on secondary: 0
IP address allocation score on primary: 100
DRBD allocation score on secondary: 0
DRBD allocation score on primary: 0
Filesystem allocation score on secondary: 0
Filesystem allocation score on primary: 0
PostgreSQL allocation score on secondary: 0
PostgreSQL allocation score on primary: 0
Kamailio (my custom service) allocation score on secondary: 0
Kamailio (my custom service) allocation score on primary: 0
IP address allocation score on secondary: 0
IP address allocation score on primary: 100
DRBD allocation score on secondary: -1000000
DRBD allocation score on primary: 0
Filesystem allocation score on secondary: -1000000
Filesystem allocation score on primary: 0
PostgreSQL allocation score on secondary: -1000000
PostgreSQL allocation score on primary: 0
Kamailio (my custom service) allocation score on secondary: -1000000
Kamailio (my custom service) allocation score on primary: 0

This and your config pasted below is not from the same cluster or not from the same config. The names are not identical and your values dont match.

So get that right.

If this is a test cluster. Go like this:

# stop heartbeat
/etc/init.d/heartbeat stop
# wait until all nodes shut down
# remove any old xml config
rm /var/lib/heartbeat/crm/*
# then start all nodes again
/etc/init.d/heartbeat start

It never hurts to attach your entire config, maybe something else is causing this.

cibadmin -Ql > my.xml

OK. Here is something else very strange; when I look at my cib.xml, I see the -INFINITY values I've assigned for default-resource-stickiness and default-resource-failure-stickiness. But when I look at cibadmin -Q, I see them as being set to 0. What gives?

cib.xml: http://pastebin.com/f610c9a0a

<nvpair id="cib-bootstrap-options-default-resource-stickiness" name="default-resource-stickiness" value="-INFINITY"/>

Thats propably not what you want. That would mean every successfully started resource would receive a score of -infinity and therefore be stopped again as with a negative score, a resource is not allowed to run on a node. That would end up in a stop/start I imagine (although didnt try :).

cibadmin -Q: http://pastebin.com/f21748793
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to