Re: [Linux-HA] Heartbeat v2 CIB/API questions

Andrew Beekhof Mon, 17 Dec 2007 23:53:13 -0800


On Dec 17, 2007, at 11:28 PM, Scott Mann wrote:

On Mon 12/17/2007 1:36 AM, Andrew Beekhof said:

On Dec 14, 2007, at 6:31 PM, Scott Mann wrote:


On Fri 12/14/2007 1:04 AM, Andrew Beekhof said:

On Dec 14, 2007, at 12:12 AM, Scott Mann wrote:


On Thu 12/13/2007 3:09 PM, Andrew Beekhof said:

On Dec 13, 2007, at 8:11 PM, Scott Mann wrote:

I'm seeing about a 2.5minute delay between the time that
heartbeat
starts and the time that the IP address comes up on eth0:0
(if it
were 5minutes, I'd at least have a clue).


i depends on your configured deadtime IIRC.
what does ha.cf look like?


Here's my ha.cf:

logfacility     local0
keepalive 2
deadtime 30
warntime 10
initdead 120


120 - that's 2 of your 2.5 minutes right there


Ah, interesting. So, in v2 (due to autojoin, perhaps?), initdead
causes a

delay in startup, whereas in v1 mode it doesn't. Very good toknow.

should do in both i'd have thought...

when are you measuring from?


OK. More details.

First, in both cases I am running 2.1.2-24.1.

In the case of v1 mode, my ha.cf file looks identical to the one I
sent,
except for the fact that I specify the two nodes (no autojoin) AND
crm is off.

The haresources file has one line with the "preferred" node andthe IP

address to manage.

Heartbeat is started with the init script (/etc/init.d/heartbeat)
and then
another init script is run that starts my API application. In v1
mode, I
can start my API application as soon as the init script completes
and everything
works as expected.

In v2 mode, I cannot start the API app as soon as the heartbeat init
script completes
because I get a "Cannot signon" message because my app cannot
connect to heartbeat.
Only after the election completes and the resource is "started" am I
able to connect
to heartbeat via the API, which as you pointed out is delayed by
initdead.


That's really strange.

In order for the election to take place a number of components haveto

be signed into heartbeat... so I have no idea why your app cant.
Especially since nothing the CRM does (having elections or starting
resources) should influence your ability to sign in.

Unless the resource is an IP and you're using it to connect to the
cluster in some way?


The resource is an IP, but I'm using signon ((hb->llc_ops->signon).
The heartbeat API I wrote doesn't really depend on the IP resource,
it just wants to monitor it

in v2 mode you can't monitor the resource using the HA API... only viathe CIB.

(and a few other things) and pass messages
back and forth. But it is the signon that fails until everything is up

and running. It's not just my api, by the way, no other client cansignon

either (e.g., cl_status).



I am concluding that in v1 mode, since the nodes are known, there's
no need to
delay initdead time. Whereas
in v2 mode with autojoin any, the initdead wait time is consumed
because
there may be another node joining. Is that right?


Its possible.  I don't know how that code works.
Have you tried v2 without autojoin?

Yes. That's what I tried to explain above. In v1 mode with specificnodes, I can signon right

away. In v2 with autojoin, it takes 2.5 minutes.

Its still not clear to me that you've tried the third option.... "v2mode with specific nodes"

Now that I had the chance,
even changing initdead does not modify this delay in v2 with autojoin.
I need to start tracing through the code and find out where itdelays. Thiswill probably be a more useful approach. I might get into that thisweek, butI expect it will more likely be after the first of the year. It ison myTODO list anyway, I just have other things I have to get done rightnow. In any
case, if I find something useful, I'll post it here.


cool

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Heartbeat v2 CIB/API questions

Reply via email to