On Fri 12/14/2007 1:04 AM, Andrew Beekhof said: >On Dec 14, 2007, at 12:12 AM, Scott Mann wrote: > >> >> On Thu 12/13/2007 3:09 PM, Andrew Beekhof said: >> >>> On Dec 13, 2007, at 8:11 PM, Scott Mann wrote: >> >>>>>>> I'm seeing about a 2.5minute delay between the time that >>>>>>> heartbeat >>>>>>> starts and the time that the IP address comes up on eth0:0 (if it >>>>>>> were 5minutes, I'd at least have a clue). >>>>>> >>>>> >>>>> i depends on your configured deadtime IIRC. >>>>> what does ha.cf look like? >>>> >>>> Here's my ha.cf: >>>> >>>> logfacility local0 >>>> keepalive 2 >>>> deadtime 30 >>>> warntime 10 >>>> initdead 120 >>> >>> 120 - that's 2 of your 2.5 minutes right there >> >> Ah, interesting. So, in v2 (due to autojoin, perhaps?), initdead >> causes a >> delay in startup, whereas in v1 mode it doesn't. Very good to know.
>should do in both i'd have thought... >when are you measuring from? OK. More details. First, in both cases I am running 2.1.2-24.1. In the case of v1 mode, my ha.cf file looks identical to the one I sent, except for the fact that I specify the two nodes (no autojoin) AND crm is off. The haresources file has one line with the "preferred" node and the IP address to manage. Heartbeat is started with the init script (/etc/init.d/heartbeat) and then another init script is run that starts my API application. In v1 mode, I can start my API application as soon as the init script completes and everything works as expected. In v2 mode, I cannot start the API app as soon as the heartbeat init script completes because I get a "Cannot signon" message because my app cannot connect to heartbeat. Only after the election completes and the resource is "started" am I able to connect to heartbeat via the API, which as you pointed out is delayed by initdead. I am concluding that in v1 mode, since the nodes are known, there's no need to delay initdead time. Whereas in v2 mode with autojoin any, the initdead wait time is consumed because there may be another node joining. Is that right? Perhaps there is a way to control this with a quorum size in v2? Or is there a behavioral bug in v1? Anyway, in both of these cases, both nodes come up at about the same time and recognize each other very quickly. In other words, this isn't about testing fail-over, etc. Thanks, again, for all your help. Let me know if you need anything else, I'll be happy to return the favor. Scott _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
<<winmail.dat>>
_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
