I don't have autojoin in my ha.cf, and I believe it defaults to
"autojoin none", so that wouldn't explain why heartbeat keeps waiting
after all nodes have joined.

I can see in /var/log/messages where crmd is doing the waiting for my
900-second initdead:

2010-01-11T13:51:15.428916-05:00 crmd: [4273]: info: do_started: The
local CRM is operational
2010-01-11T13:51:15.428924-05:00 crmd: [4273]: info:
do_state_transition: State transition S_STARTING -> S_PENDING [
input=I_PENDING cause=C_FSA_INTERNAL origin=do_started ]
2010-01-11T14:06:15.964307-05:00 crmd: [4273]: info: crm_timer_popped:
Election Trigger (I_DC_TIMEOUT) just popped!
2010-01-11T14:06:15.964337-05:00 crmd: [4273]: WARN: do_log: [[FSA]]
Input I_DC_TIMEOUT from crm_timer_popped() received in state (S_PENDING)
2010-01-11T14:06:15.964348-05:00 crmd: [4273]: info:
do_state_transition: State transition S_PENDING -> S_ELECTION [
input=I_DC_TIMEOUT cause=C_TIMER_POPPED origin=crm_timer_popped ]

I am using "Version 2 Resource Manager".  I didn't previously realize
this was the last version before the split.

I am also using DRBD, and yesterday I discovered that its
wait-for-connection timeout (wfc-timeout) works as I had hoped initdead
would, and by putting it before heartbeat in the startup sequence, it
turns out I don't really need initdead after all.

Thanks,
David


-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Dejan
Muhamedagic
Sent: Tuesday, January 12, 2010 3:51 AM
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] heartbeat waits for initdead even after all
nodes have joined

Hi,

On Mon, Jan 11, 2010 at 03:21:05PM -0500, David Sickmiller wrote:
> Hi,
> 
>  
> 
> I was hoping to configure my 2-node cluster to start as soon as both
> nodes were present but wait up to 15 minutes if the other node was
> missing upon system startup.  In my case, a delay of several minutes
is
> better than a split-brain scenario.  The Linux-HA documentation says
> "The initdead parameter is used to set the time that it takes to
declare
> a cluster node dead when Heartbeat is first started.", so I figured I
> could just set "initdead 900" in ha.cf.  Unfortunately, heartbeat
seems
> to be waiting for the entire initdead time interval regardless of
> whether all the nodes are present.
> 
>  
> 
> Does this match others' experiences?  Is there a different setting
that
> could accomplish my objective?
> 
>  
> 
> It seems like the documentation would be more accurate if it said "The
> initdead parameter is used to set the time that heartbeat waits before
> starting any resources, which allows time for additional nodes to
join."

If you have autojoin set to "any".

> However, I would much prefer that Linux-HA behaved according to the
> original documentation.
> 
>  
> 
> I'm using Heartbeat 2.1.4 on RHEL 5.4.

Please switch to Pacemaker/heartbeat or Pacemaker/corosync. Or
are you using v1/haresources?

Thanks,

Dejan
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to