Hi,

On Wed, Jan 13, 2010 at 07:55:28PM -0500, David Sickmiller wrote:
> I don't have autojoin in my ha.cf, and I believe it defaults to
> "autojoin none", so that wouldn't explain why heartbeat keeps waiting
> after all nodes have joined.

True. That should be fixed. Can you please open a bugzilla for
this issue,

> I can see in /var/log/messages where crmd is doing the waiting for my
> 900-second initdead:
> 
> 2010-01-11T13:51:15.428916-05:00 crmd: [4273]: info: do_started: The
> local CRM is operational
> 2010-01-11T13:51:15.428924-05:00 crmd: [4273]: info:
> do_state_transition: State transition S_STARTING -> S_PENDING [
> input=I_PENDING cause=C_FSA_INTERNAL origin=do_started ]
> 2010-01-11T14:06:15.964307-05:00 crmd: [4273]: info: crm_timer_popped:
> Election Trigger (I_DC_TIMEOUT) just popped!
> 2010-01-11T14:06:15.964337-05:00 crmd: [4273]: WARN: do_log: [[FSA]]
> Input I_DC_TIMEOUT from crm_timer_popped() received in state (S_PENDING)

Well, this looks like another timeout, specific to crmd
(election). You can probably find it in /usr/lib*/heartbeat/crmd
metadata.

Thanks,

Dejan

> 2010-01-11T14:06:15.964348-05:00 crmd: [4273]: info:
> do_state_transition: State transition S_PENDING -> S_ELECTION [
> input=I_DC_TIMEOUT cause=C_TIMER_POPPED origin=crm_timer_popped ]
> 
> I am using "Version 2 Resource Manager".  I didn't previously realize
> this was the last version before the split.
> 
> I am also using DRBD, and yesterday I discovered that its
> wait-for-connection timeout (wfc-timeout) works as I had hoped initdead
> would, and by putting it before heartbeat in the startup sequence, it
> turns out I don't really need initdead after all.
> 
> Thanks,
> David
> 
> 
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Dejan
> Muhamedagic
> Sent: Tuesday, January 12, 2010 3:51 AM
> To: General Linux-HA mailing list
> Subject: Re: [Linux-HA] heartbeat waits for initdead even after all
> nodes have joined
> 
> Hi,
> 
> On Mon, Jan 11, 2010 at 03:21:05PM -0500, David Sickmiller wrote:
> > Hi,
> > 
> >  
> > 
> > I was hoping to configure my 2-node cluster to start as soon as both
> > nodes were present but wait up to 15 minutes if the other node was
> > missing upon system startup.  In my case, a delay of several minutes
> is
> > better than a split-brain scenario.  The Linux-HA documentation says
> > "The initdead parameter is used to set the time that it takes to
> declare
> > a cluster node dead when Heartbeat is first started.", so I figured I
> > could just set "initdead 900" in ha.cf.  Unfortunately, heartbeat
> seems
> > to be waiting for the entire initdead time interval regardless of
> > whether all the nodes are present.
> > 
> >  
> > 
> > Does this match others' experiences?  Is there a different setting
> that
> > could accomplish my objective?
> > 
> >  
> > 
> > It seems like the documentation would be more accurate if it said "The
> > initdead parameter is used to set the time that heartbeat waits before
> > starting any resources, which allows time for additional nodes to
> join."
> 
> If you have autojoin set to "any".
> 
> > However, I would much prefer that Linux-HA behaved according to the
> > original documentation.
> > 
> >  
> > 
> > I'm using Heartbeat 2.1.4 on RHEL 5.4.
> 
> Please switch to Pacemaker/heartbeat or Pacemaker/corosync. Or
> are you using v1/haresources?
> 
> Thanks,
> 
> Dejan
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to