Hi, On Wed, Jan 13, 2010 at 07:55:28PM -0500, David Sickmiller wrote: > I don't have autojoin in my ha.cf, and I believe it defaults to > "autojoin none", so that wouldn't explain why heartbeat keeps waiting > after all nodes have joined.
True. That should be fixed. Can you please open a bugzilla for this issue, > I can see in /var/log/messages where crmd is doing the waiting for my > 900-second initdead: > > 2010-01-11T13:51:15.428916-05:00 crmd: [4273]: info: do_started: The > local CRM is operational > 2010-01-11T13:51:15.428924-05:00 crmd: [4273]: info: > do_state_transition: State transition S_STARTING -> S_PENDING [ > input=I_PENDING cause=C_FSA_INTERNAL origin=do_started ] > 2010-01-11T14:06:15.964307-05:00 crmd: [4273]: info: crm_timer_popped: > Election Trigger (I_DC_TIMEOUT) just popped! > 2010-01-11T14:06:15.964337-05:00 crmd: [4273]: WARN: do_log: [[FSA]] > Input I_DC_TIMEOUT from crm_timer_popped() received in state (S_PENDING) Well, this looks like another timeout, specific to crmd (election). You can probably find it in /usr/lib*/heartbeat/crmd metadata. Thanks, Dejan > 2010-01-11T14:06:15.964348-05:00 crmd: [4273]: info: > do_state_transition: State transition S_PENDING -> S_ELECTION [ > input=I_DC_TIMEOUT cause=C_TIMER_POPPED origin=crm_timer_popped ] > > I am using "Version 2 Resource Manager". I didn't previously realize > this was the last version before the split. > > I am also using DRBD, and yesterday I discovered that its > wait-for-connection timeout (wfc-timeout) works as I had hoped initdead > would, and by putting it before heartbeat in the startup sequence, it > turns out I don't really need initdead after all. > > Thanks, > David > > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Dejan > Muhamedagic > Sent: Tuesday, January 12, 2010 3:51 AM > To: General Linux-HA mailing list > Subject: Re: [Linux-HA] heartbeat waits for initdead even after all > nodes have joined > > Hi, > > On Mon, Jan 11, 2010 at 03:21:05PM -0500, David Sickmiller wrote: > > Hi, > > > > > > > > I was hoping to configure my 2-node cluster to start as soon as both > > nodes were present but wait up to 15 minutes if the other node was > > missing upon system startup. In my case, a delay of several minutes > is > > better than a split-brain scenario. The Linux-HA documentation says > > "The initdead parameter is used to set the time that it takes to > declare > > a cluster node dead when Heartbeat is first started.", so I figured I > > could just set "initdead 900" in ha.cf. Unfortunately, heartbeat > seems > > to be waiting for the entire initdead time interval regardless of > > whether all the nodes are present. > > > > > > > > Does this match others' experiences? Is there a different setting > that > > could accomplish my objective? > > > > > > > > It seems like the documentation would be more accurate if it said "The > > initdead parameter is used to set the time that heartbeat waits before > > starting any resources, which allows time for additional nodes to > join." > > If you have autojoin set to "any". > > > However, I would much prefer that Linux-HA behaved according to the > > original documentation. > > > > > > > > I'm using Heartbeat 2.1.4 on RHEL 5.4. > > Please switch to Pacemaker/heartbeat or Pacemaker/corosync. Or > are you using v1/haresources? > > Thanks, > > Dejan > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
