On Tue, 10 Aug 2010, Igor Chudov wrote:
> On Tue, Aug 10, 2010 at 7:05 PM, David Lang
> wrote:
>> On Tue, 10 Aug 2010, Igor Chudov wrote:
>>
>>> On Tue, Aug 10, 2010 at 6:41 PM, David Lang
>>> wrote:
On Tue, 10 Aug 2010, Igor Chudov wrote:
As I noted in a prior e-mail, to work around i
On Tue, Aug 10, 2010 at 7:05 PM, David Lang
wrote:
> On Tue, 10 Aug 2010, Igor Chudov wrote:
>
>> On Tue, Aug 10, 2010 at 6:41 PM, David Lang
>> wrote:
>>> On Tue, 10 Aug 2010, Igor Chudov wrote:
>>>
Guys, I have a bit of clarification. In an attempt to avoid the timing
issues, an hour
On Tue, 10 Aug 2010, Igor Chudov wrote:
> On Tue, Aug 10, 2010 at 6:41 PM, David Lang
> wrote:
>> On Tue, 10 Aug 2010, Igor Chudov wrote:
>>
>>> Guys, I have a bit of clarification. In an attempt to avoid the timing
>>> issues, an hour ago I tried adding a configuration change to
>>> /etc/init.d/
On Tue, Aug 10, 2010 at 6:41 PM, David Lang
wrote:
> On Tue, 10 Aug 2010, Igor Chudov wrote:
>
>> Guys, I have a bit of clarification. In an attempt to avoid the timing
>> issues, an hour ago I tried adding a configuration change to
>> /etc/init.d/heartbeat to delay starting it by 2 minutes on one
On Tue, 10 Aug 2010, Igor Chudov wrote:
> Guys, I have a bit of clarification. In an attempt to avoid the timing
> issues, an hour ago I tried adding a configuration change to
> /etc/init.d/heartbeat to delay starting it by 2 minutes on one box. So
> logs with takeover succeeding, and heartbeat sh
Guys, I have a bit of clarification. In an attempt to avoid the timing
issues, an hour ago I tried adding a configuration change to
/etc/init.d/heartbeat to delay starting it by 2 minutes on one box. So
logs with takeover succeeding, and heartbeat shutting down are partly
an artifact of this change
actually, what catches my attention is a little before that
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: WARN: Shutdown delayed until
current resource activity finishes.
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: Heartbeat shutdown in
progress. (1162)
Aug 10 17:49:08 pfs-srv3 heartbeat: [
David and Dmitri,
Here's one more try and one more set of log files. I now see that
heartbeat is shutting down, which is beyond what used to happen.
some interesting lines I saw:
Aug 10 17:49:09 pfs-srv4 heartbeat: [1276]: info: Received shutdown
notice from 'pfs-srv3'.
Aug 10 17:49:08 pfs-srv3
On Tuesday 10 August 2010 17:19, Igor Chudov wrote:
> Guys, I just sent ha-log, ha.cf, haresources from both machines.
These look like shutdown logs, not startup logs.
FWIW here's what mine's like (sanitized):
** secondary **
heartbeat: [8356]: info: Configuration validated. Starting heartbeat 2.
On Tue, 10 Aug 2010, Igor Chudov wrote:
> Guys, I just sent ha-log, ha.cf, haresources from both machines.
>
> At this point, I of course greatly appreciate your help and your
> generous assistance.
>
> But I wonder if our attention is going in a wrong direction of "try
> this and try that".
>
> W
Guys, I just sent ha-log, ha.cf, haresources from both machines.
At this point, I of course greatly appreciate your help and your
generous assistance.
But I wonder if our attention is going in a wrong direction of "try
this and try that".
What if right now, I need to systematically understand wh
On Tue, Aug 10, 2010 at 3:25 PM, David Lang
wrote:
> could you re-post the files (log files, ha.cf and haresources from each box)
>
Log file from pfs-srv3
Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: info: other_holds_resources: 0
Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: info: Received shutd
could you re-post the files (log files, ha.cf and haresources from each box)
David Lang
On Tue, 10 Aug 2010, Igor Chudov wrote:
> Date: Tue, 10 Aug 2010 15:23:44 -0500
> From: Igor Chudov
> Reply-To: General Linux-HA mailing list
> To: General Linux-HA mailing list
> Subject: Re: [Linux-HA] H
On Tue, Aug 10, 2010 at 2:59 PM, David Lang
wrote:
> Ok, just checking again, the two haresources files are truely identical.
>
> you didn't put different system names in the first line of each file or
> something like that? (this is a common mistake)
>
> I would also remove the second host from t
Ok, just checking again, the two haresources files are truely identical.
you didn't put different system names in the first line of each file or
something like that? (this is a common mistake)
I would also remove the second host from the haresources file. having it there
with no resources on it
On Tue, Aug 10, 2010 at 2:28 PM, David Lang
wrote:
> On Tue, 10 Aug 2010, Igor Chudov wrote:
>
>> Dmitri, you are right.
>>
>> In any case the name change did nothing.
>
> did it eliminate the error from the log? does the log say anything else after
> that point?
It eliminated the error from the
On Tue, 10 Aug 2010, Igor Chudov wrote:
> Dmitri, you are right.
>
> In any case the name change did nothing.
did it eliminate the error from the log? does the log say anything else after
that point?
David Lang
> They are still refuse to take over when rebooted simultaneously.
>
> The symptoms
Dmitri, you are right.
In any case the name change did nothing.
They are still refuse to take over when rebooted simultaneously.
The symptoms are the same as usual.
I am thinking, should I perhaps put a little statement in
/etc/init.d/heartbeat on one of the boxes and add "sleep 100" in it?
i
On Tuesday 10 August 2010 13:14, Igor Chudov wrote:
>
> Haresources refers to "drbddisk", however, the resource in
> /usr/lib/ocf/resource.d/heartbeat is called "drbd".
Heartbeat 2.1.4 on centos 5 comes with /etc/ha.d/resource.d/drbddisk. Looks
like the docs you read don't match the version you h
On Tue, Aug 10, 2010 at 1:08 PM, David Lang
wrote:
> On Tue, 10 Aug 2010, Igor Chudov wrote:
>
>> On Tue, Aug 10, 2010 at 12:51 PM, David Lang
>> wrote:
>>>
>>> one problem I see in ha-log-2.txt is the lines
>>>
>>> Aug 10 10:38:06 pfs-srv4 ResourceManager[1241]: [1253]: ERROR: Cannot
>>> locate
On Tue, 10 Aug 2010, Igor Chudov wrote:
On Tue, Aug 10, 2010 at 12:51 PM, David Lang
wrote:
one problem I see in ha-log-2.txt is the lines
Aug 10 10:38:06 pfs-srv4 ResourceManager[1241]: [1253]: ERROR: Cannot locate
resource script
Aug 10 10:38:06 pfs-srv4 req_resource[1236]: [1256]: debug:
On Tue, Aug 10, 2010 at 12:51 PM, David Lang
wrote:
> one problem I see in ha-log-2.txt is the lines
>
> Aug 10 10:38:06 pfs-srv4 ResourceManager[1241]: [1253]: ERROR: Cannot locate
> resource script
> Aug 10 10:38:06 pfs-srv4 req_resource[1236]: [1256]: debug: in
> /usr/share/heartbeat/req_reso
one problem I see in ha-log-2.txt is the lines
Aug 10 10:38:06 pfs-srv4 ResourceManager[1241]: [1253]: ERROR: Cannot locate
resource script
Aug 10 10:38:06 pfs-srv4 req_resource[1236]: [1256]: debug: in
/usr/share/heartbeat/req_resource
Aug 10 10:38:06 pfs-srv4 req_resource[1236]: [1258]: debug:
On Tue, Aug 10, 2010 at 10:21 AM, Pushkar Pradhan
wrote:
> David, I did a fresh restart today (without changing to mcast, yet, as
> I want to do one thing at a time).
>
> Again, neither server took over.
>
> Here's the ha-logs from them:
>
> http://igor.chudov.com/tmp/ha-log-1.txt
> http://igor.ch
From: linux-ha-boun...@lists.linux-ha.org on behalf of Igor Chudov
Sent: Tue 8/10/2010 6:50 AM
To: General Linux-HA mailing list
Cc: david.l...@digitalinsight.com
Subject: Re: [Linux-HA] Heartbeat does not take over if BOTH
machinesarebootedat the same time
On Mon, Aug 9, 2010 at 5:07 PM, David Lang
wrote:
> ha-log should give you a detailed picture of what each box is thinking as they
> startup. I've always been able to track down the problem with that info for my
> systems.
>
David, I did a fresh restart today (without changing to mcast, yet, as
I
26 matches
Mail list logo