On 4/22/2011 4:25 AM, SEILLIER Mathieu wrote:
Result of /usr/bin/cl_status nodestatus servappli01 command on servappli01 :
active
Result of /usr/bin/cl_status nodestatus servappli02 command on servappli01 :
dead
Result of /usr/bin/cl_status nodestatus servappli01 command on servappli02 :
[mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of mike
Sent: Tuesday, April 26, 2011 5:41 PM
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] [Heartbeat] my VIP doesn't work :(
On 11-04-22 06:25 AM, SEILLIER Mathieu wrote:
Hi all,
First I'm french so sorry in advance for my English
Hi,
On Fri, Apr 08, 2011 at 11:39:49AM +0200, Andrea Bertucci wrote:
Hello.
I have a problem with heartbeat 2.1.3-3.el5 installed on a Red Hat
That version is way too old. You won't get help for it here.
Please upgrade to Pacemaker 1.0.10 or 1.1.5.
Thanks,
Dejan
Enterprise Linux Server
On 04/19/2011 12:46 PM, Dejan Muhamedagic wrote:
Hi,
On Fri, Apr 08, 2011 at 11:39:49AM +0200, Andrea Bertucci wrote:
Hello.
I have a problem with heartbeat 2.1.3-3.el5 installed on a Red Hat
That version is way too old. You won't get help for it here.
Please upgrade to Pacemaker 1.0.10 or
Hi,
On Tue, Apr 19, 2011 at 02:19:48PM +0200, Andrea Bertucci wrote:
On 04/19/2011 12:46 PM, Dejan Muhamedagic wrote:
Hi,
On Fri, Apr 08, 2011 at 11:39:49AM +0200, Andrea Bertucci wrote:
Hello.
I have a problem with heartbeat 2.1.3-3.el5 installed on a Red Hat
That version is way too
Yes...funny...
Andrea.
On 04/19/2011 02:23 PM, Dejan Muhamedagic wrote:
Hi.
The version I use is the newest I found distributed with CentOS rpm's (I
choose this for compatibility reason).
I also tried to recompile heartbeat v3 (I don't remember exact version)
and to use non official
On Tue, Apr 5, 2011 at 11:58 AM, Maxim Ianoglo dot...@gmail.com wrote:
Hello,
I have four serves in a HA cluster:
NodeA
NodeB
NodeC
NodeD
There are defined three groups of resources and one inline resource:
1. group_storage ( NFS VIP, NFS Server, DRBD )
2. group_apache_www (Domains VIPs
create a resource for the other script and then use an regular
ordering constraint to have it start before the VIP
On Thu, Jan 13, 2011 at 7:13 PM, maillis...@gmail.com wrote:
Sorry if this is a silly question. I've been reading the docs and I'm
a little confused.
I have a situation where I
Il giorno Mar 18 Gen 2011 12:13:15 CET, Erik Dobák ha scritto:
Hi people,
i got my active/passive cluster running. when i start the first node all
resources are started. but when i start the second node, all resources are
stoped on the first node and started on the second node. why? do i
I have set up cron jobs on both servers. I restart heartbeat at 22 hours on
one box and at 23 hours on another. It's been 4 days and so far, so good. I
will report more result. This could be an ugly solution to an ugly problem,
but workable.
i
___
On Tue, Jan 4, 2011 at 10:22 AM, Serge Dubrouski serge...@gmail.com wrote:
On Tue, Jan 4, 2011 at 9:14 AM, Igor Chudov ichu...@gmail.com wrote:
Serge, I am not sure of anything, but the self-communication is supposed
to
be taking place on a single crossover cable between second network
Igor Chudov wrote:
My second question is, can heartbeat be configured to restart itself in case
of such a failure.
Usually you can't have X restart itself after X dies. You need some kind
of Y.
If you're running snmpd, see if you can get proc to identify
heartbeat: master control process
I have the same problem (on Ubuntu).
Very interested in an answer.
i
On Fri, Jan 7, 2011 at 5:12 AM, Daniel Krambrock ajs...@googlemail.comwrote:
hi there,
we have got an 12 node cluster for managing KVM based virtual machines.
we are using fedora 12 for the node systems with pacemaker
hi there,
i think we found the reason for the syntax error in ping RA:
the crash of heartbeat had produced a coredump
in /var/lib/heartbeat/cores/root , which is the working directory of
the ping RA. ping RA makes use of a unquoted * symbol:
score=`expr $active * $OCF_RESKEY_multiplier`
since
Very interested on SIGXCPU problem.
I cannot deploy my solution with it.
i
On Fri, Jan 7, 2011 at 11:41 AM, Daniel Krambrock ajs...@googlemail.comwrote:
hi there,
i think we found the reason for the syntax error in ping RA:
the crash of heartbeat had produced a coredump
in
Further reading indicates that heartbeat itself sets a limit for itself
every so often.
Then it exceeds the limit (probably due to a bug). I am sure that tha's why
whoever wrote heartbeat, set cpu limit, instead of foxing their bugs.
Then it dies with SIGXCPU, leaving everything in an extremely
On 4 January 2011 13:47, Igor Chudov ichu...@gmail.com wrote:
Further reading indicates that heartbeat itself sets a limit for itself
every so often.
Then it exceeds the limit (probably due to a bug). I am sure that tha's why
whoever wrote heartbeat, set cpu limit, instead of foxing their
Which OS?
Which version of Hearbeat?
heartbeat_pid - PID of which of Heartbeat processes? It has several.
On Tue, Jan 4, 2011 at 6:32 AM, Igor Chudov ichu...@gmail.com wrote:
A few weeks I reported that heartbeat died on one of the cluster machines,
due to SIGXCPU.
Well, it happened
Hi,
On Tue, Jan 04, 2011 at 07:47:10AM -0600, Igor Chudov wrote:
Further reading indicates that heartbeat itself sets a limit for itself
every so often.
True.
Then it exceeds the limit (probably due to a bug). I am sure that tha's why
whoever wrote heartbeat, set cpu limit, instead of
Steve, here's some data.
The OS is Ubuntu 10.04.
~# apt-cache policy heartbeat
heartbeat:
Installed: 1:3.0.3-1ubuntu1
Candidate: 1:3.0.3-1ubuntu1
Version table:
*** 1:3.0.3-1ubuntu1 0
500 http://us.archive.ubuntu.com/ubuntu/ lucid/universe Packages
100 /var/lib/dpkg/status
On Tue, Jan 4, 2011 at 9:40 AM, Serge Dubrouski serge...@gmail.com wrote:
Which OS?
Ubuntu 10.04 Lucid.
Which version of Hearbeat?
3.0.3
~# apt-cache policy heartbeat
heartbeat:
Installed: 1:3.0.3-1ubuntu1
Candidate: 1:3.0.3-1ubuntu1
Version table:
*** 1:3.0.3-1ubuntu1 0
Are you sure that everything is all right with your network? It looks
like processes that are responsible for UDP communications are taking
too much of CPU time.
On Tue, Jan 4, 2011 at 8:47 AM, Igor Chudov ichu...@gmail.com wrote:
Steve, here's some data.
The OS is Ubuntu 10.04.
~# apt-cache
Serge, I am not sure of anything, but the self-communication is supposed to
be taking place on a single crossover cable between second network cards of
the servers. (eth1).
Igor
On Tue, Jan 4, 2011 at 10:06 AM, Serge Dubrouski serge...@gmail.com wrote:
Are you sure that everything is all right
On Tue, Jan 4, 2011 at 9:14 AM, Igor Chudov ichu...@gmail.com wrote:
Serge, I am not sure of anything, but the self-communication is supposed to
be taking place on a single crossover cable between second network cards of
the servers. (eth1).
Agree, yet something strange and pretty unique is
Igor Chudov wrote:
At this point I feel rather desperate. Perhaps I should give pacemaker
another go. I really have no idea and I am running out of options.
If all you need is a 2-node active-passive cluster, most (all?)
pacemaker features are useless for you. (Besides, one look at their
On Tue, Jan 4, 2011 at 1:29 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote:
Igor Chudov wrote:
At this point I feel rather desperate. Perhaps I should give pacemaker
another go. I really have no idea and I am running out of options.
If all you need is a 2-node active-passive cluster, most
On Sun, Dec 26, 2010 at 08:56:13AM -0600, Igor Chudov wrote:
As you guys recall, I have set up a heartbeat/drbd based system to replace
an aging drbd solution.
While it sits there, it has not been activated.
I have noticed (due to some self checking scripts) that heartbeat died on
one
On Wed, 8 Dec 2010, sunitha kumar wrote:
Hi,
I am using:
heartbeat-3.0.0-33.2
pacemaker-mgmt-client-1.99.2-6.1
pacemaker-libs-1.0.5-4.1
pacemaker-1.0.5-4.1
pacemaker-mgmt-1.99.2-6.1
/etc/init.d/heartbeat stop hangs in /usr/lib/heartbeat/heartbeat -k
strace output on this shows it is
Vadym, etc.,
it is not is not just an issue on multicast setups - the same
thing happens if I set up an environment using unicasts ...
# cl_status hblinkstatus nodename ethn
returns up for every defined node and ethernet interface *except* on the
node you are running the
On Nov 10, 2010, at 10:51 AM, Frank Lazzarini wrote:
Correct me if I am wrong but you can't check the link of a node on that same
node ?
I sure hope not, why?
On Wed, Nov 10, 2010 at 4:16 PM, Pavlos Parissis
pavlos.paris...@gmail.comwrote:
I can confirm that on the same release,
I can confirm that on the same release, but I have no idea if it is normal
or not. Could be a expected behavior due to the use of multicast
[r...@node-01 tmp]# ./checkcl_status.pl
/usr/bin/cl_status hblinkstatus node-01 eth0
node-01 eth0: dead
/usr/bin/cl_status hblinkstatus node-01 eth1
node-01
Correct me if I am wrong but you can't check the link of a node on that same
node ?
On Wed, Nov 10, 2010 at 4:16 PM, Pavlos Parissis
pavlos.paris...@gmail.comwrote:
I can confirm that on the same release, but I have no idea if it is normal
or not. Could be a expected behavior due to the use
On Mon, Nov 08, 2010 at 05:51:08PM -0700, Alan Robertson wrote:
Quoting Vsevolod Katkov katkovsur...@yahoo.com:
Reporting an issue. Thank you very much for any feedback
today heartbeat process took all the CPU (99%-100%) and load went up.
it put this message to log repeating 14 times a
thank you Lars for the feedback
From: Lars Ellenberg lars.ellenb...@linbit.com
To: linux-ha@lists.linux-ha.org
Sent: Tue, November 9, 2010 5:49:54 AM
Subject: Re: [Linux-HA] heartbeat takes all cpu
On Mon, Nov 08, 2010 at 05:51:08PM -0700, Alan Robertson wrote
Quoting Vsevolod Katkov katkovsur...@yahoo.com:
Reporting an issue. Thank you very much for any feedback
today heartbeat process took all the CPU (99%-100%) and load went up.
it put this message to log repeating 14 times a seccond:
heartbeat: [2553]: WARN: Gmain_timeout_dispatch: Dispatch
, 2010 4:51:08 PM
Subject: Re: [Linux-HA] heartbeat takes all cpu
Quoting Vsevolod Katkov katkovsur...@yahoo.com:
Reporting an issue. Thank you very much for any feedback
today heartbeat process took all the CPU (99%-100%) and load went up.
it put this message to log repeating 14 times a seccond
On Tue, Oct 19, 2010 at 11:04:10AM -0600, Serge Dubrouski wrote:
I see you could do it and now you are going to use Pacemaker all the
time in the future. Than I see no reason why other can't do it as
well taking into account that Heartbeat v1 almost not supported and
definitely has no future
On Tue, Oct 19, 2010 at 12:11:03PM +0800, Linux Cook wrote:
hi!
I used the tarball package of postgresql and recompiled it. Postgres now
resides at /usr/local/pgsql and mounting /usr/local/pgsql/data into
/dev/drbd0.
However, hearbeat recognizes my Filesystem and IPaddr2 resources but not
On Wed, Oct 27, 2010 at 4:41 AM, Lars Ellenberg
lars.ellenb...@linbit.com wrote:
On Tue, Oct 19, 2010 at 11:04:10AM -0600, Serge Dubrouski wrote:
I see you could do it and now you are going to use Pacemaker all the
time in the future. Than I see no reason why other can't do it as
well taking
On 10/21/10 22:41, René Pilz wrote:
Hello,
I compiled heartbeat 2.1.4 on CentOS 4.4 x86_64.
All seems working the right way, but I can't login to the server using
hb_gui because I can't connect to the server.
Password and username are correct but the mgmtd is not running.
If I try to
2010/10/25 Yan Gao y...@novell.com
On 10/21/10 22:41, René Pilz wrote:
Hello,
I compiled heartbeat 2.1.4 on CentOS 4.4 x86_64.
All seems working the right way, but I can't login to the server using
hb_gui because I can't connect to the server.
Password and username are correct but
On 10/25/10 15:28, René Pilz wrote:
2010/10/25 Yan Gao y...@novell.com
On 10/21/10 22:41, René Pilz wrote:
Hello,
I compiled heartbeat 2.1.4 on CentOS 4.4 x86_64.
All seems working the right way, but I can't login to the server using
hb_gui because I can't connect to the server.
On Fri, Oct 22, 2010 at 7:23 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote:
Andrew Beekhof wrote:
OK, I'll post this and shut up.
Or are you really trying to claim that:
linuxha1 IPaddr::192.168.85.3 httpd smb
is fundamentally less complex than
primitive IP ocf:heartbeat:IPaddr
On Fri, Oct 22, 2010 at 7:32 PM, Greg Woods wo...@ucar.edu wrote:
On Fri, 2010-10-22 at 18:32 +0200, Andrew Beekhof wrote:
if you're just using v1 - thats not a cluster,
thats a prayer.
Then God must answer my prayers, because I have been using some simple
heartbeat v1/DRBD clusters for
On Wed, Oct 20, 2010 at 4:43 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote:
Andrew Beekhof wrote:
On Tue, Oct 19, 2010 at 6:44 PM, Greg Woods wo...@ucar.edu wrote:
On Tue, 2010-10-19 at 10:01 -0600, Serge Dubrouski wrote:
Any particular reason for using Heartbeat v1 instead of CRM/Pacemaker?
On Wed, Oct 20, 2010 at 7:09 PM, Greg Woods wo...@ucar.edu wrote:
On Wed, 2010-10-20 at 08:13 +0200, Andrew Beekhof wrote:
Um, maybe because heartbeat v1 has a much much much much less steep
learning curve?
I dispute that:
Andrew Beekhof wrote:
OK, I'll post this and shut up.
Or are you really trying to claim that:
linuxha1 IPaddr::192.168.85.3 httpd smb
is fundamentally less complex than
primitive IP ocf:heartbeat:IPaddr params ip=192.168.85.3
primitive http lsb:httpd
primitive samba
On Fri, 2010-10-22 at 18:32 +0200, Andrew Beekhof wrote:
if you're just using v1 - thats not a cluster,
thats a prayer.
Then God must answer my prayers, because I have been using some simple
heartbeat v1/DRBD clusters for YEARS, for critical services like DNS.
They have worked flawlessly and
On Tue, Oct 19, 2010 at 6:44 PM, Greg Woods wo...@ucar.edu wrote:
On Tue, 2010-10-19 at 10:01 -0600, Serge Dubrouski wrote:
Any particular reason for using Heartbeat v1 instead of CRM/Pacemaker?
Um, maybe because heartbeat v1 has a much much much much less steep
learning curve?
I dispute
Andrew Beekhof wrote:
On Tue, Oct 19, 2010 at 6:44 PM, Greg Woods wo...@ucar.edu wrote:
On Tue, 2010-10-19 at 10:01 -0600, Serge Dubrouski wrote:
Any particular reason for using Heartbeat v1 instead of CRM/Pacemaker?
Um, maybe because heartbeat v1 has a much much much much less steep
learning
On Wed, 2010-10-20 at 08:13 +0200, Andrew Beekhof wrote:
Um, maybe because heartbeat v1 has a much much much much less steep
learning curve?
I dispute that:
http://theclusterguy.clusterlabs.org/post/178680309/configuring-heartbeat-v1-was-so-simple
This addresses the fact that
On Wed, Oct 20, 2010 at 11:09 AM, Greg Woods wo...@ucar.edu wrote:
On Wed, 2010-10-20 at 08:13 +0200, Andrew Beekhof wrote:
Um, maybe because heartbeat v1 has a much much much much less steep
learning curve?
I dispute that:
Michael Schwartzkopff escribió:
On Monday 18 October 2010 18:58:27 Jordi Moreno wrote:
Hi all,
I have searched for info about this issue with no success...
I have been using Heartbeat2 in various environments with no problems at
all. Recently, I started to try version 3.0.3 of Heartbeat. We
Any particular reason for using Heartbeat v1 instead of CRM/Pacemaker?
On Mon, Oct 18, 2010 at 10:11 PM, Linux Cook linuxc...@gmail.com wrote:
hi!
I used the tarball package of postgresql and recompiled it. Postgres now
resides at /usr/local/pgsql and mounting /usr/local/pgsql/data into
On Tue, 2010-10-19 at 10:01 -0600, Serge Dubrouski wrote:
Any particular reason for using Heartbeat v1 instead of CRM/Pacemaker?
Um, maybe because heartbeat v1 has a much much much much less steep
learning curve? If you have a simple two-node cluster where one node is
just a hot spare, it is way
On Tue, Oct 19, 2010 at 10:44 AM, Greg Woods wo...@ucar.edu wrote:
On Tue, 2010-10-19 at 10:01 -0600, Serge Dubrouski wrote:
Any particular reason for using Heartbeat v1 instead of CRM/Pacemaker?
Um, maybe because heartbeat v1 has a much much much much less steep
learning curve? If you have a
Serge Dubrouski wrote:
I see you could do it and now you are going to use Pacemaker all the
time in the future. Than I see no reason why other can't do it as
well taking into account that Heartbeat v1 almost not supported and
definitely has no future unless somebody will decide to fork it out
On Tue, Oct 19, 2010 at 12:41 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote:
Serge Dubrouski wrote:
I see you could do it and now you are going to use Pacemaker all the
time in the future. Than I see no reason why other can't do it as
well taking into account that Heartbeat v1 almost not
On Tue, Oct 19, 2010 at 1:49 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote:
Serge Dubrouski wrote:
Ok. Please let's stop this useless holywar and try to help to solve
the original problem: why PostgreSQL doesn't want to start on
Heartbeat v1. I personally have no idea since I've never used
Serge Dubrouski wrote:
On Tue, Oct 19, 2010 at 1:49 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote:
The easiest fix was to create /drbdfs/pgsql with proper ownership and
symlink /var/lib/pgsql to it. Now that he's recompiled everything, who
knows.
Or manually mount /var/lib/pgsql/data and
On Tue, Oct 19, 2010 at 2:00 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote:
Serge Dubrouski wrote:
On Tue, Oct 19, 2010 at 1:49 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu
wrote:
The easiest fix was to create /drbdfs/pgsql with proper ownership and
symlink /var/lib/pgsql to it. Now that he's
Serge Dubrouski wrote:
He's not using OCF. And that was the reason for my first question.
Right. Someone else mentioned ocf pgsql primitive.
In v1 case, /etc/init.d/postgresql -- recompiling from tarball typically
doesn't install /etc/init.d/ scripts, or installs the wrong one (e.g.
paths
On Friday 15 October 2010 09:28:36 Linux Cook wrote:
hi!
I've just setup heartbeat + drbd with postgresql. I'm mirroring /dev/drbd0
to /var/lib/postgresql. The problem is, postgresql service can't start
because everytime heartbeat mounts the /dev/drbd0 to /var/lib/postgresql,
it changes it
On Oct 15, 2010, at 3:28 AM, Linux Cook wrote:
hi!
I've just setup heartbeat + drbd with postgresql. I'm mirroring /dev/drbd0
to /var/lib/postgresql. The problem is, postgresql service can't start
because everytime heartbeat mounts the /dev/drbd0 to /var/lib/postgresql, it
changes it user
thanks michael and vadym,
Will try your suggestions and will let you know.
On Fri, Oct 15, 2010 at 4:46 AM, Vadym Chepkov vchep...@gmail.com wrote:
On Oct 15, 2010, at 3:28 AM, Linux Cook wrote:
hi!
I've just setup heartbeat + drbd with postgresql. I'm mirroring
/dev/drbd0
to
Now, is working, i guess i only put the wrong atribute name.
But for now, i only need one pingd teste, what if i need 2 diferentes pingd
testes, with different scores.
How do set up to each contraint have a differente pingd score ? as i only
set the atrribute pingd ?
Thanks
[]'sf.rique
On
Hi,
On Wed, Sep 22, 2010 at 04:24:01PM -0300, Henrique Fernandes wrote:
Look, i am not getting it to work.
I already made a pingd conective clone in all nodes. But now i can not get
an location to work.
Can you help me ?
I set an constraint location Conectivity and this has resource,
Hi,
On Tue, Sep 21, 2010 at 09:04:37AM +0200, RaSca wrote:
Hi all guys,
Yesterday I've finally finished and published the last article of the
Heartbeat/Pacemaker series. These are the links to the articles:
Il giorno Mer 22 Set 2010 13:44:07 CET, Dejan Muhamedagic ha scritto:
[...]
Didn't understand a thing, but looks great :)
LOL!
Just one note, it caught my attention, a node preference is
usually expressed in non-absolute terms and using a shorter
syntax:
location cli-prefer-share-a share-a
Hi,
On Wed, Sep 22, 2010 at 12:37:50AM -0300, Henrique Fernandes wrote:
When i set heartbeat to do bcast in some interface, it does ok.
But if i set to do bcast in 2 interfaces, it does ok also, the problem is, i
want the heartbeat give up resoucers if it looses just one interface... it
Ok, thanks,
I am using heartbeat 2.1.3 on a CentOS release 5.5 (Final)
And i using crm respaw, i guess it uses peacemaker. crm_mon and etc.
So, i guess you are saying that is possible to do what i want, using
constraint.
Can you give me a litle more information about suject ? Maybe some where
Hi,
On Wed, Sep 22, 2010 at 11:39:18AM -0300, Henrique Fernandes wrote:
Ok, thanks,
I am using heartbeat 2.1.3 on a CentOS release 5.5 (Final)
Oh, please upgrade as soon as you can, version 2.1.3 is quite old
and you won't get much support for it. Pacemaker is what used to
be heartbeat v2.
Here where i work, we try to use the distro release package, and the on
centos is this version. =/
But thanks, i am gonna ready it now.
[]'sf.rique
On Wed, Sep 22, 2010 at 11:47 AM, Dejan Muhamedagic deja...@fastmail.fmwrote:
Hi,
On Wed, Sep 22, 2010 at 11:39:18AM -0300, Henrique Fernandes
Look, i am not getting it to work.
I already made a pingd conective clone in all nodes. But now i can not get
an location to work.
Can you help me ?
I set an constraint location Conectivity and this has resource, my
loadbalancer group resources, The score is - infinity and the atribute is my
Oh, i think i got it now!
I guess i was setting the wrong name for the constraint, in the atrribute i
was seeting the pingdresourcename, but it has to be only pingd right ?
At least is working now.
[]'sf.rique
On Wed, Sep 22, 2010 at 4:24 PM, Henrique Fernandes sf.ri...@gmail.comwrote:
Maybe i am just not seeing this but is there a way to have heartbeat
control 2 VIP's ?
I have an app that requiers 2 IP's on a box and as these are now in a
HA pair i actually need heartbeat to control those 2 VIP's rather than
the usual one.
Am i missing a basic config item in the docs?
Sorry, my last attempts to grep for heartbeat from syslog failed to pick up
the ResourceManager stuff. Apparently Filesystem is returning a code of 2
rather than 0 for some reason that I can't identify since it works perfectly
when I run it manually:
Aug 27 03:57:27 indhlcvms1
Hi,
On Wed, Aug 25, 2010 at 05:10:07PM +0100, japc wrote:
Hi.
Multicast mac adresses have the least significant bit of the most
significant byte is set to a 1. This should include 5 but it is missing
from the validation regexp on IPaddr2 (latest release available from the
site):
David Lang,
I found more interesting info in /var/log/ha-debug files. I am
attaching them as text. It is exciting, as it may offer us a
straightforward way to diagnose this problem.
On pfs-srv3 (main)
Aug 10 18:04:43 pfs-srv3 heartbeat: [1179]: info: Enabling logging daemon
Aug 10 18:04:43
wrote:
From: linux-ha-boun...@lists.linux-ha.org on behalf of Igor Chudov
Sent: Thu 8/5/2010 9:47 PM
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] Heartbeat does not take over if BOTH machines
arebootedat the same time
On Thu, Aug 5, 2010
From: linux-ha-boun...@lists.linux-ha.org on behalf of Igor Chudov
Sent: Tue 8/10/2010 6:50 AM
To: General Linux-HA mailing list
Cc: david.l...@digitalinsight.com
Subject: Re: [Linux-HA] Heartbeat does not take over if BOTH
machinesarebootedat the same time
On Tue, Aug 10, 2010 at 10:21 AM, Pushkar Pradhan
push...@ipvideosys.com wrote:
David, I did a fresh restart today (without changing to mcast, yet, as
I want to do one thing at a time).
Again, neither server took over.
Here's the ha-logs from them:
http://igor.chudov.com/tmp/ha-log-1.txt
2010, Igor Chudov wrote:
Date: Tue, 10 Aug 2010 10:47:32 -0500
From: Igor Chudov ichu...@gmail.com
Reply-To: General Linux-HA mailing list linux-ha@lists.linux-ha.org
To: General Linux-HA mailing list linux-ha@lists.linux-ha.org
Subject: Re: [Linux-HA] Heartbeat does not take over if BOTH
On Tue, Aug 10, 2010 at 12:51 PM, David Lang
david.l...@digitalinsight.com wrote:
one problem I see in ha-log-2.txt is the lines
Aug 10 10:38:06 pfs-srv4 ResourceManager[1241]: [1253]: ERROR: Cannot locate
resource script
Aug 10 10:38:06 pfs-srv4 req_resource[1236]: [1256]: debug: in
On Tue, 10 Aug 2010, Igor Chudov wrote:
On Tue, Aug 10, 2010 at 12:51 PM, David Lang
david.l...@digitalinsight.com wrote:
one problem I see in ha-log-2.txt is the lines
Aug 10 10:38:06 pfs-srv4 ResourceManager[1241]: [1253]: ERROR: Cannot locate
resource script
Aug 10 10:38:06 pfs-srv4
On Tue, Aug 10, 2010 at 1:08 PM, David Lang
david.l...@digitalinsight.com wrote:
On Tue, 10 Aug 2010, Igor Chudov wrote:
On Tue, Aug 10, 2010 at 12:51 PM, David Lang
david.l...@digitalinsight.com wrote:
one problem I see in ha-log-2.txt is the lines
Aug 10 10:38:06 pfs-srv4
On Tuesday 10 August 2010 13:14, Igor Chudov wrote:
Haresources refers to drbddisk, however, the resource in
/usr/lib/ocf/resource.d/heartbeat is called drbd.
Heartbeat 2.1.4 on centos 5 comes with /etc/ha.d/resource.d/drbddisk. Looks
like the docs you read don't match the version you have.
Dmitri, you are right.
In any case the name change did nothing.
They are still refuse to take over when rebooted simultaneously.
The symptoms are the same as usual.
I am thinking, should I perhaps put a little statement in
/etc/init.d/heartbeat on one of the boxes and add sleep 100 in it?
i
On Tue, 10 Aug 2010, Igor Chudov wrote:
Dmitri, you are right.
In any case the name change did nothing.
did it eliminate the error from the log? does the log say anything else after
that point?
David Lang
They are still refuse to take over when rebooted simultaneously.
The symptoms are
On Tue, Aug 10, 2010 at 2:28 PM, David Lang
david.l...@digitalinsight.com wrote:
On Tue, 10 Aug 2010, Igor Chudov wrote:
Dmitri, you are right.
In any case the name change did nothing.
did it eliminate the error from the log? does the log say anything else after
that point?
It eliminated
mailing list linux-ha@lists.linux-ha.org
Subject: Re: [Linux-HA] Heartbeat does not take over if BOTH
machinesarebootedat the same time
On Tue, Aug 10, 2010 at 2:59 PM, David Lang
david.l...@digitalinsight.com wrote:
Ok, just checking again, the two haresources files are truely identical
On Tue, Aug 10, 2010 at 3:25 PM, David Lang
david.l...@digitalinsight.com wrote:
could you re-post the files (log files, ha.cf and haresources from each box)
Log file from pfs-srv3
Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: info: other_holds_resources: 0
Aug 10 17:08:28 pfs-srv3 heartbeat:
Guys, I just sent ha-log, ha.cf, haresources from both machines.
At this point, I of course greatly appreciate your help and your
generous assistance.
But I wonder if our attention is going in a wrong direction of try
this and try that.
What if right now, I need to systematically understand
On Tue, 10 Aug 2010, Igor Chudov wrote:
Guys, I just sent ha-log, ha.cf, haresources from both machines.
At this point, I of course greatly appreciate your help and your
generous assistance.
But I wonder if our attention is going in a wrong direction of try
this and try that.
What if
On Tuesday 10 August 2010 17:19, Igor Chudov wrote:
Guys, I just sent ha-log, ha.cf, haresources from both machines.
These look like shutdown logs, not startup logs.
FWIW here's what mine's like (sanitized):
** secondary **
heartbeat: [8356]: info: Configuration validated. Starting heartbeat
David and Dmitri,
Here's one more try and one more set of log files. I now see that
heartbeat is shutting down, which is beyond what used to happen.
some interesting lines I saw:
Aug 10 17:49:09 pfs-srv4 heartbeat: [1276]: info: Received shutdown
notice from 'pfs-srv3'.
Aug 10 17:49:08 pfs-srv3
-To: General Linux-HA mailing list linux-ha@lists.linux-ha.org
To: General Linux-HA mailing list linux-ha@lists.linux-ha.org
Subject: Re: [Linux-HA] Heartbeat does not take over if BOTH
machinesarebootedat the same time
David and Dmitri,
Here's one more try and one more set of log files. I now see
Guys, I have a bit of clarification. In an attempt to avoid the timing
issues, an hour ago I tried adding a configuration change to
/etc/init.d/heartbeat to delay starting it by 2 minutes on one box. So
logs with takeover succeeding, and heartbeat shutting down are partly
an artifact of this
On Tue, 10 Aug 2010, Igor Chudov wrote:
Guys, I have a bit of clarification. In an attempt to avoid the timing
issues, an hour ago I tried adding a configuration change to
/etc/init.d/heartbeat to delay starting it by 2 minutes on one box. So
logs with takeover succeeding, and heartbeat
On Tue, Aug 10, 2010 at 6:41 PM, David Lang
david.l...@digitalinsight.com wrote:
On Tue, 10 Aug 2010, Igor Chudov wrote:
Guys, I have a bit of clarification. In an attempt to avoid the timing
issues, an hour ago I tried adding a configuration change to
/etc/init.d/heartbeat to delay starting
101 - 200 of 763 matches
Mail list logo