Re: [Linux-HA] Heartbeat Failover Configuration Question

2012-04-23 Thread Net Warrior
True, but even on the most expensive software likve Veritas Cluster or Red Hat Cluster I can configure how I want to failover the resources ( auto or manual ), that's why my curiosity to acomplish the same in here. Thanks for your time Best Regards 2012/4/23, David Coulson

Re: [Linux-HA] Heartbeat Failover Configuration Question

2012-04-23 Thread Andreas Kurz
On 04/23/2012 01:47 PM, Net Warrior wrote: True, but even on the most expensive software likve Veritas Cluster or Red Hat Cluster I can configure how I want to failover the resources ( auto or manual ), that's why my curiosity to acomplish the same in here. with the help of the meat-ware

[Linux-HA] Heartbeat Failover Configuration Question

2012-04-22 Thread Net Warrior
Hi There I configured heartbeat to failover an IP address , if I for example shutdown one node, the other takes it's ip address, so far so good, now my doubt is if there is a way to configure it not to make the failover automatically and have someone run the failover manually, can you provide

[Linux-HA] heartbeat bcast config

2012-04-04 Thread Douglas Pasqua
Hi Everyone, I have a begginer question. In ha.conf of heartbeat, I set the interface for the bcast to eth4. (in both servers of cluster). I have a doubt. It´s allowed to configure eth4 with a determined ip address, in both servers, and use it to sincronize some files with rsync ? This will

Re: [Linux-HA] heartbeat doesnt create the socket /var/run/heartbeat/register

2012-01-22 Thread Efrat Lefeber
-ha.org Subject: Re: [Linux-HA] heartbeat doesnt create the socket /var/run/heartbeat/register On Thu, Jan 19, 2012 at 02:18:53PM +, Efrat Lefeber wrote: Hi, I am using linux-ha heartbeat on a two simple nodes cluster. For some reason which I can't figure out, the socket /var/run/heartbeat

Re: [Linux-HA] heartbeat doesnt create the socket /var/run/heartbeat/register

2012-01-20 Thread Lars Ellenberg
On Thu, Jan 19, 2012 at 02:18:53PM +, Efrat Lefeber wrote: Hi, I am using linux-ha heartbeat on a two simple nodes cluster. For some reason which I can't figure out, the socket /var/run/heartbeat/register is not created though the directory /var/run/heartbeat/ exist: ll /var/run

[Linux-HA] heartbeat doesnt create the socket /var/run/heartbeat/register

2012-01-19 Thread Efrat Lefeber
Hi, I am using linux-ha heartbeat on a two simple nodes cluster. For some reason which I can't figure out, the socket /var/run/heartbeat/register is not created though the directory /var/run/heartbeat/ exist: ll /var/run/heartbeat/ total 24 drwxr-x--- 6 hacluster haclient 4096 2012-01-19 14

Re: [Linux-HA] [Heartbeat][Pacemaker] VIP doesn't swith to other server

2011-11-18 Thread Andreas Kurz
Hello Mathieu, On 11/17/2011 07:22 PM, SEILLIER Mathieu wrote: Hi all, I have to use Heartbeat with Pacemaker for High Availability between 2 Tomcat 5.5 servers under Linux RedHat 5.4. The first server is active, the other one is passive. The master is called servappli01, with IP address

[Linux-HA] heartbeat and squid

2011-09-14 Thread Nicolas Repentin
Hi all, I've got a question for heartbeat. How can I made this : If squid stop or be killed on node1, how make node2 be master ? Actually, node2 become master only when node1 is down, or heartbeat service on node1 is down, but if I kill squid, nothing happen. I'm using Centos 6 and last

Re: [Linux-HA] heartbeat and squid

2011-09-14 Thread Dejan Muhamedagic
Hi, On Thu, Sep 01, 2011 at 06:30:46PM +0200, Nicolas Repentin wrote: Hi all, I've got a question for heartbeat. How can I made this : If squid stop or be killed on node1, how make node2 be master ? Actually, node2 become master only when node1 is down, or heartbeat service on node1

Re: [Linux-HA] Heartbeat Restart is not same as Stop and Start

2011-08-04 Thread Rahul Kanna
Mike, I checked the permission and those are fine. If you can please check the restart script I have given below, it does not touch the heartbeat lock file *touch $LOCKDIR/$SUBSYS* when the heartbeat is restared and I guess it is a problem. Is it not? Btw, we have a product for some web

[Linux-HA] Heartbeat Restart is not same as Stop and Start

2011-08-03 Thread Rahul Kanna
Hi, Our system setup: Heartbeat 3.0.3 DRBD (to manage file system and it is one of the resource managed by CRM) Redhat Linux Pacemaker We have built an application on top of Linux-HA for users to configure cluster by giving IP addresses of the nodes, do operations like Restart system, Change

Re: [Linux-HA] Heartbeat Restart is not same as Stop and Start

2011-08-03 Thread mike
Permission problem perhaps? Not really sure what you're doing but the fact that you have users configuring the cluster (why do you do this btw?) may be pointing to a permission issue. -mgb On 11-08-03 06:57 PM, Rahul Kanna wrote: Hi, Our system setup: Heartbeat 3.0.3 DRBD (to manage file

[Linux-HA] Heartbeat link recovery problems - ARP requests for broadcast IP

2011-08-02 Thread Klaus Darilion
Hi! The cluster consist of 2 nodes: db1 and db2 using Squeeze backports (heartbeat 1:3.0.4-1~bpo60+1). Heartbeat is configured to use all 3 available links: /etc/heartbeat/ha.cf: logfacility local0 bcast eth0 bcast eth1 bcast eth2 auto_failback on node db1 node db2 crm respawn The network

Re: [Linux-HA] Heartbeat 3.0.3 stable version + RHEL 6.1: restart network will make heartbeat not send broadcasts

2011-07-18 Thread Ai Lei
Hi: I'm using Heartbeat 3.0.3 stable version on RHEL 6.1 x64 platform, and found following issue: If I restart network service, heartbeat will not send broadcast packages from port 694. That makes this node never have a chance to join HA cluster again except restart it. Details for setting

Re: [Linux-HA] heartbeat three node configuration

2011-06-26 Thread Andrew Beekhof
On Thu, Jun 9, 2011 at 11:54 PM, Ricardo F ri...@hotmail.com wrote: What is the configuration for create a three node cluster?, Essentially you need Pacemaker on top. haresources based clusters were only designed for 2-nodes. i have this but the servers bring-up the shared ip at same time:

Re: [Linux-HA] heartbeat step down after split brain scenario

2011-06-20 Thread Jack Berg
Hi - thanks for the response. Dimitri Maziuk wrote: What do you mean by disconnecting: what's your failure scenario and how do you expect it to be handled? The disconnection is the loss of the intersite link which interrupts heartbeat comms. In this case it's expected that both

[Linux-HA] heartbeat step down after split brain scenario

2011-06-16 Thread Jack Berg
I have a two node cluster using heartbeat and haproxy. Unfortunately it is impossible to provide redundant heartbeat paths between the two nodes at different sites so it is possible for a failure to cause split brain. To evaluate the impact I tried disconnecting the two nodes and I found that

Re: [Linux-HA] heartbeat step down after split brain scenario

2011-06-16 Thread Dimitri Maziuk
On 06/16/2011 04:28 AM, Jack Berg wrote: I have a two node cluster using heartbeat and haproxy. Unfortunately it is impossible to provide redundant heartbeat paths between the two nodes at different sites so it is possible for a failure to cause split brain. To evaluate the impact I tried

[Linux-HA] heartbeat three node configuration

2011-06-09 Thread Ricardo F
What is the configuration for create a three node cluster?, i have this but the servers bring-up the shared ip at same time: ha.cflogfacility local0keepalive 2deadtime 10warntime 5initdead 30auto_failback offucast bond0 host1 host2 host3node host1node host2node host3 haresourceshost1

[Linux-HA] Heartbeat build from source problem

2011-06-07 Thread Michael Pelletier
Hello, I am building heatbeat (3.0.4) from source and I am running into this problem: above output deleted checking heartbeat/glue_config.h usability... yes checking heartbeat/glue_config.h presence... yes checking for heartbeat/glue_config.h... yes checking glue_config.h usability... yes

[Linux-HA] heartbeat and mysql cluster nodes

2011-05-25 Thread Calistus Che
Hi Guys, I have been struggling to set up 2 mysql cluster nodes with loadbalancing functionality (heartbeat) to no avail. May be you could help me sort out this mess:-) Here is my design: lb1 and mysql cluster management server = on one ubuntu server lb2 = on another ubuntu server db1 = cluster

Re: [Linux-HA] heartbeat sends udp to whole network

2011-05-24 Thread Dejan Muhamedagic
Hi, On Mon, May 23, 2011 at 03:18:37PM -0700, Nulgor Wankevitch wrote: hi, heartbeat seems to be send udp on port 694 to the whole network segment, Do you use ucast or bcast? With the latter, which is broadcast it's of course expected. If it happens with the former, then you must have

Re: [Linux-HA] heartbeat sends udp to whole network

2011-05-24 Thread Nulgor Wankevitch
Hi, thnk for reply, when use ucast things do not seem to work, the nodes are able to bring up the VIP but not any services. When using bcast things seem to work correctly but there is that broadcast problem, I would like to firewall the broadcast and isolate it to the local machine and 2nd

Re: [Linux-HA] heartbeat sends udp to whole network

2011-05-24 Thread Dejan Muhamedagic
Hi, On Tue, May 24, 2011 at 02:12:12AM -0700, Nulgor Wankevitch wrote: Hi, thnk for reply, when use ucast things do not seem to work, the nodes are able to bring up the VIP but not any services. When using bcast things seem to work correctly Wow! You really do have gremlins somewhere.

Re: [Linux-HA] heartbeat sends udp to whole network

2011-05-24 Thread Nulgor Wankevitch
ya, gremlins, very reassuring, thanks. On 5/24/2011 2:42 AM, Dejan Muhamedagic wrote: Hi, On Tue, May 24, 2011 at 02:12:12AM -0700, Nulgor Wankevitch wrote: Hi, thnk for reply, when use ucast things do not seem to work, the nodes are able to bring up the VIP but not any services. When

Re: [Linux-HA] heartbeat sends udp to whole network

2011-05-24 Thread Dimitri Maziuk
On 05/24/2011 05:48 AM, Nulgor Wankevitch wrote: ya, gremlins, very reassuring, thanks. If the broadcast packets from host A are seen by host B, and unicast packets from host A to host B are not seen by host B, then your universe is governed by laws of physics we here are completely unfamiliar

Re: [Linux-HA] heartbeat sends udp to whole network

2011-05-24 Thread Nulgor Wankevitch
I think you guys might have jumped the gun on me, why would you assume it is not seen? I reported it will bring up the VIP but not the services. nulgor On 5/24/2011 9:37 AM, Dimitri Maziuk wrote: On 05/24/2011 05:48 AM, Nulgor Wankevitch wrote: ya, gremlins, very reassuring, thanks. If the

Re: [Linux-HA] heartbeat sends udp to whole network

2011-05-24 Thread Dimitri Maziuk
On 05/24/2011 02:56 PM, Nulgor Wankevitch wrote: I think you guys might have jumped the gun on me, why would you assume it is not seen? I reported it will bring up the VIP but not the services. The only way I can vaguely imagine that possibly happening is if cib isn't propagated to the other

Re: [Linux-HA] heartbeat sends udp to whole network

2011-05-24 Thread Nulgor Wankevitch
it seems like cib is on both nodes as I am able to view both from crm_mon and crm configure show shows the same info, am I correct? On 5/24/2011 2:02 PM, Dimitri Maziuk wrote: On 05/24/2011 02:56 PM, Nulgor Wankevitch wrote: I think you guys might have jumped the gun on me, why would you

Re: [Linux-HA] heartbeat sends udp to whole network

2011-05-24 Thread Lars Ellenberg
On Tue, May 24, 2011 at 02:10:25PM -0700, Nulgor Wankevitch wrote: it seems like cib is on both nodes as I am able to view both from crm_mon and crm configure show shows the same info, am I correct? This does not lead anywhere. You complained that broadcast broadcasts. Well, that's the nature

[Linux-HA] heartbeat sends udp to whole network

2011-05-23 Thread Nulgor Wankevitch
hi, heartbeat seems to be send udp on port 694 to the whole network segment, not just the link host, and getting blocked by firewall, how to limit? Firewall: *UDP_IN Blocked* IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:00:22:19:21:f1:75:08:00 SRC=192.168.1.190 DST=192.168.1.255 LEN=246 TOS=0x00

[Linux-HA] Heartbeat kills itself

2011-05-05 Thread Lacoco, Joshua
Hello, I have a 2 node cluster on RHEL 5.4. I am currently only running the heartbeat service on one node because the heartbeat service kills itself and I'm trying to avoid downtime/split brain issues. I've tried searching and I found posts that have similar problems. I am running heartbeat

Re: [Linux-HA] Heartbeat kills itself

2011-05-05 Thread Andrea Bertucci
On 05/05/2011 11:45 AM, Lacoco, Joshua wrote: Hello, I have a 2 node cluster on RHEL 5.4. I am currently only running the heartbeat service on one node because the heartbeat service kills itself and I'm trying to avoid downtime/split brain issues. I've tried searching and I found posts

[Linux-HA] [Heartbeat] my VIP doesn't work :(

2011-04-26 Thread SEILLIER Mathieu
Hi all, First I'm french so sorry in advance for my English... I have to use Heartbeat for High Availability between 2 Tomcat 5.5 servers under Linux RedHat 5.3. The first server is active, the other one is passive. The master is called servappli01, with IP address 186.20.100.40, the slave is

Re: [Linux-HA] [Heartbeat] my VIP doesn't work :(

2011-04-26 Thread mike
On 11-04-22 06:25 AM, SEILLIER Mathieu wrote: Hi all, First I'm french so sorry in advance for my English... I have to use Heartbeat for High Availability between 2 Tomcat 5.5 servers under Linux RedHat 5.3. The first server is active, the other one is passive. The master is called

Re: [Linux-HA] [Heartbeat] my VIP doesn't work :(

2011-04-26 Thread Amit Jathar
Linux-HA mailing list Subject: Re: [Linux-HA] [Heartbeat] my VIP doesn't work :( On 11-04-22 06:25 AM, SEILLIER Mathieu wrote: Hi all, First I'm french so sorry in advance for my English... I have to use Heartbeat for High Availability between 2 Tomcat 5.5 servers under Linux RedHat 5.3

Re: [Linux-HA] [Heartbeat] my VIP doesn't work :(

2011-04-26 Thread Dimitri Maziuk
On 4/22/2011 4:25 AM, SEILLIER Mathieu wrote: Result of /usr/bin/cl_status nodestatus servappli01 command on servappli01 : active Result of /usr/bin/cl_status nodestatus servappli02 command on servappli01 : dead Result of /usr/bin/cl_status nodestatus servappli01 command on servappli02 :

Re: [Linux-HA] [Heartbeat] my VIP doesn't work :(

2011-04-26 Thread Dejan Muhamedagic
[mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of mike Sent: Tuesday, April 26, 2011 5:41 PM To: General Linux-HA mailing list Subject: Re: [Linux-HA] [Heartbeat] my VIP doesn't work :( On 11-04-22 06:25 AM, SEILLIER Mathieu wrote: Hi all, First I'm french so sorry in advance for my English

Re: [Linux-HA] Heartbeat restarts

2011-04-19 Thread Dejan Muhamedagic
Hi, On Fri, Apr 08, 2011 at 11:39:49AM +0200, Andrea Bertucci wrote: Hello. I have a problem with heartbeat 2.1.3-3.el5 installed on a Red Hat That version is way too old. You won't get help for it here. Please upgrade to Pacemaker 1.0.10 or 1.1.5. Thanks, Dejan Enterprise Linux Server

Re: [Linux-HA] Heartbeat restarts

2011-04-19 Thread Andrea Bertucci
On 04/19/2011 12:46 PM, Dejan Muhamedagic wrote: Hi, On Fri, Apr 08, 2011 at 11:39:49AM +0200, Andrea Bertucci wrote: Hello. I have a problem with heartbeat 2.1.3-3.el5 installed on a Red Hat That version is way too old. You won't get help for it here. Please upgrade to Pacemaker 1.0.10 or

Re: [Linux-HA] Heartbeat restarts

2011-04-19 Thread Dejan Muhamedagic
Hi, On Tue, Apr 19, 2011 at 02:19:48PM +0200, Andrea Bertucci wrote: On 04/19/2011 12:46 PM, Dejan Muhamedagic wrote: Hi, On Fri, Apr 08, 2011 at 11:39:49AM +0200, Andrea Bertucci wrote: Hello. I have a problem with heartbeat 2.1.3-3.el5 installed on a Red Hat That version is way too

Re: [Linux-HA] Heartbeat restarts

2011-04-19 Thread Andrea Bertucci
Yes...funny... Andrea. On 04/19/2011 02:23 PM, Dejan Muhamedagic wrote: Hi. The version I use is the newest I found distributed with CentOS rpm's (I choose this for compatibility reason). I also tried to recompile heartbeat v3 (I don't remember exact version) and to use non official

[Linux-HA] Heartbeat crashed with core dump available

2011-04-17 Thread Mark Pentland
I am running a pacemaker/heartbeat cluster on Debian. Heartbeat is 3.0.4-1 (from wheezy) from my daemon.log Apr 17 17:07:07 s1 attrd: [2692]: info: ha_msg_dispatch: Lost connection to heartbeat service. Apr 17 17:07:07 s1 stonithd: [2691]: info: ha_msg_dispatch: Lost connection to heartbeat

[Linux-HA] Heartbeat restarts

2011-04-08 Thread Andrea Bertucci
Hello. I have a problem with heartbeat 2.1.3-3.el5 installed on a Red Hat Enterprise Linux Server release 5.4. My configuration is: - two nodes - two network cards - the nodes comunicate with multicast messages (e.g mcast directive in ha.cf) - two virtual ip for the cluster (one for each

Re: [Linux-HA] heartbeat ordering

2011-04-06 Thread Andrew Beekhof
On Tue, Apr 5, 2011 at 11:58 AM, Maxim Ianoglo dot...@gmail.com wrote: Hello, I have four serves in a HA cluster: NodeA NodeB NodeC NodeD There are defined three groups of resources and one inline resource: 1. group_storage ( NFS VIP, NFS Server, DRBD ) 2. group_apache_www (Domains VIPs

[Linux-HA] heartbeat ordering

2011-04-05 Thread Maxim Ianoglo
Hello, I have four serves in a HA cluster: NodeA NodeB NodeC NodeD There are defined three groups of resources and one inline resource: 1. group_storage ( NFS VIP, NFS Server, DRBD ) 2. group_apache_www (Domains VIPs and Apache) 3. group_nginx_www (Static files with nginx) 4. inline_nfs_client (

[Linux-HA] Heartbeat start postgresql ??

2011-01-27 Thread Palaffre Michel
Hello I'm on Debian. Heartbeat starts the script / etc / init.d / postgresql stop but does not launch the script / etc / init.d / postgresql start. All other service starts and stops well. Service postgresql / etc / init.d / postgresql start is not launched by hearbeat There is no error in

[Linux-HA] Heartbeat not start postgresql

2011-01-26 Thread Palaffre Michel
Bonjour Je suis sous Debian. Heartbeat lance le script /etc/init.d/postgresql stop mais ne lance pas le script /etc/init.d/postgresql start. Tous les autres services démarre et s'arrète bien. Il ma manque le servcice postgresql /etc/init.d/postgresql start Il n'y a rien dans les log de

Re: [Linux-HA] Heartbeat and order of execution

2011-01-18 Thread Andrew Beekhof
create a resource for the other script and then use an regular ordering constraint to have it start before the VIP On Thu, Jan 13, 2011 at 7:13 PM, maillis...@gmail.com wrote: Sorry if this is a silly question. I've been reading the docs and I'm a little confused. I have a situation where I

[Linux-HA] heartbeat moves the resources when heartbeat starts on a second node

2011-01-18 Thread Erik Dobák
Hi people, i got my active/passive cluster running. when i start the first node all resources are started. but when i start the second node, all resources are stoped on the first node and started on the second node. why? do i something wrong? crm(live)configure# show node

Re: [Linux-HA] heartbeat moves the resources when heartbeat starts on a second node

2011-01-18 Thread RaSca
Il giorno Mar 18 Gen 2011 12:13:15 CET, Erik Dobák ha scritto: Hi people, i got my active/passive cluster running. when i start the first node all resources are started. but when i start the second node, all resources are stoped on the first node and started on the second node. why? do i

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-18 Thread Igor Chudov
I have set up cron jobs on both servers. I restart heartbeat at 22 hours on one box and at 23 hours on another. It's been 4 days and so far, so good. I will report more result. This could be an ugly solution to an ugly problem, but workable. i ___

[Linux-HA] Heartbeat and order of execution

2011-01-13 Thread maillists0
Sorry if this is a silly question. I've been reading the docs and I'm a little confused. I have a situation where I want heartbeat to take down the VIP of a failed machine, then run a script and only take over the vip after the script has succeeded. How do I control the order of execution and

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-10 Thread Igor Chudov
On Tue, Jan 4, 2011 at 10:22 AM, Serge Dubrouski serge...@gmail.com wrote: On Tue, Jan 4, 2011 at 9:14 AM, Igor Chudov ichu...@gmail.com wrote: Serge, I am not sure of anything, but the self-communication is supposed to be taking place on a single crossover cable between second network

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-10 Thread Dimitri Maziuk
Igor Chudov wrote: My second question is, can heartbeat be configured to restart itself in case of such a failure. Usually you can't have X restart itself after X dies. You need some kind of Y. If you're running snmpd, see if you can get proc to identify heartbeat: master control process

[Linux-HA] Heartbeat dies with SIGXCPU, pacemaker ping RA syntax error

2011-01-07 Thread Daniel Krambrock
hi there, we have got an 12 node cluster for managing KVM based virtual machines. we are using fedora 12 for the node systems with pacemaker (pacemaker-1.0.7-1.fc12.x86_64) and heartbeat (heartbeat-3.0.0-0.7.0daab7da36a8.hg.fc12.x86_64). we had a crash of heartbeat with SIGXCPU Jan 2 01:21:11

Re: [Linux-HA] Heartbeat dies with SIGXCPU, pacemaker ping RA syntax error

2011-01-07 Thread Igor Chudov
I have the same problem (on Ubuntu). Very interested in an answer. i On Fri, Jan 7, 2011 at 5:12 AM, Daniel Krambrock ajs...@googlemail.comwrote: hi there, we have got an 12 node cluster for managing KVM based virtual machines. we are using fedora 12 for the node systems with pacemaker

Re: [Linux-HA] Heartbeat dies with SIGXCPU, pacemaker ping RA syntax error

2011-01-07 Thread Daniel Krambrock
hi there, i think we found the reason for the syntax error in ping RA: the crash of heartbeat had produced a coredump in /var/lib/heartbeat/cores/root , which is the working directory of the ping RA. ping RA makes use of a unquoted * symbol: score=`expr $active * $OCF_RESKEY_multiplier` since

Re: [Linux-HA] Heartbeat dies with SIGXCPU, pacemaker ping RA syntax error

2011-01-07 Thread Igor Chudov
Very interested on SIGXCPU problem. I cannot deploy my solution with it. i On Fri, Jan 7, 2011 at 11:41 AM, Daniel Krambrock ajs...@googlemail.comwrote: hi there, i think we found the reason for the syntax error in ping RA: the crash of heartbeat had produced a coredump in

[Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Igor Chudov
A few weeks I reported that heartbeat died on one of the cluster machines, due to SIGXCPU. Well, it happened again. Heartbeat died, now both machines had the shared IP address up, what a god awful mess!!! Nopw they have split brain and the whole nine yards! I looked at

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Igor Chudov
Further reading indicates that heartbeat itself sets a limit for itself every so often. Then it exceeds the limit (probably due to a bug). I am sure that tha's why whoever wrote heartbeat, set cpu limit, instead of foxing their bugs. Then it dies with SIGXCPU, leaving everything in an extremely

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Steve Davies
On 4 January 2011 13:47, Igor Chudov ichu...@gmail.com wrote: Further reading indicates that heartbeat itself sets a limit for itself every so often. Then it exceeds the limit (probably due to a bug). I am sure that tha's why whoever wrote heartbeat, set cpu limit, instead of foxing their

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Serge Dubrouski
Which OS? Which version of Hearbeat? heartbeat_pid - PID of which of Heartbeat processes? It has several. On Tue, Jan 4, 2011 at 6:32 AM, Igor Chudov ichu...@gmail.com wrote: A few weeks I reported that heartbeat died on one of the cluster machines, due to SIGXCPU. Well, it happened

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Dejan Muhamedagic
Hi, On Tue, Jan 04, 2011 at 07:47:10AM -0600, Igor Chudov wrote: Further reading indicates that heartbeat itself sets a limit for itself every so often. True. Then it exceeds the limit (probably due to a bug). I am sure that tha's why whoever wrote heartbeat, set cpu limit, instead of

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Igor Chudov
Steve, here's some data. The OS is Ubuntu 10.04. ~# apt-cache policy heartbeat heartbeat: Installed: 1:3.0.3-1ubuntu1 Candidate: 1:3.0.3-1ubuntu1 Version table: *** 1:3.0.3-1ubuntu1 0 500 http://us.archive.ubuntu.com/ubuntu/ lucid/universe Packages 100 /var/lib/dpkg/status

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Igor Chudov
On Tue, Jan 4, 2011 at 9:40 AM, Serge Dubrouski serge...@gmail.com wrote: Which OS? Ubuntu 10.04 Lucid. Which version of Hearbeat? 3.0.3 ~# apt-cache policy heartbeat heartbeat: Installed: 1:3.0.3-1ubuntu1 Candidate: 1:3.0.3-1ubuntu1 Version table: *** 1:3.0.3-1ubuntu1 0

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Serge Dubrouski
Are you sure that everything is all right with your network? It looks like processes that are responsible for UDP communications are taking too much of CPU time. On Tue, Jan 4, 2011 at 8:47 AM, Igor Chudov ichu...@gmail.com wrote: Steve, here's some data. The OS is Ubuntu 10.04. ~# apt-cache

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Igor Chudov
Serge, I am not sure of anything, but the self-communication is supposed to be taking place on a single crossover cable between second network cards of the servers. (eth1). Igor On Tue, Jan 4, 2011 at 10:06 AM, Serge Dubrouski serge...@gmail.com wrote: Are you sure that everything is all right

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Serge Dubrouski
On Tue, Jan 4, 2011 at 9:14 AM, Igor Chudov ichu...@gmail.com wrote: Serge, I am not sure of anything, but the self-communication is supposed to be taking place on a single crossover cable between second network cards of the servers. (eth1). Agree, yet something strange and pretty unique is

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Dimitri Maziuk
Igor Chudov wrote: At this point I feel rather desperate. Perhaps I should give pacemaker another go. I really have no idea and I am running out of options. If all you need is a 2-node active-passive cluster, most (all?) pacemaker features are useless for you. (Besides, one look at their

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Serge Dubrouski
On Tue, Jan 4, 2011 at 1:29 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote: Igor Chudov wrote: At this point I feel rather desperate. Perhaps I should give pacemaker another go. I really have no idea and I am running out of options. If all you need is a 2-node active-passive cluster, most

Re: [Linux-HA] Heartbeat died: WARN: Managed HBREAD process 3279 killed by signal 24 [SIGXCPU - CPU limit exceeded].

2010-12-28 Thread Dejan Muhamedagic
On Sun, Dec 26, 2010 at 08:56:13AM -0600, Igor Chudov wrote: As you guys recall, I have set up a heartbeat/drbd based system to replace an aging drbd solution. While it sits there, it has not been activated. I have noticed (due to some self checking scripts) that heartbeat died on one

[Linux-HA] Heartbeat died: WARN: Managed HBREAD process 3279 killed by signal 24 [SIGXCPU - CPU limit exceeded].

2010-12-26 Thread Igor Chudov
As you guys recall, I have set up a heartbeat/drbd based system to replace an aging drbd solution. While it sits there, it has not been activated. I have noticed (due to some self checking scripts) that heartbeat died on one machine. Looking in logs, I found this in ha-log.2: Dec 13 17:13:14

[Linux-HA] heartbeat previous releases

2010-12-19 Thread battsetseg . m
Dears, Where can I find older releases of heartbeat? the site contains only version 3. Regards, Bagi ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also:

Re: [Linux-HA] heartbeat stop hangs

2010-12-09 Thread David Lang
On Wed, 8 Dec 2010, sunitha kumar wrote: Hi, I am using: heartbeat-3.0.0-33.2 pacemaker-mgmt-client-1.99.2-6.1 pacemaker-libs-1.0.5-4.1 pacemaker-1.0.5-4.1 pacemaker-mgmt-1.99.2-6.1 /etc/init.d/heartbeat stop hangs in /usr/lib/heartbeat/heartbeat -k strace output on this shows it is

[Linux-HA] heartbeat stop hangs

2010-12-08 Thread sunitha kumar
Hi, I am using: heartbeat-3.0.0-33.2 pacemaker-mgmt-client-1.99.2-6.1 pacemaker-libs-1.0.5-4.1 pacemaker-1.0.5-4.1 pacemaker-mgmt-1.99.2-6.1 /etc/init.d/heartbeat stop hangs in /usr/lib/heartbeat/heartbeat -k strace output on this shows it is waiting on its child to exit.., which are inturn

Re: [Linux-HA] heartbeat multicast status

2010-11-12 Thread Max
Vadym, etc., it is not is not just an issue on multicast setups - the same thing happens if I set up an environment using unicasts ... # cl_status hblinkstatus nodename ethn returns up for every defined node and ethernet interface *except* on the node you are running the

Re: [Linux-HA] heartbeat multicast status

2010-11-11 Thread Vadym Chepkov
On Nov 10, 2010, at 10:51 AM, Frank Lazzarini wrote: Correct me if I am wrong but you can't check the link of a node on that same node ? I sure hope not, why? On Wed, Nov 10, 2010 at 4:16 PM, Pavlos Parissis pavlos.paris...@gmail.comwrote: I can confirm that on the same release,

[Linux-HA] heartbeat multicast status

2010-11-10 Thread Vadym Chepkov
Hi, I found an issue with cl_status/hblinkstatus when it is used with multicast. The cluster behaves as expected, but cl_status reports a dead link on the host where it runs: [r...@xen-20 ~]# checkhblinks xen-20 br0: dead xen-21 br0: up xen-22 br0: up [r...@xen-21 ~]# checkhbstatus xen-20

Re: [Linux-HA] heartbeat multicast status

2010-11-10 Thread Pavlos Parissis
I can confirm that on the same release, but I have no idea if it is normal or not. Could be a expected behavior due to the use of multicast [r...@node-01 tmp]# ./checkcl_status.pl /usr/bin/cl_status hblinkstatus node-01 eth0 node-01 eth0: dead /usr/bin/cl_status hblinkstatus node-01 eth1 node-01

Re: [Linux-HA] heartbeat multicast status

2010-11-10 Thread Frank Lazzarini
Correct me if I am wrong but you can't check the link of a node on that same node ? On Wed, Nov 10, 2010 at 4:16 PM, Pavlos Parissis pavlos.paris...@gmail.comwrote: I can confirm that on the same release, but I have no idea if it is normal or not. Could be a expected behavior due to the use

Re: [Linux-HA] heartbeat takes all cpu

2010-11-09 Thread Lars Ellenberg
On Mon, Nov 08, 2010 at 05:51:08PM -0700, Alan Robertson wrote: Quoting Vsevolod Katkov katkovsur...@yahoo.com: Reporting an issue. Thank you very much for any feedback today heartbeat process took all the CPU (99%-100%) and load went up. it put this message to log repeating 14 times a

Re: [Linux-HA] heartbeat takes all cpu

2010-11-09 Thread Vsevolod Katkov
thank you Lars for the feedback From: Lars Ellenberg lars.ellenb...@linbit.com To: linux-ha@lists.linux-ha.org Sent: Tue, November 9, 2010 5:49:54 AM Subject: Re: [Linux-HA] heartbeat takes all cpu On Mon, Nov 08, 2010 at 05:51:08PM -0700, Alan Robertson wrote

[Linux-HA] heartbeat takes all cpu

2010-11-08 Thread Vsevolod Katkov
Reporting an issue. Thank you very much for any feedback today heartbeat process took all the CPU (99%-100%) and load went up. it put this message to log repeating 14 times a seccond: heartbeat: [2553]: WARN: Gmain_timeout_dispatch: Dispatch function for retransmit request took too long to

Re: [Linux-HA] heartbeat takes all cpu

2010-11-08 Thread Alan Robertson
Quoting Vsevolod Katkov katkovsur...@yahoo.com: Reporting an issue. Thank you very much for any feedback today heartbeat process took all the CPU (99%-100%) and load went up. it put this message to log repeating 14 times a seccond: heartbeat: [2553]: WARN: Gmain_timeout_dispatch: Dispatch

Re: [Linux-HA] heartbeat takes all cpu

2010-11-08 Thread Vsevolod Katkov
, 2010 4:51:08 PM Subject: Re: [Linux-HA] heartbeat takes all cpu Quoting Vsevolod Katkov katkovsur...@yahoo.com: Reporting an issue. Thank you very much for any feedback today heartbeat process took all the CPU (99%-100%) and load went up. it put this message to log repeating 14 times a seccond

[Linux-HA] Heartbeat generates /var/log/ha-log unusually

2010-10-27 Thread Suphasit Phienpattanawit
Dear all, We have some problem about heartbeat 1.2.5_3 on FreeBSD 7.0 release p4. Normally, heartbeat generates log to /var/log/ha-log around 20kb-25kb per day. But yesterday, heartbeat unusually generated log to /var/log/ha-log. It generated log more than 30GB with normal log. I am not sure

Re: [Linux-HA] heartbeat with postgresql

2010-10-27 Thread Lars Ellenberg
On Tue, Oct 19, 2010 at 11:04:10AM -0600, Serge Dubrouski wrote: I see you could do it and now you are going to use Pacemaker all the time in the future. Than I see no reason why other can't do it as well taking into account that Heartbeat v1 almost not supported and definitely has no future

Re: [Linux-HA] heartbeat with postgresql

2010-10-27 Thread Lars Ellenberg
On Tue, Oct 19, 2010 at 12:11:03PM +0800, Linux Cook wrote: hi! I used the tarball package of postgresql and recompiled it. Postgres now resides at /usr/local/pgsql and mounting /usr/local/pgsql/data into /dev/drbd0. However, hearbeat recognizes my Filesystem and IPaddr2 resources but not

Re: [Linux-HA] heartbeat with postgresql

2010-10-27 Thread Serge Dubrouski
On Wed, Oct 27, 2010 at 4:41 AM, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Tue, Oct 19, 2010 at 11:04:10AM -0600, Serge Dubrouski wrote: I see you could do it and now you are going to use Pacemaker all the time in the future. Than I see no reason why other can't do it as well taking

Re: [Linux-HA] Heartbeat 2.1.4 mgmtd doesn't start

2010-10-25 Thread Yan Gao
On 10/21/10 22:41, René Pilz wrote: Hello, I compiled heartbeat 2.1.4 on CentOS 4.4 x86_64. All seems working the right way, but I can't login to the server using hb_gui because I can't connect to the server. Password and username are correct but the mgmtd is not running. If I try to

Re: [Linux-HA] Heartbeat 2.1.4 mgmtd doesn't start

2010-10-25 Thread René Pilz
2010/10/25 Yan Gao y...@novell.com On 10/21/10 22:41, René Pilz wrote: Hello, I compiled heartbeat 2.1.4 on CentOS 4.4 x86_64. All seems working the right way, but I can't login to the server using hb_gui because I can't connect to the server. Password and username are correct but

Re: [Linux-HA] Heartbeat 2.1.4 mgmtd doesn't start

2010-10-25 Thread Yan Gao
On 10/25/10 15:28, René Pilz wrote: 2010/10/25 Yan Gao y...@novell.com On 10/21/10 22:41, René Pilz wrote: Hello, I compiled heartbeat 2.1.4 on CentOS 4.4 x86_64. All seems working the right way, but I can't login to the server using hb_gui because I can't connect to the server.

Re: [Linux-HA] heartbeat with postgresql

2010-10-23 Thread Andrew Beekhof
On Fri, Oct 22, 2010 at 7:23 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote: Andrew Beekhof wrote: OK, I'll post this and shut up. Or are you really trying to claim that:    linuxha1 IPaddr::192.168.85.3 httpd smb is fundamentally less complex than    primitive IP ocf:heartbeat:IPaddr

Re: [Linux-HA] heartbeat with postgresql

2010-10-23 Thread Andrew Beekhof
On Fri, Oct 22, 2010 at 7:32 PM, Greg Woods wo...@ucar.edu wrote: On Fri, 2010-10-22 at 18:32 +0200, Andrew Beekhof wrote: if you're just using v1 - thats not a cluster, thats a prayer. Then God must answer my prayers, because I have been using some simple heartbeat v1/DRBD clusters for

Re: [Linux-HA] heartbeat with postgresql

2010-10-22 Thread Andrew Beekhof
On Wed, Oct 20, 2010 at 4:43 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote: Andrew Beekhof wrote: On Tue, Oct 19, 2010 at 6:44 PM, Greg Woods wo...@ucar.edu wrote: On Tue, 2010-10-19 at 10:01 -0600, Serge Dubrouski wrote: Any particular reason for using Heartbeat v1 instead of CRM/Pacemaker?

Re: [Linux-HA] heartbeat with postgresql

2010-10-22 Thread Andrew Beekhof
On Wed, Oct 20, 2010 at 7:09 PM, Greg Woods wo...@ucar.edu wrote: On Wed, 2010-10-20 at 08:13 +0200, Andrew Beekhof wrote: Um, maybe because heartbeat v1 has a much much much much less steep learning curve? I dispute that:    

Re: [Linux-HA] heartbeat with postgresql

2010-10-22 Thread Dimitri Maziuk
Andrew Beekhof wrote: OK, I'll post this and shut up. Or are you really trying to claim that: linuxha1 IPaddr::192.168.85.3 httpd smb is fundamentally less complex than primitive IP ocf:heartbeat:IPaddr params ip=192.168.85.3 primitive http lsb:httpd primitive samba

Re: [Linux-HA] heartbeat with postgresql

2010-10-22 Thread Greg Woods
On Fri, 2010-10-22 at 18:32 +0200, Andrew Beekhof wrote: if you're just using v1 - thats not a cluster, thats a prayer. Then God must answer my prayers, because I have been using some simple heartbeat v1/DRBD clusters for YEARS, for critical services like DNS. They have worked flawlessly and

<    1   2   3   4   5   6   7   8   9   10   >