Re: [Linux-ha-dev] ip rule/ip route and ocf:heartbeat:Route

2011-01-04 Thread Dejan Muhamedagic
On Mon, Jan 03, 2011 at 06:57:29PM +0100, Raoul Bhatia [IPAX] wrote: On 12/28/2010 07:00 PM, Dejan Muhamedagic wrote: what i'm thinking about: 1. issue /sbin/ip rule add from a.b.c.0/24 table 12 upon reboot/firewall setup (my current solution); or Probably the easiest way right

Re: [Linux-ha-dev] bashism in external/rackpdu stonith plugin

2011-01-04 Thread Dejan Muhamedagic
On Mon, Jan 03, 2011 at 06:55:21PM +0100, Raoul Bhatia [IPAX] wrote: On 12/29/2010 05:36 PM, Raoul Bhatia [IPAX] wrote: On 12/29/2010 04:12 PM, Dejan Muhamedagic wrote: What seems to me the best is to simply avoid the issue and do: local names names=`echo $snmp_result | cut -f2

Re: [Linux-ha-dev] Antwort: SAPInstance starts sapstartsrv in monitor

2011-01-04 Thread Dejan Muhamedagic
On Mon, Dec 27, 2010 at 12:29:10PM +0100, alexander.kra...@basf.com wrote: Hi, I was recently reviewing some logs of a SAP HA installation and found out that SAPInstance tries to start the service in the monitor action. That includes probes too. Further, if it finds a different

[Linux-ha-dev] SAPDatabase starts oracle listener in monitor

2011-01-04 Thread Dejan Muhamedagic
Hi, This is a similar issue we recently discussed about SAPInstance and perhaps it can be resolved in a similar way. That is to start the listener only in case there are already some oracle processes running and we want to do a better test. See the attached patch. Thanks, Dejan diff -r

Re: [Linux-HA] Is 'resource_set' still experimental?

2011-01-04 Thread Tobias Appel
On 12/28/2010 06:46 PM, Dejan Muhamedagic wrote: 40 order constraints? A big cluster. We have currently 40 VM's (XEN) on it. I can't put them in a group since they have to run independently and not necessarily on the same node(s). To make it worse I also have location constraints and

[Linux-HA] ha for 2 jboss instancies in an active/passive cluster

2011-01-04 Thread Erik Dobák
Hi i am a completely newbie, but it seems that i will have to use Heartbeat to solve my problem. I have to install 2 jboss instancies on 1 server (multihoming/ vertical cluster) and unfortunataly the aplication inside jboss cant be run in an active/active cluster. now i have found this:

Re: [Linux-HA] Admin of heartbeat 2.13 on Debian Lenny is a PITA

2011-01-04 Thread Michael Schwartzkopff
On Tuesday 04 January 2011 12:31:14 Imran Chaudhry wrote: Hi List, Has anyone found a good solution to administering an established 2-node cluster running heartbeat 2.13 on Debian Lenny? No, since version 2.1.3 is extremly buggy. Please consider using pacemaker from the backports. -- Dr.

Re: [Linux-HA] Admin of heartbeat 2.13 on Debian Lenny is a PITA

2011-01-04 Thread Tobias Appel
On 01/04/2011 12:31 PM, Imran Chaudhry wrote: Hi List, Has anyone found a good solution to administering an established 2-node cluster running heartbeat 2.13 on Debian Lenny? I have 2.1.4 on RHEL5 still running. It also has the GUI (although it can be dangerous). b) Save the CIB XML,

[Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Igor Chudov
A few weeks I reported that heartbeat died on one of the cluster machines, due to SIGXCPU. Well, it happened again. Heartbeat died, now both machines had the shared IP address up, what a god awful mess!!! Nopw they have split brain and the whole nine yards! I looked at

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Igor Chudov
Further reading indicates that heartbeat itself sets a limit for itself every so often. Then it exceeds the limit (probably due to a bug). I am sure that tha's why whoever wrote heartbeat, set cpu limit, instead of foxing their bugs. Then it dies with SIGXCPU, leaving everything in an extremely

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Steve Davies
On 4 January 2011 13:47, Igor Chudov ichu...@gmail.com wrote: Further reading indicates that heartbeat itself sets a limit for itself every so often. Then it exceeds the limit (probably due to a bug). I am sure that tha's why whoever wrote heartbeat, set cpu limit, instead of foxing their

Re: [Linux-HA] Admin of heartbeat 2.13 on Debian Lenny is a PITA

2011-01-04 Thread Dejan Muhamedagic
On Tue, Jan 04, 2011 at 01:14:39PM +0100, Tobias Appel wrote: On 01/04/2011 12:31 PM, Imran Chaudhry wrote: Hi List, Has anyone found a good solution to administering an established 2-node cluster running heartbeat 2.13 on Debian Lenny? I have 2.1.4 on RHEL5 still running. It also has

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Serge Dubrouski
Which OS? Which version of Hearbeat? heartbeat_pid - PID of which of Heartbeat processes? It has several. On Tue, Jan 4, 2011 at 6:32 AM, Igor Chudov ichu...@gmail.com wrote: A few weeks I reported that heartbeat died on one of the cluster machines, due to SIGXCPU. Well, it happened

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Dejan Muhamedagic
Hi, On Tue, Jan 04, 2011 at 07:47:10AM -0600, Igor Chudov wrote: Further reading indicates that heartbeat itself sets a limit for itself every so often. True. Then it exceeds the limit (probably due to a bug). I am sure that tha's why whoever wrote heartbeat, set cpu limit, instead of

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Igor Chudov
Steve, here's some data. The OS is Ubuntu 10.04. ~# apt-cache policy heartbeat heartbeat: Installed: 1:3.0.3-1ubuntu1 Candidate: 1:3.0.3-1ubuntu1 Version table: *** 1:3.0.3-1ubuntu1 0 500 http://us.archive.ubuntu.com/ubuntu/ lucid/universe Packages 100 /var/lib/dpkg/status

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Igor Chudov
On Tue, Jan 4, 2011 at 9:40 AM, Serge Dubrouski serge...@gmail.com wrote: Which OS? Ubuntu 10.04 Lucid. Which version of Hearbeat? 3.0.3 ~# apt-cache policy heartbeat heartbeat: Installed: 1:3.0.3-1ubuntu1 Candidate: 1:3.0.3-1ubuntu1 Version table: *** 1:3.0.3-1ubuntu1 0

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Serge Dubrouski
Are you sure that everything is all right with your network? It looks like processes that are responsible for UDP communications are taking too much of CPU time. On Tue, Jan 4, 2011 at 8:47 AM, Igor Chudov ichu...@gmail.com wrote: Steve, here's some data. The OS is Ubuntu 10.04. ~# apt-cache

Re: [Linux-HA] Config sanity check

2011-01-04 Thread Dejan Muhamedagic
Hi, On Thu, Dec 30, 2010 at 08:56:09PM +, James Smith wrote: Hi, I've been hitting some problems with my drbd / iscsi-target clusters, resources dropping in to FAILED (unmanaged) states etc. I'm after a bit of a sanity check on the config below. Firstly, I know the timesouts and

Re: [Linux-HA] Admin of heartbeat 2.13 on Debian Lenny is a PITA

2011-01-04 Thread Ryan Kish
Right now my only recourse is one of these options: a) Install Ubuntu 8.04 in a VM and use heartbeat-gui Have you installed the package heartbeat-2-gui? It provides /usr/lib/heartbeat-gui/haclient.py which is called heartbeat-gui in Ubuntu. -Ryan

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Igor Chudov
Serge, I am not sure of anything, but the self-communication is supposed to be taking place on a single crossover cable between second network cards of the servers. (eth1). Igor On Tue, Jan 4, 2011 at 10:06 AM, Serge Dubrouski serge...@gmail.com wrote: Are you sure that everything is all right

Re: [Linux-HA] pingd resource problem

2011-01-04 Thread Dejan Muhamedagic
Hi, On Thu, Dec 30, 2010 at 04:52:38PM +0100, Nico Faerber wrote: Salute I have some troubles setting up a pingd clone resource. I'm using pacemaker 1.0.8 with corosync 1.2.0 running on a ubuntu 10.04. after setting up the resource crm/configure/show gives this: primitive pingd

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Serge Dubrouski
On Tue, Jan 4, 2011 at 9:14 AM, Igor Chudov ichu...@gmail.com wrote: Serge, I am not sure of anything, but the self-communication is supposed to be taking place on a single crossover cable between second network cards of the servers. (eth1). Agree, yet something strange and pretty unique is

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Dimitri Maziuk
Igor Chudov wrote: At this point I feel rather desperate. Perhaps I should give pacemaker another go. I really have no idea and I am running out of options. If all you need is a 2-node active-passive cluster, most (all?) pacemaker features are useless for you. (Besides, one look at their

Re: [Linux-HA] Heartbeat dies AGAIN with SIGXCPU, cluster screwed up again

2011-01-04 Thread Serge Dubrouski
On Tue, Jan 4, 2011 at 1:29 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote: Igor Chudov wrote: At this point I feel rather desperate. Perhaps I should give pacemaker another go. I really have no idea and I am running out of options. If all you need is a 2-node active-passive cluster, most