Re: [Linux-HA] type: pseduo

2011-07-04 Thread Andrew Beekhof
On Mon, Jul 4, 2011 at 7:06 PM, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote: Hi, found this syslog message on a SLES11 SP1 system: Jul  4 10:55:14 rksaph02 crmd: [11517]: WARN: print_elem:     [Action 83]: Pending (id: grp_t11_as2_stopped_0, type: pseduo, priority: 2570) I guess

Re: [Linux-HA] Q: ms-resources and grouping

2011-07-01 Thread Andrew Beekhof
On Thu, Jun 30, 2011 at 7:41 PM, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote: Hi! I have a question: when I want to have a filesystem on a logical volume, where the VG is on a RAID1, I would typically have three resources to handle that. Now if I wish to have a clone or ms

Re: [Linux-HA] Help for the floating IPaddress monitoing

2011-07-01 Thread Andrew Beekhof
On Thu, Jun 30, 2011 at 3:47 PM, 徐斌 robin@163.com wrote: Hi Gent, I want to let the floating IP running again after restart the network. But I met the issue when I enable the monitoring for the floating ip (using ocf:heartbeat:IPaddr2). [root@master ~]# crm configure show ip2

Re: [Linux-HA] Question on order rule

2011-06-30 Thread Andrew Beekhof
On Thu, Jun 9, 2011 at 4:58 AM, Alessandro Iurlano alessandro.iurl...@gmail.com wrote: Hello. I'm trying to setup an highly available OpenVZ cluster. As OpenVZ only supports disk quota on ext3/4 local filesystems  (nfs and gfs/ocfs2 don't work), I have setup two iscsi volumes on an highly

Re: [Linux-HA] Web resource monitoring

2011-06-29 Thread Andrew Beekhof
On Mon, Jun 27, 2011 at 3:19 AM, Maxim Ianoglo dot...@gmail.com wrote: Hello, The http monitoring code should be split off from the apache RA. Then a simple stateless (see the Dummy RA for a sample) RA, say httpmon, can be created which would source the http monitoring. Patches accepted!

Re: [Linux-HA] serial cable or ethnet cable for heartbeat, which one is better?

2011-06-26 Thread Andrew Beekhof
On Mon, Jun 27, 2011 at 3:03 AM, Hai Tao taoh...@hotmail.com wrote: Which one is better for heartbeat, a serial cable or a dedicated ethernet cable? Can the bandwidth of a serial cable be a bottleneck? If you're running pacemaker - yes. How much data is transferred on the heartbeat link?

Re: [Linux-HA] CIB process quits and could not connect to CRM

2011-06-26 Thread Andrew Beekhof
On Mon, May 16, 2011 at 11:36 PM, Mateusz Kalisiak mateusz.kalis...@gmail.com wrote: Hello, I'm struggling the same problem on RHEL 6. Does anyone have some idea of solving this out? Any help would be appreciated. You'd need to provide more details than that. Have you tried reading the logs?

Re: [Linux-HA] Best way for colocating resource on a dual primary drbd

2011-06-26 Thread Andrew Beekhof
On Mon, May 16, 2011 at 5:38 PM, RaSca ra...@miamammausalinux.org wrote: Il giorno Lun 16 Mag 2011 09:01:08 CET, Andrew Beekhof ha scritto: [...] Implicit that once the resource go away it becomes slave? Pretty sure this is a bug in 1.0. Have you tried 1.1.5 ? Not yet, but so Andrew

Re: [Linux-HA] Colocation of VIP and httpd

2011-06-26 Thread Andrew Beekhof
On Wed, Jun 1, 2011 at 12:04 PM, 吴鸿宇 whyfo...@gmail.com wrote: Thank you for your reply. My requirement is like this: The httpd service runs on every node in the cluster and is monitored by watchdog. VIP only runs on one node at a time. Heartbeat will check the status of httpd on each node

Re: [Linux-HA] need help on email alerts

2011-06-26 Thread Andrew Beekhof
On Sun, Jun 5, 2011 at 8:53 PM, Amit Jathar amit.jat...@alepo.com wrote: Hi, I have configured email alerts for corosync as follows :- Crm configure show ---SNIP- primitive resMON ocf:pacemaker:ClusterMon \        operations $id=resMON-operations \        op monitor interval=180

Re: [Linux-HA] Virtual mysql cluster ip is not accessible on port 3306

2011-06-26 Thread Andrew Beekhof
On Thu, Jun 23, 2011 at 12:06 AM, Calistus Che calistus@gmail.com wrote: Hi Guys, could any one of you help me? I just set up a 2 lb (master and slave) and 2 mysql cluster nodes db1 and 2. The servers have 2 interfaces private and public and loadbalancing is running on the private

Re: [Linux-HA] Always Get a Billion Failed Actions

2011-06-26 Thread Andrew Beekhof
On Thu, Jun 16, 2011 at 8:38 PM, Robinson, Eric eric.robin...@psmnv.com wrote: crm_mon on my system displays a lot of failed actions, I guess because the init script for the resource is not fully lsb compliant? In any case, the resources seem to work okay and failover okay. How can I get rid

Re: [Linux-HA] crm_report versus hb_report

2011-06-26 Thread Andrew Beekhof
events.txt but perhaps it is a coincidence ... Any idea ? Thanks Alain De :    Andrew Beekhof and...@beekhof.net A :     General Linux-HA mailing list linux-ha@lists.linux-ha.org Date :  17/06/2011 09:31 Objet : Re: [Linux-HA] crm_report versus hb_report Envoyé par :    linux-ha-boun

Re: [Linux-HA] heartbeat three node configuration

2011-06-26 Thread Andrew Beekhof
On Thu, Jun 9, 2011 at 11:54 PM, Ricardo F ri...@hotmail.com wrote: What is the configuration for create a three node cluster?, Essentially you need Pacemaker on top. haresources based clusters were only designed for 2-nodes. i have this but the servers bring-up the shared ip at same time:

Re: [Linux-HA] ClusterIP clone resource failover and migration issue

2011-06-26 Thread Andrew Beekhof
On Mon, Jun 6, 2011 at 10:22 PM, Randy Wilson randyedwil...@gmail.com wrote: Hi, I've setup two ClusterIP instances on a two node cluster using the below configuration: node node1.domain.com node node2.domain.com primitive clusterip_33 ocf:heartbeat:IPaddr2 \     params ip=xxx.xxx.xxx.33

Re: [Linux-HA] using the pacemaker logo for the xing group

2011-06-21 Thread Andrew Beekhof
On Tue, Jun 21, 2011 at 5:22 PM, Keisuke MORI keisuke.mori...@gmail.com wrote: Hi Erkan, As I've sent a personal email to you and as Ikeda-san already replied to you, Anybody may use the logo in conjunction with any Pacemaker / Linux-HA related projects. The logo is a contribution from the

Re: [Linux-HA] crm_report versus hb_report

2011-06-17 Thread Andrew Beekhof
On Fri, Jun 17, 2011 at 4:47 PM, alain.mou...@bull.net wrote: Hi, I just discover that on RH6 there is no more hb_report, it has been remove from cluster-glue rpm . Does the crm_report delivered in pacemaker rpm give the sames results as hb_report ? Yes. It re-uses much of the same

Re: [Linux-HA] crm_report versus hb_report

2011-06-17 Thread Andrew Beekhof
On Fri, Jun 17, 2011 at 5:30 PM, Andrew Beekhof and...@beekhof.net wrote: On Fri, Jun 17, 2011 at 4:47 PM,  alain.mou...@bull.net wrote: Hi, I just discover that on RH6 there is no more hb_report, it has been remove from cluster-glue rpm . Does the crm_report delivered in pacemaker rpm give

Re: [Linux-HA] Status about the four stack options

2011-05-26 Thread Andrew Beekhof
On Tue, May 24, 2011 at 9:54 AM, alain.mou...@bull.net wrote: Hi Many thanks for this status. I suppose this is the same status on RHEL6 as Suse is likely to be in advance with regard to RHEL6 Pacemaker corosync evolutions ? This is not implied. On RHEL6, although Pacemaker is not

Re: [Linux-HA] ocfs2

2011-05-26 Thread Andrew Beekhof
On Tue, May 24, 2011 at 2:33 PM, Eric Warnke ewar...@albany.edu wrote: Fedora 14 lacks dlm-pcmk since it has been depreciated.  Really frustrating as whatprovides shows a file but yum install says nothing to do without installing.  Most of the existing quick start docs are therefor

Re: [Linux-HA] DO NOT start using heartbeat 2.x in crm mode, but just use Pacemaker, please! [was: managing resource httpd in heartbeat]

2011-05-19 Thread Andrew Beekhof
On Thu, May 19, 2011 at 12:36 PM, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Wed, May 18, 2011 at 05:21:35PM -0700, Vinay Nagrik wrote: Hello everybody, I am running Centos 5.2 with heartbeat 2.1.3 and we as a group run it on appliances and *it is readily not possible to suddenly

Re: [Linux-HA] Best way for colocating resource on a dual primary drbd

2011-05-16 Thread Andrew Beekhof
On Sat, May 14, 2011 at 9:31 AM, RaSca ra...@miamammausalinux.org wrote: Il giorno Ven 13 Mag 2011 16:09:14 CET, Viacheslav Biriukov ha scritto: In your case you have two drbd master. So, I think, it is not a good idea to create that collocation. Instead of this you can set location directives

Re: [Linux-HA] Behaviour when rebooting inactive node

2011-05-10 Thread Andrew Beekhof
On Mon, May 9, 2011 at 2:38 PM, Nicolas Guenette nicol...@ho.reitmans.com wrote: Hello, I have a two node cluster and a question about the cluster's behaviour when I reboot the inactive node. The situation is this: if the resources are running on serverA and I reboot serverB, serverA

Re: [Linux-ha-dev] Filesystem ocf file

2011-05-06 Thread Andrew Beekhof
On Fri, May 6, 2011 at 9:37 AM, Florian Haas florian.h...@linbit.com wrote: On 2011-05-06 09:26, Darren Thompson wrote: Team I was reviewing some errors on a cluster mounted file-system that caused me to review the Filesystem ocf file. I notice that it uses an undeclared parameter of

Re: [Linux-ha-dev] [ha-wg] Cluster Stack - Ubuntu Developer Summit

2011-05-05 Thread Andrew Beekhof
On Thu, May 5, 2011 at 10:25 AM, Florian Haas florian.h...@linbit.com wrote: On 2011-04-26 19:33, Andres Rodriguez wrote: UDS' are open-to-public events, and I believe it would be great if upstream could participate and maybe even further the discussion about the Cluster Stack. For more

Re: [Linux-ha-dev] ACLs and privilege escalation (was Re: New OCF RA: symlink)

2011-05-05 Thread Andrew Beekhof
On Thu, May 5, 2011 at 9:09 AM, Florian Haas florian.h...@linbit.com wrote: Rather than going into ACLs in more detail, I wanted to highlight that however we limit access to the CIB, the resource agents still _execute_ as root, so we will always have what would normally be considered a

Re: [Linux-ha-dev] New OCF RA: symlink

2011-05-05 Thread Andrew Beekhof
On Wed, May 4, 2011 at 4:36 PM, Lars Ellenberg lars.ellenb...@linbit.com wrote:  Services running under Pacemaker control are probably critical,  so a malicious person with even only stop access on the CIB  can do a DoS. I guess we have to assume people with any write access  at all to the CIB

Re: [Linux-HA] get haresources2cib.py

2011-05-04 Thread Andrew Beekhof
. arun On Mon, May 2, 2011 at 11:16 PM, Andrew Beekhof and...@beekhof.net wrote: On Mon, May 2, 2011 at 9:33 PM, Vinay Nagrik vnag...@gmail.com wrote: Thank you Andrew. Could you please tell me where to get the DTD for cib.xml and where from can I download crm shell. Both get installed

Re: [Linux-HA] Filesystem do not start on Pacemaker-Cluster

2011-05-04 Thread Andrew Beekhof
On Tue, May 3, 2011 at 10:39 AM, KoJack kojac...@web.de wrote: Hi, i was trying to set up a pacemaker cluster. After I added all resources, the filesystem will not start at one Node. crm_verify -L -V crm_verify[30068]: 2011/05/03_10:35:39 WARN: unpack_rsc_op: Processing failed op

Re: [Linux-HA] get haresources2cib.py

2011-05-03 Thread Andrew Beekhof
On Mon, May 2, 2011 at 12:56 AM, Andrew Beekhof and...@beekhof.net wrote: On Sun, May 1, 2011 at 9:26 PM, Vinay Nagrik vnag...@gmail.com wrote: Dear Andrew, I read your document clusters from scratch and found it very detailed.  It gave lots of information, but I was looking for creating

Re: [Linux-HA] Antw: Re: ocf:pacemaker:ping: dampen

2011-05-03 Thread Andrew Beekhof
On Mon, May 2, 2011 at 5:29 PM, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Mon, May 02, 2011 at 04:04:56PM +0200, Andrew Beekhof wrote: Still, we may get a spurious failover in this case: reachability:   +__ Node A monitoring

Re: [Linux-HA] get haresources2cib.py

2011-05-02 Thread Andrew Beekhof
30, 2011 at 1:32 AM, Andrew Beekhof and...@beekhof.net wrote: Forget the conversion. Use the crm shell to create one from scratch. And look for the clusters from scratch doc relevant to your version - its worth the read. On Sat, Apr 30, 2011 at 1:19 AM, Vinay Nagrik vnag...@gmail.com wrote

Re: [Linux-HA] Antw: Re: ocf:pacemaker:ping: dampen

2011-05-02 Thread Andrew Beekhof
On Mon, May 2, 2011 at 8:27 AM, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote: Andrew Beekhof and...@beekhof.net schrieb am 29.04.2011 um 09:31 in Nachricht BANLkTi=-ftyk9uxcgu0m2wqhquu_rt8...@mail.gmail.com: On Fri, Apr 29, 2011 at 9:27 AM, Dominik Klein d...@in-telegence.net wrote

Re: [Linux-HA] Antw: Re: ocf:pacemaker:ping: dampen

2011-05-02 Thread Andrew Beekhof
On Mon, May 2, 2011 at 3:51 PM, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Mon, May 02, 2011 at 01:20:16PM +0200, Andrew Beekhof wrote: On Mon, May 2, 2011 at 8:27 AM, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote: Andrew Beekhof and...@beekhof.net schrieb am 29.04.2011 um 09

Re: [Linux-HA] get haresources2cib.py

2011-04-30 Thread Andrew Beekhof
Forget the conversion. Use the crm shell to create one from scratch. And look for the clusters from scratch doc relevant to your version - its worth the read. On Sat, Apr 30, 2011 at 1:19 AM, Vinay Nagrik vnag...@gmail.com wrote: Hello Group, Kindly tell me where can I download

Re: [Linux-HA] ocf:pacemaker:ping: dampen

2011-04-29 Thread Andrew Beekhof
On Fri, Apr 29, 2011 at 9:27 AM, Dominik Klein d...@in-telegence.net wrote: It waits $dampen before changes are pushed to the cib. So that eventually occuring icmp hickups do not produce an unintended failover. At least that's my understanding. correcto

Re: [Linux-HA] Pingd does not react as expected = split brain

2011-04-28 Thread Andrew Beekhof
On Wed, Apr 27, 2011 at 7:18 PM, Stallmann, Andreas astallm...@conet.de wrote: Hi Andrew, According to your configuration, it can be up to 60s before we'll detect a change in external connectivity. Thats plenty of time for the cluster to start resources. Maybe shortening the monitor

Re: [Linux-ha-dev] Translate crm_cli.txt to Japanese

2011-04-27 Thread Andrew Beekhof
On Wed, Apr 27, 2011 at 12:54 PM, Dejan Muhamedagic de...@suse.de wrote: Hi Junko-san, On Wed, Apr 27, 2011 at 06:42:52PM +0900, Junko IKEDA wrote: Hi, May I suggest that you go with the devel version, because crm_cli.txt was converted to crm.8.txt. There are not many textual changes,

Re: [Linux-ha-dev] Translate crm_cli.txt to Japanese

2011-04-27 Thread Andrew Beekhof
On Wed, Apr 27, 2011 at 3:47 PM, Dejan Muhamedagic de...@suse.de wrote: On Wed, Apr 27, 2011 at 02:01:40PM +0200, Andrew Beekhof wrote: On Wed, Apr 27, 2011 at 12:54 PM, Dejan Muhamedagic de...@suse.de wrote: Hi Junko-san, On Wed, Apr 27, 2011 at 06:42:52PM +0900, Junko IKEDA wrote: Hi

Re: [Linux-HA] Problem using Stonith external/ipmi device

2011-04-27 Thread Andrew Beekhof
On Tue, Apr 26, 2011 at 9:07 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: On Tue, Apr 19, 2011 at 02:46:06PM +0200, Andrew Beekhof wrote: On Tue, Apr 19, 2011 at 12:43 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Mon, Apr 11, 2011 at 09:41:12AM +0200, Andrew Beekhof wrote

Re: [Linux-HA] Pingd does not react as expected = split brain

2011-04-27 Thread Andrew Beekhof
On Wed, Apr 27, 2011 at 11:52 AM, Stallmann, Andreas astallm...@conet.de wrote: Hi Lars, Hi Lars! You are exercising complete cluster communication loss. Which is cluster split brain. Correct, yes. If you are specifically exercising cluster split brain, why are you surprised that you

Re: [Linux-ha-dev] Bug in crm shell or pengine

2011-04-19 Thread Andrew Beekhof
On Mon, Apr 18, 2011 at 11:38 PM, Serge Dubrouski serge...@gmail.com wrote: Ok, I've read the documentation. It's not a bug, it's a feature :-) Might be nice if the shell could somehow prevent such configs, but it would be non-trivial to implement. On Mon, Apr 18, 2011 at 3:01 PM, Serge

Re: [Linux-HA] Problem using Stonith external/ipmi device

2011-04-19 Thread Andrew Beekhof
On Tue, Apr 19, 2011 at 12:43 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Mon, Apr 11, 2011 at 09:41:12AM +0200, Andrew Beekhof wrote: On Fri, Apr 8, 2011 at 11:07 AM, Matthew Richardson m.richard...@ed.ac.uk wrote: On 07/04/11 16:36, Dejan Muhamedagic wrote: For whatever

Re: [Linux-HA] How can I add more options to current reset, on, off options?

2011-04-18 Thread Andrew Beekhof
On Sun, Apr 17, 2011 at 11:23 PM, Avestan babak_khoram...@hotmail.com wrote: Hello, I am using STONITH Device AP9225EXP with AP9617 Network Management card.  I have generated my own pacth to change the apcmaster.c file to work with my setup. The stonith appears to allow only three commands

Re: [Linux-HA] Shutdown Escalation

2011-04-16 Thread Andrew Beekhof
On Sat, Apr 16, 2011 at 12:30 PM, yash er.bhara...@yahoo.in wrote: yash er.bharat09 at yahoo.in writes: Andrew Beekhof andrew at beekhof.net writes: I am facing problem during heartbeat stop command as it hangs and returns after long 20 min, through google i came to know about shutdown

Re: [Linux-ha-dev] Dovecot OCF Resource Agent

2011-04-15 Thread Andrew Beekhof
On Fri, Apr 15, 2011 at 12:53 PM, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote: On 04/15/2011 11:10 AM, jer...@intuxicated.org wrote: Yes, it does the same thing but contains some additional features, like logging into a mailbox. first of all, i do not know how the others think about a ocf ra

Re: [Linux-HA] Shutdown Escalation

2011-04-15 Thread Andrew Beekhof
On Fri, Apr 15, 2011 at 8:44 AM, yash er.bhara...@yahoo.in wrote: Hello list, I am facing problem during heartbeat stop command as it hangs and returns after long 20 min, through google i came to know about shutdown escalation parameter of crmd but when i try reducing this parameter it  do

Re: [Linux-HA] question about lsb init script

2011-04-14 Thread Andrew Beekhof
Probably not lsb compliant. http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/ap-lsb.html On Wed, Apr 13, 2011 at 10:58 PM, Gerry Kernan gerry.ker...@infinityit.ie wrote: Hi I am setting up a asterisk HA solution using a redfone device for the PRI lines. To start

Re: [Linux-HA] question about lsb init script

2011-04-14 Thread Andrew Beekhof
On Thu, Apr 14, 2011 at 11:18 AM, Gerry Kernan gerry.ker...@infinityit.ie wrote: Andrew, Thanks, I've  do some checking and it doesn't appear to be. Can I add a resource that runs a command and doesn't look for a status for the resource. No. The status operation is required to be implemented

Re: [Linux-HA] Resource-Group won't start - crm_mon does not react - no failures shown

2011-04-13 Thread Andrew Beekhof
On Wed, Apr 13, 2011 at 12:23 AM, Stallmann, Andreas astallm...@conet.de wrote: Hi! We've got a pretty straightforward and easy configuration: Corosync 1.2.1 / Pacemaker 2.0.0 on OpenSuSE 11.3 running DRBD (M/S), Ping (clone), and a resource-group, containing a shared IP, tomcat and mysql

Re: [Linux-HA] Filesystem thinks it is run as a clone

2011-04-13 Thread Andrew Beekhof
On Tue, Apr 12, 2011 at 5:17 PM, Christoph Bartoschek bartosc...@or.uni-bonn.de wrote: Hi, today we tested some NFS cluster scenarios and the first test failed. The first test was to put the current master node into standby. Stopping the services worked but then starting it on the other node

Re: [Linux-HA] Filesystem thinks it is run as a clone

2011-04-13 Thread Andrew Beekhof
On Wed, Apr 13, 2011 at 10:57 AM, Christoph Bartoschek bartosc...@or.uni-bonn.de wrote: Am 13.04.2011 08:26, schrieb Andrew Beekhof: On Tue, Apr 12, 2011 at 5:17 PM, Christoph Bartoschek bartosc...@or.uni-bonn.de  wrote: Hi, today we tested some NFS cluster scenarios and the first test

Re: [Linux-ha-dev] Resource agent implementing SPC-3 Persistent Reservations (contribution from Evgeny Nifontov)

2011-04-12 Thread Andrew Beekhof
Awesome. I was wondering if someone would ever write one of these :) On Tue, Apr 12, 2011 at 10:29 AM, Florian Haas florian.h...@linbit.com wrote: Hi everyone, Evgeny Nifontov has started to implement sg_persist, a resource agent managing SPC-3 Persistent Reservations (PRs) using the

Re: [Linux-HA] Problem using Stonith external/ipmi device

2011-04-11 Thread Andrew Beekhof
On Fri, Apr 8, 2011 at 11:07 AM, Matthew Richardson m.richard...@ed.ac.uk wrote: On 07/04/11 16:36, Dejan Muhamedagic wrote: For whatever reason stonith-ng doesn't think that stonithipmidisk1 can manage this node. Which version of Pacemaker do you run? Perhaps this has been fixed in the

Re: [Linux-HA] HA software download

2011-04-07 Thread Andrew Beekhof
On Wed, Apr 6, 2011 at 1:28 PM, Ajaykumar Narayanaswamy ajaykumar_narayanasw...@mindtree.com wrote: Hi All, I would like to know whether Linux OS has any inbuilt HA/Failover software or should we procure some third-party HA s/w. I came to know about heartbeat package which is an Open source

Re: [Linux-HA] When is the next release for resource agents?

2011-04-07 Thread Andrew Beekhof
On Wed, Apr 6, 2011 at 4:55 PM, Serge Dubrouski serge...@gmail.com wrote: Hello - When is the next release for resource agents? Agents that come with resource-agents-1.0.3-2.6.el5 form clusterlabs repository are very outdated.pgsql is at least one year old or so. in most cases there's not

Re: [Linux-HA] HA software download

2011-04-07 Thread Andrew Beekhof
. Could you please throw some light on this query??? No. I've never run an ldap server. Sorry. Thx for lending help.. Regards, Ajaykumar -Original Message- From: Ajaykumar Narayanaswamy Sent: Thursday, April 07, 2011 12:41 PM To: 'Andrew Beekhof' Subject: RE: [Linux-HA] HA

Re: [Linux-HA] heartbeat ordering

2011-04-06 Thread Andrew Beekhof
On Tue, Apr 5, 2011 at 11:58 AM, Maxim Ianoglo dot...@gmail.com wrote: Hello, I have four serves in a HA cluster: NodeA NodeB NodeC NodeD There are defined three groups of resources and one inline resource: 1. group_storage ( NFS VIP, NFS Server, DRBD ) 2. group_apache_www (Domains VIPs

Re: [Linux-HA] why Cluster restarts A, before starting B on surviving node.

2011-04-06 Thread Andrew Beekhof
I meant in the form of a hb_report which contains the necessary logs and status information necessary to diagnose your issue. On Mon, Apr 4, 2011 at 12:11 PM, Muhammad Sharfuddin m.sharfud...@nds.com.pk wrote: On Mon, 2011-04-04 at 10:42 +0200, Andrew Beekhof wrote: On Thu, Mar 24, 2011 at 7

Re: [Linux-HA] crm commands : how to reduce the delay between two commands

2011-04-06 Thread Andrew Beekhof
On Fri, Mar 25, 2011 at 2:07 PM, Alain.Moulle alain.mou...@bull.net wrote: Hi, I tried but it does not work : crm_resource -r resname -p target-role -v started because it adds a target-role=started as params whereis I already have a meta target-role=Stopped so resource does not start. So I

Re: [Linux-HA] DRBD and pacemaker interaction

2011-04-05 Thread Andrew Beekhof
On Mon, Apr 4, 2011 at 10:14 PM, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Mon, Apr 04, 2011 at 09:43:27AM +0200, Andrew Beekhof wrote: I am missing the state: running degraded or suboptimal. Yep, degraded is not a state available for pacemaker. Pacemaker cannot do much about

Re: [Linux-HA] DRBD and pacemaker interaction

2011-04-05 Thread Andrew Beekhof
On Tue, Apr 5, 2011 at 9:42 AM, Christoph Bartoschek bartosc...@or.uni-bonn.de wrote: Am 04.04.2011 22:14, schrieb Lars Ellenberg: On Mon, Apr 04, 2011 at 09:43:27AM +0200, Andrew Beekhof wrote: I am missing the state: running degraded or suboptimal. Yep, degraded is not a state available

Re: [Linux-HA] DRBD and pacemaker interaction

2011-04-04 Thread Andrew Beekhof
: Am 01.04.2011 10:27, schrieb Andrew Beekhof: On Sat, Mar 26, 2011 at 12:10 AM, Lars Ellenberg lars.ellenb...@linbit.com    wrote: On Fri, Mar 25, 2011 at 06:18:07PM +0100, Christoph Bartoschek wrote: I am missing the state: running degraded or suboptimal. Yep, degraded is not a state available

Re: [Linux-HA] NFS cluster after node crash

2011-04-04 Thread Andrew Beekhof
On Thu, Mar 24, 2011 at 9:58 PM, Christoph Bartoschek bartosc...@or.uni-bonn.de wrote: It seems as if the g_nfs service is stopped on the surviving node when the other one comes up again. To me it looks like the service gets stopped after it fails: p_exportfs_root:0_monitor_3

Re: [Linux-HA] Stonith resource appears to be active on 2 nodes ...

2011-04-04 Thread Andrew Beekhof
On Mon, Apr 4, 2011 at 9:03 AM, Alain.Moulle alain.mou...@bull.net wrote: Hi, I got this error : 1301591983 2011 Mar 31 19:19:43 berlin5 daemon err crm_resource [36968]: ERROR: native_add_running: Resource stonith::fence_ipmilan:restofenceberlin4 appears to be active on 2 nodes. 1301591983

Re: [Linux-HA] why Cluster restarts A, before starting B on surviving node.

2011-04-04 Thread Andrew Beekhof
On Thu, Mar 24, 2011 at 7:42 PM, Muhammad Sharfuddin m.sharfud...@nds.com.pk wrote: we have two resources A and B Cluster starts A on node1, and B on node2, while failover node for A is node2 and failover node for B is node1 B cant start without A, so I have following location rules:        

Re: [Linux-HA] update question

2011-04-01 Thread Andrew Beekhof
On Mon, Mar 28, 2011 at 9:08 PM, Miles Fidelman mfidel...@meetinghouse.net wrote: Hi Folks, I'm getting ready to upgrade a 2-node HA cluster from Debian Etch to Squeeze.  I'd very much appreciate any suggestions regarding gotchas to avoid, and so forth. Basic current configuration: - 2

Re: [Linux-HA] DRBD and pacemaker interaction

2011-04-01 Thread Andrew Beekhof
On Fri, Apr 1, 2011 at 4:38 PM, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Fri, Apr 01, 2011 at 11:35:19AM +0200, Christoph Bartoschek wrote: Am 01.04.2011 11:27, schrieb Florian Haas: On 2011-04-01 10:49, Christoph Bartoschek wrote: Am 01.04.2011 10:27, schrieb Andrew Beekhof

Re: [Linux-HA] Sort of crm commandes but off line ?

2011-03-25 Thread Andrew Beekhof
On Thu, Mar 24, 2011 at 2:32 PM, Alain.Moulle alain.mou...@bull.net wrote: Hi, Ok I think my question was not clear : in fact, the pb is not to do or not ssh node crm ... , the pb is just to know the hostname of the node to ssh it , in another way than parsing the cib.xml to know which other

Re: [Linux-HA] comments required on location rules

2011-03-25 Thread Andrew Beekhof
should work. depends what stickiness value you're using On Sat, Mar 19, 2011 at 11:48 AM, Muhammad Sharfuddin m.sharfud...@nds.com.pk wrote: we have two resource groups 'grp-SAPDatabase' and 'grp-SAPInstances' To better utilize our both machines, I want to run the 'grp-SAPDatabase' and

Re: [Linux-HA] need help to configure the fence_ifmib for stonith

2011-03-17 Thread Andrew Beekhof
On Thu, Mar 17, 2011 at 11:25 AM, Amit Jathar amit.jat...@alepo.com wrote: Hi, I would like to try the fence_ifmib as the fencing agent. I can see it is present in my machine. [root@OEL6_VIP_1 fence]# ls /usr/sbin/fence_ifmib /usr/sbin/fence_ifmib Also, I can see some python scripts

Re: [Linux-ha-dev] new resource agents repository commit policy

2011-03-14 Thread Andrew Beekhof
On Mon, Mar 14, 2011 at 6:07 PM, Dejan Muhamedagic de...@suse.de wrote: Hello everybody, It's time to figure out how to maintain the new Resource Agents repository. Fabio and I already discussed this a bit in IRC. There are two options: a) everybody gets an account at github.com and commit

Re: [Linux-HA] question on Creating an Active/Passive iSCSI configuration

2011-03-14 Thread Andrew Beekhof
On Fri, Mar 11, 2011 at 7:23 PM, Randy Katz rk...@simplicityhosting.com wrote: On 3/11/2011 3:29 AM, Dejan Muhamedagic wrote: Hi, On Fri, Mar 11, 2011 at 01:36:25AM -0800, Randy Katz wrote: On 3/11/2011 12:50 AM, RaSca wrote: Il giorno Ven 11 Mar 2011 07:32:32 CET, Randy Katz ha scritto: ps

Re: [Linux-HA] GFS2 mounting hangs

2011-03-11 Thread Andrew Beekhof
On Thu, Mar 10, 2011 at 5:53 PM, Jonathan Schaeffer jonathan.schaef...@univ-brest.fr wrote: Hi, I'm trying to setup a pacemaker cluster based on DRBD Active/Active and GFS2. Everything is working fine on normal startup. But when I try to mess around with the cluster, I come across

Re: [Linux-HA] Active/active cluster with connectivity check

2011-03-11 Thread Andrew Beekhof
On Thu, Mar 10, 2011 at 3:54 PM, Artur linu...@netdirect.fr wrote: Hello, I'm currently switching to Heartbeat (3.0.3) and Pacemaker (1.0.9.1) on Debian Squeeze with CRM/CIB setup. This is the first time i try to configure it so please be kind with a newbie. :) I would like to setup an

Re: [Linux-HA] resource not restarted due to score value

2011-03-11 Thread Andrew Beekhof
Message- From: linux-ha-boun...@lists.linux-ha.org [mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of Andrew Beekhof Sent: Monday, February 07, 2011 8:46 AM To: General Linux-HA mailing list Subject: Re: [Linux-HA] resource not restarted due to score value On Fri, Feb 4, 2011 at 12

Re: [Linux-HA] Not able to stop services in group individually.

2011-03-09 Thread Andrew Beekhof
On Wed, Mar 2, 2011 at 9:31 AM, Caspar Smit c.s...@truebit.nl wrote: Hi, I have the following (simple) configuration: primitive iscsi0 ocf:heartbeat:iscsi \ params portal=*172.20.250.5 *target=iqn.2010-10.nl.nas:nas.storage0 primitive iscsi1 ocf:heartbeat:iscsi \ params

Re: [Linux-HA] Load CRM-Konfiguration from file

2011-03-09 Thread Andrew Beekhof
try: crm configure filename On Wed, Mar 9, 2011 at 4:34 PM, Stallmann, Andreas astallm...@conet.de wrote: Hi there, is it possible to exchange a complete CIB with an other CIB? The background is, that we have to roll out the same cluster in different customer enviroments with

Re: [Linux-HA] Ping resource goes down and never comes up

2011-03-09 Thread Andrew Beekhof
On Thu, Feb 17, 2011 at 10:00 PM, RaSca ra...@miamammausalinux.org wrote: Hi all, is it possible that a ping_clone goes down on a node because there is no connectivity and never comes up again when the connectivity returns? The ping and clone resource is declared like this: primitive ping

Re: [Linux-HA] after configuring dlm resource, pacemaker on cman stack fails

2011-03-09 Thread Andrew Beekhof
Check the correct daemon is being started (looks like its still starting the pacemaker specific one). Check what happens when you start the daemon manually. On Fri, Feb 25, 2011 at 9:10 AM, Pieter Baele pieter.ba...@gmail.com wrote: I try to get clvmd (+cmirror) running on top of pacemaker -

Re: [Linux-HA] Looking for a suitable Stonith Solution

2011-03-03 Thread Andrew Beekhof
On Wed, Mar 2, 2011 at 9:05 AM, Stallmann, Andreas astallm...@conet.de wrote: Hi Andrew, If suicide is no supported fencing option, why is it still included with stonith? Left over from heartbeat v1 days I guess. Could also be a testing-only device like ssh. www.clusterlabs.org tells me,

Re: [Linux-HA] Looking for a suitable Stonith Solution

2011-02-28 Thread Andrew Beekhof
On Fri, Feb 25, 2011 at 12:51 PM, Stallmann, Andreas astallm...@conet.de wrote: Hi! I conentrate both your answers into one mail, I hope that's allright for you. For now, I need an interim solution, which is, as of now, stonith via suicide. Doesn't work as suicide is not considered

Re: [Linux-ha-dev] new resource agents repository

2011-02-24 Thread Andrew Beekhof
On Thu, Feb 24, 2011 at 4:10 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: On Thu, Feb 24, 2011 at 03:56:27PM +0100, Andrew Beekhof wrote: On Thu, Feb 24, 2011 at 2:59 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hello, There is a new repository for Resource Agents which contains

Re: [Linux-HA] Problems starting apache

2011-02-24 Thread Andrew Beekhof
On Thu, Feb 24, 2011 at 3:31 PM, Dan Frincu df.clus...@gmail.com wrote: Hi, On 02/24/2011 04:24 PM, Stallmann, Andreas wrote: Hi! First: I set up my configuration anew, and it works. I didn't change that much, just set the monitor-action differently from before. Instead of:

Re: [Linux-HA] Looking for a suitable Stonith Solution

2011-02-24 Thread Andrew Beekhof
On Thu, Feb 24, 2011 at 2:49 PM, Stallmann, Andreas astallm...@conet.de wrote: Hi! TNX for your answer. We will switch to sbd after the shared storage has been set up. For now, I need an interim solution, which is, as of now, stonith via suicide. Doesn't work as suicide is not considered

Re: [Linux-HA] CLVM cmirror using Pacemaker DLM integration on rhel 6

2011-02-21 Thread Andrew Beekhof
On Thu, Feb 17, 2011 at 3:34 PM, Pieter Baele pieter.ba...@gmail.com wrote: Hi, With our last cluster experiments we try to set up Pacemaker with CLVM mirroring on RHEL 6.0 I added a DLM resource, but when I try to add clvm in crm, I get the following error: crm(live)configure# primitive

Re: [Linux-ha-dev] New ocft config file for IBM db2 resource agent

2011-02-15 Thread Andrew Beekhof
On Tue, Feb 15, 2011 at 10:50 AM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi Holger, On Tue, Feb 15, 2011 at 09:49:07AM +0100, Holger Teutsch wrote: Hi, please find enclosed an ocft config for db2 for review and inclusion into the project if appropriate. Wonderful! This is the first

Re: [Linux-ha-dev] [PATCH] manage PostgreSQL 9.0 streaming replication using Master/Slave

2011-02-14 Thread Andrew Beekhof
On Mon, Feb 14, 2011 at 8:46 PM, Serge Dubrouski serge...@gmail.com wrote: On Mon, Feb 14, 2011 at 1:28 AM, Takatoshi MATSUO matsuo@gmail.com wrote: Ideally demote operation should stop a master node and then restart it in hot-standby mode. It's up to administrator to make sure that no

Re: [Linux-HA] OCF_RESKEY_CRM_meta_timeout not matching monitor timeout meta-data

2011-02-14 Thread Andrew Beekhof
On Fri, Feb 4, 2011 at 11:23 AM, Brett Delle Grazie brett.dellegra...@gmail.com wrote: Hi, Apologies for cross-posting but I'm not sure where this problem resides. I'm running: corosync-1.2.7-1.1.el5.x86_64 corosynclib-1.2.7-1.1.el5.x86_64 cluster-glue-1.0.6-1.6.el5.x86_64

Re: [Linux-HA] A bunch of thoughts/questions about heartbeat network(s)

2011-02-14 Thread Andrew Beekhof
On Tue, Jan 25, 2011 at 8:15 AM, Alain.Moulle alain.mou...@bull.net wrote: Hi A bunch of thoughts/questions about heartbeat network(s) : In the following, when I talk about two heartbeat networks , I'm talking about two physically different networks set in the corosync.conf as two different

Re: [Linux-HA] One-Node-Cluster

2011-02-14 Thread Andrew Beekhof
On Mon, Feb 14, 2011 at 12:40 PM, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote: Andrew Beekhof and...@beekhof.net schrieb am 14.02.2011 um 10:08 in Nachricht aanlktinuc9_oqpwjubxrdmqkncqvnqx68a_1kbqss...@mail.gmail.com: [...] The log just keeps on saying: Feb  8 16:01:03 dmcs2

Re: [Linux-HA] One-Node-Cluster

2011-02-14 Thread Andrew Beekhof
On Tue, Feb 15, 2011 at 6:08 AM, Alan Robertson al...@unix.sh wrote: On 02/14/2011 04:45 AM, Andrew Beekhof wrote: On Mon, Feb 14, 2011 at 12:40 PM, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de  wrote: Andrew Beekhofand...@beekhof.net  schrieb am 14.02.2011 um 10:08 in Nachricht

Re: [Linux-ha-dev] ocft: status vs. monitor

2011-02-13 Thread Andrew Beekhof
On Sun, Feb 13, 2011 at 11:01 AM, Holger Teutsch holger.teut...@web.de wrote: Hi, to my knowledge OCF *requires* a method monitor while status is optional (or what is it really for? heritage, compatibility, ...) Shouldn't the ocft configs check for status ? Yes, unless its trying to talk to

Re: [Linux-HA] Documenting ocf:pacemaker:ping

2011-02-10 Thread Andrew Beekhof
On Thu, Feb 10, 2011 at 9:14 AM, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote: Hi! I'm getting into Linux-HA, and it seems the documentation was made with a very hot needle. For example, ocf:pacemaker:ping has the following documentation (crm(live)ra# info ocf:ping in SLES11

Re: [Linux-HA] Documenting ocf:pacemaker:ping

2011-02-10 Thread Andrew Beekhof
On Thu, Feb 10, 2011 at 12:28 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: On Thu, Feb 10, 2011 at 09:29:41AM +0100, Andrew Beekhof wrote: On Thu, Feb 10, 2011 at 9:14 AM, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote: Hi! I'm getting into Linux-HA, and it seems

Re: [Linux-ha-dev] New master/slave resource agent for DB2 databases in HADR (High Availability Disaster Recovery) mode

2011-02-09 Thread Andrew Beekhof
On Wed, Feb 9, 2011 at 12:15 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: On Wed, Feb 09, 2011 at 12:06:04PM +0100, Florian Haas wrote: On 2011-02-09 11:56, Dejan Muhamedagic wrote: It is plugin compatible to the old version of the agent. Great! Unfortunately, we can't replace the old

Re: [Linux-ha-dev] New master/slave resource agent for DB2 databases in HADR (High Availability Disaster Recovery) mode

2011-02-09 Thread Andrew Beekhof
On Wed, Feb 9, 2011 at 2:17 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi Andrew, On Wed, Feb 09, 2011 at 01:33:03PM +0100, Andrew Beekhof wrote: On Wed, Feb 9, 2011 at 12:15 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: On Wed, Feb 09, 2011 at 12:06:04PM +0100, Florian Haas wrote

Re: [Linux-ha-dev] New master/slave resource agent for DB2 databases in HADR (High Availability Disaster Recovery) mode

2011-02-09 Thread Andrew Beekhof
On Wed, Feb 9, 2011 at 3:35 PM, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Wed, Feb 09, 2011 at 02:43:17PM +0100, Andrew Beekhof wrote: Are you going to change the name of every agent that gets a rewrite?    IPaddr2-ng-ng-again-and-one-more-plus-one I don't think it is going

Re: [Linux-HA] how to configure heartbeat for polling MySql server?

2011-02-09 Thread Andrew Beekhof
On Wed, Feb 2, 2011 at 5:28 PM, Danilo danilo.abbasci...@gmail.com wrote: On 02/01/2011 09:32 AM, Cristian Mammoli - Apra Sistemi wrote: On 01/31/2011 10:08 AM, Danilo Abbasciano wrote: If the running node is rebooted the cluster works and the service will switched to the other node. But if I

Re: [Linux-HA] resource stickness question

2011-02-09 Thread Andrew Beekhof
On Wed, Feb 9, 2011 at 9:05 AM, Erik Dobák erik.do...@gmail.com wrote: i have 2 nodes which are running 2 resource groups in an active/passive cluster. my goal was to run 1 resource active on lc-cl1 and the other resource on node lc-cl2. this is how i configured it: group bamcluster ipaddr2

<    1   2   3   4   5   6   7   8   9   10   >