Re: [ClusterLabs] Cluster node getting stopped from other node(resending mail)

2015-07-01 Thread Ken Gaillot
On 06/30/2015 11:30 PM, Arjun Pandey wrote: Hi I am running a 2 node cluster with this config on centos 6.5/6.6 Master/Slave Set: foo-master [foo] Masters: [ messi ] Stopped: [ronaldo ] eth1-CP(ocf::pw:IPaddr): Started messi eth2-UP(ocf::pw:IPaddr): Started

Re: [ClusterLabs] Resource stop when another resource run on that node

2015-07-01 Thread Ken Gaillot
On 07/01/2015 01:18 AM, John Gogu wrote: ​Hello, this is what i have setup but is now working 100%: Online: [ node01hb0 node02hb0 ] Full list of resources: IP1_Vir(ocf::heartbeat:IPaddr):Started node01hb0 IP2_Vir(ocf::heartbeat:IPaddr):Started node02hb0

Re: [ClusterLabs] Pacemaker failover failure

2015-07-01 Thread Ken Gaillot
On 07/01/2015 08:57 AM, alex austin wrote: I have now configured stonith-enabled=true. What device should I use for fencing given the fact that it's a virtual machine but I don't have access to its configuration. would fence_pcmk do? if so, what parameters should I configure for it to work

Re: [ClusterLabs] Pacemaker failover failure

2015-07-01 Thread Ken Gaillot
corrupt your data. On Wed, Jul 1, 2015 at 3:25 PM, Ken Gaillot kgail...@redhat.com wrote: On 07/01/2015 08:57 AM, alex austin wrote: I have now configured stonith-enabled=true. What device should I use for fencing given the fact that it's a virtual machine but I don't have access to its

Re: [ClusterLabs] cib state is now lost

2015-08-12 Thread Ken Gaillot
:0 in libc-2.12.so[7f41c365e000+18a000] On Mon, Aug 10, 2015 at 9:54 AM, Ken Gaillot kgail...@redhat.com wrote: On 08/09/2015 02:27 PM, David Neudorfer wrote: Where can I dig deeper to figure out why cib keeps terminating? selinux and iptables are both disabled and I've have debug

Re: [ClusterLabs] CRM managing ADSL connection; failure not handled

2015-08-24 Thread Ken Gaillot
On 08/24/2015 04:52 AM, Andrei Borzenkov wrote: 24.08.2015 12:35, Tom Yates пишет: I've got a failover firewall pair where the external interface is ADSL; that is, PPPoE. i've defined the service thus: primitive ExternalIP lsb:hb-adsl-helper \ op monitor interval=60s and in

Re: [ClusterLabs] starting of resources

2015-08-11 Thread Ken Gaillot
On 08/11/2015 02:12 AM, Vijay Partha wrote: After you start pacemaker and then type pcs status, we get the output that there are nodes online and the list of resources are empty. We then add resources to the nodes. Now what i want is after starting pacemaker can i get some resources to be

Re: [ClusterLabs] Delayed first monitoring

2015-08-12 Thread Ken Gaillot
On 08/12/2015 10:45 AM, Miloš Kozák wrote: Thank you for your answer, but. 1) This sounds ok, but in other words it means the first delayed check is not possible to be done. 2) Start of init script? I follow lsb scripts from distribution, so there is not way to change them (I can change

Re: [ClusterLabs] apache services

2015-08-05 Thread Ken Gaillot
On 08/05/2015 04:05 AM, Vijay Partha wrote: Hi, I need to run apache service on both the nodes in a cluster. httpd is listening in port 80 on first node and httpd is listening to port 81 on the second. I am not able to add these instances separately rather both of them are starting on the

[ClusterLabs] Running pacemaker 1.1.13 with legacy plugin or heartbeat

2015-08-05 Thread Ken Gaillot
would recommend instead using the upstream master branch as of at least commit 0f8059e. That branch includes an overhaul of the affected code area, as well as other bug fixes. -- Ken Gaillot kgail...@redhat.com ___ Users mailing list: Users@clusterlabs.org

Re: [ClusterLabs] Corosync GitHub vs. dev list

2015-08-25 Thread Ken Gaillot
On 08/25/2015 05:20 AM, Ferenc Wagner wrote: Hi, Since Corosync is hosted on GitHub, I wonder if it's enough to submit pull requests/issues/patch comments there to get the developers' attention, or should I also post to develop...@clusterlabs.org? GitHub is good for patches, and when you

Re: [ClusterLabs] Cluster monitoring

2015-10-21 Thread Ken Gaillot
On 10/21/2015 08:24 AM, Michael Schwartzkopff wrote: > Am Mittwoch, 21. Oktober 2015, 18:50:15 schrieb Arjun Pandey: >> Hi folks >> >> I had a question on monitoring of cluster events. Based on the >> documentation it seems that cluster monitor is the only method >> of monitoring the cluster

Re: [ClusterLabs] difference between OCF return codes for monitor action

2015-10-21 Thread Ken Gaillot
On 10/21/2015 07:44 AM, Kostiantyn Ponomarenko wrote: > Hi, > > What is the difference between "OCF_ERR_GENERIC" and "OCF_NOT_RUNNING" > return codes in "monitor" action from the Pacemaker's point of view? > > I was looking here >

Re: [ClusterLabs] VIP monitoring failing with Timed Out error

2015-10-28 Thread Ken Gaillot
On 10/28/2015 03:51 AM, Pritam Kharat wrote: > Hi All, > > I am facing one issue in my two node HA. When I stop pacemaker on ACTIVE > node, it takes more time to stop and by this time VIP migration with other > resources migration fails to STANDBY node. (I have seen same issue in > ACTIVE node

Re: [ClusterLabs] பதில்: Re: crm_mon memory leak

2015-11-09 Thread Ken Gaillot
report any failures for the ClusterMon resource? I doubt it is the issue in this case, but ClusterMon resources should not be cloned or duplicated, because it does not monitor the health of one node but of the entire cluster. > -Original Message- > From: Ken Gaillot [mailto:kga

Re: [ClusterLabs] Loadbalancing using Pacemaker

2015-11-09 Thread Ken Gaillot
On 11/08/2015 11:29 AM, didier tanti wrote: > Thank You Michael, > In fact I spent some more time looking at documentions and indeed Pacemaker > is only used for resource control and management. To have my HA solution I > will need to use Corosync directly as well. The OpenAIS API is pretty well

Re: [ClusterLabs] move service basing on both connection status and hostname

2015-11-09 Thread Ken Gaillot
On 11/09/2015 10:02 AM, Stefano Sasso wrote: > Hi Guys, > I am having some troubles with the location constraint. > > In particular, what I want to achieve, is to run my service on a host; if > the ip interconnection fails I want to migrate it to another host, but on > IP connectivity

Re: [ClusterLabs] Howto use ocf:heartbeat:nginx check level > 0

2015-11-09 Thread Ken Gaillot
On 11/08/2015 04:46 AM, user.clusterlabs@siimnet.dk wrote: > >> On 8. nov. 2015, at 10.26, user.clusterlabs@siimnet.dk wrote: >> >> Setting up my first pacemaker cluster, I’m trying to grasp howto make >> ocf:heartbeat:nginx monitor with check levels > 0. >> >> Got this so far: >> >>

Re: [ClusterLabs] பதில்: Re: crm_mon memory leak

2015-11-02 Thread Ken Gaillot
EL 6.5. I am not sure what's the plan for RHEL 6.7 and 7.1. > > Thanks, > Karthik. > > -----Original Message- > From: Ken Gaillot [mailto:kgail...@redhat.com] > Sent: 02 நவம்பர் 2015 21:04 > To: Karthikeyan Ramasamy; users@clusterlabs.org > Subject: Re: பதில்: Re: [Cl

Re: [ClusterLabs] பதில்: Re: crm_mon memory leak

2015-11-02 Thread Ken Gaillot
r response. > > Thanks, > Karthik. > > > Ken Gaillot எழுதியது > > On 10/30/2015 05:29 AM, Karthikeyan Ramasamy wrote: >> Dear Pacemaker support, >> We are using pacemaker1.1.10-14 to implement a service management framework, >> with high ava

Re: [ClusterLabs] Pacemaker process 10-15% CPU

2015-11-02 Thread Ken Gaillot
sort of loop. > Thanks, > Karthik. > -Original Message- > From: Ken Gaillot [mailto:kgail...@redhat.com] > Sent: 31 அக்டோபர் 2015 03:33 > To: users@clusterlabs.org > Subject: Re: [ClusterLabs] Pacemaker process 10-15% CPU > > On 10/30/2015 05:14 AM, Karthikeyan

Re: [ClusterLabs] Pacemaker process 10-15% CPU

2015-10-30 Thread Ken Gaillot
On 10/30/2015 05:14 AM, Karthikeyan Ramasamy wrote: > Hello, > We are using Pacemaker to manage the services that run on a node, as part > of a service management framework, and manage the nodes running the services > as a cluster. One service will be running as 1+1 and other services with be

Re: [ClusterLabs] crm_mon memory leak

2015-10-30 Thread Ken Gaillot
On 10/30/2015 05:29 AM, Karthikeyan Ramasamy wrote: > Dear Pacemaker support, > We are using pacemaker1.1.10-14 to implement a service management framework, > with high availability on the road-map. This pacemaker version was available > through redhat for our environments > > We are running

Re: [ClusterLabs] Resources not starting some times after node reboot

2015-10-30 Thread Ken Gaillot
On 10/29/2015 12:42 PM, Pritam Kharat wrote: > Hi All, > > I have single node with 5 resources running on it. When I rebooted node > sometimes I saw resources in stopped state though node comes online. > > When looked in to the logs, one difference found in success and failure > case is, when >

Re: [ClusterLabs] large cluster - failure recovery

2015-11-04 Thread Ken Gaillot
On 11/04/2015 12:55 PM, Digimer wrote: > On 04/11/15 01:50 PM, Radoslaw Garbacz wrote: >> Hi, >> >> I have a cluster of 32 nodes, and after some tuning was able to have it >> started and running, > > This is not supported by RH for a reasons; it's hard to get the timing > right. SUSE supports up

Re: [ClusterLabs] Multiple OpenSIPS services on one cluster

2015-11-03 Thread Ken Gaillot
On 11/03/2015 05:38 AM, Nuno Pereira wrote: >> -Mensagem original- >> De: Ken Gaillot [mailto:kgail...@redhat.com] >> Enviada: segunda-feira, 2 de Novembro de 2015 19:53 >> Para: users@clusterlabs.org >> Assunto: Re: [ClusterLabs] Multiple OpenSIPS services

Re: [ClusterLabs] Multiple OpenSIPS services on one cluster

2015-11-03 Thread Ken Gaillot
On 11/03/2015 01:40 PM, Nuno Pereira wrote: >> -Mensagem original- >> De: Ken Gaillot [mailto:kgail...@redhat.com] >> Enviada: terça-feira, 3 de Novembro de 2015 18:02 >> Para: Nuno Pereira; 'Cluster Labs - All topics related to open-source > clustering

Re: [ClusterLabs] Pacemaker build error

2015-11-04 Thread Ken Gaillot
On 11/03/2015 11:10 PM, Jim Van Oosten wrote: > > > I am getting a compile error when building Pacemaker on Linux version > 2.6.32-431.el6.x86_64. > > The build commands: > > git clone git://github.com/ClusterLabs/pacemaker.git > cd pacemaker > ./autogen.sh && ./configure --prefix=/usr

Re: [ClusterLabs] Pacemaker build error

2015-11-04 Thread Ken Gaillot
On 11/04/2015 09:31 AM, Ken Gaillot wrote: > On 11/03/2015 11:10 PM, Jim Van Oosten wrote: >> >> >> I am getting a compile error when building Pacemaker on Linux version >> 2.6.32-431.el6.x86_64. >> >> The build commands: >> >> git clone git://g

Re: [ClusterLabs] multiple action= lines sent to STDIN of fencing agents - why?

2015-10-15 Thread Ken Gaillot
On 10/15/2015 06:25 AM, Adam Spiers wrote: > I inserted some debugging into fencing.py and found that stonithd > sends stuff like this to STDIN of the fencing agents it forks: > > action=list > param1=value1 > param2=value2 > param3=value3 > action=list > > where paramX and

Re: [ClusterLabs] Corosync+Pacemaker error during failover

2015-10-08 Thread Ken Gaillot
On 10/08/2015 10:16 AM, priyanka wrote: > Hi, > > We are trying to build a HA setup for our servers using DRBD + Corosync > + pacemaker stack. > > Attached is the configuration file for corosync/pacemaker and drbd. A few things I noticed: * Don't set become-primary-on in the DRBD configuration

Re: [ClusterLabs] Stopped node detection.

2015-10-16 Thread Ken Gaillot
On 10/15/2015 03:55 PM, Vallevand, Mark K wrote: > Ubuntu 12.04 LTS > pacemaker 1.1.10 > cman 3.1.7 > corosync 1.4.6 > > If my cluster has no resources, it seems like it takes 20s for a stopped node > to be detected. Is the value really 20s and is it a parameter that can be > adjusted? The

Re: [ClusterLabs] Coming in 1.1.14: remapping sequential reboots to all-off-then-all-on

2015-10-19 Thread Ken Gaillot
On 10/19/2015 11:42 AM, Digimer wrote: > On 19/10/15 12:34 PM, Ken Gaillot wrote: >> Pacemaker supports fencing "topologies", allowing multiple fencing >> devices to be used (in conjunction or as fallbacks) when a node needs to >> be fenced. >> >> How

[ClusterLabs] Coming in 1.1.14: remapping sequential reboots to all-off-then-all-on

2015-10-19 Thread Ken Gaillot
ot_timeout). The new code knows to skip the "on" step if the fence agent has automatic unfencing (because it will happen when the node rejoins the cluster). This allows fence_scsi to work with this feature. -- Ken Gaillot <kgail...@redhat.com> __

Re: [ClusterLabs] group resources not grouped ?!?

2015-10-07 Thread Ken Gaillot
On 10/07/2015 09:12 AM, zulucloud wrote: > Hi, > i got a problem i don't understand, maybe someone can give me a hint. > > My 2-node cluster (named ali and baba) is configured to run mysql, an IP > for mysql and the filesystem resource (on drbd master) together as a > GROUP. After doing some

Re: [ClusterLabs] Antw: Monitoring Op for LVM - Excessive Logging

2015-10-09 Thread Ken Gaillot
On 10/09/2015 08:06 AM, Ulrich Windl wrote: Jorge Fábregas schrieb am 09.10.2015 um 14:20 > in > Nachricht <5617b10f.1060...@gmail.com>: >> Hi, >> >> Is there a way to stop the excessive logging produced by the LVM monitor >> operation? I got it set at the default

Re: [ClusterLabs] CRM managing ADSL connection; failure not handled

2015-08-27 Thread Ken Gaillot
thanks to all for help with this. thanks also to those who have suggested i rewrite this as an OCF agent (especially to ken gaillot who was kind enough to point me to documentation); i will look at that if time permits. ___ Users mailing list: Users

Re: [ClusterLabs] resource-stickiness

2015-08-27 Thread Ken Gaillot
On 08/27/2015 02:42 AM, Rakovec Jost wrote: Hi it doesn't work as I expected, I change name to: location loc-aapche-sles1 aapche role=Started 10: sles1 but after I manual move resource via HAWK to other node it auto add this line: location cli-prefer-aapche aapche role=Started

Re: [ClusterLabs] HA Cluster and Fencing

2015-09-03 Thread Ken Gaillot
On 09/03/2015 11:44 AM, Streeter, Michelle N wrote: > I was trying to get a HA Cluster working but it was not failing over. In > past posts, someone kept asking me to get the fencing working and make it a > priority. So I finally got the fencing to work with VBox. And the fail over >

Re: [ClusterLabs] resource-stickiness

2015-09-02 Thread Ken Gaillot
be removed automatically because neither the > cluster nor the tool knows when you no longer prefer the resource to be > at the new location. You have to tell it. > > If you have resource-stickiness, you can "unmove" as soon as the move is > done, and the resource

Re: [ClusterLabs] Adding and removing a node dyamically

2015-10-02 Thread Ken Gaillot
On 10/02/2015 05:36 AM, Vijay Partha wrote: > could someone help me out with this please? i am making use of cman and > pacemaker. pcs cluster node add/remove is not working as it throws > pcsd service is not running on . pcs relies on pcsd running on all nodes. Make sure pcs is installed on

Re: [ClusterLabs] Antw: Re: Antw: Need bash instead of /bin/sh

2015-09-23 Thread Ken Gaillot
On 09/23/2015 08:38 AM, Ulrich Windl wrote: Vladislav Bogdanov schrieb am 23.09.2015 um 15:24 in > Nachricht <5602a808.1090...@hoster-ok.com>: >> 23.09.2015 15:42, dan wrote: >>> ons 2015-09-23 klockan 14:08 +0200 skrev Ulrich Windl: >>> dan

Re: [ClusterLabs] Coming in 1.1.14: Fencing topology based on node attribute

2015-09-24 Thread Ken Gaillot
;1" On 09/09/2015 07:20 AM, Andrew Beekhof wrote: > >> On 9 Sep 2015, at 7:45 pm, Kristoffer Grönlund <kgronl...@suse.com> wrote: >> >> Hi, >> >> Ken Gaillot <kgail...@redhat.com> writes: >> >>> Pacemaker's upstream master branch has

Re: [ClusterLabs] Virtual Machines with USB Dongle

2015-09-25 Thread Ken Gaillot
On 09/25/2015 01:40 PM, J. Echter wrote: > Hi, > > what would you do if you have to run a machine which needs a usb dongle > / usb gsm modem to operate properly. > > If this machine switches to another node, the usb thing doesnt move around. > > Any hint on such a case? > > Thanks > > Juergen

Re: [ClusterLabs] Help needed getting DRBD cluster working

2015-10-05 Thread Ken Gaillot
On 10/05/2015 08:09 AM, Gordon Ross wrote: > I’m trying to setup a simple DRBD cluster using Ubuntu 14.04 LTS using > Pacemaker & Corosync. My problem is getting the resource to startup. > > I’ve setup the DRBD aspect fine. Checking /proc/drbd I can see that my test > DRBD device is all synced

Re: [ClusterLabs] Help needed getting DRBD cluster working

2015-10-06 Thread Ken Gaillot
On 10/06/2015 09:38 AM, Gordon Ross wrote: > On 5 Oct 2015, at 15:05, Ken Gaillot <kgail...@redhat.com> wrote: >> >> The "rc=6" in the failed actions means the resource's Pacemaker >> configuration is invalid. (For OCF return codes, see >> http://clust

Re: [ClusterLabs] How I can contribute the code and TR fix for default resouce agent?

2015-12-08 Thread Ken Gaillot
On 12/07/2015 01:13 PM, Xiaohua Wang wrote: > Hi Friends, > Since our product is using the Pacemaker and related Resource Agent based on > RHEL 6.5. > We found some bugs and already fixed them. So we want to contribute the code > fixing? > How can we do it ? > > Best Regards > Xiaohua Wang Hi,

Re: [ClusterLabs] Running 'pcs status' cmd on remote node

2015-12-02 Thread Ken Gaillot
On 12/02/2015 06:21 AM, Simon Lawrence wrote: > > In my 2 node test cluster, one node is a physical server (running > Pacemaker 1.1.13), the other is a VM on that server, configured as a > Pacemaker remote node (v1.1.13). > > I get the correct output if I run crm_mon & pcs config on the remote >

Re: [ClusterLabs] Pacemaker crash and fencing failure

2015-11-30 Thread Ken Gaillot
On 11/20/2015 06:38 PM, Brian Campbell wrote: > I've been trying to debug and do a root cause analysis for a cascading > series of failures that a customer hit a couple of days ago, that > caused their filesystem to be unavailable for a couple of hours. > > The original failure was in our own

Re: [ClusterLabs] Resources suddenly get target-role="stopped"

2015-12-04 Thread Ken Gaillot
On 12/04/2015 10:22 AM, Klechomir wrote: > Hi list, > My issue is the following: > > I have very stable cluster, using Corosync 2.1.0.26 and Pacemaker 1.1.8 > (observed the same problem with Corosync 2.3.5 & Pacemaker 1.1.13-rc3) > > Bumped on this issue when started playing with VirtualDomain

[ClusterLabs] Pacemaker 1.1.14 - Release Candidate (try it out!)

2015-12-08 Thread Ken Gaillot
and test the new release. We do many regression tests and simulations, but we can't cover all possible use cases, so your feedback is important and appreciated. (You may notice we're starting with rc2; rc1 was released, but had a compilation issue in some cases.) -- Ken Gaillot <kgail...@redhat.

Re: [ClusterLabs] Help required for N+1 redundancy setup

2015-12-03 Thread Ken Gaillot
subset of OpenAIS, optimized for use with Pacemaker. Corosync 2 is now the preferred membership layer for Pacemaker for most uses, though other layers are still supported. > Thanks. > > On Tue, Dec 1, 2015 at 9:04 PM, Ken Gaillot <kgail...@redhat.com> wrote: > >> On 12/01/2015 0

Re: [ClusterLabs] Stack: unknown and all nodes offline

2015-12-10 Thread Ken Gaillot
On 12/10/2015 01:14 PM, Louis Munro wrote: > I can now answer parts of my own question. > > > My config was missing the quorum configuration: > > quorum { > # Enable and configure quorum subsystem (default: off) > # see also corosync.conf.5 and votequorum.5 > provider:

Re: [ClusterLabs] Stack: unknown and all nodes offline

2015-12-10 Thread Ken Gaillot
On 12/10/2015 12:45 PM, Louis Munro wrote: > Hello all, > > I am trying to get a Corosync 2 cluster going on CentOS 6.7 but I am running > in a bit of a problem with either Corosync or Pacemaker. > crm reports that all my nodes are offline and the stack is unknown (I am not > sure if that is

Re: [ClusterLabs] Early VM resource migration

2015-12-16 Thread Ken Gaillot
On 12/16/2015 10:30 AM, Klechomir wrote: > On 16.12.2015 17:52, Ken Gaillot wrote: >> On 12/16/2015 02:09 AM, Klechomir wrote: >>> Hi list, >>> I have a cluster with VM resources on a cloned active-active storage. >>> >>> VirtualDomain resource mi

Re: [ClusterLabs] Pacemaker documentation license clarification

2015-12-14 Thread Ken Gaillot
On 12/13/2015 06:56 PM, Ferenc Wagner wrote: > Ken Gaillot <kgail...@redhat.com> writes: > >> On 12/11/2015 10:07 AM, Ferenc Wagner wrote: >> >>> [...] the "Legal Notice" >>> section of the generated Publican documentation (for example &g

Re: [ClusterLabs] successful ipmi stonith still times out

2015-12-17 Thread Ken Gaillot
On 12/17/2015 10:32 AM, Ron Kerry wrote: > I have a customer (running SLE 11 SP4 HAE) who is seeing the following > stonith behavior running the ipmi stonith plugin. > > Dec 15 14:21:43 test4 pengine[24002]: warning: pe_fence_node: Node > test3 will be fenced because termination was requested >

Re: [ClusterLabs] Early VM resource migration

2015-12-16 Thread Ken Gaillot
On 12/16/2015 02:09 AM, Klechomir wrote: > Hi list, > I have a cluster with VM resources on a cloned active-active storage. > > VirtualDomain resource migrates properly during failover (node standby), > but tries to migrate back too early, during failback, ignoring the > "order" constraint,

Re: [ClusterLabs] [Pacemaker] Beginner | Resources stuck unloading

2015-12-16 Thread Ken Gaillot
On 12/14/2015 12:18 AM, Tyler Hampton wrote: > Hi! > > I'm currently trying to semi-follow Sebastien Han's blog post on > implementing HA with Ceph rbd volumes and I am hitting some walls. The > difference between what I'm trying to do and the blog post is that I'm > trying to implement an

[ClusterLabs] Pacemaker 1.1.14 - Release Candidate 3

2015-12-14 Thread Ken Gaillot
a regression preventing crm_mon from being run as a daemon. Everyone is encouraged to download, compile and test the new release. Your feedback is important and appreciated. -- Ken Gaillot <kgail...@redhat.com> ___ Users mailing list:

Re: [ClusterLabs] Pacemaker documentation license clarification

2015-12-11 Thread Ken Gaillot
On 12/11/2015 10:07 AM, Ferenc Wagner wrote: > Hi, > > We're packaging Pacemaker for Debian and this requires a clear picture > of all licenses relevant to the package. The software part is clearly > under GPL-2+ and LGPL-2+, which is fine. However, the "Legal Notice" > section of the generated

Re: [ClusterLabs] [OCF] Pacemaker reports a multi-state clone resource instance as running while it is not in fact

2016-01-04 Thread Ken Gaillot
On 01/04/2016 09:25 AM, Bogdan Dobrelya wrote: > On 04.01.2016 15:50, Bogdan Dobrelya wrote: >> So far so bad. >> I made a dummy OCF script [0] to simulate an example >> promote/demote/notify failure mode for a multistate clone resource which >> is very similar to the one I reported originally.

Re: [ClusterLabs] IPaddr2 cluster-ip restarts on all nodes after failover

2016-01-06 Thread Ken Gaillot
On 01/06/2016 02:40 PM, Joakim Hansson wrote: > Hi list! > I'm running a 3-node vm-cluster in which all the nodes run Tomcat (Solr) > from the same disk using GFS2. > On top of this I use IPaddr2-clone for cluster-ip and loadbalancing between > all the nodes. > > Everything works fine, except

Re: [ClusterLabs] [Q] Pacemaker: Kamailio resource agent

2016-01-08 Thread Ken Gaillot
On 12/26/2015 05:27 AM, Sebish wrote: > Hello to all ha users, > > first of all thanks for you work @ mailinglist, pacemaker and ras! > > I have an issue with the kamailio resource agent > > (ra) and it would be

Re: [ClusterLabs] Help required for N+1 redundancy setup

2016-01-08 Thread Ken Gaillot
lable? > Thanks. > > > On Fri, Jan 8, 2016 at 9:30 PM, Ken Gaillot <kgail...@redhat.com> wrote: > >> On 01/08/2016 06:55 AM, Nikhil Utane wrote: >>> Would like to validate my final config. >>> >>> As I mentioned earlier, I will be havi

Re: [ClusterLabs] [Q] Check on application layer (kamailio, openhab)

2015-12-21 Thread Ken Gaillot
On 12/19/2015 10:21 AM, Sebish wrote: > Dear all ha-list members, > > I am trying to setup two availability checks on application layer using > heartbeat and pacemaker. > To be more concrete I need 1 resource agent (ra) for openHAB and 1 for > Kamailio SIP Proxy. > > *My setup: > * > >+

Re: [ClusterLabs] Anyone successfully install Pacemaker/Corosync on Freebsd?

2015-12-21 Thread Ken Gaillot
On 12/19/2015 04:56 PM, mike wrote: > Hi All, > > just curious if anyone has had any luck at one point installing > Pacemaker and Corosync on FreeBSD. I have to install from source of > course and I've run into an issue when running ./configure while trying > to install Corosync. The process

Re: [ClusterLabs] master/slave resource agent without demote

2015-11-30 Thread Ken Gaillot
On 11/25/2015 10:57 PM, Waldemar Brodkorb wrote: > Hi, > Andrei Borzenkov wrote, > >> On Tue, Nov 24, 2015 at 5:19 PM, Waldemar Brodkorb >> wrote: >>> Hi, >>> >>> we are using a derivate of the Tomcat OCF script. >>> Our web application needs to be promoted (via a wget

Re: [ClusterLabs] Help required for N+1 redundancy setup

2015-12-01 Thread Ken Gaillot
On 12/01/2015 05:31 AM, Nikhil Utane wrote: > Hi, > > I am evaluating whether it is feasible to use Pacemaker + Corosync to add > support for clustering/redundancy into our product. Most definitely > Our objectives: > 1) Support N+1 redundancy. i,e. N Active and (up to) 1 Standby. You can do

Re: [ClusterLabs] start service after filesystemressource

2015-11-20 Thread Ken Gaillot
On 11/20/2015 07:38 AM, haseni...@gmx.de wrote: > Hi, > I want to start several services after the drbd ressource an the filessystem > is > avaiable. This is my current configuration: > node $id="184548773" host-1 \ > attributes standby="on" > node $id="184548774" host-2 \ >

Re: [ClusterLabs] Wait until resource is really ready before moving clusterip

2016-01-12 Thread Ken Gaillot
On 01/12/2016 07:57 AM, Kristoffer Grönlund wrote: > Joakim Hansson writes: > >> Hi! >> I have a cluster running tomcat which in turn run solr. >> I use three nodes with loadbalancing via ipaddr2. >> The thing is, when tomcat is started on a node it takes about 2

Re: [ClusterLabs] dovecot RA

2016-06-08 Thread Ken Gaillot
On 06/08/2016 10:11 AM, Dmitri Maziuk wrote: > On 2016-06-08 09:11, Ken Gaillot wrote: >> On 06/08/2016 03:26 AM, Jan Pokorný wrote: > >>> Pacemaker can drive systemd-managed services for quite some time. >> >> This is as easy as changing lsb:dovecot to system

Re: [ClusterLabs] Different pacemaker versions split cluster

2016-06-08 Thread Ken Gaillot
familiar with any relevant changes between 2.3.3 and 2.3.5, so I'm not sure what's going wrong. > > > Em Segunda-feira, 6 de Junho de 2016 17:30, Ken Gaillot <kgail...@redhat.com> > escreveu: > On 05/30/2016 01:14 PM, DacioMF wrote: >> Hi, >> >> I had 4 nodes w

Re: [ClusterLabs] Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-06-06 Thread Ken Gaillot
On 06/05/2016 07:27 PM, Andrew Beekhof wrote: > On Sat, Jun 4, 2016 at 12:16 AM, Ken Gaillot <kgail...@redhat.com> wrote: >> On 06/02/2016 08:01 PM, Andrew Beekhof wrote: >>> On Fri, May 20, 2016 at 1:53 AM, Ken Gaillot <kgail...@redhat.com> wrote: >>>>

Re: [ClusterLabs] Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-06-06 Thread Ken Gaillot
On 06/06/2016 12:25 PM, Vladislav Bogdanov wrote: > 06.06.2016 19:39, Ken Gaillot wrote: >> On 06/05/2016 07:27 PM, Andrew Beekhof wrote: >>> On Sat, Jun 4, 2016 at 12:16 AM, Ken Gaillot <kgail...@redhat.com> >>> wrote: >>>> On 06/02/2016 08:01 PM, And

Re: [ClusterLabs] Pacemaker reload Master/Slave resource

2016-06-06 Thread Ken Gaillot
On 05/20/2016 06:20 AM, Felix Zachlod (Lists) wrote: > version 1.1.13-10.el7_2.2-44eb2dd > > Hello! > > I am currently developing a master/slave resource agent. So far it is working > just fine, but this resource agent implements reload() and this does not work > as expected when running as

Re: [ClusterLabs] Creating a rule based on whether a quorum exists

2016-06-06 Thread Ken Gaillot
On 05/30/2016 08:13 AM, Les Green wrote: > Hi All, > > I have a two-node cluster with no-quorum-policy=ignore and an external > ping responder to try to determine if a node has its network down (it's > the dead one), or if the other node is really dead.. > > The ping helps to determine who the

Re: [ClusterLabs] Different pacemaker versions split cluster

2016-06-06 Thread Ken Gaillot
On 05/30/2016 01:14 PM, DacioMF wrote: > Hi, > > I had 4 nodes with Ubuntu 14.04LTS in my cluster and all of then worked well. > I need upgrade all my cluster nodes to Ubuntu 16.04LTS without stop my > resources. Two nodes have been updated to 16.04 and the two others remains > with 14.04. The

Re: [ClusterLabs] Pacemaker 1.1.15 - Release Candidate 4

2016-06-12 Thread Ken Gaillot
On 06/12/2016 07:28 AM, Ferenc Wágner wrote: > Ken Gaillot <kgail...@redhat.com> writes: > >> With this release candidate, we now provide three sample alert scripts >> to use with the new alerts feature, installed in the >> /usr/share/pacemaker/alerts directory.

[ClusterLabs] Pacemaker 1.1.15 - Release Candidate 4

2016-06-10 Thread Ken Gaillot
encouraged to download, compile and test the new release. Your feedback is important and appreciated. This is most likely very close to the final 1.1.15 release. -- Ken Gaillot <kgail...@redhat.com> ___ Users mailing list: Users@clusterlabs.org http:/

Re: [ClusterLabs] Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-06-03 Thread Ken Gaillot
On 06/02/2016 08:01 PM, Andrew Beekhof wrote: > On Fri, May 20, 2016 at 1:53 AM, Ken Gaillot <kgail...@redhat.com> wrote: >> A recent thread discussed a proposed new feature, a new environment >> variable that would be passed to resource agents, indicating whether a &g

Re: [ClusterLabs] Processing failed op monitor for WebSite on node1: not running (7)

2016-06-14 Thread Ken Gaillot
On 06/14/2016 03:10 AM, Jeremy Voisin wrote: > Hi all, > > > > We actually have a 2 nodes cluster with corosync and pacemaker for > httpd. We have 2 VIP configured. > > > > Since we’ve added ModSecurity 2.9, httpd restart is very slow. So I > increased the start / stop timeout. But

Re: [ClusterLabs] Apache Active Active Balancer without FileSystem Cluster

2016-06-13 Thread Ken Gaillot
On 06/13/2016 08:06 AM, Klaus Wenninger wrote: > On 06/13/2016 02:33 PM, alan john wrote: >> Dear All, >> >> I am trying to setup an Apache active-active cluster. I do not wish >> to have common file system for both nodes. it i However I do not like >> to have pcs/corosync to start or stop

Re: [ClusterLabs] Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-06-06 Thread Ken Gaillot
On 06/06/2016 05:45 PM, Adam Spiers wrote: > Adam Spiers <aspi...@suse.com> wrote: >> Andrew Beekhof <abeek...@redhat.com> wrote: >>> On Tue, Jun 7, 2016 at 8:29 AM, Adam Spiers <aspi...@suse.com> wrote: >>>> Ken Gaillot <kgail...@redhat.com

Re: [ClusterLabs] Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-06-06 Thread Ken Gaillot
On 06/06/2016 03:30 PM, Vladislav Bogdanov wrote: > 06.06.2016 22:43, Ken Gaillot wrote: >> On 06/06/2016 12:25 PM, Vladislav Bogdanov wrote: >>> 06.06.2016 19:39, Ken Gaillot wrote: >>>> On 06/05/2016 07:27 PM, Andrew Beekhof wrote: >>>>> On Sat

Re: [ClusterLabs] Minimum configuration for dynamically adding a node to a cluster

2016-06-08 Thread Ken Gaillot
On 06/08/2016 06:54 AM, Jehan-Guillaume de Rorthais wrote: > > > Le 8 juin 2016 13:36:03 GMT+02:00, Nikhil Utane > a écrit : >> Hi, >> >> Would like to know the best and easiest way to add a new node to an >> already >> running cluster. >> >> Our limitation: >> 1)

Re: [ClusterLabs] dovecot RA

2016-06-08 Thread Ken Gaillot
On 06/08/2016 03:26 AM, Jan Pokorný wrote: > On 07/06/16 14:48 -0500, Dimitri Maziuk wrote: >> next question: I'm on centos 7 and there's no more /etc/init.d/> anything>. With lennartware spreading, is there a coherent plan to deal >> with former LSB agents? > > Pacemaker can drive

Re: [ClusterLabs] dovecot RA

2016-06-08 Thread Ken Gaillot
On 06/08/2016 09:11 AM, Ken Gaillot wrote: > On 06/08/2016 03:26 AM, Jan Pokorný wrote: >> On 07/06/16 14:48 -0500, Dimitri Maziuk wrote: >>> next question: I'm on centos 7 and there's no more /etc/init.d/>> anything>. With lennartware spreading, is there a coherent p

Re: [ClusterLabs] pacemaker_remoted XML parse error

2016-06-08 Thread Ken Gaillot
On 06/08/2016 06:14 AM, Narayanamoorthy Srinivasan wrote: > I have a pacemaker cluster with two pacemaker remote nodes. Recently the > remote nodes started throwing below errors and SDB started self-fencing. > Appreciate if someone throws light on what could be the issue and the fix. > > OS -

Re: [ClusterLabs] newbie questions

2016-05-31 Thread Ken Gaillot
On 05/31/2016 03:59 PM, Jay Scott wrote: > Greetings, > > Cluster newbie > Centos 7 > trying to follow the "Clusters from Scratch" intro. > 2 nodes (yeah, I know, but I'm just learning) > > [root@smoking ~]# pcs status > Cluster name: > Last updated: Tue May 31 15:32:18 2016Last change:

Re: [ClusterLabs] Antw: RES: Performance of a mirrored LV (cLVM) with OCFS: Attempt to monitor it

2016-05-27 Thread Ken Gaillot
On 05/27/2016 12:58 AM, Ulrich Windl wrote: > Hi! > > Thanks for this info. We actually run the "noop" scheduler for the SAN > storage (as per menufacturer's recommendation), because on "disk" is actually > spread over up to 40 disks. > Other settings we changes was: > queue/rotational:0 >

[ClusterLabs] Pacemaker 1.1.15 - Release Candidate 3

2016-05-27 Thread Ken Gaillot
re release candidate, with the final release in mid- to late June. -- Ken Gaillot <kgail...@redhat.com> ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http

Re: [ClusterLabs] Antw: Re: Q: status section of CIB: "last_0" IDs and "queue-time"

2016-06-02 Thread Ken Gaillot
On 06/02/2016 01:07 AM, Ulrich Windl wrote: >>>> Ken Gaillot <kgail...@redhat.com> schrieb am 01.06.2016 um 16:14 in >>>> Nachricht > <574eede2.1090...@redhat.com>: >> On 06/01/2016 06:14 AM, Ulrich Windl wrote: >>> Hello! >>>

[ClusterLabs] FYI: Alert script permissions

2016-06-01 Thread Ken Gaillot
as the ability to execute the script itself. The new approach has obvious security benefits but may be less convenient in some cases. If there is a need, we may add the ability to configure an alert script's run-as user in a future version. -- Ken Gaillot <kgail...@redhat.

[ClusterLabs] FYI: ocf:pacemaker:controld issue in rc3

2016-06-01 Thread Ken Gaillot
to use it, you must compile the entire pacemaker package, not just grab the agent. Anyone not using the controld agent is still encouraged to download and test rc3, which has many improvements and is fairly close to what the final release will be. -- Ken Gaillot <kgail...@redhat.

Re: [ClusterLabs] IPaddr2 failed to start

2016-06-22 Thread Ken Gaillot
On 06/22/2016 06:46 AM, wd wrote: > if [ X`uname -s` != "XLinux" ]; then > ocf_log err "IPaddr2 only supported Linux." > exit $OCF_ERR_INSTALLED > fi > > Do you run on a linux? what is 'uname -s' command returned? It could also return "not installed" if

Re: [ClusterLabs] Node is silently unfenced if transition is very long

2016-06-21 Thread Ken Gaillot
On 06/17/2016 07:05 AM, Vladislav Bogdanov wrote: > 03.05.2016 01:14, Ken Gaillot wrote: >> On 04/19/2016 10:47 AM, Vladislav Bogdanov wrote: >>> Hi, >>> >>> Just found an issue with node is silently unfenced. >>> >>> That is quite large setup

Re: [ClusterLabs] Recovering after split-brain

2016-06-21 Thread Ken Gaillot
On 06/20/2016 11:33 PM, Nikhil Utane wrote: > Let me give the full picture about our solution. It will then make it > easy to have the discussion. > > We are looking at providing N + 1 Redundancy to our application servers, > i.e. 1 standby for upto N active (currently N<=5). Each server will

[ClusterLabs] Pacemaker 1.1.15 released

2016-06-21 Thread Ken Gaillot
code to this release, including Andrew Beekhof, Bin Liu, Christian Schneider, Christoph Berg, David Shane Holden, Ferenc Wágner, Gao Yan, Hideo Yamauchi, Jan Pokorný, Ken Gaillot, Klaus Wenninger, Kostiantyn Ponomarenko, Kristoffer Grönlund, Lars Ellenberg, Michal Koutný, Nakahira Kazutomo, Oyvi

Re: [ClusterLabs] restarting pacemakerd

2016-06-20 Thread Ken Gaillot
On 06/18/2016 05:15 AM, Ferenc Wágner wrote: > Hi, > > Could somebody please elaborate a little why the pacemaker systemd > service file contains "Restart=on-failure"? I mean that a failed node > gets fenced anyway, so most of the time this would be a futile effort. > On the other hand, one

  1   2   3   4   5   6   7   8   9   10   >