Re: [ClusterLabs] Pacemaker/Corosync on FreeBSD

2017-12-04 Thread Jan Pokorný
Hello Alberto, On 04/12/17 16:12 -0400, Alberto Mijares wrote: > At this point, I need to know if someone is using pacemaker/corosync > on FreeBSD. Is it a problem with crmsh only? well, it's enough to have a look at which people develop these high level tooling (crm, pcs) and you'll figure out t

Re: [ClusterLabs] systemd's TasksMax and pacemaker

2017-12-02 Thread Jan Pokorný
On 15/11/17 11:16 +0100, Jan Pokorný wrote: > On 14/11/17 15:07 -0600, Ken Gaillot wrote: >> It is conceivable in a large cluster that Pacemaker could exceed >> this limit > > [of 512 or 4915 tasks allowed per service process tree, possibly > overridden with systemd-syst

[ClusterLabs] Should pacemaker pursue its own and corosync's instant resurrection if either dies? (Was: Is corosync supposed to be restarted if it dies?)

2017-12-02 Thread Jan Pokorný
On 30/11/17 11:00 +0300, Andrei Borzenkov wrote: > On Thu, Nov 30, 2017 at 12:42 AM, Jan Pokorný wrote: >> On 29/11/17 22:00 +0100, Jan Pokorný wrote: >>> On 28/11/17 22:35 +0300, Andrei Borzenkov wrote: >>>> I'm not sure what is expected outcome, but pacema

Re: [ClusterLabs] Is corosync supposed to be restarted if it fies?

2017-11-29 Thread Jan Pokorný
On 29/11/17 22:00 +0100, Jan Pokorný wrote: > On 28/11/17 22:35 +0300, Andrei Borzenkov wrote: >> 28.11.2017 13:01, Jan Pokorný пишет: >>> On 27/11/17 17:43 +0300, Andrei Borzenkov wrote: >>>> Отправлено с iPhone >>>> >>>>> 27 нояб. 2017 г.

Re: [ClusterLabs] Is corosync supposed to be restarted if it fies?

2017-11-29 Thread Jan Pokorný
On 28/11/17 22:35 +0300, Andrei Borzenkov wrote: > 28.11.2017 13:01, Jan Pokorný пишет: >> On 27/11/17 17:43 +0300, Andrei Borzenkov wrote: >>> Отправлено с iPhone >>> >>>> 27 нояб. 2017 г., в 14:36, Ferenc Wágner написал(а): >>>> >>

Re: [ClusterLabs] Is corosync supposed to be restarted if it fies?

2017-11-28 Thread Jan Pokorný
On 27/11/17 17:43 +0300, Andrei Borzenkov wrote: > Отправлено с iPhone > >> 27 нояб. 2017 г., в 14:36, Ferenc Wágner написал(а): >> >> Andrei Borzenkov writes: >> >>> 25.11.2017 10:05, Andrei Borzenkov пишет: >>> In one of guides suggested procedure to simulate split brain was to kill >>

Re: [ClusterLabs] pcs create master/slave resource doesn't work

2017-11-27 Thread Jan Pokorný
On 27/11/17 12:07 -0600, Ken Gaillot wrote: > On Fri, 2017-11-24 at 18:00 +0800, Hui Xiang wrote: >>   Very appreciated on your help, I am getting further more, but still >> it looks very strange. >> >> 1. To use "debug-promote", I upgrade pacemaker from 1.12 to 1.16, pcs >> to 0.9.160. >> >> 2.

Re: [ClusterLabs] pcs create master/slave resource doesn't work

2017-11-23 Thread Jan Pokorný
On 23/11/17 23:52 +0800, Hui Xiang wrote: > I am working on HA with 3-nodes, which has below configurations: > > """ > pcs resource create ovndb_servers ocf:ovn:ovndb-servers \ > master_ip=168.254.101.2 \ > op monitor interval="10s" \ > op monitor interval="11s" role=Master > pcs resource ma

Re: [ClusterLabs] systemd's TasksMax and pacemaker

2017-11-15 Thread Jan Pokorný
On 14/11/17 15:07 -0600, Ken Gaillot wrote: > It is conceivable in a large cluster that Pacemaker could exceed > this limit [of 512 or 4915 tasks allowed per service process tree, possibly overridden with systemd-system.conf(5) configuration], > so we are now recommending that users set TasksMax=

Re: [ClusterLabs] [Announce] clufter v0.77.0 released

2017-11-10 Thread Jan Pokorný
On 10/11/17 23:25 +0100, Jan Pokorný wrote: > - bug fixes: > [...] > . all commands having sequence of pcs commands on the output, > hence getting post-processed (line-wrapped and generally > prettified) with the aim to get them human-friendly, might > previousl

[ClusterLabs] [Announce] clufter v0.77.0 released

2017-11-10 Thread Jan Pokorný
I am happy to announce that clufter, a tool/library for transforming and analyzing cluster configuration formats, got its version 0.77.0 tagged and released (incl. signature using my 60BCBB4F5CD7F9EF key):

Re: [ClusterLabs] Issue in starting Pacemaker Virtual IP in RHEL 7

2017-11-09 Thread Jan Pokorný
On 06/11/17 10:43 +, Somanath Jeeva wrote: > I am using a two node pacemaker cluster with teaming enabled. The cluster has > > 1. Two team interfaces with different subents. > > 2. The team1 has a NFS VIP plumbed to it. > > 3. The VirtualIP from pacemaker is configured to p

Re: [ClusterLabs] are there equivalent restful apis for crm commands

2017-11-07 Thread Jan Pokorný
On 07/11/17 09:14 -0600, Ken Gaillot wrote: > On Tue, 2017-11-07 at 16:06 +0800, he.hailo...@zte.com.cn wrote: >> For some purpose, I have to acquire some info within the docker >> container that usually can be made by executing crm commands on the >> host. Importing this tool into the container ma

Re: [ClusterLabs] MYSQL data on DRBD

2017-10-25 Thread Jan Pokorný
On 25/10/17 12:45 +0300, Антон Сацкий wrote: > Digimer yes U r right i need to run Mysql on one server > but somehow when i run > pcs resource create MYSQL lsb:mysql > pacameker also trying to start mysql on a backup server Not sure I follow properly, but you likely don't want to keep pacemak

Re: [ClusterLabs] Antw: Pacemaker 1.1.18 deprecation warnings

2017-09-19 Thread Jan Pokorný
On 19/09/17 09:13 +0200, Ulrich Windl wrote: Ken Gaillot schrieb am 18.09.2017 um 19:48 in Nachricht > <1505756918.5541.4.ca...@redhat.com>: >> As discussed at the recent ClusterLabs Summit, I plan to start the >> release cycle for Pacemaker 1.1.18 soon. >> >> There will be the usual b

Re: [ClusterLabs] Pacemaker 1.1.18 deprecation warnings

2017-09-19 Thread Jan Pokorný
On 19/09/17 08:45 +0200, Klaus Wenninger wrote: > We could as well deprecate use of CRM_NOTIFY_* in alert-agents. > Just don't know an easy way of writing out a deprecation warning > upon a script using one of these. Rename to CRM_NOTIFY_DEPRECATED_* to allow emergency sed-based re-enabling in scr

Re: [ClusterLabs] Friday morning lake swim challenge (Was: Clusterlabs Summit 2017 is on!)

2017-09-07 Thread Jan Pokorný
On 07/09/17 18:34 +0200, Jan Pokorný wrote: > On 07/09/17 00:32 +0200, Jan Pokorný wrote: >> Side question: is anyone up for the early Friday morning lake swim >> challenge? There're at least two contestants at this point :) > > If you haven't left the city yet

[ClusterLabs] Friday morning lake swim challenge (Was: Clusterlabs Summit 2017 is on!)

2017-09-07 Thread Jan Pokorný
On 07/09/17 00:32 +0200, Jan Pokorný wrote: > Side question: is anyone up for the early Friday morning lake swim > challenge? There're at least two contestants at this point :) If you haven't left the city yet and wanna try, see you at 7 am around the entrance of Sorat Saxx Hot

Re: [ClusterLabs] [ClusterLabs Developers] [HA/ClusterLabs Summit] Key-Signing Party, 2017 Edition

2017-09-06 Thread Jan Pokorný
On 24/07/17 16:59 +0200, Jan Pokorný wrote: > On 23/07/17 12:32 +0100, Adam Spiers wrote: >> Jan Pokorný wrote: >>> So, going to attend summit and want your key signed while reciprocally >>> spreading the web of trust? >>> Awesome, let's reuse the steps f

Re: [ClusterLabs] Clusterlabs Summit 2017 is on!

2017-09-06 Thread Jan Pokorný
On 06/09/17 09:45 +0200, Digimer wrote: > This is, by far, the largest summit yet. Very stoked! > > Fellow attendees; Feel free to share your pictures! > > https://photos.app.goo.gl/vBeC83yWTZcxRgux1 "Wow! Such HA. Very cluster" :) Side question: is anyone up for the early Friday morning lake

Re: [ClusterLabs] Antw: Is there a way to ignore a single monitoring timeout

2017-09-01 Thread Jan Pokorný
On 01/09/17 09:48 +0300, Klechomir wrote: > I have cases, when for an unknown reason a single monitoring request > never returns result. > So having bigger timeouts doesn't resolve this problem. If I get you right, the pain point here is a command called by the resource agents during monitor opera

Re: [ClusterLabs] - webdav/davfs

2017-08-22 Thread Jan Pokorný
Hello Philipp, [first of all, I've noticed you are practising a pretty bad habit of starting a new topic/thread by simply responding to an existing one, hence distorting the clear thread overview of the exchanges going on for some of us ... please stop that, there's nothing to be afraid of going f

Re: [ClusterLabs] Antw: Re: big trouble with a DRBD resource

2017-08-22 Thread Jan Pokorný
On 08/08/17 09:42 -0500, Ken Gaillot wrote: > On Tue, 2017-08-08 at 10:18 +0200, Ulrich Windl wrote: > Ken Gaillot schrieb am 07.08.2017 um 22:26 in > Nachricht >> <1502137587.5788.83.ca...@redhat.com>: >> >> [...] >>> Unmanaging doesn't stop monitoring a resource, it only prevents start

Re: [ClusterLabs] nginx resource - how to reload config or do a config test

2017-08-10 Thread Jan Pokorný
On 07/08/17 11:53 -0500, Ken Gaillot wrote: > On Mon, 2017-08-07 at 16:32 +0200, Przemyslaw Kulczycki wrote: >> 2) How do I do an nginx config test with the clustered resource? >> >> >> I know I can do a "nginx -t", but is there an option to do it using >> pacemaker/pcs commands on both nodes? >

Re: [ClusterLabs] Fwd: Multi cluster

2017-08-05 Thread Jan Pokorný
On 05/08/17 00:10 +0200, Jan Pokorný wrote: > [addendum inline] And some more... > On 04/08/17 18:35 +0200, Jan Pokorný wrote: >> On 03/08/17 20:37 +0530, sharafraz khan wrote: >>> I am new to clustering so please ignore if my Question sounds silly, i have >>> a

Re: [ClusterLabs] Fwd: Multi cluster

2017-08-04 Thread Jan Pokorný
[addendum inline] On 04/08/17 18:35 +0200, Jan Pokorný wrote: > On 03/08/17 20:37 +0530, sharafraz khan wrote: >> I am new to clustering so please ignore if my Question sounds silly, i have >> a requirement were in i need to create cluster for ERP application with >> apache

Re: [ClusterLabs] Fwd: Multi cluster

2017-08-04 Thread Jan Pokorný
On 03/08/17 20:37 +0530, sharafraz khan wrote: > I am new to clustering so please ignore if my Question sounds silly, i have > a requirement were in i need to create cluster for ERP application with > apache, VIP component,below is the scenario > > We have 5 Sites, > 1. DC > 2. Site A > 3. Site B

Re: [ClusterLabs] Notification agent and Notification recipients

2017-08-04 Thread Jan Pokorný
On 04/08/17 11:06 +0530, Sriram wrote: > Any idea what could have gone wrong or if there are other ways to achieve > the same ? Sriram, I have just answered in the original thread. Note that it's that part of the year where vacations are quite common, so even if you are eager to know the answer,

Re: [ClusterLabs] Notification agent and Notification recipients

2017-08-04 Thread Jan Pokorný
On 03/08/17 12:31 +0530, Sriram wrote: > We have a four node cluster (1 active : 3 standby) in our lab for a > particular service. If the active node goes down, one of the three standby > node becomes active. Now there will be (1 active : 2 standby : 1 offline). > > Is there any way where this n

Re: [ClusterLabs] [ClusterLabs Developers] [HA/ClusterLabs Summit] Key-Signing Party, 2017 Edition

2017-07-24 Thread Jan Pokorný
On 23/07/17 12:32 +0100, Adam Spiers wrote: > Jan Pokorný wrote: >> So, going to attend summit and want your key signed while reciprocally >> spreading the web of trust? >> Awesome, let's reuse the steps from the last time: >> >> Once you have a key pair

Re: [ClusterLabs] epic fail

2017-07-24 Thread Jan Pokorný
On 23/07/17 14:40 +0200, Valentin Vidic wrote: > On Sun, Jul 23, 2017 at 07:27:03AM -0500, Dmitri Maziuk wrote: >> So yesterday I ran yum update that puled in the new pacemaker and tried to >> restart it. The node went into its usual "can't unmount drbd because kernel >> is using it" and got stonit

[ClusterLabs] [HA/ClusterLabs Summit] Key-Signing Party, 2017 Edition

2017-07-21 Thread Jan Pokorný
Hello cluster masters :-) as there's little less than 7 weeks left to "The Summit" meetup (), it's about time to get the ball rolling so we can voluntarily augment the digital trust amongst us the attendees, on OpenGPG basis. Doing that, we'll actually establish a traditi

Re: [ClusterLabs] pcs: how to properly unset a value for resource/stonith? [Was: (no subject)]

2017-07-20 Thread Jan Pokorný
Hello ArekW, first of all, gentle reminder to always set the subject for the posts to the list (or, as a rule of thumb, in any email-based conversation). On 20/07/17 08:43 +0200, Klaus Wenninger wrote: > On 07/20/2017 07:21 AM, ArekW wrote: >> Hi, How to properly unset a value with pcs? Set to fa

Re: [ClusterLabs] fence_ipmilan power_timeout ValueError

2017-07-14 Thread Jan Pokorný
On 13/07/17 09:11 -0400, Ron Kerry wrote: > I have a customer who recently tried to create fence-ipmilan > resources with a power_timeout parameters set to 60s. His reasoning > was that the parameter is set to be a string value, so he expected > the 's' modifier on the 60 value to be interpreted co

Re: [ClusterLabs] Pacemaker 1.1.17 released

2017-07-07 Thread Jan Pokorný
On 06/07/17 14:54 -0500, Ken Gaillot wrote: > ClusterLabs is proud to announce the latest release of the Pacemaker > cluster resource manager, version 1.1.17. The source code is available at: > > https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.17 Congratulations on this leap i

Re: [ClusterLabs] clusterlabs.org now supports https :-)

2017-06-29 Thread Jan Pokorný
On 26/06/17 11:04 -0500, Ken Gaillot wrote: > Thanks to the wonderful service provided by Let's Encrypt[1], we now > have an SSL certificate for the ClusterLabs websites. You can use the > websites with secure encryption by starting the URL with "https", for > example: > >https://www.clusterla

Re: [ClusterLabs] ClusterIP won't return to recovered node

2017-06-28 Thread Jan Pokorný
On 27/06/17 15:15 -0400, Dan Ragle wrote: > All that said, and for anyone interested, here's the recipe I tried that > appears to work well. After setting it up this way, I was able to > standby/unstandby each of the nodes in turn with the clones consistently > re-splitting after each unstandby (an

Re: [ClusterLabs] vip is not removed after node lost connection with the other two nodes

2017-06-26 Thread Jan Pokorný
[Hui, no need to address us individually along with the list, we are both subscribed to it since around the beginning] On 26/06/17 16:10 +0800, Hui Xiang wrote: > Thanks guys!! > > @Ken > I did "ifconfig ethx down" to make the cluster interface down. That's what I suspected and what I tried to s

Re: [ClusterLabs] Notifications on changes in clustered LVM

2017-06-23 Thread Jan Pokorný
On 20/06/17 10:59 +0200, Ferenc Wágner wrote: > Digimer writes: > >> On 19/06/17 11:40 PM, Andrei Borzenkov wrote: >>> udev events are sent over netlink, not D-Bus. >> >> I've not used that before. Any docs on how to listen for those events, >> by chance? If nothing off hand, don't worry, I can

Re: [ClusterLabs] vip is not removed after node lost connection with the other two nodes

2017-06-23 Thread Jan Pokorný
On 23/06/17 08:48 -0500, Ken Gaillot wrote: > On 06/22/2017 09:44 PM, Hui Xiang wrote: >> I have setup 3 nodes(node-1, node-2, node-3) as controller nodes, an >> vip is selected by pacemaker between them, after manually make the >> management interface down in node-1 (used by corosync) but still

Re: [ClusterLabs] ocf_take_lock is NOT actually safe to use

2017-06-22 Thread Jan Pokorný
On 21/06/17 16:40 +0200, Lars Ellenberg wrote: > Repost to a wider audience, to raise awareness for this. Appreciated, Lars. Adding developers ML for possibly even larger outreach. > ocf_take_lock may or may not be better than nothing. > > It at least "annotates" that the auther would like to pr

Re: [ClusterLabs] time taken for pcs resource create for IPAddr resource.

2017-06-14 Thread Jan Pokorný
On 14/06/17 21:52 +0200, Jan Pokorný wrote: > On 15/06/17 00:12 +0530, ashutosh tiwari wrote: >> Also wanted to know that is there a way to add multiple resources in a go, >> presently we are doing the resource addition serially. >> >> the command we use to create re

Re: [ClusterLabs] time taken for pcs resource create for IPAddr resource.

2017-06-14 Thread Jan Pokorný
On 15/06/17 00:12 +0530, ashutosh tiwari wrote: > Also wanted to know that is there a way to add multiple resources in a go, > presently we are doing the resource addition serially. > > the command we use to create resources is > "pcs resource create resname ocf:abc:IPaddr ip=x.x.x.x cidr_netmask=

[ClusterLabs] [Announce] clufter v0.76.0 released

2017-06-06 Thread Jan Pokorný
I am happy to announce that clufter, a tool/library for transforming and analyzing cluster configuration formats, got its version 0.76.0 tagged and released (incl. signature using my 60BCBB4F5CD7F9EF key):

[ClusterLabs] [Announce] clufter v0.75.0 released

2017-05-26 Thread Jan Pokorný
I am happy to announce that clufter, a tool/library for transforming and analyzing cluster configuration formats, got its version 0.70.0 tagged and released (incl. signature using my 60BCBB4F5CD7F9EF key):

Re: [ClusterLabs] Fwd: Unable to start cluster (Pacemaker/Corosync)

2017-05-09 Thread Jan Pokorný
On 09/05/17 09:51 -0500, Ken Gaillot wrote: > On 05/09/2017 02:44 AM, Handra Cs wrote: >> I am currently trying to configure Pacemaker/Corosync. I managed to >> install the required packages for the cluster configuration, however I >> could not start the cluster service. Based on the log file, ther

Re: [ClusterLabs] Fraud Detection Check?

2017-05-01 Thread Jan Pokorný
On 13/04/17 10:30 -0500, Dmitri Maziuk wrote: > On 2017-04-13 01:39, Jan Pokorný wrote: > >> After a bit of a search, the best practice at the list server seems to >> be: >> >>> [...] if you change the message (eg, by adding a list signature or >>> by

Re: [ClusterLabs] [ClusterLabs Developers] checking all procs on system enough during stop action?

2017-04-24 Thread Jan Pokorný
On 24/04/17 17:32 +0200, Jehan-Guillaume de Rorthais wrote: > On Mon, 24 Apr 2017 17:08:15 +0200 > Lars Ellenberg wrote: > >> On Mon, Apr 24, 2017 at 04:34:07PM +0200, Jehan-Guillaume de Rorthais wrote: >>> Hi all, >>> >>> In the PostgreSQL Automatic Failover (PAF) project, one of most frequent

[ClusterLabs] Upcoming "compact containers handling" feature discussion (Was: Coming in Pacemaker 1.1.17: container bundles)

2017-04-19 Thread Jan Pokorný
On 18/04/17 15:52 +0200, Jan Pokorný wrote: > On 17/04/17 09:51 -0500, Ken Gaillot wrote: >> On 04/13/2017 07:04 AM, Jan Pokorný wrote: >>> On 03/04/17 09:47 -0500, Ken Gaillot wrote: >>>> With a group, you could reproduce most of this functionality, though it >

Re: [ClusterLabs] Coming in Pacemaker 1.1.17: container bundles

2017-04-18 Thread Jan Pokorný
On 17/04/17 09:51 -0500, Ken Gaillot wrote: > On 04/13/2017 07:04 AM, Jan Pokorný wrote: >> On 03/04/17 09:47 -0500, Ken Gaillot wrote: >>> On 04/03/2017 02:12 AM, Ulrich Windl wrote: >>>>>>> Ken Gaillot schrieb am 01.04.2017 um 00:43 in >>>>&g

Re: [ClusterLabs] Coming in Pacemaker 1.1.17: container bundles

2017-04-13 Thread Jan Pokorný
On 03/04/17 09:47 -0500, Ken Gaillot wrote: > On 04/03/2017 02:12 AM, Ulrich Windl wrote: > Ken Gaillot schrieb am 01.04.2017 um 00:43 in > Nachricht >> <981d420d-73b2-3f24-a67c-e9c66dafb...@redhat.com>: >> >> [...] >>> Pacemaker 1.1.17 introduces a new type of resource: the "bundle". A

Re: [ClusterLabs] Fraud Detection Check?

2017-04-12 Thread Jan Pokorný
On 13/04/17 08:21 +0200, Jan Pokorný wrote: > On 12/04/17 17:16 -0500, Dimitri Maziuk wrote: >> On 04/12/2017 04:36 PM, Jan Pokorný wrote: >> >>> Eric, as of now, to get rid of the fraud warnings, it's primarily your >>> emailing software that needs to

Re: [ClusterLabs] Fraud Detection Check?

2017-04-12 Thread Jan Pokorný
On 12/04/17 17:16 -0500, Dimitri Maziuk wrote: > On 04/12/2017 04:36 PM, Jan Pokorný wrote: > >> Eric, as of now, to get rid of the fraud warnings, it's primarily your >> emailing software that needs to be taught to be less picky either when >> sending, i.e., als

Re: [ClusterLabs] Fraud Detection Check?

2017-04-12 Thread Jan Pokorný
On 11/04/17 09:08 +0200, Jan Pokorný wrote: > On 07/04/17 18:32 +, Eric Robinson wrote: >> What the heck, ClusterLabs? Why does your system keep tagging my >> emails as potential fraud? You guys got a thing against Office 365? > > Do I understand it correctly that

Re: [ClusterLabs] Fraud Detection Check?

2017-04-11 Thread Jan Pokorný
On 07/04/17 18:32 +, Eric Robinson wrote: > What the heck, ClusterLabs? Why does your system keep tagging my > emails as potential fraud? You guys got a thing against Office 365? Do I understand it correctly that your your Office 365 interface is making these accusations? I can imagine that's

[ClusterLabs] [Announce] clufter v0.70.0 released

2017-03-22 Thread Jan Pokorný
I am happy to announce that clufter, a tool/library for transforming and analyzing cluster configuration formats, got its version 0.70.0 tagged and released (incl. signature using my 60BCBB4F5CD7F9EF key):

Re: [ClusterLabs] FenceAgentAPI

2017-03-07 Thread Jan Pokorný
On 06/03/17 17:12 -0500, Digimer wrote: > The old FenceAgentAPI document on fedorahosted is gone now that fedora > hosted is closed. So I created a copy on the clusterlabs wiki: > > http://wiki.clusterlabs.org/wiki/FenceAgentAPI Note that just few days ago I've announced that the page has moved

Re: [ClusterLabs] corosync dead loop in segfault handler

2017-02-15 Thread Jan Pokorný
On 15/02/17 18:04 +0100, Jan Pokorný wrote: > On 15/02/17 15:13 +, Christine Caulfield wrote: >> On 15/02/17 14:50, Jan Friesse wrote: >>>> Hi all, >>>> >>>> Corosync Cluster Engine, version '2.3.4' >>>> Copyright (c) 2006

Re: [ClusterLabs] corosync dead loop in segfault handler

2017-02-15 Thread Jan Pokorný
On 15/02/17 15:13 +, Christine Caulfield wrote: > On 15/02/17 14:50, Jan Friesse wrote: >>> Hi all, >>> >>> Corosync Cluster Engine, version '2.3.4' >>> Copyright (c) 2006-2009 Red Hat, Inc. >>> >>> Today I found corosync consuming 100% cpu. Strace showed following: >>> >>> write(7, "\v\0\0\

Re: [ClusterLabs] pcsd 99% CPU

2017-02-06 Thread Jan Pokorný
On 03/02/17 16:08 -0500, Scott Greenlese wrote: > Over the past few days, I noticed that pcsd and ruby process is pegged at > 99% CPU, and commands such as pcs status pcsd take up to 5 minutes to > complete. > On all active cluster nodes, top shows: > > PID USER PR NI VIRT RE

Re: [ClusterLabs] Antw: Re: lrmd segfault

2017-01-31 Thread Jan Pokorný
On 31/01/17 15:04 +0100, Jan Pokorný wrote: > On 31/01/17 10:16 +0100, Ulrich Windl wrote: >>>> Kristoffer Grönlund schrieb am 31.01.2017 um 07:34 >>>> in Nachricht <87mve768lx@suse.com>: >> >> [...] >>> Just from looking at the core dump

Re: [ClusterLabs] Antw: Re: lrmd segfault

2017-01-31 Thread Jan Pokorný
On 31/01/17 10:16 +0100, Ulrich Windl wrote: >>> Kristoffer Grönlund schrieb am 31.01.2017 um 07:34 >>> in Nachricht <87mve768lx@suse.com>: > > [...] >> Just from looking at the core dump, it looks like your processor doesn't >> support the SSE extensions used by the newer version of the code

[ClusterLabs] [Announce] clufter v0.59.8 released

2017-01-18 Thread Jan Pokorný
I am happy to announce that clufter, a tool/library for transforming and analyzing cluster configuration formats, got its version 0.59.8 tagged and released (incl. signature using my 60BCBB4F5CD7F9EF key):

Re: [ClusterLabs] Log rotation for /var/log/debug-pcmk.log

2017-01-12 Thread Jan Pokorný
On 11/01/17 18:46 +, Andrew Nagy wrote: > I hope this is the right place > We have pacemaker running (on at least one) RHEL 7.2 server with a > huge (8+ GB) /var/log/debug-pcmk.log file. No one else here knows > anything about pacemaker and the person who built/configured the > server is un

Re: [ClusterLabs] New ClusterLabs logo unveiled :-)

2016-12-22 Thread Jan Pokorný
On 22/12/16 12:06 -0600, Ken Gaillot wrote: > ClusterLabs is happy to unveil its new logo! Many thanks to the > designer, Kristoffer Grönlund , who graciously > donated the clever approach. Nice, and congratulations, Krig, for the logo escalation :) (Still looking forward to seeing the animated v

Re: [ClusterLabs] [ClusterLabs Developers] Help! Can packmaker launch resource from new network namespace automatically

2016-12-22 Thread Jan Pokorný
[forwarding to users list as it seems a better audience to me] On 22/12/16 05:08 +0800, Hao QingFeng wrote: > I am newbie for pacemaker and using it to manage resource haproxy on ubuntu > 16.04. > > I met a problem that haproxy can't start listening for some services > in vip because the related

Re: [ClusterLabs] [Announce] clufter v0.59.7 released

2016-12-15 Thread Jan Pokorný
On 12/12/16 21:36 +0100, Jan Pokorný wrote: > Changelog highlights for v0.59.7 (also available as a tag message): > > - bug fix release (bash completion + shebangs, regard resource-agents version) > - bug fixes: > . output of {ccs,pcs}2pcscmd commands could previously confuse us

[ClusterLabs] [Announce] clufter v0.59.7 released

2016-12-12 Thread Jan Pokorný
I am happy to announce that clufter, a tool/library for transforming and analyzing cluster configuration formats, got its version 0.59.6 tagged and released (incl. signature using my 60BCBB4F5CD7F9EF key):

Re: [ClusterLabs] Error performing operation: Argument list too long

2016-12-06 Thread Jan Pokorný
On 06/12/16 09:44 -0600, Ken Gaillot wrote: > On 12/05/2016 02:29 PM, Shane Lawrence wrote: >> I'm experiencing a strange issue with pacemaker. It is unable to check >> the status of a systemd resource. >> >> systemctl shows that the service crashed: >> [root@xx ~]# systemctl status rsyslo

Re: [ClusterLabs] standby and unstandby commands

2016-12-01 Thread Jan Pokorný
On 29/11/16 08:24 +0100, Ulrich Windl wrote: > Some servers first fork and exit, virtually being successful > immediately, while the child (the "server loop") could die a moment > later. Finding the perfect time to wait for the server dying is kind > of black magic ;-) That's exactly why I thing t

[ClusterLabs] @ClusterLabs/devel COPR with new pacemaker (Was: Pacemaker 1.1.16 released)

2016-12-01 Thread Jan Pokorný
On 30/11/16 14:05 -0600, Ken Gaillot wrote: > ClusterLabs is proud to announce the latest release of the Pacemaker > cluster resource manager, version 1.1.15. The source code is available at: > > https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.16 In the same vein as recent lib

Re: [ClusterLabs] ocf:heartbeat:IPaddr2 - Different network segment

2016-12-01 Thread Jan Pokorný
On 30/11/16 12:10 -0300, Ronny Machado C. wrote: > soon I'll have to configure an Apache web server cluster which publish a > VIP via ofc:hearbeat:IPaddr2, thing is, the two nodes live in different > sites and with a different ip segments...any advice on how to > use ocf:heartbeat:IPaddr2 ip=x.x.

Re: [ClusterLabs] Get rid of reload altogether

2016-11-30 Thread Jan Pokorný
On 28/11/16 09:44 +0530, Nikhil Utane wrote: > I understand the whole concept of reload and how to define parameters with > unique=0 so that pacemaker can call the reload operation of the OCF script > instead of stopping and starting the resource. > Now my problem is that I have 100s of parameters

[ClusterLabs] @ClusterLabs/devel COPR with new libqb (Was: libqb 1.0.1 release)

2016-11-24 Thread Jan Pokorný
On 24/11/16 10:42 +, Christine Caulfield wrote: > I am very pleased to announce the 1.0.1 release of libqb For instant tryout on Fedora/EL-based distros, there is already a habitual COPR build. But this time around, I'd like to introduce some advancements in the process... * * * First, we n

Re: [ClusterLabs] Antw: Locate resource with functioning member of clone set?

2016-11-23 Thread Jan Pokorný
On 18/11/16 08:22 +0100, Ulrich Windl wrote: >> 1) is there a way to set up a "kill script", such that before trying to >> launch a new copy of a process, pacemaker will run this script, which would >> be responsible for making sure that there are no other instances of the >> process running? >>

Re: [ClusterLabs] iSCSI on ZFS on DRBD

2016-11-23 Thread Jan Pokorný
On 22/11/16 17:28 +, Jason A Ramsey wrote: > The way that Pacemaker interacts with services is using resource > agents. These resource agents are bash scripts that you can modify > to your heart’s content to do the things you want to do. > > [...] > > Just take a peak at the resource agent fi

Re: [ClusterLabs] Authoritative corosync's location

2016-11-07 Thread Jan Pokorný
On 22/09/16 09:05 +0200, Jan Friesse wrote: > Jan Pokorný napsal(a): >> On 21/09/16 09:16 +0200, Jan Friesse wrote: >>> Thomas Lamprecht napsal(a): >>>> I have also another, organizational question. I saw on the GitHub page from >>>> corosync that pull

Re: [ClusterLabs] Pacemaker 1.1.16 - Release Candidate 1

2016-11-07 Thread Jan Pokorný
On 03/11/16 11:08 -0500, Ken Gaillot wrote: > ClusterLabs is happy to announce the first release candidate for > Pacemaker version 1.1.16. Source code is available at: > > https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.16-rc1 > > [...] As usual, there are COPR builds (using

Re: [ClusterLabs] Antw: Pacemaker 1.1.16 - Release Candidate 1

2016-11-04 Thread Jan Pokorný
On 04/11/16 08:29 +0100, Ulrich Windl wrote: > Ken Gaillot schrieb am 03.11.2016 um 17:08 in > Nachricht <8af2ff98-05fd-a2c7-f670-58d0ff68e...@redhat.com>: >> ClusterLabs is happy to announce the first release candidate for >> Pacemaker version 1.1.16. Source code is available at: >> >> https://g

[ClusterLabs] [SECURITY] CVE-2016-7035 - pacemaker - improper IPC guarding

2016-11-03 Thread Jan Pokorný
Following issue is being publicly disclosed today; more information regarding the release process will arrive later today and also this is an opportunity to announce http://clusterlabs.org/wiki/Security page that was intoduced to help keeping track of security issues (any fellow project is welcome

Re: [ClusterLabs] pacemaker compile error

2016-10-31 Thread Jan Pokorný
On 27/10/16 19:13 +0200, ferdinando wrote: > i'm trying to install pacemaker cluster in a testing environment (2 nodes > Fedora release 24, 4.7.9-200.fc24.x86_64) > i have problems compiling last commit > (19c0d74717fb1e9701d51b206823a3386a114caa) of pacemaker. > chunk of log: > > [...] > > CC

[ClusterLabs] [Announce] clufter v0.59.6 released

2016-10-20 Thread Jan Pokorný
I am happy to announce that clufter, a tool/library for transforming and analyzing cluster configuration formats, got its version 0.59.6 tagged and released (incl. signature using my 60BCBB4F5CD7F9EF key):

Re: [ClusterLabs] Live migration problem

2016-10-06 Thread Jan Pokorný
On 05/10/16 13:02 -0400, Digimer wrote: > I just spent a fair bit of time debugging a weird error, and now that > I've solved it, I wanted to share it on the list so that it is archived. > With luck, it will save someone else some heartache. No replies are > expected. :) > > Environment: > * Anv

Re: [ClusterLabs] Pacemaker remote - invalid message detected, endian mismatch

2016-09-30 Thread Jan Pokorný
On 30/09/16 11:28 -0500, Radoslaw Garbacz wrote: > I have posted a question about this error attached to another thread, but > because it was old and there is no answer I thought it could have been > missed, so I am sorry for repeating it. > > Regarding the problem. > I have a cluster, and when th

Re: [ClusterLabs] Failed to retrieve meta-data for custom ocf resource

2016-09-29 Thread Jan Pokorný
Hello , On 29/09/16 12:41 -0400, Christopher Harvey wrote: > I think something is failing at the execvp() level. I'm seeing > useful looking trace logs in the code, but can't enable them right > now. I have: > PCMK_debug=yes > PCMK_logfile=/tmp/pacemaker.log > PCMK_logpriority=debug > PCMK_trace_

Re: [ClusterLabs] Pacemaker quorum behavior

2016-09-29 Thread Jan Pokorný
On 28/09/16 16:30 -0400, Scott Greenlese wrote: > Also, I have tried simulating a failed cluster node (to trigger a > STONITH action) by killing the corosync daemon on one node, but all > that does is respawn the daemon ... causing a temporary / transient > failure condition, and no fence takes p

Re: [ClusterLabs] Failed to retrieve meta-data for custom ocf resource

2016-09-29 Thread Jan Pokorný
On 28/09/16 16:55 -0500, Ken Gaillot wrote: > On 09/28/2016 04:04 PM, Christopher Harvey wrote: >> My corosync/pacemaker logs are seeing a bunch of messages like the >> following: >> >> Sep 22 14:50:36 [1346] node-132-60 crmd: info: >> action_synced_wait: Managed MsgBB-Active_meta-da

Re: [ClusterLabs] RFC: allowing soft recovery attempts before ignore/block/etc.

2016-09-22 Thread Jan Pokorný
On 21/09/16 10:51 +1000, Andrew Beekhof wrote: > On Wed, Sep 21, 2016 at 6:25 AM, Ken Gaillot wrote: >> Our first proposed approach would add a new hard-fail-threshold >> operation property. If specified, the cluster would first try restarting >> the resource on the same node, > > > Well, just a

Re: [ClusterLabs] RFC: allowing soft recovery attempts before ignore/block/etc.

2016-09-22 Thread Jan Pokorný
On 22/09/16 08:42 +0200, Kristoffer Grönlund wrote: > Ken Gaillot writes: > >> I'm not saying it's a bad idea, just that it's more complicated than it >> first sounds, so it's worth thinking through the implications. > > Thinking about it and looking at how complicated it gets, maybe what > you'

[ClusterLabs] Authoritative corosync's location (Was: corosync-quorum tool, output name key on Name column if set?)

2016-09-21 Thread Jan Pokorný
On 21/09/16 09:16 +0200, Jan Friesse wrote: > Thomas Lamprecht napsal(a): >> I have also another, organizational question. I saw on the GitHub page from >> corosync that pull request there are preferred, and also that the > > True At this point, it's worth noting that ClusterLabs/corosync is curr

Re: [ClusterLabs] [rgmanager] Recovering a failed (but running) server in rgmanager

2016-09-19 Thread Jan Pokorný
On 18/09/16 15:37 -0400, Digimer wrote: > If, for example, a server's definition file is corrupted while the > server is running, rgmanager will put the server into a 'failed' state. > That's fine and fair. Please, be more precise. Is it "vm" resource agent that you are talking about, hence ser

Re: [ClusterLabs] Virtual ip resource restarted on node with down network device

2016-09-19 Thread Jan Pokorný
On 19/09/16 10:18 +, Auer, Jens wrote: > Ok, after reading the log files again I found > > Sep 19 10:03:45 MDA1PFP-S01 crmd[7797]: notice: Initiating action 3: stop > mda-ip_stop_0 on MDA1PFP-PCS01 (local) > Sep 19 10:03:45 MDA1PFP-S01 crmd[7797]: notice: > MDA1PFP-PCS01-mda-ip_monitor_10

Re: [ClusterLabs] Virtual ip resource restarted on node with down network device

2016-09-19 Thread Jan Pokorný
On 19/09/16 09:15 +, Auer, Jens wrote: > After the restart ifconfig still shows the device bond0 to be not RUNNING: > MDA1PFP-S01 09:07:54 2127 0 ~ # ifconfig > bond0: flags=5123 mtu 1500 > inet 192.168.120.20 netmask 255.255.255.255 broadcast 0.0.0.0 > ether a6:17:2c:2a:72:f

Re: [ClusterLabs] Virtual ip resource restarted on node with down network device

2016-09-16 Thread Jan Pokorný
On 16/09/16 11:01 -0500, Ken Gaillot wrote: > On 09/16/2016 10:43 AM, Auer, Jens wrote: >> thanks for the help. >> >>> I'm not sure what you mean by "the device the virtual ip is attached >>> to", but a separate question is why the resource agent reported that >>> restarting the IP was successful,

Re: [ClusterLabs] [rgmanager] generic 'initscript' resource agent that passes arguments?

2016-09-16 Thread Jan Pokorný
On 13/09/16 17:33 -0400, berg...@merctech.com wrote: > In the message dated: Tue, 06 Sep 2016 20:36:53 +0200, > The pithy ruminations from Jan =?utf-8?Q?Pokorn=C3=BD?= on > passes arguments?> were: > > => On 29/08/16 13:41 -0400, berg...@merctech.com wrote: > => > I've got a number of scripts th

Re: [ClusterLabs] Pacemaker quorum behavior

2016-09-09 Thread Jan Pokorný
On 09/09/16 14:13 -0400, Scott Greenlese wrote: > You had mentioned this command: > > pstree -p | grep -A5 $(pidof -x pcs) > > I'm not quite sure what the $(pidof -x pcs) represents?? This is a "command substitution" shell construct (new, blessed form of `backtick` notation) that in this particu

Re: [ClusterLabs] Pacemaker migration - how to?

2016-09-08 Thread Jan Pokorný
On 08/09/16 14:44 +, Nurit Vilosny wrote: > I have a very basic question that I couldn't find an answer for. > I am using the pacemaker to control a 3 nodes cluster, with > a private application that works in an active - standby - standby > mode. > My node have priorities in which is better to

Re: [ClusterLabs] Pacemaker quorum behavior

2016-09-08 Thread Jan Pokorný
On 08/09/16 10:20 -0400, Scott Greenlese wrote: > Correction... > > When I stopped pacemaker/corosync on the four (powered on / active) > cluster node hosts, I was having an issue with the gentle method of > stopping the cluster (pcs cluster stop --all), Can you elaborate on what went wrong with

Re: [ClusterLabs] [rgmanager] generic 'initscript' resource agent that passes arguments?

2016-09-06 Thread Jan Pokorný
On 29/08/16 13:41 -0400, berg...@merctech.com wrote: > I've got a number of scripts that are based on LSB compliant scripts, > but which also accept arguments & values. For example, a script to manage > multiple virtual machines has a command-line in the form: > > vbox_init --vmname $VMNAME

Re: [ClusterLabs] fence_apc delay?

2016-09-06 Thread Jan Pokorný
On 06/09/16 10:35 -0500, Ken Gaillot wrote: > On 09/06/2016 10:20 AM, Dan Swartzendruber wrote: >> On 2016-09-06 10:59, Ken Gaillot wrote: >> >> [snip] >> >>> I thought power-wait was intended for this situation, where the node's >>> power supply can survive a brief outage, so a delay is needed t

<    1   2   3   4   5   >