Re: [ClusterLabs] Antw: Announcing hawk-apiserver, now in ClusterLabs

2019-02-13 Thread Adam Spiers
Ulrich Windl wrote: Hello! I'd like to comment as an "old" SuSE customer: I'm amazed that lighttpd is dropped in favor of some new go application: SuSE now has a base system that needs (correct me if I'm wrong): shell, perl, python, java, go, ruby, ...? Sorry for the off-topic nitpick, but

Re: [ClusterLabs] [questionnaire] Do you manage your pacemaker configuration by hand and (if so) what reusability features do you use?

2018-06-14 Thread Adam Spiers
Jan Pokorný wrote: On 31/05/18 14:48 +0200, Jan Pokorný wrote: I am soliciting feedback on these CIB features related questions, please reply (preferably on-list so we have the shared collective knowledge) if at least one of the questions is answered positively in your case (just tick the

Re: [ClusterLabs] Ansible role to configure Pacemaker

2018-06-07 Thread Adam Spiers
Jan Pokorný wrote: While I see why Ansible is compelling, I feel it's important to challenge this trend of trying to bend/rebrand _machine-local configuration management tool_ as _distributed system management tool_ (pacemaker is distributed application/framework of sorts), which Ansible alone

Re: [ClusterLabs] Possible idea for 2.0.0: renaming the Pacemaker daemons

2018-03-29 Thread Adam Spiers
Kristoffer Gronlund wrote: Ken Gaillot writes: Hi all, Andrew Beekhof brought up a potential change to help with reading Pacemaker logs. Great idea! [snipped] Better to do it now rather than later. I vote in favor of changing the names. Yes,

Re: [ClusterLabs] Misunderstanding or bug in crm_simulate output

2018-01-18 Thread Adam Spiers
Jehan-Guillaume de Rorthais wrote: Hi list, I was explaining how to use crm_simulate to a colleague when he pointed to me a non expected and buggy output. [snipped] Probably related: https://bugs.clusterlabs.org/show_bug.cgi?id=5294

Re: [ClusterLabs] Antw: Feedback wanted: changing "master/slave" terminology

2018-01-17 Thread Adam Spiers
Ulrich Windl wrote: Ken Gaillot schrieb am 16.01.2018 um 23:33 in Nachricht <1516142036.5604.3.ca...@redhat.com>: As we look to release Pacemaker 2.0 and (separately) update the OCF standard, this is a good time to revisit the

Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: Re: Coming in Pacemaker 2.0.0: /var/log/pacemaker/pacemaker.log

2018-01-15 Thread Adam Spiers
Ken Gaillot <kgail...@redhat.com> wrote: On Mon, 2018-01-15 at 12:40 +, Adam Spiers wrote: Ulrich Windl <ulrich.wi...@rz.uni-regensburg.de> wrote: But for a general solution, do you think it's more clean to have the same directory with identical properties in multiple packages

Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: Re: Coming in Pacemaker 2.0.0: /var/log/pacemaker/pacemaker.log

2018-01-15 Thread Adam Spiers
Ulrich Windl wrote: Vladislav Bogdanov schrieb: 15.01.2018 11:23, Ulrich Windl wrote: Vladislav Bogdanov schrieb: 11.01.2018 18:39, Ken Gaillot wrote: [...] I thought one option aired at the summit to address

Re: [ClusterLabs] Coming in Pacemaker 2.0.0: /var/log/pacemaker/pacemaker.log

2018-01-10 Thread Adam Spiers
Ken Gaillot wrote: The initial proposal, after discussion at last year's summit, was to use /var/log/cluster/pacemaker.log instead. That turned out to be slightly problematic: it broke some regression tests in a way that wasn't easily fixable, and more significantly, it

Re: [ClusterLabs] low-cost ways to make Pacemaker more usable?

2017-12-07 Thread Adam Spiers
Ken Gaillot <kgail...@redhat.com> wrote: On Thu, 2017-12-07 at 17:15 +, Adam Spiers wrote: For example, making a few of the most crucial existing log messages less cryptic could maybe go a long way. Or if "dumbing down" log messages would make life harder for developers

[ClusterLabs] low-cost ways to make Pacemaker more usable?

2017-12-07 Thread Adam Spiers
Ken Gaillot <kgail...@redhat.com> wrote: On Thu, 2017-12-07 at 12:13 +, Adam Spiers wrote: https://gocardless.com/blog/incident-review-api-and-dashboard-outage- on-10th-october/ It's a great write-up, although a little frustrating that it is still not fully understood why a -inf colo

[ClusterLabs] interesting blog on Pacemaker-related outage

2017-12-07 Thread Adam Spiers
https://gocardless.com/blog/incident-review-api-and-dashboard-outage-on-10th-october/ It's a great write-up, although a little frustrating that it is still not fully understood why a -inf colocation failed whereas a +inf succeeded. (I actually have a vague memory of discovering something very

Re: [ClusterLabs] questions about startup fencing

2017-12-06 Thread Adam Spiers
Ken Gaillot <kgail...@redhat.com> wrote: On Thu, 2017-11-30 at 11:58 +, Adam Spiers wrote: Ken Gaillot <kgail...@redhat.com> wrote: On Wed, 2017-11-29 at 14:22 +, Adam Spiers wrote: [snipped] Let's suppose further that the cluster configuration is such that no statef

Re: [ClusterLabs] questions about startup fencing

2017-11-30 Thread Adam Spiers
Ken Gaillot <kgail...@redhat.com> wrote: On Wed, 2017-11-29 at 14:22 +, Adam Spiers wrote: Hi all, A colleague has been valiantly trying to help me belatedly learn about the intricacies of startup fencing, but I'm still not fully understanding some of the finer points of the beh

Re: [ClusterLabs] questions about startup fencing

2017-11-29 Thread Adam Spiers
Kristoffer Gronlund <kgronl...@suse.com> wrote: Adam Spiers <aspi...@suse.com> writes: Kristoffer Gronlund <kgronl...@suse.com> wrote: Adam Spiers <aspi...@suse.com> writes: - The whole cluster is shut down cleanly. - The whole cluster is then started up agai

Re: [ClusterLabs] questions about startup fencing

2017-11-29 Thread Adam Spiers
Klaus Wenninger <kwenn...@redhat.com> wrote: On 11/29/2017 04:23 PM, Kristoffer Grönlund wrote: Adam Spiers <aspi...@suse.com> writes: - The whole cluster is shut down cleanly. - The whole cluster is then started up again. (Side question: what happens if the last node

Re: [ClusterLabs] questions about startup fencing

2017-11-29 Thread Adam Spiers
Kristoffer Gronlund <kgronl...@suse.com> wrote: Adam Spiers <aspi...@suse.com> writes: - The whole cluster is shut down cleanly. - The whole cluster is then started up again. (Side question: what happens if the last node to shut down is not the first to start up? How will

Re: [ClusterLabs] Where to Find pcs and pcsd for OpenSUSE LEAP 4.23

2017-11-10 Thread Adam Spiers
Eric Robinson wrote: Which aspects of its constraints handling do you like, and why? I'm curious, since I wasn't aware that it was significantly different from crmsh in this respect. Well, to be fair, in the past I have always configured my clusters by using 'crm

Re: [ClusterLabs] Where to Find pcs and pcsd for OpenSUSE LEAP 4.23

2017-11-07 Thread Adam Spiers
Eric Robinson wrote: Thanks much. I am experienced with crmsh because I have been using it for years, but I recently tried pcs and I really like the way it handles constraints. Which aspects of its constraints handling do you like, and why? I'm curious, since I

Re: [ClusterLabs] New website design and new-new logo

2017-09-21 Thread Adam Spiers
Kai Dupke wrote: On 09/21/2017 04:42 PM, Ken Gaillot wrote: Yes, the FAQ needs an overhaul as well -- all the Pacemaker-specific questions should be moved to a separate Pacemaker FAQ, and the top FAQ should just have questions about ClusterLabs plus links to project FAQs Can

Re: [ClusterLabs] Clusterlabs Summit: Expect rain tomorrow

2017-09-05 Thread Adam Spiers
Kristoffer Gronlund wrote: > Hey everyone! > > I am going to try to be at the event area at 8 in the morning tomorrow, > and I wouldn't recommend showing up earlier than that. The doors will > probably be locked. The summit itself is scheduled to start at 9. > >

Re: [ClusterLabs] [ClusterLabs Developers] [HA/ClusterLabs Summit] Key-Signing Party, 2017 Edition

2017-07-23 Thread Adam Spiers
Hi Jan :-) Jan Pokorný wrote: Hello cluster masters :-) as there's little less than 7 weeks left to "The Summit" meetup (), it's about time to get the ball rolling so we can voluntarily augment the digital trust amongst us the attendees, on

Re: [ClusterLabs] Clusterlabs Summit 2017 (Nuremberg, 6-7 September) - Hotels and Topics

2017-05-03 Thread Adam Spiers
Kristoffer Gronlund <kgronl...@suse.com> wrote: > Hi everyone! > > Here's a quick update on the summit happening at the SUSE office in > Nuremberg on September 6-7. [snipped] > I am also happy to say that Adam Spiers from the SUSE Cloud team will be > attending the sum

Re: [ClusterLabs] Pacemaker 1.1.16 - Release Candidate 1

2016-11-03 Thread Adam Spiers
Klaus Wenninger <kwenn...@redhat.com> wrote: > On 11/03/2016 05:28 PM, Adam Spiers wrote: > > Ken Gaillot <kgail...@redhat.com> wrote: > >> ClusterLabs is happy to announce the first release candidate for > >> Pacemaker version 1.1.16. Source code is ava

Re: [ClusterLabs] Pacemaker 1.1.16 - Release Candidate 1

2016-11-03 Thread Adam Spiers
Ken Gaillot wrote: > ClusterLabs is happy to announce the first release candidate for > Pacemaker version 1.1.16. Source code is available at: > > https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.16-rc1 > > The most significant enhancements in this

Re: [ClusterLabs] Doing reload right

2016-07-21 Thread Adam Spiers
Ken Gaillot <kgail...@redhat.com> wrote: > On 07/20/2016 07:32 PM, Andrew Beekhof wrote: > > On Thu, Jul 21, 2016 at 2:47 AM, Adam Spiers <aspi...@suse.com> wrote: > >> Ken Gaillot <kgail...@redhat.com> wrote: > >>> Hello all, > >>> >

Re: [ClusterLabs] Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-06-25 Thread Adam Spiers
Ken Gaillot <kgail...@redhat.com> wrote: > On 06/24/2016 05:41 AM, Adam Spiers wrote: > > Andrew Beekhof <abeek...@redhat.com> wrote: > >> On Fri, Jun 24, 2016 at 1:01 AM, Adam Spiers <aspi...@suse.com> wrote: > >>> Andrew Beekhof <abeek...@

Re: [ClusterLabs] Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-06-23 Thread Adam Spiers
Andrew Beekhof <abeek...@redhat.com> wrote: > On Wed, Jun 15, 2016 at 10:42 PM, Adam Spiers <aspi...@suse.com> wrote: > > Andrew Beekhof <abeek...@redhat.com> wrote: > >> On Mon, Jun 13, 2016 at 9:34 PM, Adam Spiers <aspi...@suse.com> wrote: > &g

Re: [ClusterLabs] Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-06-15 Thread Adam Spiers
Andrew Beekhof <abeek...@redhat.com> wrote: > On Mon, Jun 13, 2016 at 9:34 PM, Adam Spiers <aspi...@suse.com> wrote: > > Andrew Beekhof <abeek...@redhat.com> wrote: > >> On Wed, Jun 8, 2016 at 6:23 PM, Adam Spiers <aspi...@suse.com> wrote: > &g

Re: [ClusterLabs] Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-06-13 Thread Adam Spiers
Andrew Beekhof <abeek...@redhat.com> wrote: > On Wed, Jun 8, 2016 at 6:23 PM, Adam Spiers <aspi...@suse.com> wrote: > > Andrew Beekhof <abeek...@redhat.com> wrote: > >> On Wed, Jun 8, 2016 at 12:11 AM, Adam Spiers <aspi...@suse.com> wrote: >

Re: [ClusterLabs] Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-06-07 Thread Adam Spiers
Ken Gaillot <kgail...@redhat.com> wrote: > On 06/06/2016 05:45 PM, Adam Spiers wrote: > > Adam Spiers <aspi...@suse.com> wrote: > >> Andrew Beekhof <abeek...@redhat.com> wrote: > >>> On Tue, Jun 7, 2016 at 8:29 AM, Adam Spiers <aspi...@suse.

Re: [ClusterLabs] Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-06-06 Thread Adam Spiers
Adam Spiers <aspi...@suse.com> wrote: > Andrew Beekhof <abeek...@redhat.com> wrote: > > On Tue, Jun 7, 2016 at 8:29 AM, Adam Spiers <aspi...@suse.com> wrote: > > > Ken Gaillot <kgail...@redhat.com> wrote: > > >> My main question is how useful

Re: [ClusterLabs] Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-06-06 Thread Adam Spiers
Andrew Beekhof <abeek...@redhat.com> wrote: > On Tue, Jun 7, 2016 at 8:29 AM, Adam Spiers <aspi...@suse.com> wrote: > > Ken Gaillot <kgail...@redhat.com> wrote: > >> On 06/02/2016 08:01 PM, Andrew Beekhof wrote: > >> > On Fri, May 20, 2016 at

Re: [ClusterLabs] Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-06-06 Thread Adam Spiers
Ken Gaillot wrote: > On 06/02/2016 08:01 PM, Andrew Beekhof wrote: > > On Fri, May 20, 2016 at 1:53 AM, Ken Gaillot wrote: > >> A recent thread discussed a proposed new feature, a new environment > >> variable that would be passed to resource agents,

Re: [ClusterLabs] Antw: Re: FR: send failcount to OCF RA start/stop actions

2016-05-23 Thread Adam Spiers
Ken Gaillot <kgail...@redhat.com> wrote: > On 05/20/2016 10:40 AM, Adam Spiers wrote: > > Ken Gaillot <kgail...@redhat.com> wrote: > >> Just musing a bit ... on-fail + migration-threshold could have been > >> designed to be more flexible: > >> >

Re: [ClusterLabs] Antw: Re: Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-05-20 Thread Adam Spiers
Klaus Wenninger wrote: > On 05/20/2016 08:39 AM, Ulrich Windl wrote: > Jehan-Guillaume de Rorthais schrieb am 19.05.2016 um > 21:29 in > > Nachricht <20160519212947.6cc0fd7b@firost>: > > [...] > >> I was thinking of a use case where a graceful

Re: [ClusterLabs] Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-05-20 Thread Adam Spiers
Ken Gaillot wrote: > A recent thread discussed a proposed new feature, a new environment > variable that would be passed to resource agents, indicating whether a > stop action was part of a recovery. > > Since that thread was long and covered a lot of topics, I'm starting a

Re: [ClusterLabs] Antw: Re: FR: send failcount to OCF RA start/stop actions

2016-05-20 Thread Adam Spiers
Ken Gaillot <kgail...@redhat.com> wrote: > On 05/12/2016 06:21 AM, Adam Spiers wrote: > > Ken Gaillot <kgail...@redhat.com> wrote: > >> On 05/10/2016 02:29 AM, Ulrich Windl wrote: > >>>> Here is what I'm testing currently: > >>>> > &g

Re: [ClusterLabs] Pacemaker with Zookeeper??

2016-05-17 Thread Adam Spiers
Bogdan Dobrelya wrote: > On 05/16/2016 09:23 AM, Jan Friesse wrote: > >> Hi, > >> > >> I have an idea: use Pacemaker with Zookeeper (instead of Corosync). Is > >> it possible? > >> Is there any examination about that? > > Indeed, would be *great* to have a Pacemaker based

Re: [ClusterLabs] Antw: Re: FR: send failcount to OCF RA start/stop actions

2016-05-12 Thread Adam Spiers
Hi Ken, Firstly thanks a lot not just for working on this, but also for being so proactive in discussing the details. A perfect example of OpenStack's "Open Design" philosophy in action :-) Ken Gaillot wrote: > On 05/10/2016 02:29 AM, Ulrich Windl wrote: > Ken Gaillot

[ClusterLabs] FR: send failcount to OCF RA start/stop actions

2016-05-04 Thread Adam Spiers
Hi all, As discussed with Ken and Andrew at the OpenStack summit last week, we would like Pacemaker to be extended to export the current failcount as an environment variable to OCF RA scripts when they are invoked with 'start' or 'stop' actions. This would mean that if you have

Re: [ClusterLabs] Coming in 1.1.15: Event-driven alerts

2016-04-21 Thread Adam Spiers
Ken Gaillot wrote: > Hello everybody, > > The release cycle for 1.1.15 will be started soon (hopefully tomorrow)! > > The most prominent feature will be Klaus Wenninger's new implementation > of event-driven alerts -- the ability to call scripts whenever > interesting

Re: [ClusterLabs] HA meetup at OpenStack Summit

2016-04-21 Thread Adam Spiers
ace > (vendor booths). I'll put a ClusterLabs sign on the table to help people > find it. > > On 04/14/2016 09:53 AM, Adam Spiers wrote: > > Ken Gaillot <kgail...@redhat.com> wrote: > >> Hi everybody, > >> > >> The upcoming OpenStack Summit is

Re: [ClusterLabs] service flap as nodes join and leave

2016-04-14 Thread Adam Spiers
Ken Gaillot wrote: > On 04/14/2016 09:33 AM, Christopher Harvey wrote: > > MsgBB-Active is a dummy resource that simply returns OCF_SUCCESS on > > every operation and logs to a file. > > That's a common mistake, and will confuse the cluster. The cluster > checks the status

Re: [ClusterLabs] HA meetup at OpenStack Summit

2016-04-14 Thread Adam Spiers
Ken Gaillot wrote: > Hi everybody, > > The upcoming OpenStack Summit is April 25-29 in Austin, Texas (US). Some > regular ClusterLabs contributors are going, so I was wondering if anyone > would like to do an informal meetup sometime during the summit. > > It looks like the

Re: [ClusterLabs] IPMI working but evacuations don't work‏

2016-03-31 Thread Adam Spiers
Digimer wrote: > On 31/03/16 02:26 AM, Moiz Arif wrote: > > Hi, > > > > I am working on VM evacuations and i have noticed that when my compute > > node's network is disconnected there is call from STONITH to fence the > > node and my node gets rebooted. But the VMs are not

Re: [ClusterLabs] Set "start-failure-is-fatal=false" on only one resource?

2016-03-24 Thread Adam Spiers
Sam Gardner wrote: > I'm having some trouble on a few of my clusters in which the DRBD Slave > resource does not want to come up after a reboot until I manually run > resource cleanup. > > Setting 'start-failure-is-fatal=false' as a global cluster property and a >

Re: [ClusterLabs] documentation on STONITH with remote nodes?

2016-03-14 Thread Adam Spiers
Ken Gaillot <kgail...@redhat.com> wrote: > On 03/12/2016 05:07 AM, Adam Spiers wrote: > > Is there any documentation on how STONITH works on remote nodes? I > > couldn't find any on clusterlabs.org, and it's conspicuously missing > > from: > > > > http://

[ClusterLabs] documentation on STONITH with remote nodes?

2016-03-12 Thread Adam Spiers
Is there any documentation on how STONITH works on remote nodes? I couldn't find any on clusterlabs.org, and it's conspicuously missing from: http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Remote/ I'm guessing the answer is more or less "it works exactly the same as for

Re: [ClusterLabs] Coming in Pacemaker 1.1.15: graceful Pacemaker Remote node stops

2016-02-19 Thread Adam Spiers
Ken Gaillot wrote: > Pacemaker's upstream master branch has a new feature that will be part > of the eventual 1.1.15 release. [snipped] > This new feature makes updates of Pacemaker Remote nodes more similar to > that of cluster nodes -- simply stop cluster services (in

[ClusterLabs] [ANNOUNCE] [HA] new #openstack-ha IRC channel on FreeNode

2015-10-22 Thread Adam Spiers
[cross-posting to several lists; please trim the recipients list before replying!] Hi all, After discussion with members of the openstack-infra team, I registered new FreeNode IRC channel #openstack-ha. Discussion on all aspects of OpenStack High Availability is welcome in this channel.

[ClusterLabs] [ANNOUNCE] [HA] [Pacemaker] new, maintained openstack-resource-agents repository

2015-10-21 Thread Adam Spiers
[cross-posting to openstack-dev and pacemaker user lists; please consider trimming the recipients list if your reply is not relevant to both communities] Hi all, Back in June I proposed moving the well-used but no longer maintained https://github.com/madkiss/openstack-resource-agents/ repository

Re: [ClusterLabs] Question about fence-agents-compute

2015-10-12 Thread Adam Spiers
Kazunori INOUE wrote: > [VM_db0101]# export OS_USERNAME=demo ; export OS_PASSWORD=demo ; > export OS_AUTH_URL=http://10.0.2.11:5000/v2.0 ; export > OS_TENANT_NAME=demo > [VM_db0101]# nova list > +--+---++ (snip) > | ID