Re: [openstack-dev] [Fuel] Waiting for Haproxy backends

2014-11-18 Thread Andrew Beekhof
Hi Everyone, I was reading the blueprints mentioned here and thought I'd take the opportunity to introduce myself and ask a few questions. For those that don't recognise my name, Pacemaker is my baby - so I take a keen interest helping people have a good experience with it :) A couple of items

Re: [openstack-dev] [Fuel] Waiting for Haproxy backends

2014-11-19 Thread Andrew Beekhof
On 20 Nov 2014, at 6:55 am, Sergii Golovatiuk sgolovat...@mirantis.com wrote: Hi crew, Please see my inline comments. Hi Everyone, I was reading the blueprints mentioned here and thought I'd take the opportunity to introduce myself and ask a few questions. For those that don't

Re: [openstack-dev] [nova] Host health monitoring

2015-01-11 Thread Andrew Beekhof
On 9 Jan 2015, at 5:37 am, Joe Gordon joe.gord...@gmail.com wrote: On Sun, Jan 4, 2015 at 7:08 PM, Andrew Beekhof abeek...@redhat.com wrote: On 9 Dec 2014, at 1:20 am, Roman Dobosz roman.dob...@intel.com wrote: On Wed, 3 Dec 2014 08:44:57 +0100 Roman Dobosz roman.dob

Re: [openstack-dev] [nova] Host health monitoring

2015-01-04 Thread Andrew Beekhof
On 9 Dec 2014, at 1:20 am, Roman Dobosz roman.dob...@intel.com wrote: On Wed, 3 Dec 2014 08:44:57 +0100 Roman Dobosz roman.dob...@intel.com wrote: I've just started to work on the topic of detection if host is alive or not:

Re: [openstack-dev] [Fuel] Speed Up RabbitMQ Recovering

2015-05-07 Thread Andrew Beekhof
On 5 May 2015, at 1:19 pm, Zhou Zheng Sheng / 周征晟 zhengsh...@awcloud.com wrote: Thank you Andrew. on 2015/05/05 08:03, Andrew Beekhof wrote: On 28 Apr 2015, at 11:15 pm, Bogdan Dobrelya bdobre...@mirantis.com wrote: Hello, Hello, Zhou I using Fuel 6.0.1 and find that RabbitMQ

Re: [openstack-dev] [Fuel] Speed Up RabbitMQ Recovering

2015-05-07 Thread Andrew Beekhof
On 5 May 2015, at 9:30 pm, Zhou Zheng Sheng / 周征晟 zhengsh...@awcloud.com wrote: Thank you Andrew. Sorry for misspell your name in the previous email. on 2015/05/05 14:25, Andrew Beekhof wrote: On 5 May 2015, at 2:31 pm, Zhou Zheng Sheng / 周征晟 zhengsh...@awcloud.com wrote: Thank you

Re: [openstack-dev] [Fuel] Speed Up RabbitMQ Recovering

2015-05-07 Thread Andrew Beekhof
On 5 May 2015, at 7:52 pm, Bogdan Dobrelya bdobre...@mirantis.com wrote: On 05.05.2015 04:32, Andrew Beekhof wrote: [snip] Technically it calculates an ordered graph of actions that need to be performed for a set of related resources. You can see an example of the kinds of graphs

Re: [openstack-dev] [Fuel] Speed Up RabbitMQ Recovering

2015-05-04 Thread Andrew Beekhof
On 28 Apr 2015, at 11:15 pm, Bogdan Dobrelya bdobre...@mirantis.com wrote: Hello, Hello, Zhou I using Fuel 6.0.1 and find that RabbitMQ recover time is long after power failure. I have a running HA environment, then I reset power of all the machines at the same time. I observe that

Re: [openstack-dev] [Fuel] Speed Up RabbitMQ Recovering

2015-05-05 Thread Andrew Beekhof
On 5 May 2015, at 2:31 pm, Zhou Zheng Sheng / 周征晟 zhengsh...@awcloud.com wrote: Thank you Bogdan for clearing the pacemaker promotion process for me. on 2015/05/05 10:32, Andrew Beekhof wrote: On 29 Apr 2015, at 5:38 pm, Zhou Zheng Sheng / 周征晟 zhengsh...@awcloud.com wrote: [snip

Re: [openstack-dev] [Fuel] Speed Up RabbitMQ Recovering

2015-05-04 Thread Andrew Beekhof
On 29 Apr 2015, at 5:38 pm, Zhou Zheng Sheng / 周征晟 zhengsh...@awcloud.com wrote: [snip] Batch is a pacemaker concept I found when I was reading its documentation and code. There is a batch-limit: 30 in the output of pcs property list --all. The pacemaker official documentation

Re: [openstack-dev] [Fuel] Speed Up RabbitMQ Recovering

2015-05-19 Thread Andrew Beekhof
On 20 May 2015, at 6:05 am, Andrew Woodward xar...@gmail.com wrote: On Thu, May 7, 2015 at 5:01 PM Andrew Beekhof abeek...@redhat.com wrote: On 5 May 2015, at 1:19 pm, Zhou Zheng Sheng / 周征晟 zhengsh...@awcloud.com wrote: Thank you Andrew. on 2015/05/05 08:03, Andrew Beekhof

Re: [openstack-dev] [Keystone][Fernet] HA SQL backend for Fernet keys

2015-08-04 Thread Andrew Beekhof
On 3 Aug 2015, at 8:02 pm, Sergii Golovatiuk sgolovat...@mirantis.com wrote: Hi, I agree with Bogdan that key rotation procedure should be part of HA solution. These things don’t usually have to be an either/or situation. Why not create one script that does the work and can be called

Re: [openstack-dev] [Cinder] A possible solution for HA Active-Active

2015-08-07 Thread Andrew Beekhof
On 5 Aug 2015, at 1:34 am, Joshua Harlow harlo...@outlook.com wrote: Philipp Marek wrote: If we end up using a DLM then we have to detect when the connection to the DLM is lost on a node and stop all ongoing operations to prevent data corruption. It may not be trivial to do, but we will

Re: [openstack-dev] [HA][RabbitMQ][messaging][Pacemaker][operators] Improved OCF resource agent for dynamic active-active mirrored clustering

2015-11-11 Thread Andrew Beekhof
> On 11 Nov 2015, at 6:26 PM, bdobre...@mirantis.com wrote: > > Thank you Andrew. > Answers below. > >>> > Sounds interesting, can you give any comment about how it differs to the > other[i] upstream agent? > Am I right that this one is effectively A/P and wont function without some > kind of

Re: [openstack-dev] [HA][RabbitMQ][messaging][Pacemaker][operators] Improved OCF resource agent for dynamic active-active mirrored clustering

2015-11-09 Thread Andrew Beekhof
> On 23 Oct 2015, at 7:01 PM, Bogdan Dobrelya wrote: > > Hello. > I'm glad to announce that the pacemaker OCF resource agent for the > rabbitmq clustering, which was born in the Fuel project initially, now > available and maintained upstream! It will be shipped with the

Re: [openstack-dev] [HA][RabbitMQ][messaging][Pacemaker][operators] Improved OCF resource agent for dynamic active-active mirrored clustering

2015-11-12 Thread Andrew Beekhof
esource to restart > http://bugs.clusterlabs.org/show_bug.cgi?id=5243) . It appears I misunderstood your bug the first time around :-( Do you still have logs of this occuring? > I now remember, why we did notify errors - for error logging, I guess. > > > On Thu, Nov 12, 2015 a

Re: [openstack-dev] [HA][RabbitMQ][messaging][Pacemaker][operators] Improved OCF resource agent for dynamic active-active mirrored clustering

2015-11-15 Thread Andrew Beekhof
ode and the default check returning SUCCESS code. You will find > that it is restarting only after 2 consequent failures of non-zero level > check. Ack. I’ve asked some people to look into it. > > On Thu, Nov 12, 2015 at 10:58 PM, Andrew Beekhof <abeek...@redhat

Re: [openstack-dev] [HA][RabbitMQ][messaging][Pacemaker][operators] Improved OCF resource agent for dynamic active-active mirrored clustering

2015-11-11 Thread Andrew Beekhof
the subsequent monitor operation to notice the error state. I guess that would work but you might be waiting a while for it to notice. > > On Wed, Nov 11, 2015 at 2:12 PM, Andrew Beekhof <abeek...@redhat.com> wrote: > > > On 11 Nov 2015, at 6:26 PM, bdobre...@mirantis.co

Re: [openstack-dev] [Fuel] HA cluster disk monitoring, failover and recovery

2015-11-17 Thread Andrew Beekhof
> On 18 Nov 2015, at 4:52 AM, Alex Schultz wrote: > > On Tue, Nov 17, 2015 at 11:12 AM, Vladimir Kuklin > wrote: >> Bogdan >> >> I think we should firstly check whether attribute deletion leads to node >> starting its services or not. From what I

Re: [openstack-dev] [HA][RabbitMQ][messaging][Pacemaker][operators] Improved OCF resource agent for dynamic active-active mirrored clustering

2016-03-18 Thread Andrew Beekhof
On Tue, Feb 16, 2016 at 2:58 AM, Bogdan Dobrelya <bdobre...@mirantis.com> wrote: > Hello! > A quick status update inline: > [snip] > So, what's next? > > - I'm open for merging both [5], [6] of the existing OCF RA solutions, > as it was proposed by Andrew Be

Re: [openstack-dev] [TripleO][Heat][Kolla][Magnum] The zen of Heat, containers, and the future of TripleO

2016-04-03 Thread Andrew Beekhof
On Tue, Mar 29, 2016 at 6:02 AM, Dan Prince wrote: [...] > That said regardless of what we eventually do with Pacemaker or Puppet > it should be feasible for them both to co-exist. The key thing to keep in mind if you're using Puppet to build a cluster is that if you're