[Openstack-operators] revamped ops meetup day 2

2018-09-10 Thread Chris Morgan
Hi All,
  We (ops meetups team) got several additional suggestions for ops meetups
session, so we've attempted to revamp day 2 to fit them in, please see

https://docs.google.com/spreadsheets/d/1EUSYMs3GfglnD8yfFaAXWhLe0F5y9hCUKqCYe0Vp1oA/edit#gid=981527336

Given the timing, we'll attempt to confirm the rest of the day starting at
9am over coffee. If you're moderating something tomorrow please check out
the adjusted times. If something doesn't work for you we'll try and swap
sessions to make it work.

Cheers
Chris, Erik, Sean

-- 
Chris Morgan 
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] [upgrade] request for pre-upgrade check for db purge

2018-09-10 Thread Matt Riedemann
I created a nova bug [1] to track a request that came up in the upgrades 
SIG room at the PTG today [2] and would like to see if there is any 
feedback from other operators/developers that weren't part of the 
discussion.


The basic problem is that failing to archive/purge deleted records* from 
the database can make upgrades much slower during schema migrations. 
Anecdotes from the room mentioned that it can be literally impossible to 
complete upgrades for keystone and heat in certain scenarios if you 
don't purge the database first.


The request was that a configurable limit gets added to each service 
which is checked as part of the service's pre-upgrade check routine [3] 
and warn if the number of records to purge is over that limit.


For example, the nova-status upgrade check could warn if there are over 
10 deleted records total across all cells databases. Maybe cinder 
would have something similar for deleted volumes. Keystone could have 
something for revoked tokens.


Another idea in the room was flagging on records over a certain age 
limit. For example, if there are deleted instances in nova that were 
deleted >1 year ago.


How do people feel about this? It seems pretty straight-forward to me. 
If people are generally in favor of this, then the question is what 
would be sane defaults - or should we not assume a default and force 
operators to opt into this?


* nova delete doesn't actually delete the record from the instances 
table, it flips a value to hide it - you have to archive/purge those 
records to get them out of the main table.


[1] https://bugs.launchpad.net/nova/+bug/1791824
[2] https://etherpad.openstack.org/p/upgrade-sig-ptg-stein
[3] https://governance.openstack.org/tc/goals/stein/upgrade-checkers.html

--

Thanks,

Matt

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Cinder HA with zookeeper or redis?

2018-09-10 Thread Jay S Bryant

James,

Sorry, I forgot to include the link to our HA documentation in the 
earlier e-mail: 
https://docs.openstack.org/cinder/latest/contributor/high_availability.html


Jay


On 9/10/2018 3:39 PM, James Penick wrote:

Ah ok so this is a case of no ones documented it, but it's do-able.

If anyone out there has done it we'd be happy to take your notes! 
Otherwise we'll figure it out and upstream the process.


thanks!
-James

On Mon, Sep 10, 2018 at 2:18 PM Jay S Bryant > wrote:



On 9/9/2018 10:58 PM, Adam Spiers wrote:
> Hi James,
>
> James Penick mailto:jpen...@gmail.com>> wrote:
>> Hey folks,
>> Does anyone have experience using zookeeper or redis to handle HA
>> failover
>> in cinder clusters?
>
> I'm guessing you mean failover of an active/passive cinder-volume
> service?
>
>> I know there's docs on pacemaker, however we already
>> have the other two installed and don't want to add yet another
>> component to
>> package and maintain in our clusters.
>
> I'm afraid I don't, but if you make any progress on this, please let
> me know as it would be great to document:
>
>  - how this would work
>  - any pros and cons vs. Pacemaker
>
> and maybe I can help with that.
>
> One particular question: if the node running the service becomes
> unreachable, is it safe to fail it over straight away, or is fencing
> required first?  (I'm pretty sure I've asked this same question
> before, but I can't remember the answer - sorry!)
James,

I echo Adam's input.  I have only heard of people implementing with
pacemaker but there is no reason that this couldn't be tried with
other
HA solutions.

If you are able to try it and document it would be great to add
documentation here:  [1]

Also, Gorka Eguileor is a good contact as he has been doing much
of the
work on HA Cinder though his focus is on Active/Active HA.

Let us know if you have any further questions.

Thanks!
Jay



___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Cinder HA with zookeeper or redis?

2018-09-10 Thread James Penick
Ah ok so this is a case of no ones documented it, but it's do-able.

If anyone out there has done it we'd be happy to take your notes! Otherwise
we'll figure it out and upstream the process.

thanks!
-James

On Mon, Sep 10, 2018 at 2:18 PM Jay S Bryant  wrote:

>
> On 9/9/2018 10:58 PM, Adam Spiers wrote:
> > Hi James,
> >
> > James Penick  wrote:
> >> Hey folks,
> >> Does anyone have experience using zookeeper or redis to handle HA
> >> failover
> >> in cinder clusters?
> >
> > I'm guessing you mean failover of an active/passive cinder-volume
> > service?
> >
> >> I know there's docs on pacemaker, however we already
> >> have the other two installed and don't want to add yet another
> >> component to
> >> package and maintain in our clusters.
> >
> > I'm afraid I don't, but if you make any progress on this, please let
> > me know as it would be great to document:
> >
> >  - how this would work
> >  - any pros and cons vs. Pacemaker
> >
> > and maybe I can help with that.
> >
> > One particular question: if the node running the service becomes
> > unreachable, is it safe to fail it over straight away, or is fencing
> > required first?  (I'm pretty sure I've asked this same question
> > before, but I can't remember the answer - sorry!)
> James,
>
> I echo Adam's input.  I have only heard of people implementing with
> pacemaker but there is no reason that this couldn't be tried with other
> HA solutions.
>
> If you are able to try it and document it would be great to add
> documentation here:  [1]
>
> Also, Gorka Eguileor is a good contact as he has been doing much of the
> work on HA Cinder though his focus is on Active/Active HA.
>
> Let us know if you have any further questions.
>
> Thanks!
> Jay
>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Cinder HA with zookeeper or redis?

2018-09-10 Thread Melvin Hillsman
Additionally if you require some resources to test this against OpenLab is
a great resource - https://openlabtesting.org provides more info -
https://github.com/theopenlab/resource-requests/issues/new - is where you
can skip having to go through the site to do so

On Mon, Sep 10, 2018 at 3:19 PM Jay S Bryant  wrote:

>
> On 9/9/2018 10:58 PM, Adam Spiers wrote:
> > Hi James,
> >
> > James Penick  wrote:
> >> Hey folks,
> >> Does anyone have experience using zookeeper or redis to handle HA
> >> failover
> >> in cinder clusters?
> >
> > I'm guessing you mean failover of an active/passive cinder-volume
> > service?
> >
> >> I know there's docs on pacemaker, however we already
> >> have the other two installed and don't want to add yet another
> >> component to
> >> package and maintain in our clusters.
> >
> > I'm afraid I don't, but if you make any progress on this, please let
> > me know as it would be great to document:
> >
> >  - how this would work
> >  - any pros and cons vs. Pacemaker
> >
> > and maybe I can help with that.
> >
> > One particular question: if the node running the service becomes
> > unreachable, is it safe to fail it over straight away, or is fencing
> > required first?  (I'm pretty sure I've asked this same question
> > before, but I can't remember the answer - sorry!)
> James,
>
> I echo Adam's input.  I have only heard of people implementing with
> pacemaker but there is no reason that this couldn't be tried with other
> HA solutions.
>
> If you are able to try it and document it would be great to add
> documentation here:  [1]
>
> Also, Gorka Eguileor is a good contact as he has been doing much of the
> work on HA Cinder though his focus is on Active/Active HA.
>
> Let us know if you have any further questions.
>
> Thanks!
> Jay
>
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>


-- 
Kind regards,

Melvin Hillsman
mrhills...@gmail.com
mobile: (832) 264-2646
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Cinder HA with zookeeper or redis?

2018-09-10 Thread Jay S Bryant


On 9/9/2018 10:58 PM, Adam Spiers wrote:

Hi James,

James Penick  wrote:

Hey folks,
Does anyone have experience using zookeeper or redis to handle HA 
failover

in cinder clusters?


I'm guessing you mean failover of an active/passive cinder-volume
service?


I know there's docs on pacemaker, however we already
have the other two installed and don't want to add yet another 
component to

package and maintain in our clusters.


I'm afraid I don't, but if you make any progress on this, please let
me know as it would be great to document:

 - how this would work
 - any pros and cons vs. Pacemaker

and maybe I can help with that.

One particular question: if the node running the service becomes
unreachable, is it safe to fail it over straight away, or is fencing
required first?  (I'm pretty sure I've asked this same question
before, but I can't remember the answer - sorry!)

James,

I echo Adam's input.  I have only heard of people implementing with 
pacemaker but there is no reason that this couldn't be tried with other 
HA solutions.


If you are able to try it and document it would be great to add 
documentation here:  [1]


Also, Gorka Eguileor is a good contact as he has been doing much of the 
work on HA Cinder though his focus is on Active/Active HA.


Let us know if you have any further questions.

Thanks!
Jay


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators