subject:"\[openstack\-dev\] \[keystone\]\[nova\]\[neutron\]\[all\] Rolling upgrades\: database triggers and oslo.versionedobjects"

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-10-27 Thread Bashmakov, Alexander

Hi Jay,

Thanks for the explanation. While I agree that there is a distinction between a 
distributed architecture like Nova and a centralized one like Glance, I would 
respectfully disagree with the statement that Glance cannot participate in 
rolling upgrades in a very similar fashion. We are currently working on a 
rolling upgrade POC in Glance (https://review.openstack.org/331740/). To date, 
we've successfully been able to run through a simple scenario with two Glance 
nodes running Newton and Ocata code base respectively. The latter introduces 
schema changes which are reconciled in the DB via a two-way trigger.

Regards,
Alex

-Original Message-
From: Jay Pipes [mailto:jaypi...@gmail.com] 
Sent: Friday, October 14, 2016 1:56 PM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: 
database triggers and oslo.versionedobjects

Alex, so sorry for the long delayed response! :( This just crept to the back of 
my inbox unfortunately. Answer inline...

On 09/14/2016 07:24 PM, Bashmakov, Alexander wrote:
>> Glance and Keystone do not participate in a rolling upgrade, because 
>> Keystone and Glance do not have a distributed component architecture. 
>> Online data migrations will reduce total downtime experienced during 
>> an *overall upgrade procedure* for an OpenStack cloud, but Nova, 
>> Neutron and Cinder are the only parts of OpenStack that are going to 
>> participate in a rolling upgrade because they are the services that 
>> are distributed across all the many compute nodes.
>
> Hi Jay, I'd like to better understand why your definition of rolling 
> upgrades excludes Glance and Keystone? Granted they don't run multiple 
> disparate components over distributed systems, however, they can still 
> run the same service on multiple distributed nodes. So a rolling 
> upgrade can still be applied on a large cloud that has, for instance 
> 50 Glance nodes.

If you've seen a cloud with 50 Glance nodes, I would be astonished :) That 
said, the number 50 doesn't really have to do with my definition of rolling... 
lemme explain.

The primary thing that, to me at least, differentiates rolling upgrades of 
distributed software is that different nodes can contain multiple versions of 
the software and continue to communicate with other nodes in the system without 
issue.

In the case of Glance, you cannot have different versions of the Glance service 
running simultaneously within an environment, because those Glance services 
each directly interface with the Glance database and therefore expect the 
Glance DB schema to look a particular way for a specific version of the Glance 
service software.

In contrast, Nova's distributed service nodes -- the nova-compute services and 
(mostly) the nova-api services do *not* talk directly to the Nova database. If 
those services need to get or set data in the database, they communicate with 
the nova-conductor services which are responsible for translating (called 
back-versioning) the most updated object model schema that matches the Nova 
database to the schema that the calling node understands. This means that Nova 
deployers can update the Nova database schema and not have to at the same time 
update the software on the distributed compute nodes. In this way deployers can 
"roll out" an upgrade of the Nova software across many hundreds of compute 
nodes over an extended period of time without needing to restart/upgrade 
services all at once.

Hope this clarifies things.

Best,
-jay

p.s. I see various information on the web referring to "rolling updates" 
or "rolling releases" as simply the process of continuously applying new 
versions of software to a deployment. This is decidedly *not* what I refer to 
as a "rolling upgrade". Perhaps we should invent a different term from "rolling 
upgrade" to refer to the attributes involved in being able to run multiple 
versions of distributed software with no impact on the control plane? Is that 
what folks call a "partial upgrade"? Not sure...

  > In this case different versions of the
> same service will run on different nodes simultaneously. Regards, Alex

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-10-17 Thread Michał Dulko

On 10/16/2016 11:52 AM, Duncan Thomas wrote:
> On 14 October 2016 at 23:55, Jay Pipes  > wrote:
>
> The primary thing that, to me at least, differentiates rolling
> upgrades of distributed software is that different nodes can
> contain multiple versions of the software and continue to
> communicate with other nodes in the system without issue.
>
> In the case of Glance, you cannot have different versions of the
> Glance service running simultaneously within an environment,
> because those Glance services each directly interface with the
> Glance database and therefore expect the Glance DB schema to look
> a particular way for a specific version of the Glance service
> software.
>
>
> Cinder services can run N+-1 versions in a mixed manner, all talking
> to the  same database, no conductor required.

That's true, keeping DB schema compatible is possible even if all the
services are talking to the DB, but this requires very careful
programming (and reviewing) of the DB migrations. That's why our DB
migrations guidelines are quite complicated [1]. I think conductor-based
design is superior but definitely isn't the only way.

[1]
http://docs.openstack.org/developer/cinder/devref/rolling.upgrades.html#database-schema-and-data-migrations

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-10-16 Thread Duncan Thomas

On 14 October 2016 at 23:55, Jay Pipes  wrote:

> The primary thing that, to me at least, differentiates rolling upgrades of
> distributed software is that different nodes can contain multiple versions
> of the software and continue to communicate with other nodes in the system
> without issue.
>
> In the case of Glance, you cannot have different versions of the Glance
> service running simultaneously within an environment, because those Glance
> services each directly interface with the Glance database and therefore
> expect the Glance DB schema to look a particular way for a specific version
> of the Glance service software.
>

Cinder services can run N+-1 versions in a mixed manner, all talking to the
 same database, no conductor required.



-- 
Duncan Thomas
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-10-15 Thread Clint Byrum

Excerpts from Jay Pipes's message of 2016-10-14 16:55:39 -0400:
> Alex, so sorry for the long delayed response! :( This just crept to
> the back of my inbox unfortunately. Answer inline...
> 
> On 09/14/2016 07:24 PM, Bashmakov, Alexander wrote:
> >> Glance and Keystone do not participate in a rolling upgrade,
> >> because Keystone and Glance do not have a distributed component
> >> architecture. Online data migrations will reduce total downtime
> >> experienced during an *overall upgrade procedure* for an OpenStack
> >> cloud, but Nova, Neutron and Cinder are the only parts of OpenStack
> >> that are going to participate in a rolling upgrade because they are
> >> the services that are distributed across all the many compute
> >> nodes.
> >
> > Hi Jay, I'd like to better understand why your definition of rolling
> > upgrades excludes Glance and Keystone? Granted they don't run
> > multiple disparate components over distributed systems, however, they
> > can still run the same service on multiple distributed nodes. So a
> > rolling upgrade can still be applied on a large cloud that has, for
> > instance 50 Glance nodes.
> 
> If you've seen a cloud with 50 Glance nodes, I would be astonished :) 
> That said, the number 50 doesn't really have to do with my definition of 
> rolling... lemme explain.
> 
> The primary thing that, to me at least, differentiates rolling upgrades 
> of distributed software is that different nodes can contain multiple 
> versions of the software and continue to communicate with other nodes in 
> the system without issue.
> 

Database are often (mis)used to communicate.

> In the case of Glance, you cannot have different versions of the Glance 
> service running simultaneously within an environment, because those 
> Glance services each directly interface with the Glance database and 
> therefore expect the Glance DB schema to look a particular way for a 
> specific version of the Glance service software.
> 

That's not a constraint of Glance, but a constraint of the way Glance
has been interfacing with the database. The argument of the thread was
that one can make schema changes in such a way that one can have
multiple versions of the same component running during an update.

> In contrast, Nova's distributed service nodes -- the nova-compute 
> services and (mostly) the nova-api services do *not* talk directly to 
> the Nova database. If those services need to get or set data in the 
> database, they communicate with the nova-conductor services which are 
> responsible for translating (called back-versioning) the most updated 
> object model schema that matches the Nova database to the schema that 
> the calling node understands. This means that Nova deployers can update 
> the Nova database schema and not have to at the same time update the 
> software on the distributed compute nodes. In this way deployers can 
> "roll out" an upgrade of the Nova software across many hundreds of 
> compute nodes over an extended period of time without needing to 
> restart/upgrade services all at once.
> 
> Hope this clarifies things.
> 

It clarifies your thinking, so thanks for that. However, I'm not so sure
there's any difference between components that are the same software,
and components that are different software, if they end up interacting
anyway because one version can write and read data that another version
does.

What I think is important is understanding the interfaces, and how they
can be tested to ensure that rolling/partial/0-downtime updates can be
done safely.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-10-14 Thread Jay Pipes


Alex, so sorry for the long delayed response! :( This just crept to
the back of my inbox unfortunately. Answer inline...

On 09/14/2016 07:24 PM, Bashmakov, Alexander wrote:

Glance and Keystone do not participate in a rolling upgrade,
because Keystone and Glance do not have a distributed component
architecture. Online data migrations will reduce total downtime
experienced during an *overall upgrade procedure* for an OpenStack
cloud, but Nova, Neutron and Cinder are the only parts of OpenStack
that are going to participate in a rolling upgrade because they are
the services that are distributed across all the many compute
nodes.


Hi Jay, I'd like to better understand why your definition of rolling
upgrades excludes Glance and Keystone? Granted they don't run
multiple disparate components over distributed systems, however, they
can still run the same service on multiple distributed nodes. So a
rolling upgrade can still be applied on a large cloud that has, for
instance 50 Glance nodes.


If you've seen a cloud with 50 Glance nodes, I would be astonished :) 
That said, the number 50 doesn't really have to do with my definition of 
rolling... lemme explain.


The primary thing that, to me at least, differentiates rolling upgrades 
of distributed software is that different nodes can contain multiple 
versions of the software and continue to communicate with other nodes in 
the system without issue.


In the case of Glance, you cannot have different versions of the Glance 
service running simultaneously within an environment, because those 
Glance services each directly interface with the Glance database and 
therefore expect the Glance DB schema to look a particular way for a 
specific version of the Glance service software.


In contrast, Nova's distributed service nodes -- the nova-compute 
services and (mostly) the nova-api services do *not* talk directly to 
the Nova database. If those services need to get or set data in the 
database, they communicate with the nova-conductor services which are 
responsible for translating (called back-versioning) the most updated 
object model schema that matches the Nova database to the schema that 
the calling node understands. This means that Nova deployers can update 
the Nova database schema and not have to at the same time update the 
software on the distributed compute nodes. In this way deployers can 
"roll out" an upgrade of the Nova software across many hundreds of 
compute nodes over an extended period of time without needing to 
restart/upgrade services all at once.


Hope this clarifies things.

Best,
-jay

p.s. I see various information on the web referring to "rolling updates" 
or "rolling releases" as simply the process of continuously applying new 
versions of software to a deployment. This is decidedly *not* what I 
refer to as a "rolling upgrade". Perhaps we should invent a different 
term from "rolling upgrade" to refer to the attributes involved in being 
able to run multiple versions of distributed software with no impact on 
the control plane? Is that what folks call a "partial upgrade"? Not sure...


 > In this case different versions of the

same service will run on different nodes simultaneously. Regards,
Alex




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-14 Thread Clint Byrum

Excerpts from Henry Nash's message of 2016-09-15 00:29:44 +0100:
> Jay,
> 
> I agree with your distinction - and when I am referring to rolling upgrades 
> for keystone I am referring to when you are running a cluster of keystones 
> (for performance and/or redundancy), and you want to roll the upgrade across 
> the cluster without creating downtime of the overall keystone service. Such a 
> keystone cluster deployment will be common in large clouds - and prior to 
> Newton, keystone did not support such a rolling upgrade (you had to take all 
> the nodes down, upgrade the DB and then boot them all back up). In order to 
> support such a rolling upgrade you either need to have code that can work on 
> different DB versions (either explicitly or via versioned objects), or you 
> hide the schema changes by “data synchronisation via Triggers”, which is 
> where this whole thread came from.
> 

It doesn't always need to be explicit or through versioned objects. One
can often manipulate the schema and even migrate data without disturbing
old code.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-14 Thread Henry Nash

Jay,

I agree with your distinction - and when I am referring to rolling upgrades for 
keystone I am referring to when you are running a cluster of keystones (for 
performance and/or redundancy), and you want to roll the upgrade across the 
cluster without creating downtime of the overall keystone service. Such a 
keystone cluster deployment will be common in large clouds - and prior to 
Newton, keystone did not support such a rolling upgrade (you had to take all 
the nodes down, upgrade the DB and then boot them all back up). In order to 
support such a rolling upgrade you either need to have code that can work on 
different DB versions (either explicitly or via versioned objects), or you hide 
the schema changes by “data synchronisation via Triggers”, which is where this 
whole thread came from.

Henry
> On 14 Sep 2016, at 23:08, Jay Pipes  wrote:
> 
> On 09/01/2016 05:29 AM, Henry Nash wrote:
>> So as the person who drove the rolling upgrade requirements into
>> keystone in this cycle (because we have real customers that need it),
>> and having first written the keystone upgrade process to be
>> “versioned object ready” (because I assumed we would do this the same
>> as everyone else), and subsequently re-written it to be “DB Trigger
>> ready”…and written migration scripts for both these cases for the (in
>> fact very minor) DB changes that keystone has in Newton…I guess I
>> should also weigh in here :-)
> 
> Sorry for delayed response. PTO and all... I'd just like to make a 
> clarification here. Henry, you are not referring to *rolling upgrades* but 
> rather *online database migrations*. There's an important distinction between 
> the two concepts.
> 
> Online schema migrations, as discussed in this thread, are all about 
> minimizing the time that a database server is locked or otherwise busy 
> performing the tasks of changing SQL schemas and moving the underlying stored 
> data from their old location/name to their new location/name. As noted in 
> this thread, there's numerous ways of reducing the downtime experienced 
> during these data and schema migrations.
> 
> Rolling upgrades are not the same thing, however. What rolling upgrades refer 
> to is the ability of a *distributed system* to have its distributed component 
> services running different versions of the software and still be able to 
> communicate with the other components of the system. This time period during 
> which the components of the distributed system may run different versions of 
> the software may be quite lengthy (days or weeks long). The "rolling" part of 
> "rolling upgrade" refers to the fact that in a distributed system of 
> thousands of components or nodes, the upgraded software must be "rolled out" 
> to those thousands of nodes over a period of time.
> 
> Glance and Keystone do not participate in a rolling upgrade, because Keystone 
> and Glance do not have a distributed component architecture. Online data 
> migrations will reduce total downtime experienced during an *overall upgrade 
> procedure* for an OpenStack cloud, but Nova, Neutron and Cinder are the only 
> parts of OpenStack that are going to participate in a rolling upgrade because 
> they are the services that are distributed across all the many compute nodes.
> 
> Best,
> -jay
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-14 Thread Bashmakov, Alexander

> Glance and Keystone do not participate in a rolling upgrade, because
> Keystone and Glance do not have a distributed component architecture.
> Online data migrations will reduce total downtime experienced during an
> *overall upgrade procedure* for an OpenStack cloud, but Nova, Neutron and
> Cinder are the only parts of OpenStack that are going to participate in a 
> rolling
> upgrade because they are the services that are distributed across all the
> many compute nodes.

Hi Jay,
I'd like to better understand why your definition of rolling upgrades excludes 
Glance and Keystone? Granted they don't run multiple disparate components over 
distributed systems, however, they can still run the same service on multiple 
distributed nodes. So a rolling upgrade can still be applied on a large cloud 
that has, for instance 50 Glance nodes.  In this case different versions of the 
same service will run on different nodes simultaneously.
Regards,
Alex
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-14 Thread Jay Pipes


On 09/01/2016 05:29 AM, Henry Nash wrote:

So as the person who drove the rolling upgrade requirements into
keystone in this cycle (because we have real customers that need it),
and having first written the keystone upgrade process to be
“versioned object ready” (because I assumed we would do this the same
as everyone else), and subsequently re-written it to be “DB Trigger
ready”…and written migration scripts for both these cases for the (in
fact very minor) DB changes that keystone has in Newton…I guess I
should also weigh in here :-)


Sorry for delayed response. PTO and all... I'd just like to make a 
clarification here. Henry, you are not referring to *rolling upgrades* 
but rather *online database migrations*. There's an important 
distinction between the two concepts.


Online schema migrations, as discussed in this thread, are all about 
minimizing the time that a database server is locked or otherwise busy 
performing the tasks of changing SQL schemas and moving the underlying 
stored data from their old location/name to their new location/name. As 
noted in this thread, there's numerous ways of reducing the downtime 
experienced during these data and schema migrations.


Rolling upgrades are not the same thing, however. What rolling upgrades 
refer to is the ability of a *distributed system* to have its 
distributed component services running different versions of the 
software and still be able to communicate with the other components of 
the system. This time period during which the components of the 
distributed system may run different versions of the software may be 
quite lengthy (days or weeks long). The "rolling" part of "rolling 
upgrade" refers to the fact that in a distributed system of thousands of 
components or nodes, the upgraded software must be "rolled out" to those 
thousands of nodes over a period of time.


Glance and Keystone do not participate in a rolling upgrade, because 
Keystone and Glance do not have a distributed component architecture. 
Online data migrations will reduce total downtime experienced during an 
*overall upgrade procedure* for an OpenStack cloud, but Nova, Neutron 
and Cinder are the only parts of OpenStack that are going to participate 
in a rolling upgrade because they are the services that are distributed 
across all the many compute nodes.


Best,
-jay

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-04 Thread Clint Byrum

Excerpts from Mike Bayer's message of 2016-09-02 17:58:42 -0400:
> 
> On 09/02/2016 01:53 PM, Doug Hellmann wrote:
> > Excerpts from Thierry Carrez's message of 2016-09-02 12:15:33 +0200:
> >> Sean Dague wrote:
> >>> Putting DB trigger failure analysis into the toolkit required to manage
> >>> an upgrade failure is a really high bar for new ops.
> >>
> >> I agree with Sean: increasing the variety of technologies used increases
> >> the system complexity, which in turn requires more skills to fully
> >> understand and maintain operationally. It should only be done as a last
> >> resort, with pros and cons carefully weighted. We really should involve
> >> operators in this discussion to get the full picture of arguments for
> >> and against.
> >>
> >
> > Yes, I would like to understand better what aspect of the approach
> > taken elsewhere is leading to the keystone team exploring other
> > options. So far I'm not seeing much upside to being different, and I'm
> > hearing a lot of cons.
> 
> I continue to maintain that the problems themselves being discussed at 
> https://review.openstack.org/#/c/331740/ are different than what has 
> been discussed in detail before.   To be "not different", this spec 
> would need to no longer discuss the concept of "we need N to be reading 
> from and writing to the old column to be compatible with N-1 as shown in 
> the below diagram...Once all the N-1 services are upgraded, N services 
> should be moved out of compatibility mode to use the new column. ". 
> To my knowledge, there are no examples of code in Openstack that 
> straddles table and column changes directly in the SQL access layer as 
> this document describes.There's still a handful of folks including 
> myself that think this is a new kind of awkwardness we've not had to 
> deal with yet.   My only ideas on how to reduce it is to put the N-1/N 
> differences on the write side, not the read side, and triggers are *not* 
> the only way to do it.   But if "being different" means, "doing it on 
> the write side", then it seems like that overall concept is being 
> vetoed.  Which I actually appreciate knowing up front before I spend a 
> lot of time on it.
> 

The example for glance shows where two entirely new objects have been
created for the database (community and shared images). The compatibility
mode flag in config is cool, I think operators deal with things like
that all the time, like when a new API version arrives and they might
not be ready to support it. I'd hope that having it turned off would
also restrict the API microversion if such a thing exists so that the
community/shared image types aren't allowed yet. This seems straight
forward, and I feel like the spec was good except for the addition of
extra layers.

In this case, I'd just create the new column nullable, and maintain
both.

* Add visibility column to schema (in spec, 'glance-manage db_expand')

* upgrade all API nodes

* run the migration code to resolve the null visibility columns
  (traditional "glance-manage db_migrate")

* advance compatibility mode to lowest commit that exists running
  against DB

* set visibility to be not null (I think this would be 'glance-manage
  db_contract latest_commit_desired')

Where, in this scheme, do triggers come in?

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-02 Thread Steve Martinelli

>
> On 09/02/2016 01:53 PM, Doug Hellmann wrote:
>
>> Excerpts from Thierry Carrez's message of 2016-09-02 12:15:33 +0200:
>
> I agree with Sean: increasing the variety of technologies used increases
>>> the system complexity, which in turn requires more skills to fully
>>> understand and maintain operationally. It should only be done as a last
>>> resort, with pros and cons carefully weighted. We really should involve
>>> operators in this discussion to get the full picture of arguments for
>>> and against.
>>>
>>
Two quick remarks about involving operators. First, see Matt Fischer's
reply to the thread, we have a great operator-developer experience with
Matt (he was one of the first folks looking at Fernet tokens), he
volunteered to out any triggers we write on his MySQL Galera cluster.
Secondly, the use of triggers was brought up at the OpenStack Ansible
midcycle, where several operators were present, and as I understand it,
felt positive about the idea.
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-02 Thread Mike Bayer




On 09/02/2016 01:53 PM, Doug Hellmann wrote:

Excerpts from Thierry Carrez's message of 2016-09-02 12:15:33 +0200:

Sean Dague wrote:

Putting DB trigger failure analysis into the toolkit required to manage
an upgrade failure is a really high bar for new ops.


I agree with Sean: increasing the variety of technologies used increases
the system complexity, which in turn requires more skills to fully
understand and maintain operationally. It should only be done as a last
resort, with pros and cons carefully weighted. We really should involve
operators in this discussion to get the full picture of arguments for
and against.



Yes, I would like to understand better what aspect of the approach
taken elsewhere is leading to the keystone team exploring other
options. So far I'm not seeing much upside to being different, and I'm
hearing a lot of cons.


I continue to maintain that the problems themselves being discussed at 
https://review.openstack.org/#/c/331740/ are different than what has 
been discussed in detail before.   To be "not different", this spec 
would need to no longer discuss the concept of "we need N to be reading 
from and writing to the old column to be compatible with N-1 as shown in 
the below diagram...Once all the N-1 services are upgraded, N services 
should be moved out of compatibility mode to use the new column. ". 
To my knowledge, there are no examples of code in Openstack that 
straddles table and column changes directly in the SQL access layer as 
this document describes.There's still a handful of folks including 
myself that think this is a new kind of awkwardness we've not had to 
deal with yet.   My only ideas on how to reduce it is to put the N-1/N 
differences on the write side, not the read side, and triggers are *not* 
the only way to do it.   But if "being different" means, "doing it on 
the write side", then it seems like that overall concept is being 
vetoed.  Which I actually appreciate knowing up front before I spend a 
lot of time on it.


















Doug

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-02 Thread Doug Hellmann

Excerpts from Thierry Carrez's message of 2016-09-02 12:15:33 +0200:
> Sean Dague wrote:
> > Putting DB trigger failure analysis into the toolkit required to manage
> > an upgrade failure is a really high bar for new ops.
> 
> I agree with Sean: increasing the variety of technologies used increases
> the system complexity, which in turn requires more skills to fully
> understand and maintain operationally. It should only be done as a last
> resort, with pros and cons carefully weighted. We really should involve
> operators in this discussion to get the full picture of arguments for
> and against.
> 

Yes, I would like to understand better what aspect of the approach
taken elsewhere is leading to the keystone team exploring other
options. So far I'm not seeing much upside to being different, and I'm
hearing a lot of cons.

Doug

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-02 Thread Thierry Carrez

Sean Dague wrote:
> Putting DB trigger failure analysis into the toolkit required to manage
> an upgrade failure is a really high bar for new ops.

I agree with Sean: increasing the variety of technologies used increases
the system complexity, which in turn requires more skills to fully
understand and maintain operationally. It should only be done as a last
resort, with pros and cons carefully weighted. We really should involve
operators in this discussion to get the full picture of arguments for
and against.

-- 
Thierry Carrez (ttx)

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-01 Thread Michael Bayer

On Thursday, September 1, 2016, Jeremy Stanley  wrote:

>
> I don't read that at all as suggesting "the problem is solved, go
> away" but rather "help us make it better for everyone, don't just
> take one project off in a new direction and leave the others
> behind."

I can clarify.  I don't work directly on glance or keystone, I do oslo.db,
sqlalchemy, and alembic development.   If it's decided that the approach is
"no special technique, just query more columns and tables in your data
access layer and straddle across API versions", that does not indicate any
new patterns or tools in Oslo or further up, hence "solved" in that the
techniques are already available.  If OTOH we are getting into triggers or
this idea I have to do Python level translation events at the write side,
that indicates the need for new library features and patterns.

I've been tasked with being ready to assist Nova and Neutron with online
migrations for over a year.   Other than helping Neutron get
expand/contract going, I've not been involved at all, and not with anything
related to data migrations.   There hasn't been any need.

> --
> Jeremy Stanley
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-01 Thread Clint Byrum

Excerpts from Robert Collins's message of 2016-09-01 20:45:22 +1200:
> On 31 August 2016 at 01:57, Clint Byrum  wrote:
> >
> >
> > It's simple, these are the holy SQL schema commandments:
> >
> > Don't delete columns, ignore them.
> > Don't change columns, create new ones.
> > When you create a column, give it a default that makes sense.
> 
> I'm sure you're aware of this but I think its worth clarifying for non
> DBAish folk: non-NULL values can change a DDL statements execution
> time from O(1) to O(N) depending on the DB in use. E.g. for Postgres
> DDL requires an exclusive table lock, and adding a column with any
> non-NULL value (including constants) requires calculating a new value
> for every row, vs just updating the metadata - see
> https://www.postgresql.org/docs/9.5/static/sql-altertable.html
> """
> When a column is added with ADD COLUMN, all existing rows in the table
> are initialized with the column's default value (NULL if no DEFAULT
> clause is specified). If there is no DEFAULT clause, this is merely a
> metadata change and does not require any immediate update of the
> table's data; the added NULL values are supplied on readout, instead.
> """
> 

InnoDB (via MySQL) has no such restrictions for online DDL:

https://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-overview.html#innodb-online-ddl-summary-grid

Basically what the link above says is that anything except these
operations can be done without locking up the table:

- Fulltext index creation
- Change column data type
- Convert or specify column character sets

Specifically, defaults are only ever stored in the rows if they're
changed. The current default is kept in the table definition, so the
rows end up with NULL physically unless the default is changed. An alter
that does a default change is just like a big update to set the current
NULL's to the old default.

> > Do not add new foreign key constraints.
> 
> What's the reason for this - if it's to avoid exclusive locks, I'd
> note that the other rules above don't avoid exclusive locks - again,
> DB specific, and for better or worse we are now testing on multiple DB
> engines via 3rd party testing.
> 
> https://dev.launchpad.net/Database/LivePatching has some info from our
> experience doing online and very fast offline patches in Launchpad.
> 

The reason is to avoid the old code running into new restrictions. If
you add a FK constraint to an existing table, old code will insert into
it and fail because it doesn't add the FK rows needed.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-01 Thread Jeremy Stanley

On 2016-09-01 10:39:09 -0400 (-0400), Mike Bayer wrote:
> On 08/31/2016 06:18 PM, Monty Taylor wrote:
[...]
> >OpenStack is One Project
> >
> >
> > Nova and Neutron have an approach for this. It may or may not be
> > ideal - but it exists right now. While it can be satisfying to
> > discount the existing approach and write a new one, I do not
> > believe that is in the best interests of OpenStack as a whole.
> > To diverge in _keystone_ - which is one of the few projects that
> > must exist in every OpenStack install - when there exists an
> > approach in the two other most commonly deployed projects - is
> > such a terrible example of the problems inherent in Conway's Law
> > that it makes me want to push up a proposal to dissolve all of
> > the individual project teams and merge all of the repos into a
> > single repo.
[...]
> The "be more similar" argument would be the only one you have to
> make. It basically says, "problem X is 'solved', other approaches
> are now unnecessary". I'm skeptical that I am reading that
> correctly. I have another approach to the issue of "rolling
> upgrades where we really need to translate at the SQL layer" that
> is in some ways similar to what triggers do, but entirely within
> the abstraction layer that you so appropriately appreciate :). I
> have a binary decision to make here, "do i work on this new idea
> that Glance has already expressed an interest in and Keystone
> might like also? Or do I not, because this problem is solved?". I
> have other projects to work on, so it's not like I'm looking for
> more. It's just I'd like to see Glance and others have their
> rolling upgrades problem solved, at least with the benefit of a
> fixed and predictable pattern, rather than every schema change
> being an ongoing seat-of-the-pants type of operation as it is
> right now.
[...]

You (presumably accidentally) snipped the next paragraph of context,
which started out:

> > Make the oslo libraries Nova and Neutron are using better. Work
> > with the Nova and Neutron teams on a consolidated approach.
[...]

I don't read that at all as suggesting "the problem is solved, go
away" but rather "help us make it better for everyone, don't just
take one project off in a new direction and leave the others
behind."
-- 
Jeremy Stanley

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-01 Thread Mike Bayer




On 09/01/2016 11:52 AM, Dan Smith wrote:


The indirection service is really unrelated to this discussion, IMHO. If
you take RPC out of the picture, all you have left is a
direct-to-the-database facade to handle the fact that schema has
expanded underneath you. As Clint (et al) have said -- designing the
application to expect schema expansion (and avoiding unnecessary
contraction) is the key here.


pretty much.  there's no fixed pattern in how to do these.  Every 
version of a data access API will be weighed down with baggage from the 
previous version and an inability to take full advantage of new 
improvements until the next release, and background migrations are 
complicated by the old application undoing their work.  Even small 
migrations mean all these issues have to be considered each time on a 
case-by-case basis.   These are the problems people are hoping to 
improve upon if possible.   The spec at 
https://review.openstack.org/#/c/331740/ is discussing these issues in 
detail and is the first such specification I've seen that tries to get 
into it at this level.





__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-01 Thread Dan Smith

> So that is fine.  However, correct me if I'm wrong but you're 
> proposing just that these projects migrate to also use a new service 
> layer with oslo.versionedobjects, because IIUC Nova/Neutron's 
> approach is dependent on that area of indirection being present. 
> Otherwise, if you meant something like, "use an approach that's kind 
> of like what Nova does w/ versionedobjects but without actually 
> having to use versionedobjects", that still sounds like, "come up 
> with a new idea".

If you don't need the RPC bits, versionedobjects is nothing more than an
object facade for you to insulate your upper layers from such change.
Writing your facade using versionedobjects just means inheriting from a
superclass that does a bunch of stuff you don't need. So I would not say
that taking the same general approach without that inheritance is "come
up with a new idea".

Using triggers and magic to solve this instead of an application-level
facade is a substantially different approach to the problem.

> I suppose if you're thinking more at the macro level, where "current
>  approach" means "do whatever you have to on the app side", then your
>  position is consistent, but I think there's still a lot of
> confusion in that area when the indirection of a versioned service
> layer is not present. It gets into the SQL nastiness I was discussing
> w/ Clint and I don't see anyone doing anything like that yet.

The indirection service is really unrelated to this discussion, IMHO. If
you take RPC out of the picture, all you have left is a
direct-to-the-database facade to handle the fact that schema has
expanded underneath you. As Clint (et al) have said -- designing the
application to expect schema expansion (and avoiding unnecessary
contraction) is the key here.

--Dan

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-01 Thread Sean Dague

On 09/01/2016 09:45 AM, David Stanek wrote:
> On Thu, Aug 25 at 13:13 -0400, Steve Martinelli wrote:
>> The keystone team is pursuing a trigger-based approach to support rolling,
>> zero-downtime upgrades. The proposed operator experience is documented here:
>>
>>   http://docs.openstack.org/developer/keystone/upgrading.html
>>
> 
> I wanted to mention a few things. One of the reasons I suggested this
> approach for keystone is that I've had success in the past using a
> combination of triggers and code to do live, online migrations. Many
> times using completely different schemas.
> 
> In keystone we are just talking about some simple data transformations
> between columns and things like that. The triggers themselves shouldn't
> get too complicated. If there are cases where triggers won't work, then
> we won't force them. (A current example of this is encrypting
> credentials.)
> 
> The online migrations are not required. Operators can still go the old
> route and db_sync while others help test out the cutting edge features.
> 
> The triggers are not there during the entire lifecycle of the
> application. The expand phase adds them and the contract removes them.

But you did that for an application where you were on call to handle any
issues, and you knew the data somewhat in advance.

In OpenStack this code would get committed. It would get executed 12 to
18 months later (the average current OpenStack level at the ops meetup
was Kilo/Liberty). It would be executed by people far away, possibly
running in different locales, without an idea about what's in the data set.

Part of OpenStack being a successful open source project is that the
mean expertise of our operators will keep decreasing over time. It will
be deployed and maintained by less and less skilled operators in each
release, because it will be deployed and maintained by more total
operators each release.

Putting DB trigger failure analysis into the toolkit required to manage
an upgrade failure is a really high bar for new ops.

-Sean

-- 
Sean Dague
http://dague.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-01 Thread Mike Bayer




On 09/01/2016 08:29 AM, Henry Nash wrote:


From a purely keystone perspective, my gut feeling is that actually the
trigger approach is likely to lead to a more robust, not less, solution - due
to the fact that we solve the very specific problems of a given migration
(i.e. need to keep column A in sync with Column B) or a short period of time,
right at the point of pain, with well established techniques - albeit they be
complex ones that need experienced coders in those techniques.


this is really the same philosophy I'm going for, that is, make a schema 
migration, then accompany it by a data migration, and then you're done. 
The rest of the world need not be concerned.


It's not as much about "triggers" as it is, "handle the data difference 
on the write side, not the read side".  That is, writing data to a SQL 
database is squeezed through exactly three very boring forms of 
statement, the INSERT, UPDATE, and DELETE.   These are easy to intercept 
in the database, and since we use an abstraction like SQLAlchemy they 
are easy to intercept in the application layer too (foreshadowing). 
  When you put it on the read side, reading is of course (mostly) 
through just one statement, the SELECT, but it is a crazy beast in 
practice and it is all over the place in an unlimited number of forms.


If you can get your migrations to be, hey, we can just read JSON records 
from version 1.0 of the service and pump them into version 2.0, then 
you're doing read-side, but you've solved the problem at the service 
layer.  This only works for those situations where it "works", and the 
dual-layer service architecture has to be feasibly present as well.


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-01 Thread Mike Bayer




On 08/31/2016 06:18 PM, Monty Taylor wrote:


I said this the other day in the IRC channel, and I'm going to say it
again here. I'm going to do it as bluntly as I can - please keeping in
mind that I respect all of the humans involved.

I think this is a monstrously terrible idea.

There are MANY reasons for this -but I'm going to limit myself to two.

OpenStack is One Project


Nova and Neutron have an approach for this. It may or may not be ideal -
but it exists right now. While it can be satisfying to discount the
existing approach and write a new one, I do not believe that is in the
best interests of OpenStack as a whole. To diverge in _keystone_ - which
is one of the few projects that must exist in every OpenStack install -
when there exists an approach in the two other most commonly deployed
projects - is such a terrible example of the problems inherent in
Conway's Law that it makes me want to push up a proposal to dissolve all
of the individual project teams and merge all of the repos into a single
repo.


So that is fine.  However, correct me if I'm wrong but you're proposing 
just that these projects migrate to also use a new service layer with 
oslo.versionedobjects, because IIUC Nova/Neutron's approach is dependent 
on that area of indirection being present. Otherwise, if you meant 
something like, "use an approach that's kind of like what Nova does w/ 
versionedobjects but without actually having to use versionedobjects", 
that still sounds like, "come up with a new idea".


I suppose if you're thinking more at the macro level, where "current 
approach" means "do whatever you have to on the app side", then your 
position is consistent, but I think there's still a lot of confusion in 
that area when the indirection of a versioned service layer is not 
present.   It gets into the SQL nastiness I was discussing w/ Clint and 
I don't see anyone doing anything like that yet.


Triggers aside since it clearly is "triggering" (ahem) allergic 
reactions, what's the approach when new approaches are devised that are 
alternatives to what "exists right now"?   E.g. I have yet another 
proposal in the works that allows for SQL-level translations but runs in 
the Python application space and does not use triggers.  Should I stop 
right now because Nova/Neutron already have a system that's "good 
enough"?This would be fine.  I find it uncomfortable working in this 
ambiguous space where some projects rightly proclaim they've solved a 
problem, and others continue to disregard that and plow forward with 
other approaches without a universally accepted reason why the current 
solution is not feasible.





BUT - I also don't think it's a good technical solution. That isn't
because triggers don't work in MySQL (they do) - but because we've spent
the last six years explicitly NOT writing raw SQL. We've chosen an
abstraction layer (SQLAlchemy) which does its job well.


There's a canard in there which is that all along I've been proposing to 
start adding systems to oslo.db to help produce and maintain triggers 
which certainly would have among its goals that consuming projects 
wouldn't be writing raw SQL.  That part of the discomfort is more 
manageable than Clint's, which is that he doesn't want the database 
doing things with the data other than storing it, and I totally know 
where he's coming from on that.


The "be more similar" argument would be the only one you have to make. 
It basically says, "problem X is 'solved', other approaches are now 
unnecessary".   I'm skeptical that I am reading that correctly.  I have 
another approach to the issue of "rolling upgrades where we really need 
to translate at the SQL layer" that is in some ways similar to what 
triggers do, but entirely within the abstraction layer that you so 
appropriately appreciate :).   I have a binary decision to make here, 
"do i work on this new idea that Glance has already expressed an 
interest in and Keystone might like also? Or do I not, because this 
problem is solved?".   I have other projects to work on, so it's not 
like I'm looking for more.   It's just I'd like to see Glance and others 
have their rolling upgrades problem solved, at least with the benefit of 
a fixed and predictable pattern, rather than every schema change being 
an ongoing seat-of-the-pants type of operation as it is right now.


Finally, it's a known and accepted pattern in large

scale MySQL shops ... Roll out a new version of the app code which
understands both the old and the new schema version, then roll out a
no-downtime additive schema change to the database, then have the app
layer process and handle on the fly transformation if needed.



Right, as I've mentioned previously, I only take issue with the 
"monolithic app code that speaks both versions of the schema" part. 
Assuming there's no layer of service indirection where migration issues 
can be finessed outside of the SQL interaction layer, it means every 
migrat

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-01 Thread David Stanek

On Wed, Aug 31 at 17:18 -0500, Monty Taylor wrote:
> 
> Nova and Neutron have an approach for this. It may or may not be ideal -
> but it exists right now. While it can be satisfying to discount the
> existing approach and write a new one, I do not believe that is in the
> best interests of OpenStack as a whole. To diverge in _keystone_ - which
> is one of the few projects that must exist in every OpenStack install -
> when there exists an approach in the two other most commonly deployed
> projects - is such a terrible example of the problems inherent in
> Conway's Law that it makes me want to push up a proposal to dissolve all
> of the individual project teams and merge all of the repos into a single
> repo.

That's a bit overly dramatic. I think having some innovation is a good
thing. Specifically in this case where our needs appear to be a little
simpler than those of nova.

> 
> Make the oslo libraries Nova and Neutron are using better. Work with the
> Nova and Neutron teams on a consolidated approach. We need to be driving
> more towards an OpenStack that behaves as if it wasn't written by
> warring factions of developers who barely communicate.

I believe we tried to keep with the same extract/migrate/contract
patterns. Sure our implementation differs, but I don't see operators
caring about that as long as it works.

> 
> Even if the idea was one I thought was good technically, the above would
> still trump that. Work with Nova and Neutron. Be more similar.
> 
> PLEASE
> 
> BUT - I also don't think it's a good technical solution. That isn't
> because triggers don't work in MySQL (they do) - but because we've spent
> the last six years explicitly NOT writing raw SQL. We've chosen an
> abstraction layer (SQLAlchemy) which does its job well.
> 
> IF this were going to be accompanied by a corresponding shift in
> approach to not support any backends by MySQL and to start writing our
> database interactions directly in SQL in ALL of our projects - I could
> MAYBE be convinced. Even then I think doing it in triggers is the wrong
> place to put logic.
> 
> "Database triggers are obviously a new challenge for developers to
> write, honestly challenging to debug (being side effects), and are made
> even more difficult by having to hand write triggers for MySQL,
> PostgreSQL, and SQLite independently (SQLAlchemy offers no assistance in
> this case)"
> 
> If you look at:
> 
> https://review.openstack.org/#/c/355618/40/keystone/common/sql/expand_repo/versions/002_add_key_hash_and_encrypted_blob_to_credential.py
> 
> You will see the three different SQL dialects this. Not only that, but
> some of the more esoteric corners of those backends. We can barely get
> _indexes_ right in our database layers ... now we think we're going to
> get triggers right? Consistently? And handle things like Galera?
> 
> The other option is app level, which is what nova and neutron are doing.
> It's a good option, because it puts the logic in python, which is a
> thing we have 2500 developers fairly well versed in. It's also scalable,
> as the things executing whatever the logic is are themselves a scale-out
> set of servers. Finally, it's a known and accepted pattern in large
> scale MySQL shops ... Roll out a new version of the app code which
> understands both the old and the new schema version, then roll out a
> no-downtime additive schema change to the database, then have the app
> layer process and handle on the fly transformation if needed.

I've done both types of migrations in the past, but with one imporant
exception. We could roll out our application on Tuesday and then the
cleanup on Thursday. We didn't carry baggage for 6 months to a year. My
fear with keystone is that we'd slow development even more by adding
more cruft and cruft on top of cuft.

> 
> SO ...
> 
> Just do what Nova and Neutron are doing - and if it's not good enough,
> fix it. Having some projects use triggers and other projects not use
> triggers is one of the more epically crazypants things I've heard around
> here ... and I lived through the twisted/eventlet argument.

-- 
David Stanek
web: http://dstanek.com
blog: http://traceback.org

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-01 Thread David Stanek

On Thu, Aug 25 at 13:13 -0400, Steve Martinelli wrote:
> The keystone team is pursuing a trigger-based approach to support rolling,
> zero-downtime upgrades. The proposed operator experience is documented here:
> 
>   http://docs.openstack.org/developer/keystone/upgrading.html
> 

I wanted to mention a few things. One of the reasons I suggested this
approach for keystone is that I've had success in the past using a
combination of triggers and code to do live, online migrations. Many
times using completely different schemas.

In keystone we are just talking about some simple data transformations
between columns and things like that. The triggers themselves shouldn't
get too complicated. If there are cases where triggers won't work, then
we won't force them. (A current example of this is encrypting
credentials.)

The online migrations are not required. Operators can still go the old
route and db_sync while others help test out the cutting edge features.

The triggers are not there during the entire lifecycle of the
application. The expand phase adds them and the contract removes them.

-- 
David Stanek
web: http://dstanek.com
blog: http://traceback.org

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-01 Thread Henry Nash

So as the person who drove the rolling upgrade requirements into keystone in 
this cycle (because we have real customers that need it), and having first 
written the keystone upgrade process to be “versioned object ready” (because I 
assumed we would do this the same as everyone else), and subsequently 
re-written it to be “DB Trigger ready”…and written migration scripts for both 
these cases for the (in fact very minor) DB changes that keystone has in 
Newton…I guess I should also weigh in here :-)

For me, the argument comes down to:

a) Is the pain that needs to cured by the rolling upgrade requirement broadly 
in the same place in the various projects (i.e. nova, glance, keystone etc.)? 
If it is, then working towards a common solution is always preferable (whatever 
that solution is)
b) I would characterise the difference between the trigger approach, the 
versioned objects approach and the “n-app approach as: do we want a small 
amount of very nasty complexity vs. spreading that complexity out to be not as 
bad, but over a broader area. Probably fewer people can (successfully) write 
the nasty complexity trigger work, than they can, say, the “do it all in the 
app” work. LOC (which, of course, isn’t always a good measure) is also 
reflected in this characterisation, with the trigger code having probably the 
fewest LOC, and the app code having the greatest. 
c) I don’t really follow the argument that somehow the trigger code in 
migrations is less desirable because we use higher level sqla abstractions in 
our main-line code - I’ve always seen migration as different and expected that 
we might have to do strange things there. Further, we should be aware of the 
time-preiods…the migration cycle is a small % of elapsed time the cloud is 
running (well, hopefully) - so again, do we solve the “issues of migration” as 
part of the migration cycle (which is what the trigger approach does) or make 
our code be (effectively) continually migration aware (using versioned objects 
or in-app code)
d) The actual process (for an operator) is simpler for a rolling upgrade 
process with Triggers than the alternative (since you don’t require several of 
the checkpoints, e.g. when you know you can move out of compatibility mode 
etc.). Operator error is also a cause of problems in upgrades (especially as 
the complexity of a cloud increases).

From a purely keystone perspective, my gut feeling is that actually the trigger 
approach is likely to lead to a more robust, not less, solution - due to the 
fact that we solve the very specific problems of a given migration (i.e. need 
to keep column A in sync with Column B) or a short period of time, right at the 
point of pain, with well established techniques - albeit they be complex ones 
that need experienced coders in those techniques. I actually prefer the small 
locality of complexity (marked with “there be dragons there, be careful”), as 
opposed to spreading medium pain over a large area, which by definition is 
updated by many…and  may do the wrong thing inadvertently. It is simpler for 
operators.

I do recognise, however, the “let’s not do different stuff for a core project 
like keytsone” as a powerful argument. I just don’t know how to square this 
with the fact that although I started in the “versioned objects camp”, having 
worked through many of the issues have come to believe that the Trigger 
approach will be more reliable overall for this specific use case. From the 
other reaction to this thread, I don’t detect a lot of support for the Trigger 
approach becoming our overall, cross-project solution.

The actual migrations in Keystone needed for Newton are minor, so one 
possibility is we use keystone as a guinea pig for this approach in Newton…if 
we had to undo this in a subsequent release, we are not talking about rafts of 
migration code to redo.

Henry

> On 1 Sep 2016, at 09:45, Robert Collins  wrote:
> 
> On 31 August 2016 at 01:57, Clint Byrum  wrote:
>> 
>> 
>> It's simple, these are the holy SQL schema commandments:
>> 
>> Don't delete columns, ignore them.
>> Don't change columns, create new ones.
>> When you create a column, give it a default that makes sense.
> 
> I'm sure you're aware of this but I think its worth clarifying for non
> DBAish folk: non-NULL values can change a DDL statements execution
> time from O(1) to O(N) depending on the DB in use. E.g. for Postgres
> DDL requires an exclusive table lock, and adding a column with any
> non-NULL value (including constants) requires calculating a new value
> for every row, vs just updating the metadata - see
> https://www.postgresql.org/docs/9.5/static/sql-altertable.html
> """
> When a column is added with ADD COLUMN, all existing rows in the table
> are initialized with the column's default value (NULL if no DEFAULT
> clause is specified). If there is no DEFAULT clause, this is merely a
> metadata change and does not require any immediate update of the
> table's data; the added NULL valu

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-09-01 Thread Robert Collins

On 31 August 2016 at 01:57, Clint Byrum  wrote:
>
>
> It's simple, these are the holy SQL schema commandments:
>
> Don't delete columns, ignore them.
> Don't change columns, create new ones.
> When you create a column, give it a default that makes sense.

I'm sure you're aware of this but I think its worth clarifying for non
DBAish folk: non-NULL values can change a DDL statements execution
time from O(1) to O(N) depending on the DB in use. E.g. for Postgres
DDL requires an exclusive table lock, and adding a column with any
non-NULL value (including constants) requires calculating a new value
for every row, vs just updating the metadata - see
https://www.postgresql.org/docs/9.5/static/sql-altertable.html
"""
When a column is added with ADD COLUMN, all existing rows in the table
are initialized with the column's default value (NULL if no DEFAULT
clause is specified). If there is no DEFAULT clause, this is merely a
metadata change and does not require any immediate update of the
table's data; the added NULL values are supplied on readout, instead.
"""

> Do not add new foreign key constraints.

What's the reason for this - if it's to avoid exclusive locks, I'd
note that the other rules above don't avoid exclusive locks - again,
DB specific, and for better or worse we are now testing on multiple DB
engines via 3rd party testing.

https://dev.launchpad.net/Database/LivePatching has some info from our
experience doing online and very fast offline patches in Launchpad.

-Rob

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-08-31 Thread Monty Taylor

On 08/25/2016 04:14 PM, Sean Dague wrote:
> On 08/25/2016 01:13 PM, Steve Martinelli wrote:
>> The keystone team is pursuing a trigger-based approach to support
>> rolling, zero-downtime upgrades. The proposed operator experience is
>> documented here:
>>
>>   http://docs.openstack.org/developer/keystone/upgrading.html
>>
>> This differs from Nova and Neutron's approaches to solve for rolling
>> upgrades (which use oslo.versionedobjects), however Keystone is one of
>> the few services that doesn't need to manage communication between
>> multiple releases of multiple service components talking over the
>> message bus (which is the original use case for oslo.versionedobjects,
>> and for which it is aptly suited). Keystone simply scales horizontally
>> and every node talks directly to the database.
>>
>> Database triggers are obviously a new challenge for developers to write,
>> honestly challenging to debug (being side effects), and are made even
>> more difficult by having to hand write triggers for MySQL, PostgreSQL,
>> and SQLite independently (SQLAlchemy offers no assistance in this case),
>> as seen in this patch:
>>
>>   https://review.openstack.org/#/c/355618/
>>
>> However, implementing an application-layer solution with
>> oslo.versionedobjects is not an easy task either; refer to Neutron's
>> implementation:
>>
>>
>> https://review.openstack.org/#/q/topic:bp/adopt-oslo-versioned-objects-for-db
>>
>>
>> Our primary concern at this point are how to effectively test the
>> triggers we write against our supported database systems, and their
>> various deployment variations. We might be able to easily drop SQLite
>> support (as it's only supported for our own test suite), but should we
>> expect variation in support and/or actual behavior of triggers across
>> the MySQLs, MariaDBs, Perconas, etc, of the world that would make it
>> necessary to test each of them independently? If you have operational
>> experience working with triggers at scale: are there landmines that we
>> need to be aware of? What is it going to take for us to say we support
>> *zero* dowtime upgrades with confidence?
> 
> I would really hold off doing anything triggers related until there was
> sufficient testing for that, especially with potentially dirty data.
> 
> Triggers also really bring in a whole new DSL that people need to learn
> and understand, not just across this boundary, but in the future
> debugging issues. And it means that any errors happening here are now in
> a place outside of normal logging / recovery mechanisms.
> 
> There is a lot of value that in these hard problem spaces like zero down
> uptime we keep to common patterns between projects because there are
> limited folks with the domain knowledge, and splitting that even further
> makes it hard to make this more universal among projects.

I said this the other day in the IRC channel, and I'm going to say it
again here. I'm going to do it as bluntly as I can - please keeping in
mind that I respect all of the humans involved.

I think this is a monstrously terrible idea.

There are MANY reasons for this -but I'm going to limit myself to two.

OpenStack is One Project

Nova and Neutron have an approach for this. It may or may not be ideal -
but it exists right now. While it can be satisfying to discount the
existing approach and write a new one, I do not believe that is in the
best interests of OpenStack as a whole. To diverge in _keystone_ - which
is one of the few projects that must exist in every OpenStack install -
when there exists an approach in the two other most commonly deployed
projects - is such a terrible example of the problems inherent in
Conway's Law that it makes me want to push up a proposal to dissolve all
of the individual project teams and merge all of the repos into a single
repo.

Make the oslo libraries Nova and Neutron are using better. Work with the
Nova and Neutron teams on a consolidated approach. We need to be driving
more towards an OpenStack that behaves as if it wasn't written by
warring factions of developers who barely communicate.

Even if the idea was one I thought was good technically, the above would
still trump that. Work with Nova and Neutron. Be more similar.

PLEASE

BUT - I also don't think it's a good technical solution. That isn't
because triggers don't work in MySQL (they do) - but because we've spent
the last six years explicitly NOT writing raw SQL. We've chosen an
abstraction layer (SQLAlchemy) which does its job well.

IF this were going to be accompanied by a corresponding shift in
approach to not support any backends by MySQL and to start writing our
database interactions directly in SQL in ALL of our projects - I could
MAYBE be convinced. Even then I think doing it in triggers is the wrong
place to put logic.

"Database triggers are obviously a new challenge for developers to
write, honestly challenging to debug (being side effects), and are made
even more difficult by having to hand

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-08-30 Thread Mike Bayer




On 08/30/2016 08:04 PM, Clint Byrum wrote:


My direct experience with this was MySQL 5.0 and 5.1. They worked as
documented, and no I don't think they've changed much since then.

When they were actually installed into the schema and up to date with
the code that expected them, and the debugging individual was aware of them, 
things were fine.

However, every other imperative part of the code was asserted with git,
package managers, ansible, puppet, pick your choice of thing that puts
file on disk and restarts daemons. These things all have obvious entry
points too. X is where wsgi starts running code. Y is where flask hands
off to the app, etc. But triggers are special and go in the database at
whatever time they go in. This means you lose all the benefit of all of
the tools you're used to using to debug and operate on imperative code.


to use your phrasing, I'd characterize this as "an unnecessarily bleak 
view" of the use of triggers as a whole.  I've no doubt you worked with 
some horrible trigger code (just as I've worked with some horrible 
application code, but I've worked with horrible stored procedure / 
trigger stuff too).


The triggers that have been in play in the current Keystone proposals as 
well as the one we were working with in Neutron were simple one liners 
that essentially act as custom constraints - they check a condition then 
raise an error if it fails.  In particular, MySQL doesn't have support 
for CHECK constraints, so if you want to assert that values going into a 
row have some quality more exotic than "not null", you might have to use 
a trigger to get this effect.


Clearly, a trigger that is so complex that it is invoking a whole series 
of imperative steps is not a trigger any of us should be considering. 
IMO these are not those triggers.





Of course, you can have books that get their edition 0 updated in book
while you're upgrading. But the editions feature code always treats
that old update as an update to edition 0.  It's still the same object
it always was, your app just makes some assumptions about it. You can
use a union in some cases where you need to see them all for instance,
and just select a literal '0' for the edition column of your union.


I find unions to be very awkward and really subject to poor performance. 
 Of course this can be made to work but I'm sticking to my preference 
for getting the data in the right shape on the write side, not the read 
side.




And one can say "old app is gone" when one knows it's gone. At that point,
one can run a migration that inserts 0 editions into book_edition, and
drops the book table. For OpenStack, we can say "all releases that used
that old schema are EOL, so we can simplify the code now". Our 6 month
pace and short EOL windows are built for this kind of thing.


Assuming we aren't able to use Nova's approach and we're stuck 
translating in the data access layer, we can simplify the code and put 
out a new release, although that "simplified" code now has to be 
"unsimplified" by all the *new* schema changes - code will always be 
carrying along junk to try and adapt it to the previous version of the 
software.   There's no problem if projects in this situation want to do 
it this way and I will gladly support everyone's efforts in going this 
route.However, I still think it's worth looking into approaches that 
can push the interaction between old and new app version into the write 
side instead of the read side, and if that interaction can be removed 
from the primary database access code into a separate layer.


To the degree that teams can just emulate Nova's finessing of the issue 
at the service level, that's even better.   This thread is just in 
response to particular teams who *want* to use triggers for a specific 
problem.Hopefully I will have time to flesh out my alternative 
technique for "application-level translation triggers" and maybe those 
folks might want to try that kind of thing too someday.






__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-08-30 Thread Clint Byrum

Excerpts from Mike Bayer's message of 2016-08-30 18:15:14 -0400:
> 
> On 08/30/2016 04:43 PM, Clint Byrum wrote:
> >>
> >
> > Correct, it is harder for development. Since the database server has all
> > of the potential for the worst problems, being a stateful service, then
> > I believe moving complexity _out_ of it, is generally an operational
> > win, at the expense of some development effort. The development effort,
> > however, is mostly on the front of the pipeline where timelines can be
> > longer. Operations typically is operating under SLA's and with
> > requirements to move slowly in defense of peoples' data and performance
> > of the system. So I suggest that paying costs in dev, vs. at the
> > database is usually the highest value choice.
> >
> > This is of course not the case if timelines are short for development as
> > well, but I can't really answer the question in that case. For OpenStack,
> > we nearly always find ourselves with more time to develop, than operators
> > do to operate.
> 
> So the idea of triggers is hey, for easy things like column X is now 
> column Y elsewhere, instead of complicating the code, use a trigger to 
> maintain that value.   Your argument against triggers is: "Triggers 
> introduce emergent behaviors and complicate scaling and reasonable 
> debugging in somewhat hidden ways that
> can frustrate even the most experienced DBA."
> 
> I'd wager that triggers probably work a little more smoothly in modern 
> MySQL/Postgresql than a more classical "DBA" platform like a crusty old 
> MS SQL Server or Oracle, but more examples on these emergent behaviors 
> would be useful, as well as evidence that they apply to current versions 
> of database software that are in use within Openstack, and are 
> disruptive enough that even the most clear-cut case for triggers vs. 
> in-application complexity should favor in-app complexity without question.
> 

My direct experience with this was MySQL 5.0 and 5.1. They worked as
documented, and no I don't think they've changed much since then.

When they were actually installed into the schema and up to date with
the code that expected them, and the debugging individual was aware of them, 
things were fine.

However, every other imperative part of the code was asserted with git,
package managers, ansible, puppet, pick your choice of thing that puts
file on disk and restarts daemons. These things all have obvious entry
points too. X is where wsgi starts running code. Y is where flask hands
off to the app, etc. But triggers are special and go in the database at
whatever time they go in. This means you lose all the benefit of all of
the tools you're used to using to debug and operate on imperative code.

> >
> >>> I don't think it's all that ambitious to think we can just use tried and
> >>> tested schema evolution techniques that work for everyone else.
> >>
> >> People have been asking me for over a year how to do this, and I have no
> >> easy answer, I'm glad that you do.  I would like to see some examples of
> >> these techniques.
> >>
> >> If you can show me the SQL access code that deals with the above change,
> >> that would help a lot.
> >>
> >
> > So schema changes fall into several categories. But basically, the only
> > one that is hard, is a relationship change. Basically, a new PK. Here's
> > an example:
> >
> > Book.isbn was the PK, but we want to have a record per edition, so the
> > new primary key is (isbn, edition).
> >
> > Solution: Maintain two tables. You have created an entirely new object!
> >
> > CREATE TABLE book (
> >   isbn varchar(30) not null primary key,
> >   description text,
> > )
> >
> > CREATE TABLE book_editions (
> >   isbn varchar(30) not null,
> >   edition int not null,
> >   description text,
> >   primary key (isbn, edition),
> > )
> >
> > And now on read, your new code has to do this:
> >
> > SELECT b.isbn,
> >COALESCE(be.edition, 0) AS edition,
> >COALESCE(be.description, b.description) AS description
> > FROM book b
> >  LEFT OUTER JOIN book_editions be
> >  ON b.isbn = be.isbn
> > WHERE b.isbn = 'fooisbn'
> >
> > And now, if a book has only ever been written by old code, you get one
> > record with a 0 edition. And if it were written by the new system, the
> > new system would need to go ahead and duplicate the book description into
> > the old table for as long as we have code that might expect it.
> 
> So some pain points here are:
> 
> 1. you really can't ever trust what's in book_editions.description as 
> long as any "old" application is running, since it can put new data into 
> book.description at any time.  You shouldn't bother reading from it at 
> all, just write to it. You won't be able to use it until the next 
> version of the application, e.g. "new" + 1. Or if you support some kind 
> of "old app is gone! " flag that modifies the behavior of "new" app to 
> modify all its queries, which is even more awkward.
> 

Of course, you can have books that get the

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-08-30 Thread Lance Bragstad

Since the encrypted credential work is currently based on triggers, I spent
most of today documenting a walk-though migration from Mitaka to Newton
[0]. Regardless of the outcome discussed here - figured it would be worth
sharing since it's relevant to the thread. Most of the gist contains stuff
not directly related to the upgrade from Mitaka to Newton, like config
files and install processes. I included them anyway since I started with a
green field deployment on Ubuntu 16.04.

Everything is technically still up for review so if you notice anything
fishy about the implementation via the walkthrough feel free to leave a
comment.

[0]
https://gist.github.com/lbragstad/ddfb10f9f9048414d1f781ba006e95d1#file-migration-md

On Tue, Aug 30, 2016 at 5:15 PM, Mike Bayer  wrote:

>
>
> On 08/30/2016 04:43 PM, Clint Byrum wrote:
>
>>
>>>
>> Correct, it is harder for development. Since the database server has all
>> of the potential for the worst problems, being a stateful service, then
>> I believe moving complexity _out_ of it, is generally an operational
>> win, at the expense of some development effort. The development effort,
>> however, is mostly on the front of the pipeline where timelines can be
>> longer. Operations typically is operating under SLA's and with
>> requirements to move slowly in defense of peoples' data and performance
>> of the system. So I suggest that paying costs in dev, vs. at the
>> database is usually the highest value choice.
>>
>> This is of course not the case if timelines are short for development as
>> well, but I can't really answer the question in that case. For OpenStack,
>> we nearly always find ourselves with more time to develop, than operators
>> do to operate.
>>
>
> So the idea of triggers is hey, for easy things like column X is now
> column Y elsewhere, instead of complicating the code, use a trigger to
> maintain that value.   Your argument against triggers is: "Triggers
> introduce emergent behaviors and complicate scaling and reasonable
> debugging in somewhat hidden ways that
> can frustrate even the most experienced DBA."
>
> I'd wager that triggers probably work a little more smoothly in modern
> MySQL/Postgresql than a more classical "DBA" platform like a crusty old MS
> SQL Server or Oracle, but more examples on these emergent behaviors would
> be useful, as well as evidence that they apply to current versions of
> database software that are in use within Openstack, and are disruptive
> enough that even the most clear-cut case for triggers vs. in-application
> complexity should favor in-app complexity without question.
>
>
>
>
>
>> I don't think it's all that ambitious to think we can just use tried and
 tested schema evolution techniques that work for everyone else.

>>>
>>> People have been asking me for over a year how to do this, and I have no
>>> easy answer, I'm glad that you do.  I would like to see some examples of
>>> these techniques.
>>>
>>> If you can show me the SQL access code that deals with the above change,
>>> that would help a lot.
>>>
>>>
>> So schema changes fall into several categories. But basically, the only
>> one that is hard, is a relationship change. Basically, a new PK. Here's
>> an example:
>>
>> Book.isbn was the PK, but we want to have a record per edition, so the
>> new primary key is (isbn, edition).
>>
>> Solution: Maintain two tables. You have created an entirely new object!
>>
>> CREATE TABLE book (
>>   isbn varchar(30) not null primary key,
>>   description text,
>> )
>>
>> CREATE TABLE book_editions (
>>   isbn varchar(30) not null,
>>   edition int not null,
>>   description text,
>>   primary key (isbn, edition),
>> )
>>
>> And now on read, your new code has to do this:
>>
>> SELECT b.isbn,
>>COALESCE(be.edition, 0) AS edition,
>>COALESCE(be.description, b.description) AS description
>> FROM book b
>>  LEFT OUTER JOIN book_editions be
>>  ON b.isbn = be.isbn
>> WHERE b.isbn = 'fooisbn'
>>
>> And now, if a book has only ever been written by old code, you get one
>> record with a 0 edition. And if it were written by the new system, the
>> new system would need to go ahead and duplicate the book description into
>> the old table for as long as we have code that might expect it.
>>
>
> So some pain points here are:
>
> 1. you really can't ever trust what's in book_editions.description as long
> as any "old" application is running, since it can put new data into
> book.description at any time.  You shouldn't bother reading from it at all,
> just write to it. You won't be able to use it until the next version of the
> application, e.g. "new" + 1. Or if you support some kind of "old app is
> gone! " flag that modifies the behavior of "new" app to modify all its
> queries, which is even more awkward.
>
> 2. deletes by "old" app of entries in "book" have to be synchronized
> offline by a background script of some kind.  You at least need to run a
> final, authoritative "clean up all the old book delet

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-08-30 Thread Mike Bayer




On 08/30/2016 04:43 PM, Clint Byrum wrote:




Correct, it is harder for development. Since the database server has all
of the potential for the worst problems, being a stateful service, then
I believe moving complexity _out_ of it, is generally an operational
win, at the expense of some development effort. The development effort,
however, is mostly on the front of the pipeline where timelines can be
longer. Operations typically is operating under SLA's and with
requirements to move slowly in defense of peoples' data and performance
of the system. So I suggest that paying costs in dev, vs. at the
database is usually the highest value choice.

This is of course not the case if timelines are short for development as
well, but I can't really answer the question in that case. For OpenStack,
we nearly always find ourselves with more time to develop, than operators
do to operate.


So the idea of triggers is hey, for easy things like column X is now 
column Y elsewhere, instead of complicating the code, use a trigger to 
maintain that value.   Your argument against triggers is: "Triggers 
introduce emergent behaviors and complicate scaling and reasonable 
debugging in somewhat hidden ways that

can frustrate even the most experienced DBA."

I'd wager that triggers probably work a little more smoothly in modern 
MySQL/Postgresql than a more classical "DBA" platform like a crusty old 
MS SQL Server or Oracle, but more examples on these emergent behaviors 
would be useful, as well as evidence that they apply to current versions 
of database software that are in use within Openstack, and are 
disruptive enough that even the most clear-cut case for triggers vs. 
in-application complexity should favor in-app complexity without question.







I don't think it's all that ambitious to think we can just use tried and
tested schema evolution techniques that work for everyone else.


People have been asking me for over a year how to do this, and I have no
easy answer, I'm glad that you do.  I would like to see some examples of
these techniques.

If you can show me the SQL access code that deals with the above change,
that would help a lot.



So schema changes fall into several categories. But basically, the only
one that is hard, is a relationship change. Basically, a new PK. Here's
an example:

Book.isbn was the PK, but we want to have a record per edition, so the
new primary key is (isbn, edition).

Solution: Maintain two tables. You have created an entirely new object!

CREATE TABLE book (
  isbn varchar(30) not null primary key,
  description text,
)

CREATE TABLE book_editions (
  isbn varchar(30) not null,
  edition int not null,
  description text,
  primary key (isbn, edition),
)

And now on read, your new code has to do this:

SELECT b.isbn,
   COALESCE(be.edition, 0) AS edition,
   COALESCE(be.description, b.description) AS description
FROM book b
 LEFT OUTER JOIN book_editions be
 ON b.isbn = be.isbn
WHERE b.isbn = 'fooisbn'

And now, if a book has only ever been written by old code, you get one
record with a 0 edition. And if it were written by the new system, the
new system would need to go ahead and duplicate the book description into
the old table for as long as we have code that might expect it.


So some pain points here are:

1. you really can't ever trust what's in book_editions.description as 
long as any "old" application is running, since it can put new data into 
book.description at any time.  You shouldn't bother reading from it at 
all, just write to it. You won't be able to use it until the next 
version of the application, e.g. "new" + 1. Or if you support some kind 
of "old app is gone! " flag that modifies the behavior of "new" app to 
modify all its queries, which is even more awkward.


2. deletes by "old" app of entries in "book" have to be synchronized 
offline by a background script of some kind.  You at least need to run a 
final, authoritative "clean up all the old book deletions" job before 
you go into "old app is gone" mode and the new app begins reading from 
book_editions alone.


3. LEFT OUTER JOINs can be a major performance hit.   You can't turn it 
off here until you go to version "new + 1" (bad performance locked in 
for a whole release cycle) or your app has a "turn off old app mode" 
flag (basically you have to write two different database access layers).


Contrast to the trigger approach, which removes all the SELECT pain and 
moves it all to writes:


1. new application has no code whatsoever referring to old application

2. no performance hit on SELECT

3. no "wait til version "new+1"" and/or "old app is gone" switch

If we have evidence that triggers are always, definitely, universally 
going to make even this extremely simple use case non-feasible, great, 
let's measure and test for that.   But in a case like this they look 
very attractive and I'd hate to just dispense with them unilaterally 
without a case-by-case examination.


As I wrote this, I

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-08-30 Thread Clint Byrum

Excerpts from Mike Bayer's message of 2016-08-30 14:56:15 -0400:
> 
> On 08/30/2016 09:57 AM, Clint Byrum wrote:
> >>
> >
> > As someone else brought up, this is an unnecessarily bleak view of how 
> > database
> > migrations work.
> 
> We aren't talking about database migrations.  We are talking about 
> *online* database migrations, where we would like both the *old* and 
> *new* versions of the code, talking to the database at the same time.
> 
> 
> If I write code that does this:
> 
> 
>  SELECT foo, bar FROM table
> 
> then I do a migration that replaces "bar" with some new table, the new 
> SQL is:
> 
>  SELECT table.foo, othertable.bar FROM table JOIN othertable ON 
> table.id == othertable.foo_id
> 
> Those two SQL statements are incompatible.  The "new" version of the 
> code must expect and maintain the old "bar" column for the benefit of 
> the "old" version of the code still reading and writing to it.   To me, 
> this seems to contradict your suggestion "don't delete columns, ignore 
> them".  We can't ignore "bar" above.
> 

It's hard to think about what you're saying without concrete examples,
but I'll try.

As I said, don't remove columns, ignore them. Of course, you can't ignore
them on writes, they still exist. If you have a new relationship for that
data, then yes, you still have to write to the old columns and tables so
that older versions of the code can find the data it needs.

Your join needs to be a left join, so that you get the data from the old
table when it was written by old code.

> >
> > Following these commandments, one can run schema changes at any time. A
> > new schema should be completely ignorable by older code, because their
> > columns keep working, and no new requirements are introduced. New code
> > can deal with defaulted new columns gracefully.
> 
> You need to specify how new code deals with the above two totally 
> different SQL statements "gracefully", except that it has to accommodate 
> for both versions of the schema at the same time.   This may be 
> "graceful" in operator land but in developer land, there is no easy 
> solution for this.  Unless there is, and nobody has shown it to me yet:
> 

Correct, it is harder for development. Since the database server has all
of the potential for the worst problems, being a stateful service, then
I believe moving complexity _out_ of it, is generally an operational
win, at the expense of some development effort. The development effort,
however, is mostly on the front of the pipeline where timelines can be
longer. Operations typically is operating under SLA's and with
requirements to move slowly in defense of peoples' data and performance
of the system. So I suggest that paying costs in dev, vs. at the
database is usually the highest value choice.

This is of course not the case if timelines are short for development as
well, but I can't really answer the question in that case. For OpenStack,
we nearly always find ourselves with more time to develop, than operators
do to operate.

> > I don't think it's all that ambitious to think we can just use tried and
> > tested schema evolution techniques that work for everyone else.
> 
> People have been asking me for over a year how to do this, and I have no 
> easy answer, I'm glad that you do.  I would like to see some examples of 
> these techniques.
> 
> If you can show me the SQL access code that deals with the above change, 
> that would help a lot.
> 

So schema changes fall into several categories. But basically, the only
one that is hard, is a relationship change. Basically, a new PK. Here's
an example:

Book.isbn was the PK, but we want to have a record per edition, so the
new primary key is (isbn, edition).

Solution: Maintain two tables. You have created an entirely new object!

CREATE TABLE book (
  isbn varchar(30) not null primary key,
  description text,
)

CREATE TABLE book_editions (
  isbn varchar(30) not null,
  edition int not null,
  description text,
  primary key (isbn, edition),
)

And now on read, your new code has to do this:

SELECT b.isbn,
   COALESCE(be.edition, 0) AS edition,
   COALESCE(be.description, b.description) AS description
FROM book b
 LEFT OUTER JOIN book_editions be
 ON b.isbn = be.isbn
WHERE b.isbn = 'fooisbn'

And now, if a book has only ever been written by old code, you get one
record with a 0 edition. And if it were written by the new system, the
new system would need to go ahead and duplicate the book description into
the old table for as long as we have code that might expect it.

Most other things are simpler and have quite obvious solutions.

> If the answer is, "oh well just don't do a schema change like that", 
> then we're basically saying we aren't really changing our schemas 
> anymore except for totally new features that otherwise aren't accessed 
> by the older version of the code.  That's fine.   It's not what people 
> coming to me are saying, though.
> 

I mean, yes and no. We should pay some r

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-08-30 Thread Dan Smith

>> I don't think it's all that ambitious to think we can just use
>> tried and tested schema evolution techniques that work for everyone
>> else.
> 
> People have been asking me for over a year how to do this, and I have
> no easy answer, I'm glad that you do.  I would like to see some
> examples of these techniques.

I'm not sure how to point you at the examples we have today because
they're not on a single line (or set of lines) in a single file. Nova
has moved a lot of data around at runtime using this approach in the
last year or so with good success.

> If you can show me the SQL access code that deals with the above
> change, that would help a lot.

We can't show you that, because as you said, there isn't a way to do
it...in SQL. That is in fact the point though: don't do it in SQL.

> If the answer is, "oh well just don't do a schema change like that", 
> then we're basically saying we aren't really changing our schemas 
> anymore except for totally new features that otherwise aren't
> accessed by the older version of the code.

We _are_ saying "don't change schema like that", but it's not a very
limiting requirement. It means you can't move things in a schema
migration, but that's all. Nova changes schema all the time.

In the last year or so, off the top of my head, nova has:

1. Moved instance flavors from row=value metadata storage to a JSON
   blob in another table
2. Moved core flavors, aggregates, keypairs and other structures from
   the cell database to the api database
3. Added uuid to aggregates
4. Added a parent_addr linkage in PCI device

...all online. Those are just the ones I have in my head that have
required actual data migrations. We've had dozens of schema changes that
enable new features that are all just new data and don't require any of
this.

> That's fine.   It's not what people coming to me are saying, though.

Not sure who is coming to you or what they're saying, but.. okay :)

If keystone really wants to use triggers to do this, then that's fine.
But I think the overwhelming response from this thread (which is asking
people's opinions on the matter) seems to be that they're an unnecessary
complication that will impede people debugging and working on that part
of the code base. We have such impediments elsewhere, but I think we
generally try to avoid doing one thing a hundred different ways to keep
the playing field as level as possible.

--Dan

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-08-30 Thread Mike Bayer




On 08/30/2016 09:57 AM, Clint Byrum wrote:




As someone else brought up, this is an unnecessarily bleak view of how database
migrations work.


We aren't talking about database migrations.  We are talking about 
*online* database migrations, where we would like both the *old* and 
*new* versions of the code, talking to the database at the same time.



If I write code that does this:


SELECT foo, bar FROM table

then I do a migration that replaces "bar" with some new table, the new 
SQL is:


SELECT table.foo, othertable.bar FROM table JOIN othertable ON 
table.id == othertable.foo_id


Those two SQL statements are incompatible.  The "new" version of the 
code must expect and maintain the old "bar" column for the benefit of 
the "old" version of the code still reading and writing to it.   To me, 
this seems to contradict your suggestion "don't delete columns, ignore 
them".  We can't ignore "bar" above.





Following these commandments, one can run schema changes at any time. A
new schema should be completely ignorable by older code, because their
columns keep working, and no new requirements are introduced. New code
can deal with defaulted new columns gracefully.


You need to specify how new code deals with the above two totally 
different SQL statements "gracefully", except that it has to accommodate 
for both versions of the schema at the same time.   This may be 
"graceful" in operator land but in developer land, there is no easy 
solution for this.  Unless there is, and nobody has shown it to me yet:




I don't think it's all that ambitious to think we can just use tried and
tested schema evolution techniques that work for everyone else.


People have been asking me for over a year how to do this, and I have no 
easy answer, I'm glad that you do.  I would like to see some examples of 
these techniques.


If you can show me the SQL access code that deals with the above change, 
that would help a lot.


If the answer is, "oh well just don't do a schema change like that", 
then we're basically saying we aren't really changing our schemas 
anymore except for totally new features that otherwise aren't accessed 
by the older version of the code.  That's fine.   It's not what people 
coming to me are saying, though.



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-08-30 Thread Dan Smith

>> Even in the case of projects using versioned objects, it still
>> means a SQL layer has to include functionality for both versions of
>> a particular schema change which itself is awkward.

That's not true. Nova doesn't have multiple models to straddle a
particular change. We just...

> It's simple, these are the holy SQL schema commandments:
> 
> Don't delete columns, ignore them.
> Don't change columns, create new ones.
> When you create a column, give it a default that makes sense.
> Do not add new foreign key constraints.

...do this ^ :)

We can drop columns once they're long-since-unused, but we still don't
need duplicate models for that.

--Dan

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-08-30 Thread Clint Byrum

Excerpts from Mike Bayer's message of 2016-08-26 11:50:24 -0400:
> 
> On 08/25/2016 01:13 PM, Steve Martinelli wrote:
> > The keystone team is pursuing a trigger-based approach to support
> > rolling, zero-downtime upgrades. The proposed operator experience is
> > documented here:
> >
> >   http://docs.openstack.org/developer/keystone/upgrading.html
> >
> > This differs from Nova and Neutron's approaches to solve for rolling
> > upgrades (which use oslo.versionedobjects), however Keystone is one of
> > the few services that doesn't need to manage communication between
> > multiple releases of multiple service components talking over the
> > message bus (which is the original use case for oslo.versionedobjects,
> > and for which it is aptly suited). Keystone simply scales horizontally
> > and every node talks directly to the database.
> 
> 
> Hi Steve -
> 
> I'm a strong proponent of looking into the use of triggers to smooth 
> upgrades between database versions.Even in the case of projects 
> using versioned objects, it still means a SQL layer has to include 
> functionality for both versions of a particular schema change which 
> itself is awkward.   I'm also still a little worried that not every case 
> of this can be handled by orchestration at the API level, and not as a 
> single SQL layer method that integrates both versions of a schema change.
> 

Speaking as an operator, I'd rather have awkwardness happen in safe, warm
development, rather than in the cold, dirty, broken world of operations.

Speaking as a former DBA: Triggers introduce emergent behaviors and
complicate scaling and reasonable debugging in somewhat hidden ways that
can frustrate even the most experienced DBA. We've discussed FK's before,
and how they are a 1:1 trade-off of integrity vs. performance, and thus
deserve more scrutiny than they're typically given. Well IMO, triggers are
a 1:10 trade off between development complexity, and debugging complexity.

Speaking as a developer: Every case can in fact be handled simply and
in code without the database's help if we're willing to accept a small
level of imperfection and redundancy.

> Using triggers would resolve the issue of SQL-specific application code 
> needing to refer to two versions of a schema at once, at least for those 
> areas where triggers and SPs can handle it.   In the "ideal", it means 
> all the Python code can just refer to one version of a schema, and nuts 
> and bolts embedded into database migrations would handle all the 
> movement between schema versions, including the phase between expand and 
> contract.   Not that I think the "ideal" is ever going to be realized 
> 100%, but maybe in some / many places, this can work.
> 

As someone else brought up, this is an unnecessarily bleak view of how database
migrations work.

It's simple, these are the holy SQL schema commandments:

Don't delete columns, ignore them.
Don't change columns, create new ones.
When you create a column, give it a default that makes sense.
Do not add new foreign key constraints.

Following these commandments, one can run schema changes at any time. A
new schema should be completely ignorable by older code, because their
columns keep working, and no new requirements are introduced. New code
can deal with defaulted new columns gracefully.

Of course, once one can be certain that all app code is updated, one can
drop old columns and tables, and add FK constraints (if you so desire,
I personally think they're a waste of precious DB resources, but that
is a much more religious debate and I accept that it's not part of
this debate).

> So if Keystone wants to be involved in paving the way for working with 
> triggers, IMO this would benefit other projects in that they could 
> leverage this kind of functionality in those places where it makes sense.
> 
> The problem of "zero downtime database migrations" is an incredibly 
> ambitious goal and I think it would be wrong to exclude any one 
> particular technique in pursuing this.  A real-world success story would 
> likely integrate many different techniques as they apply to specific 
> scenarios, and triggers and SPs IMO are a really major one which I 
> believe can be supported.
> 

I don't think it's all that ambitious to think we can just use tried and
tested schema evolution techniques that work for everyone else.

> >
> > Database triggers are obviously a new challenge for developers to write,
> > honestly challenging to debug (being side effects), and are made even
> > more difficult by having to hand write triggers for MySQL, PostgreSQL,
> > and SQLite independently (SQLAlchemy offers no assistance in this case),
> > as seen in this patch:
> 
> So I would also note that we've been working on the availability of 
> triggers and stored functions elsewhere, a very raw patch that is to be 
> largely rolled into oslo.db is here:
> 
> https://review.openstack.org/#/c/314054/
> 
> This patch makes use of an Alembic pattern called "replaceab

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-08-30 Thread Flavio Percoco


On 25/08/16 13:13 -0400, Steve Martinelli wrote:

The keystone team is pursuing a trigger-based approach to support rolling,
zero-downtime upgrades. The proposed operator experience is documented here:

 http://docs.openstack.org/developer/keystone/upgrading.html

This differs from Nova and Neutron's approaches to solve for rolling
upgrades (which use oslo.versionedobjects), however Keystone is one of the
few services that doesn't need to manage communication between multiple
releases of multiple service components talking over the message bus (which
is the original use case for oslo.versionedobjects, and for which it is
aptly suited). Keystone simply scales horizontally and every node talks
directly to the database.

Database triggers are obviously a new challenge for developers to write,
honestly challenging to debug (being side effects), and are made even more
difficult by having to hand write triggers for MySQL, PostgreSQL, and
SQLite independently (SQLAlchemy offers no assistance in this case), as
seen in this patch:

 https://review.openstack.org/#/c/355618/

However, implementing an application-layer solution with
oslo.versionedobjects is not an easy task either; refer to Neutron's
implementation:


https://review.openstack.org/#/q/topic:bp/adopt-oslo-versioned-objects-for-db

Our primary concern at this point are how to effectively test the triggers
we write against our supported database systems, and their various
deployment variations. We might be able to easily drop SQLite support (as
it's only supported for our own test suite), but should we expect variation
in support and/or actual behavior of triggers across the MySQLs, MariaDBs,
Perconas, etc, of the world that would make it necessary to test each of
them independently? If you have operational experience working with
triggers at scale: are there landmines that we need to be aware of? What is
it going to take for us to say we support *zero* dowtime upgrades with
confidence?


Hey Steve, Dolph,

Thanks for sending this out. There's been some discussions in the Glance
community about how we can implement rolling upgrades and it seems like Glance's
case is very similar to keystone's.

I'll make sure folks in the glance community are aware of this thread and reach
out.

Flavio

--
@flaper87
Flavio Percoco


signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-08-26 Thread Mike Bayer




On 08/25/2016 01:13 PM, Steve Martinelli wrote:

The keystone team is pursuing a trigger-based approach to support
rolling, zero-downtime upgrades. The proposed operator experience is
documented here:

  http://docs.openstack.org/developer/keystone/upgrading.html

This differs from Nova and Neutron's approaches to solve for rolling
upgrades (which use oslo.versionedobjects), however Keystone is one of
the few services that doesn't need to manage communication between
multiple releases of multiple service components talking over the
message bus (which is the original use case for oslo.versionedobjects,
and for which it is aptly suited). Keystone simply scales horizontally
and every node talks directly to the database.



Hi Steve -

I'm a strong proponent of looking into the use of triggers to smooth 
upgrades between database versions.Even in the case of projects 
using versioned objects, it still means a SQL layer has to include 
functionality for both versions of a particular schema change which 
itself is awkward.   I'm also still a little worried that not every case 
of this can be handled by orchestration at the API level, and not as a 
single SQL layer method that integrates both versions of a schema change.


Using triggers would resolve the issue of SQL-specific application code 
needing to refer to two versions of a schema at once, at least for those 
areas where triggers and SPs can handle it.   In the "ideal", it means 
all the Python code can just refer to one version of a schema, and nuts 
and bolts embedded into database migrations would handle all the 
movement between schema versions, including the phase between expand and 
contract.   Not that I think the "ideal" is ever going to be realized 
100%, but maybe in some / many places, this can work.


So if Keystone wants to be involved in paving the way for working with 
triggers, IMO this would benefit other projects in that they could 
leverage this kind of functionality in those places where it makes sense.


The problem of "zero downtime database migrations" is an incredibly 
ambitious goal and I think it would be wrong to exclude any one 
particular technique in pursuing this.  A real-world success story would 
likely integrate many different techniques as they apply to specific 
scenarios, and triggers and SPs IMO are a really major one which I 
believe can be supported.





Database triggers are obviously a new challenge for developers to write,
honestly challenging to debug (being side effects), and are made even
more difficult by having to hand write triggers for MySQL, PostgreSQL,
and SQLite independently (SQLAlchemy offers no assistance in this case),
as seen in this patch:


So I would also note that we've been working on the availability of 
triggers and stored functions elsewhere, a very raw patch that is to be 
largely rolled into oslo.db is here:


https://review.openstack.org/#/c/314054/

This patch makes use of an Alembic pattern called "replaceable object", 
which is intended specifically as a means of versioning things like 
triggers and stored procedures:


http://alembic.zzzcomputing.com/en/latest/cookbook.html#replaceable-objects

Within the above Neutron patch, one thing I want to move towards is that 
things like triggers and SPs would only need to be specified once, in 
the migration layer, and not within the model.   To achieve this, tests 
that work against MySQL and Postgresql would need to ensure that the 
test schema is built up using migrations, and not create_all.  This is 
already the case in some places and not in others.  There is work 
ongoing in oslo.db to provide a modernized fixture system that supports 
enginefacade cleanly as well as allows for migrations to be used 
efficiently (read: once per many tests) for all MySQL/Postgresql test 
suites, athttps://review.openstack.org/#/c/351411/ .


As far as SQLite, I have a simple opinion with SQLite which is that 
migrations, triggers, and SPs should not be anywhere near a SQLite 
database.   SQLite should be used strictly for simple model unit tests, 
the schema is created using create_all(), and that's it.   The test 
fixture system accommodates this as well.




Our primary concern at this point are how to effectively test the
triggers we write against our supported database systems, and their
various deployment variations. We might be able to easily drop SQLite
support (as it's only supported for our own test suite), but should we
expect variation in support and/or actual behavior of triggers across
the MySQLs, MariaDBs, Perconas, etc, of the world that would make it
necessary to test each of them independently? If you have operational
experience working with triggers at scale: are there landmines that we
need to be aware of? What is it going to take for us to say we support
*zero* dowtime upgrades with confidence?


*zero* downtime is an extremely difficult goal.   I appreciate that 
people are generally nervous about making more use of relatio

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-08-25 Thread Sean Dague


On 08/25/2016 01:13 PM, Steve Martinelli wrote:

The keystone team is pursuing a trigger-based approach to support
rolling, zero-downtime upgrades. The proposed operator experience is
documented here:

  http://docs.openstack.org/developer/keystone/upgrading.html

This differs from Nova and Neutron's approaches to solve for rolling
upgrades (which use oslo.versionedobjects), however Keystone is one of
the few services that doesn't need to manage communication between
multiple releases of multiple service components talking over the
message bus (which is the original use case for oslo.versionedobjects,
and for which it is aptly suited). Keystone simply scales horizontally
and every node talks directly to the database.

Database triggers are obviously a new challenge for developers to write,
honestly challenging to debug (being side effects), and are made even
more difficult by having to hand write triggers for MySQL, PostgreSQL,
and SQLite independently (SQLAlchemy offers no assistance in this case),
as seen in this patch:

  https://review.openstack.org/#/c/355618/

However, implementing an application-layer solution with
oslo.versionedobjects is not an easy task either; refer to Neutron's
implementation:


https://review.openstack.org/#/q/topic:bp/adopt-oslo-versioned-objects-for-db

Our primary concern at this point are how to effectively test the
triggers we write against our supported database systems, and their
various deployment variations. We might be able to easily drop SQLite
support (as it's only supported for our own test suite), but should we
expect variation in support and/or actual behavior of triggers across
the MySQLs, MariaDBs, Perconas, etc, of the world that would make it
necessary to test each of them independently? If you have operational
experience working with triggers at scale: are there landmines that we
need to be aware of? What is it going to take for us to say we support
*zero* dowtime upgrades with confidence?


I would really hold off doing anything triggers related until there was 
sufficient testing for that, especially with potentially dirty data.


Triggers also really bring in a whole new DSL that people need to learn 
and understand, not just across this boundary, but in the future 
debugging issues. And it means that any errors happening here are now in 
a place outside of normal logging / recovery mechanisms.


There is a lot of value that in these hard problem spaces like zero down 
uptime we keep to common patterns between projects because there are 
limited folks with the domain knowledge, and splitting that even further 
makes it hard to make this more universal among projects.


-Sean

--
Sean Dague
http://dague.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-08-25 Thread gordon chung



On 25/08/16 01:13 PM, Steve Martinelli wrote:
The keystone team is pursuing a trigger-based approach to support rolling, 
zero-downtime upgrades. The proposed operator experience is documented here:

  http://docs.openstack.org/developer/keystone/upgrading.html

This differs from Nova and Neutron's approaches to solve for rolling upgrades 
(which use oslo.versionedobjects), however Keystone is one of the few services 
that doesn't need to manage communication between multiple releases of multiple 
service components talking over the message bus (which is the original use case 
for oslo.versionedobjects, and for which it is aptly suited). Keystone simply 
scales horizontally and every node talks directly to the database.


just curious, but does Keystone have any IPC or is it still just a single 
service interacting with db? if the latter, you should be able to just apply 
migration with no downtime as long as you don't modify/delete existing columns. 
similar experience as others, haven't really used stored procedures in a while 
but it's a pain wrt to portability. considering OpenStack has a habit of 
supporting every driver under the sun, i'm guessing driver specific solutions 
will get more difficult over time.

cheers,


--
gord
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-08-25 Thread Dan Smith

> This differs from Nova and Neutron's approaches to solve for rolling
> upgrades (which use oslo.versionedobjects), however Keystone is one of
> the few services that doesn't need to manage communication between
> multiple releases of multiple service components talking over the
> message bus (which is the original use case for oslo.versionedobjects,
> and for which it is aptly suited). Keystone simply scales horizontally
> and every node talks directly to the database.

Yeah, o.vo gives you nothing really if all you want is a facade behind
which to hide the application-level migrations. That doesn't mean it
would be a bad thing to use, but maybe overkill vs. just writing a
couple wrappers.

> Database triggers are obviously a new challenge for developers to write,
> honestly challenging to debug (being side effects), and are made even
> more difficult by having to hand write triggers for MySQL, PostgreSQL,
> and SQLite independently (SQLAlchemy offers no assistance in this case),
> as seen in this patch:
> 
>   https://review.openstack.org/#/c/355618/
> 
> However, implementing an application-layer solution with
> oslo.versionedobjects is not an easy task either; refer to Neutron's
> implementation:

Yeah, it's not trivial at the application level either but at least it
is in python and write-once for any kind of compatible backend. My
(extremely limited) experience with stored procedures is that they are
very difficult to get right, even as an expert in the technology, which
almost none of us are. Application-level migrations are significantly
simpler and exist closer to the domain of the rest of the code for a
specific new feature.

I will offer one bit of anecdotal information that may be relevant:
Several of the migrations that nova has done in the past have required
things like parsing/generating JSON, and making other API calls to look
up information needed to translate from one format to another. That
would (AFAIK) be quite difficult to do in the database itself, and may
mean you end up with a combination of both approaches in the long run.

I don't think that keystone necessarily needs to adopt the same approach
as the other projects (especially in the absence of things like
cross-version RPC compatibility) and so if stored procedures are really
the best fit then that's cool. They will themselves be a landmine in
front of me should I ever have to debug such a problem, but if they are
significantly better for the most part then so be it.

--Dan

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-08-25 Thread Matt Fischer

On Thu, Aug 25, 2016 at 1:13 PM, Steve Martinelli 
wrote:

> The keystone team is pursuing a trigger-based approach to support rolling,
> zero-downtime upgrades. The proposed operator experience is documented here:
>
>   http://docs.openstack.org/developer/keystone/upgrading.html
>
> This differs from Nova and Neutron's approaches to solve for rolling
> upgrades (which use oslo.versionedobjects), however Keystone is one of the
> few services that doesn't need to manage communication between multiple
> releases of multiple service components talking over the message bus (which
> is the original use case for oslo.versionedobjects, and for which it is
> aptly suited). Keystone simply scales horizontally and every node talks
> directly to the database.
>
> Database triggers are obviously a new challenge for developers to write,
> honestly challenging to debug (being side effects), and are made even more
> difficult by having to hand write triggers for MySQL, PostgreSQL, and
> SQLite independently (SQLAlchemy offers no assistance in this case), as
> seen in this patch:
>
>   https://review.openstack.org/#/c/355618/
>
> However, implementing an application-layer solution with
> oslo.versionedobjects is not an easy task either; refer to Neutron's
> implementation:
>
>   https://review.openstack.org/#/q/topic:bp/adopt-oslo-
> versioned-objects-for-db
>
> Our primary concern at this point are how to effectively test the triggers
> we write against our supported database systems, and their various
> deployment variations. We might be able to easily drop SQLite support (as
> it's only supported for our own test suite), but should we expect variation
> in support and/or actual behavior of triggers across the MySQLs, MariaDBs,
> Perconas, etc, of the world that would make it necessary to test each of
> them independently? If you have operational experience working with
> triggers at scale: are there landmines that we need to be aware of? What is
> it going to take for us to say we support *zero* dowtime upgrades with
> confidence?
>
> Steve & Dolph
>
>

No experience to add for triggers, but I'm happy to help test this on a
MySQL Galera cluster. I'd also like to add thanks for looking into this. A
keystone outage is a cloud outage and being able to eliminate them from
upgrades will be beneficial to everyone.
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-08-25 Thread Sean M. Collins

Personally I had very bad experiences with stored procedures and
triggers in previous jobs, where the amount of side effects that
occurred and the overall lack of maintainability of triggers and stored
procedures scared me off.

We handed off changes to stored procedures and
triggers to the DBAs, who had a tendency to not apply them correctly or
forget to apply them at a site. Then it was a total nightmare to try and
figure out why things wouldn't work, until we discovered that the
changes to an SP or Trigger wasn't actually applied.

Now, I don't think OpenStack as a project suffers the same
organizational dysfunction as my previous jobs, but just overall they're
hard to debug and maintain and I don't like to use them.

/rant

-- 
Sean M. Collins

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects

2016-08-25 Thread Steve Martinelli

The keystone team is pursuing a trigger-based approach to support rolling,
zero-downtime upgrades. The proposed operator experience is documented here:

  http://docs.openstack.org/developer/keystone/upgrading.html

This differs from Nova and Neutron's approaches to solve for rolling
upgrades (which use oslo.versionedobjects), however Keystone is one of the
few services that doesn't need to manage communication between multiple
releases of multiple service components talking over the message bus (which
is the original use case for oslo.versionedobjects, and for which it is
aptly suited). Keystone simply scales horizontally and every node talks
directly to the database.

Database triggers are obviously a new challenge for developers to write,
honestly challenging to debug (being side effects), and are made even more
difficult by having to hand write triggers for MySQL, PostgreSQL, and
SQLite independently (SQLAlchemy offers no assistance in this case), as
seen in this patch:

  https://review.openstack.org/#/c/355618/

However, implementing an application-layer solution with
oslo.versionedobjects is not an easy task either; refer to Neutron's
implementation:


https://review.openstack.org/#/q/topic:bp/adopt-oslo-versioned-objects-for-db

Our primary concern at this point are how to effectively test the triggers
we write against our supported database systems, and their various
deployment variations. We might be able to easily drop SQLite support (as
it's only supported for our own test suite), but should we expect variation
in support and/or actual behavior of triggers across the MySQLs, MariaDBs,
Perconas, etc, of the world that would make it necessary to test each of
them independently? If you have operational experience working with
triggers at scale: are there landmines that we need to be aware of? What is
it going to take for us to say we support *zero* dowtime upgrades with
confidence?

Steve & Dolph
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

44 matches

Mail list logo