Re: [openstack-dev] [tc] Active or passive role with our database layer

Mike Bayer Sun, 21 May 2017 19:11:40 -0700


On 05/21/2017 03:38 PM, Monty Taylor wrote:

documentation on the sequence of steps the operator should take.
In the "active" approach, we still document expectations, but we alsovalidate them. If they are not what we expect but can be changed atruntime, we change them overriding conflicting environmental config, andif we can't, we hard-stop indicating an unsuitable environment. Ratherthan providing helper tools, we perform the steps needed ourselves, inthe order they need to be performed, ensuring that they are done in themanner in which they need to be done.

we do this in places like tripleo. The MySQL configs and such arechecked into the source tree, it includes details likeinnodb_file_per_table, timeouts used by haproxy, etc. I know tripleois not like the service itself like Nova but it's also not exactlysomething we hand off to the operators to figure out from scratch either.

We do some of it in oslo.db as well. We set things like MySQL SQL_MODE.We try to make sure the unicode-ish flags are set up and that we'reusing utf-8 encoding.

Some examples:

* Character Sets / Collations
We currently enforce at testing time that all database migrations areexplicit about InnoDB. We also validate in oslo.db that table charactersets have the string 'utf8' in them. (only on MySQL) We do not have anycheck for case-sensitive or case-insensitive collations (these affectsorting and comparison operations) Because we don't, different serverconfig settings or different database backends for different clouds canactually behave differently through the REST API.
To deal with that:
First we'd have to decide whether case sensitive or case insensitive waswhat we wanted. If we decided we wanted case sensitive, we could add anenforcement of that in oslo.db, and write migrations to get from caseinsensitive indexes to case sensitive indexes on tables where wedetected that a case insensitive collation had been used. If we decidedwe wanted to stick with case insensitive we could similarly add code toenforce it on MySQL. To enforce it actively on PostgresSQL, we'd need toeither switch our code that's using comparisons to use the sqlalchemycase-insensitive versions explicitly, or maybe write some sort ofoverloaded driver for PG that turns all comparisons intocase-insensitive, which would wrap both sides of comparisons in lower()calls (which has some indexing concerns, but let's ignore that for themoment) We could also take the 'external' approach and just document it,then define API tests and try to tie the insensitive behavior in the APIto Interop Compliance. I'm not 100% sure how a db operator wouldremediate this - but PG has some fancy computed index features - somaybe it would be possible.


let's make the case sensitivity explicitly enforced!

A similar issue lurks with the fact that MySQL unicode storage is 3-byteby default and 4-byte is opt-in. We could take the 'external' approachand document it and assume the operator has configured their my.cnf withthe appropriate default, or taken an 'active' approach where we overrideit in all the models and make migrations to get us from 3 to 4 byte.

let's force MySQL to use utf8mb4! Although I am curious what is theactual use case we want to hit here (which gets into, zzzeek is ignorantas to which unicode glyphs actually live in 4-byte utf8 characters).

* Schema Upgrades
The way you roll out online schema changes is highly dependent on yourdatabase architecture.
Just limiting to the MySQL world:
If you do Galera, you can do roll them out in Total Order or Rollingfashion. Total Order locks basically everything while it's happening, soisn't a candidate for "online". In rolling you apply the schema changeto one node at a time. If you do that, the application has to be able todeal with both forms of the table, and you have to deal with ensuringthat data can replicate appropriately while the schema change is happening.

Galera replicates DDL operations. If I add a column on a node, it popsup on the other nodes too in a similar way as transactions arereplicated, e.g. nearly synchronous. I would *assume* it has to dothis in the context of it's usual transaction ordering, even thoughMySQL doesn't do transactional DDL, so that if the cluster seestransaction A, schema change B, transaction C that depends on B, thatordering is serialized appropriately. However, even if it doesn't dothat, the rolling upgrades we do don't start the services talking to thenew schema structures until the DDL changes are complete, and Galera isnear-synchronous replication.

Also speaking to the "active" question, we certainly have all kinds oflogic in Openstack (the optimistic update strategy in particular) thattake "Galera" into account. And of course we have Galera config insideof tripleo. So that's kind of the "active" approach, I think.

If you do DRBD active/passive or a single-node deployment you only haveone upgrade operation to perform, but you will only lock certain things- depending on what schema change operations you were performing.
If you do master/slave, you can roll out the schema change to yourslaves one at a time, wait for them all to catch up, then promote aslave taking the current master out of commission - update the oldmaster then then put it into the slave pool. Like Galera rolling, theapp needs to be able to handle old and new versions and the replicationstream needs to be able to replicate between the versions.
Making sure that the stream is able to replicate puts a set oflimitations on the types of schema changes you can perform, but it is anunderstandable constrained set.

My current thinking for online upgrades, the schema changes and theapplication speaking to those schema changes are at least isolatedstates of the openstack cluster as a whole. That's at least how itseems to work right now. Also right now, Openstack has almost no codeI'm aware of that takes advantage of true master / asynchronous slaves.While it's been kind of stuck in oslo.db for years, and inenginefacade I added new decorators that allow you to declare a methodas safe to run in a "slave", applications are hardly using this featureat all. I vaguely recall one obscure feature in Nova maybe using it forsomething. But last I checked, even if you configure Opentack with a"master" and "slave" database URL (which we support!), 90% of everythingis on the "master" anyway (perhaps some projects that I never look at doin fact use the "slave", please let me know as I should probably be morefamiliar with that).

In either approach the OpenStack service has to be able to talk to bothold and new versions of the schema. And in either approach we need tomake sure to limit the schema change operations to the set that can beaccomplished in an online fashion. We also have to be careful to notstart writing values to new columns until all of the nodes have beenupdated, because the replication stream can't replicate the new columnvalue to nodes that don't have the new column.

This is...what everyone (except keystone w/ the evil triggers) doesalready, I thought?

In either approach we can decide to limit the number of architectures wesupport for "online" upgrades.
In an 'external' approach, we make sure to do those things, we writedocumentation and we assume the database will be updated appropriately.We can document that if the deployer chooses to do Total Order onGalera, they will not have online upgrades. There will also have to be adeployer step to let the services know that they can start writingvalues to the new schema format once the upgrade is complete.
In an 'active' approach, we can notice that we have an update availableto run, and we can drive it from code. We can check for Galera, and ifit's there we can run the upgrade in Rolling fashion one node at a timewith no work needed on the part of the deployer. Since we're driving theupgrade, we know when it's done, so we can signal ourselves to startusing the new version. We'd obviously have to pick the set of acceptablearchitectures we can handle consistently orchestrating.
* Versions
It's worth noting that behavior for schema updates and other thingschange over time with backend database version. We set minimum versionsof other things, like libvirt and OVS - so we might also want to setminimum versions for what we can support in the database.

agree though so far I don't think we've hit too many features that havean issue here, the MySQL/Mariadb 5.x set of features are ubiquitous nowand that's pretty much what we target. In the Postgresql world, theyare crazy with the new syntaxes every release (to my dismay having tosupport them all) but none of these are really appropriate for Openstackas long as we are targeting MySQL also.




That way we

can know for a given release of OpenStack what DDL operations are safeto use for a rolling upgrade and what are not. That means detecting sucha version and potentially refusing to perform an upgrade if the versionisn't acceptable. That reduces the operator's ability to choose whatversion of the database software to run, but increases our ability to beable to provide tooling and operations that we can be confident will work.

We definitely make sure that if we put a migration directive somewhere,it's going to work on the MySQL/MariaDB's that are in general use. Ithink there might have even been some behavior recently that was perhapson the 5.5/5.6 border but I can't recall.

== Summary ==
These are just a couple of examples - but I hope they're at least mildlyuseful to explain some of the sorts of issues at hand - and why I thinkwe need to clarify what our intent is separate from the issue of whatdatabases we "support".
Some operations have one and only one "right" way to be done. For thoseoperations if we take an 'active' approach, we can implement them onceand not make all of our deployers and distributors each implement andrun them. However, there is a cost to that. Automatic and prescriptivebehavior has a higher dev cost that is proportional to the number ofsupported architectures. This then implies a need to limit deployerarchitecture choices.
On the other hand, taking an 'external' approach allows us to federatethe work of supporting the different architectures to the deployers.This means more work on the deployer's part, but also potentially agreater amount of freedom on their part to deploy supporting servicesthe way they want. It means that some of the things that have beenrequested of us - such as easier operation and an increase in the numberof things that can be upgraded with no-downtime - might becomeprohibitively costly for us to implement.

I think right now we are doing a "hybrid". If you're on a MySQLvariant, you get the cadillac version and if you're going withPostgresql, you get the stick shift. I'm not endorsing this but itdoes seem work to some extent.

I honestly think that both are acceptable choices we can make and thatfor any given topic there are middle grounds to be found at any givenmoment in time.


ok i just said that

BUT - without a decision as to what our long-term philosophical intentin this space is that is clear and understandable to everyone, we cannothave successful discussions about the impact of implementation choices,since we will not have a shared understanding of the problem space orthe solutions we're talking about.
For my part - I hear complaints that OpenStack is 'difficult' to operateand requests for us to make it easier. This is why I have beenadvocating some actions that are clearly rooted in an 'active' worldview.

I think this goes to a point I typed on the etherpad in the bostonsession, I don't think that MySQL defaults to 3-byte utf8 or that if adeployer happens to use Postgresql they suddenly get case sensitivecomparisons are the big reasons openstack is "difficult". I findopenstack to be really difficult but setting the db connection URL andrunning the "db-manage" scripts is kind of the easiest part of it (butof course, I'm super biased on that).

Finally, this is focused on the database layer but similar questionsarise in other places. What is our philosophy on prescriptive/activechoices on our part coupled with automated action and ease of operationvs. expanded choices for the deployer at the expense of configurationand operational complexity. For now let's see if we can answer it fordatabases, and see where that gets us.
Thanks for reading.

Monty

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [tc] Active or passive role with our database layer

Reply via email to