Re: [openstack-dev] [tc] revised Postgresql support status patch for governance

Mike Bayer Thu, 18 May 2017 10:04:54 -0700


On 05/17/2017 02:38 PM, Sean Dague wrote:


Some of the concerns/feedback has been "please describe things that are
harder by this being an abstraction", so examples are provided.


so let's go through this list:

- OpenStack services taking a more active role in managing the DBMS

mmmm, "managing" is vague to me, are we referring to the databaseservice itself, e.g. starting / stopping / configuring? installerslike tripleo do this now, pacemaker is standard in HA for control ofservices, I think I need some background here as to what the more activerole would look like.



- The ability to have zero down time upgrade for services such as
  Keystone.

So "zero down time upgrades" seems to have broken into:

* "expand / contract with the code carefully dancing around theexistence of two schema concepts simultaneously", e.g. nova, neutron.AFAIK there is no particular issue supporting multiple backends on thisbecause we use alembic or sqlalchemy-migrate to abstract away basicALTER TABLE types of feature.

* "expand / contract using server side triggers to reconcile the twoschema concepts", e.g. keystone. This is more difficult because thereis currently no "trigger" abstraction layer. Triggers represent moreof an imperative programming model vs. typical SQL, which is why I'venot taken on trying to build a one-size-fits-all abstraction for this inupstream Alembic or SQLAlchemy. However, it is feasible to build a"one-size-that-fits-openstack-online-upgrades" abstraction. I wastrying to gauge interest in helping to create this back in the"triggers" thread, in my note athttp://lists.openstack.org/pipermail/openstack-dev/2016-August/102345.html,which also referred to some very raw initial code examples. However, itreceived strong pushback from a wide range of openstack veterans, whichled me to believe this was not a thing that was happening. ApparentlyKeystone has gone ahead and used triggers anyway, however I was notpulled into that process. But if triggers are to be "blessed" by atleast some projects, I can likely work on this problem for MySQL /Postgresql agnosticism. If keystone is using triggers right now foronline upgrades, I would ask, are they currently working on Postgresqlas well with PG-specific triggers, or does Postgresql degrade into a"non-online" migration scenario if you're running Keystone?



- Consistent UTF8 4 & 5 byte support in our APIs

"5 byte support" appears to refer to utf-8's ability to be...well atotal of 6 bytes. But in practice, unicode itself only needs 4 bytesand that is as far as any database supports right now since they targetunicode (see https://en.wikipedia.org/wiki/UTF-8#Description). That'sall any database we're talking about supports at most. So...lets assumethis means four bytes.

From the perspective of database-agnosticism with regards to databaseand driver support for non-ascii characters, this problem has beensolved by SQLAlchemy well before Python 3 existed when many DBAPIs wouldliterally crash if they received a u'' string, and the rest of themwould churn out garbage; SQLAlchemy implemented a full encode/decodelayer on top of the Python DBAPI to fix this. The situation is vastlyimproved now that all DBAPIs support unicode natively.

However, on the MySQL side there is this complexity that their utf-8support is a 3-byte only storage model, and you have to use utf8mb4 ifyou want the four byte model. I'm not sure right now what projects arespecifically hitting issues related to this.

Postgresql doesn't have such a limitation. If your Postgresql serveror specific database is set up for utf-8 (which should be the case),then you get full utf-8 character set support.

So I don't see the problem of "consistent utf8 support" having much todo with whether or not we support Posgtresql - you of course need your"CREATE DATABASE" to include the utf8 charset like we do on MySQL, butthat's it.



- The requirement that Postgresql libraries are compiled for new users
  trying to just run unit tests (no equiv is true for mysql because of
  the pure python driver).

I would suggest that new developers for whom the presence of things likepostgresql client libraries is a challenge (but somehow they are runninga MySQL server for their pure python driver to talk to?) don't actuallyhave to worry about running the tests against Postgresql, this is howthe "opportunistic" testing model in oslo.db has always worked; it onlyruns for the backends that you have set up.

Also, openstack got all the way through Kilo approximately using thenative python-MySQL driver which required a compiled client library aswell as the MySQL dependencies be installed. The psycopg2 driver has aton of whl's up on pypi (https://pypi.python.org/pypi/psycopg2) and alllinux distros supply it as a package in any case, so an actual "compile"should not be needed. Also, this is Openstack....it's basic existenceis a kind of this vastly enormous glue between thousands of native(ultimately C-compiled) libraries and packages, and it runs onlyon...linux. So this is just a weird point to bring up. Seems alittle red herring to me.



- Consistency around case sensitivity collation defaults that lead to
  strange bugs around searching/updating names in resources.

Finally, a real issue-ish thing that requires a resolution. So thegood news here is that while MySQL is defaulting to case-insensitivecollations (which I assume we like) but Postgresql has almost no supportfor case-insensitive collations (unless you use this odd CITEXTdatatype), it is possible to make a case-sensitive collation stylebecome case-insensitive at SQL query time much more easily than it wouldbe to go the other way.

SQLAlchemy already has some case-insensitive operators, most notably"ilike()", which is a case-insensitive "LIKE" that is backend agnostic.If these search queries are just using LIKE then they only need useilike() from SQLAlchemy instead of like().

If we are talking about the full range of operators like ==, !=, <, >etc., and/or if we are also concerned that developers may use like()when they really need to use ilike(), the issue can be addressed at thetyping level as well. Using SQLAlchemy, a String datatype thatguarantees case-insensitive comparisons is straightforward to construct.This would be a Python side replacement for the String type, andpossibly Text, Unicode, etc. as needed. It does **not** imply anydatabase schema migrations. The hypothetical CaseInsensitiveStringwould override all the comparison operators to ensure that on thePostgresql (or other case-sensitive) backend, both sides of theexpression are embedded within the SQL LOWER() function, so that thesecomparisons act case insensitively. The trick then is to get thedownstream projects to use this type (which would be in oslo.db) intheir models, which is the usual herding cats job. But this is a prettysolvable problem.


so to sum up:

1. "more active management role" - not totally sure what that refers tobeyond what is already present

2. "zero downtime upgrades" - to the extent that projects use serverside constructs like triggers, effort is needed to produceimplementations for both backends. I can help with this effort but thedownstream projects need to care and be open to reviewing / merginggerrits and things. Not sure what the state of trigger-based onlineupgrades is for Postgresql right now, or if there is a degradation mode

present in Keystone w/ Postgresql.

3. utf-8 - no problem

4. native clients - red herring ?

5. case-insensitive - both query-level and model-level solutions existthat work within our existing frameworks and patterns


A statement around support was also put in there, because support only
meant QA jobs, or only developers for some folks. I think it's important
to ensure we paint the whole picture with how people get support in an
Open Source project.

There seems to be general agreement that we need to be more honest with
users, and that we've effectively been lying to them.

I feel like the current sticking points come down to whether:

* it's important that the operator community largely is already in one
camp or not
* future items listed that are harder are important enough to justify a
strict trade off here
* it's ok to have the proposal have a firm lean in tone, even though
it's set of concrete actions are pretty reversible and don't commit to
future removal of postgresql


Also, as I stated on IRC, if some set of individuals came through and
solved all the future problems on the list for us as a community, my
care on how many DBs supported would drastically decrease. Because its
the fact that it's costing us solving real problems that we want to
solve (by making them too complex for anyone to take on), is my key
concern. For folks asking the question about what they could do to make
pg a first class citizen, that's a pretty good starting point.

        -Sean


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [tc] revised Postgresql support status patch for governance

Reply via email to