On 05/21/2017 03:51 PM, Monty Taylor wrote:

So I don't see the problem of "consistent utf8 support" having much to
do with whether or not we support Posgtresql - you of course need your
"CREATE DATABASE" to include the utf8 charset like we do on MySQL, but
that's it.

That's where we stand which means that we're doing 3 byte UTF8 on MySQL,
and 4 byte on PG. That's actually an API facing difference today. It's
work to dig out of from the MySQL side, maybe the PG one is just all
super cool and done. But it's still a consideration point.

The biggest concern for me is that we're letting API behavior be dictated by database backend and/or database config choices. The API should behave like the API behaves.

The API should behave like, "we store utf-8". We should accept that "utf-8" means "up to four bytes" and make sure we are using utf8mb4 for all MySQL backends. That the API of MySQL has made this bizarre decision about what utf-8 is to be would be a bug in MySQL that needs to be worked around by the calling application. Other databases that want to work with openstack need to also do utf-8 with four bytes. We can easily add some tests to oslo.db that round trip an assortment of unicode glyphs to confirm this (if there's one kind of test I've written more than anyone should, it's pushing out non-ascii bytes to a database and testing they come back the same).



Sure, it's work. But that's fine. The point of that list was that there
is stuff that is work because SQLA is a leaky abstraction. Which is fine
if there are people taking that work off the table.

I would not characterize this as SQLA being a leaky abstraction.

yeessss !   win!    :)


I'd say that at some point we didn't make a decision as to what we wanted to do with text input and how it would be stored or not stored and how it would be searched and sorted. Case sensitive collations have been available to us the entire time, but we never decided whether our API was case sensitive or case insensitive. OR - we *DID* decide that our API is case insensitive the fact that it isn't on some deployments is a bug. I'm putting money on the 'nobody made a decision' answer.

I wasn't there but perhaps early Openstack versions didn't have "textual search" kinds of features ? maybe they were added by folks who didn't consider the case sensitivity issue at that time. I'd be strongly in favor of making use of oslo.db / SQLAlchemy constructs that are explicitly case sensitive or not. It's true, SQLAlchemy also does not force you to "make a decision" on this, if it did, this would be in the "hooray the abstraction did not leak!" category. But SQLA makes lots of these kinds of decisions to be kind of hands-off about things like this as developers often don't want there to be a decision made here (lest it adds even more to the "SQLAlchemy forces me to make so many decisions!" complaint I have to read on twitter every day).






__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to