Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

Octave J. Orgeron Wed, 26 Jul 2017 16:04:06 -0700

Hi Michael,

Yeah, the data types are the same database wise, it's just the total rowsize that different between InnoDB and NDB when it comes to tablestructure. So it's more of a decision point of:


A. Change the column size or type across the database types.

B. Have a mechanism to dynamically change the size or type of a specificcolumn based on the database engine.

While I do like A, I know that it requires agreement and jumping throughhoops to get there. As where B, definitely requires some overhead inoslo.db and functions/args to make the dynamic change, but it reducesthe impacts, politics, and hoop jumping. Either way though, we'll haveto modify the migration scripts, deal with foreign keys, and still haveto deal with any migrations that don't make proper use of oslo.db.

Let me know which way we should proceed. I'd like to get things movingforward again.


Thanks,
Octave

On 7/26/2017 4:02 PM, Michael Bayer wrote:

I realize now that we are in fact going for a total "row size", when I
was under the impression that ndb had a simple limit of 64 characters
for a VARCHAR.

As I was going on the completely wrong assumptions, I'd like to
rethink the approach of datatypes.

I do think a real migration that simply reduces the sizes of selected
columns is the best approach in this case, and that the types like
AutoStringXYZ should go away completely.

To that end I've proposed reverting the one ndb patchset that has
merged which is the one in Cinder:

https://review.openstack.org/#/c/487603/

However, if Cinder declines to revert this, the "AutoXYZ" types in
oslo.db (which have also been released) will have to go through a
deprecation cycle.

Additionally, my concern that projects will not have any way to guard
against ever going over a 14K row size remains, and I still think that
checks need to be put in place in oslo.db that would sum the total row
size of any given table and raise an error if the limit is surpassed.




On Wed, Jul 26, 2017 at 5:40 PM, Octave J. Orgeron
<octave.orge...@oracle.com> wrote:

Hi Michael,

Comments below..

On 7/26/2017 1:08 PM, Michael Bayer wrote:



On Jul 25, 2017 3:38 PM, "Octave J. Orgeron" <octave.orge...@oracle.com>
wrote:

Hi Michael,

I understand that you want to abstract this completely away inside of
oslo.db. However, the reality is that making column changes based purely on
the size and type of that column, without understanding what that column is
being used for is extremely dangerous. You could end up clobbering a column
that needs a specific length for a value,



Nowhere in my example is the current length truncated.   Also, if two
distinct lengths truly must be maintained we add a field "minimum_length".


prevent

  an index from working, etc.


That's what the indexable flag would achieve.

It

  wouldn't make sense to just do global changes on a column based on the
size.


This seems to be what your patches are doing, however.


This is incorrect. I only change columns that meet my criteria for being
changed. I'm not globally changing columns across every table and service.
So to be clear and make sure we are on the same page..

Are you proposing that we continue to select specific columns and adjust
their size by using the below, instead of the ndb.Auto* functions?

oslo_db.sqlalchemy.String(<value>, indexable=<value>, ndb_size=<value>,
ndb_type=<value>)

i.e.

oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for most
dbs, TINYTEXT for ndb
oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
dbs, TEXT for ndb
oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
VARCHAR(64) on ndb

So if I need to change a column that today says:

sa.String(4096)

I would modify it to:

oslo_db.sqlalchemy.String(4096, ndb_type=TEXT)

OR

Are you proposing that we change very single column across every single
database blindly using some logic in oslo.db, where even if a column doesn't
need to be changed, it gets changed based on the database engine type and
the size of the column?

So even if we have a table that doesn't need to be changed or touched, we
would end up with:

mysql_enable_ndb = True

sa.String(255) -> TINYTEXT

If that is the type of behavior you are aiming for, I think don't that makes
sense.





There are far more tables that fit in both InnoDB and NDB already than those
that don't. As I've stated many times before, the columns that I make
changes to are evaluated to understand:

1. What populates it?
2. Who consumes it?
3. What are the possible values and required lengths?
4. What is the impact of changing the size or type?
5. Evaluated against the other columns in the table, which one makes the
most sense to adjust?

I don't see a way of automating that and making it maintainable without a
lot more overhead in code and people.


My proposal is intended to *reduce* the great verbosity in the current
patches I see and remove the burden of every project having to be aware of
"ndb" every time a column is added.


I agree with using as few arguments to the oslo.db.sqlalchemy.String
function. But at the same time, if a column needs to be adjusted, someone
has to put the right arguments there. As far as the burden goes, Oracle is
already taking the ownership of making MySQL Cluster work across services,
which means maintaining patches and creating new ones as projects evolve.

Also, if we want one behavior for NDB, another for PostgreSQL, and yet
another for DB2 or Oracle DB, wouldn't we need to be somewhat verbose on
what we want?

i.e.

String(8192, ndb_type=TEXT, pgs_type=text, db2_type=CLOB, ora_type=CLOB)




If

  we really want to remove the complexity here, why don't we just change the
sizes and types on these handful of table columns so that they fit within
both InnoDB and NDB?



Because that requires new migrations which are a great risk and
inconvenience to projects.


When it comes to projects that need table columns adjusted, so far we are
only talking about Cinder, Neutron, Nova, and Magnum. Also, let's keep in
mind that it's only a handful of tables that are being touched. I still feel
this is being blown out of proportion. Here are some metrics to consider,
the 4 services with tables that need to be adjusted:

Service:    # of Tables with columns changed:
Cinder         1
Neutron      5
Nova            1
Magnum     2

With the exception of Magnum, those services tend to have over 75 or 100
tables. So I ask, are we blowing this out of proportion compared the normal
churn on tables in these services? For example, Neutron dropped 30+ tables
and changed dozens. These databases are not so static over time to begin
with.



That

  way we don't need these functions and the tables are exactly the same? That
would only leave us with the createtable, savepoint/rollback, etc. stuff to
address which is already taken care of in the ndb module in oslo.db? Then we
just fix the foreign key stuff as I've been doing, since it has zero impact
on InnoDB deployments and if anything ensures things are consistent. That
would then leave us to really focus on fixing migrations to use oslo.db and
pass the correct flags, which is a more lengthy process than the rest of
this.

I don't see the point in trying to make this stuff anymore complicated.


The proposal is to make it simpler than it is right now.

Run though every column change youve proposed and show me which ones don't
fit into my proposed ruleset.   I will add additional declarative flags to
ensure those use cases are covered.





Octave


On 7/25/2017 12:20 PM, Michael Bayer wrote:

On Mon, Jul 24, 2017 at 5:41 PM, Michael Bayer <mba...@redhat.com> wrote:

oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for
most
dbs, TINYTEXT for ndb
oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
dbs, TEXT for ndb
oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
VARCHAR(64) on ndb

This way, we can override the String with TINYTEXT or TEXT or change the
size for ndb.

oslo_db.sqlalchemy.String(255)     -> VARCHAR(255) on most dbs,
TINYTEXT() on ndb
oslo_db.sqlalchemy.String(255, ndb_size=64)     -> VARCHAR(255) on
most dbs, VARCHAR(64) on ndb
oslo_db.sqlalchemy.String(50)     -> VARCHAR(50) on all dbs
oslo_db.sqlalchemy.String(64)     -> VARCHAR(64) on all dbs
oslo_db.sqlalchemy.String(80)     -> VARCHAR(64) on most dbs,
TINYTEXT()
on ndb
oslo_db.sqlalchemy.String(80, ndb_size=55)     -> VARCHAR(64) on most
dbs, VARCHAR(55) on ndb

don't worry about implementation, can the above declaration ->
datatype mapping work ?

In my patch for Neutron, you'll see a lot of the AutoStringText() calls
to
replace exceptionally long String columns (4096, 8192, and larger).

MySQL supports large VARCHAR now, OK.   yeah this could be
String(8192, ndb_type=TEXT) as well.

OK, no, sorry each time I think of this I keep seeing the verbosity of
imports etc. in the code, because if we had:

String(80, ndb_type=TEXT)

then we have to import both String and TEXT, and then what if there's
ndb.TEXT, the code is still making an ndb-specific decision, etc.

I still see that this can be mostly automated from a simple ruleset
based on the size:

length <= 64 :    VARCHAR(length) on all backends
length > 64, length <= 255:   VARCHAR(length) for most backends,
TINYTEXT for ndb
length > 4096:  VARCHAR(length) for most backends, TEXT for ndb

the one case that seems outside of this is:

String(255)  where they have an index or key on the VARCHAR, and in
fact they only need < 64 characters to be indexed.  In that case you
don't want to use TINYTEXT, right?   So one exception:

oslo_db.sqlalchemy.types.String(255, indexable=True)

e.g. a declarative hint to the oslo_db backend to not use a LOB type.

then we just need oslo_db.sqlalchemy.types.String, and virtually
nothing except the import has to change, and a few keywords.

What we're trying to do in oslo_db is as much as possible state the
intent of a structure or datatype declaratively, and leave as much of
the implementation up to oslo_db itself.

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


--


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

--

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

Reply via email to