subject:"\[openstack\-dev\] \[oslo.db\] \[ndb\] ndb namespace throughout openstack projects"


On 07/26/2017 07:58 PM, Michael Bayer wrote:
On Jul 26, 2017 7:45 PM, "Jay Pipes" > wrote:


On 07/26/2017 07:06 PM, Octave J. Orgeron wrote:

Hi Michael,

On 7/26/2017 4:28 PM, Michael Bayer wrote:


it at all.
thinking out loud

oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=64)
oslo_db.sqlalchemy.types.String(255,
mysql_small_rowsize=sa.TINYTEXT)
oslo_db.sqlalchemy.types.String(255,
mysql_small_rowsize=sa.TEXT)


so if you don't have mysql_small_rowsize,  nothing happens.


I think the mysql_small_rowsize is a bit misleading since in one
case we are changing the size and the others the type. Perhaps:

mysql_alt_size=64
mysql_alt_type=sa.TINYTEXT
mysql_alt_type=sa.TEXT

alt standing for alternate. What do you think?


-1

I think it should be specific to NDB, since that's what the override
is for. I'd support something like:

  oslo_db.sqlalchemy.types.String(255, mysql_ndb_size=64)


Ok, I give up on that fight, fine.  mysql_ndb_xyz but at least build it 
into a nicely named type.   I know i come off as crazy changing my mind 
and temporarily forgetting key details but this is often how I 
internally come up with things...


Isn't that exactly what I'm proposing below? :)


Octave, I understand due to the table row size limitations the
desire to reduce some column sizes for NDB. What I'm not entirely
clear on is the reason to change the column *type* specifically for
NDB. There are definitely cases where different databases have
column types -- say, PostgreSQL's INET column type -- that don't
exist in other RDBMS. For those cases, the standard approach in
SQLAlchemy is to create a sqlalchemy ColumnType concrete class that
essentially translates the CREATE TABLE statement (and type
compilation/coercing) to specify the supported column type in the
RDBMS if it's supported otherwise defaults the column type to
something coerceable.

An example of this can be seen here for how this is done for IPv4
data in the apiary project:

https://github.com/gmr/apiary/blob/master/apiary/types.py#L49


I'd certainly be open to doing things like this for NDB, but I'd
first need to understand why you chose to convert the column types
for the columns that you did. Any information you can provide about
that would be great.

Best,
-jay


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev





__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

On Jul 26, 2017 7:45 PM, "Jay Pipes"  wrote:

On 07/26/2017 07:06 PM, Octave J. Orgeron wrote:

> Hi Michael,
>
> On 7/26/2017 4:28 PM, Michael Bayer wrote:
>
>>
>> it at all.
>> thinking out loud
>>
>> oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=64)
>> oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=sa.TINYTEXT)
>> oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=sa.TEXT)
>>
>>
>> so if you don't have mysql_small_rowsize,  nothing happens.
>>
>>
> I think the mysql_small_rowsize is a bit misleading since in one case we
> are changing the size and the others the type. Perhaps:
>
> mysql_alt_size=64
> mysql_alt_type=sa.TINYTEXT
> mysql_alt_type=sa.TEXT
>
> alt standing for alternate. What do you think?
>

-1

I think it should be specific to NDB, since that's what the override is
for. I'd support something like:

 oslo_db.sqlalchemy.types.String(255, mysql_ndb_size=64)

Ok, I give up on that fight, fine.  mysql_ndb_xyz but at least build it
into a nicely named type.   I know i come off as crazy changing my mind and
temporarily forgetting key details but this is often how I internally come
up with things...

Octave, I understand due to the table row size limitations the desire to
reduce some column sizes for NDB. What I'm not entirely clear on is the
reason to change the column *type* specifically for NDB. There are
definitely cases where different databases have column types -- say,
PostgreSQL's INET column type -- that don't exist in other RDBMS. For those
cases, the standard approach in SQLAlchemy is to create a sqlalchemy
ColumnType concrete class that essentially translates the CREATE TABLE
statement (and type compilation/coercing) to specify the supported column
type in the RDBMS if it's supported otherwise defaults the column type to
something coerceable.

An example of this can be seen here for how this is done for IPv4 data in
the apiary project:

https://github.com/gmr/apiary/blob/master/apiary/types.py#L49

I'd certainly be open to doing things like this for NDB, but I'd first need
to understand why you chose to convert the column types for the columns
that you did. Any information you can provide about that would be great.

Best,
-jay

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects


On 07/26/2017 07:06 PM, Octave J. Orgeron wrote:

Hi Michael,

On 7/26/2017 4:28 PM, Michael Bayer wrote:


it at all.
thinking out loud

oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=64)
oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=sa.TINYTEXT)
oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=sa.TEXT)


so if you don't have mysql_small_rowsize,  nothing happens.



I think the mysql_small_rowsize is a bit misleading since in one case we 
are changing the size and the others the type. Perhaps:


mysql_alt_size=64
mysql_alt_type=sa.TINYTEXT
mysql_alt_type=sa.TEXT

alt standing for alternate. What do you think?


-1

I think it should be specific to NDB, since that's what the override is 
for. I'd support something like:


 oslo_db.sqlalchemy.types.String(255, mysql_ndb_size=64)

Octave, I understand due to the table row size limitations the desire to 
reduce some column sizes for NDB. What I'm not entirely clear on is the 
reason to change the column *type* specifically for NDB. There are 
definitely cases where different databases have column types -- say, 
PostgreSQL's INET column type -- that don't exist in other RDBMS. For 
those cases, the standard approach in SQLAlchemy is to create a 
sqlalchemy ColumnType concrete class that essentially translates the 
CREATE TABLE statement (and type compilation/coercing) to specify the 
supported column type in the RDBMS if it's supported otherwise defaults 
the column type to something coerceable.


An example of this can be seen here for how this is done for IPv4 data 
in the apiary project:


https://github.com/gmr/apiary/blob/master/apiary/types.py#L49

I'd certainly be open to doing things like this for NDB, but I'd first 
need to understand why you chose to convert the column types for the 
columns that you did. Any information you can provide about that would 
be great.


Best,
-jay

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects


On 07/26/2017 07:09 PM, Octave J. Orgeron wrote:

Hi Michael,

That is something we are working towards having our own CI where we 
could catch these things. I do think that having a function in the utils 
module of oslo.db to test the length of a table row would be handy 
though for projects to leverage as part of their unit tests.


Agreed, and this can be added relatively easily even without any support 
for NDB in the projects themselves (yet).


Best,
-jay

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-26 Thread Sean McGinnis

[snip]
> 
> I do think a real migration that simply reduces the sizes of selected
> columns is the best approach in this case, and that the types like
> AutoStringXYZ should go away completely.
> 
> To that end I've proposed reverting the one ndb patchset that has
> merged which is the one in Cinder:
> 
> https://review.openstack.org/#/c/487603/
> 
> However, if Cinder declines to revert this, the "AutoXYZ" types in
> oslo.db (which have also been released) will have to go through a
> deprecation cycle.
> 

I have just approved this revert. I would really like to see MySQL
cluster support added, but it appears we need to work though some
things yet. I'd rather we get something in early in Queens so we
have the whole cycle to work through any unintended consequences.

Sean

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects


On 07/26/2017 07:01 PM, Octave J. Orgeron wrote:

Either way though, we'll have to ... still have to deal with any
migrations that don't make proper use of oslo.db.
Could you elaborate on the above? What about the Nova migrations aren't 
making proper use of oslo.db? Could you provide an example for us?


Best,
-jay

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects


Hi Michael,

That is something we are working towards having our own CI where we 
could catch these things. I do think that having a function in the utils 
module of oslo.db to test the length of a table row would be handy 
though for projects to leverage as part of their unit tests.


Octave

On 7/26/2017 4:54 PM, Michael Bayer wrote:



thinking out loud

oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=64)
oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=sa.TINYTEXT)
oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=sa.TEXT)


so if you don't have mysql_small_rowsize,  nothing happens.



Also, these flags in theory would *only* be in an old migration file. 
  Going forward, we'd hope that Oracle will be running its own CI and 
ensuring new migrations don't go over the limits , right ?  New 
columns would ideally not have conditional datatype rules at all.






__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects


Hi Michael,


On 7/26/2017 4:28 PM, Michael Bayer wrote:


it at all.
thinking out loud

oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=64)
oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=sa.TINYTEXT)
oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=sa.TEXT)


so if you don't have mysql_small_rowsize,  nothing happens.



I think the mysql_small_rowsize is a bit misleading since in one case we 
are changing the size and the others the type. Perhaps:


mysql_alt_size=64
mysql_alt_type=sa.TINYTEXT
mysql_alt_type=sa.TEXT

alt standing for alternate. What do you think?

Octave

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects


Hi Michael,

Yeah, the data types are the same database wise, it's just the total row 
size that different between InnoDB and NDB when it comes to table 
structure. So it's more of a decision point of:


A. Change the column size or type across the database types.
B. Have a mechanism to dynamically change the size or type of a specific 
column based on the database engine.


While I do like A, I know that it requires agreement and jumping through 
hoops to get there. As where B, definitely requires some overhead in 
oslo.db and functions/args to make the dynamic change, but it reduces 
the impacts, politics, and hoop jumping. Either way though, we'll have 
to modify the migration scripts, deal with foreign keys, and still have 
to deal with any migrations that don't make proper use of oslo.db.


Let me know which way we should proceed. I'd like to get things moving 
forward again.


Thanks,
Octave

On 7/26/2017 4:02 PM, Michael Bayer wrote:

I realize now that we are in fact going for a total "row size", when I
was under the impression that ndb had a simple limit of 64 characters
for a VARCHAR.

As I was going on the completely wrong assumptions, I'd like to
rethink the approach of datatypes.

I do think a real migration that simply reduces the sizes of selected
columns is the best approach in this case, and that the types like
AutoStringXYZ should go away completely.

To that end I've proposed reverting the one ndb patchset that has
merged which is the one in Cinder:

https://review.openstack.org/#/c/487603/

However, if Cinder declines to revert this, the "AutoXYZ" types in
oslo.db (which have also been released) will have to go through a
deprecation cycle.

Additionally, my concern that projects will not have any way to guard
against ever going over a 14K row size remains, and I still think that
checks need to be put in place in oslo.db that would sum the total row
size of any given table and raise an error if the limit is surpassed.




On Wed, Jul 26, 2017 at 5:40 PM, Octave J. Orgeron
 wrote:

Hi Michael,

Comments below..

On 7/26/2017 1:08 PM, Michael Bayer wrote:



On Jul 25, 2017 3:38 PM, "Octave J. Orgeron" 
wrote:

Hi Michael,

I understand that you want to abstract this completely away inside of
oslo.db. However, the reality is that making column changes based purely on
the size and type of that column, without understanding what that column is
being used for is extremely dangerous. You could end up clobbering a column
that needs a specific length for a value,



Nowhere in my example is the current length truncated.   Also, if two
distinct lengths truly must be maintained we add a field "minimum_length".


prevent

  an index from working, etc.


That's what the indexable flag would achieve.

It

  wouldn't make sense to just do global changes on a column based on the
size.


This seems to be what your patches are doing, however.


This is incorrect. I only change columns that meet my criteria for being
changed. I'm not globally changing columns across every table and service.
So to be clear and make sure we are on the same page..

Are you proposing that we continue to select specific columns and adjust
their size by using the below, instead of the ndb.Auto* functions?

oslo_db.sqlalchemy.String(, indexable=, ndb_size=,
ndb_type=)

i.e.

oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for most
dbs, TINYTEXT for ndb
oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
dbs, TEXT for ndb
oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
VARCHAR(64) on ndb

So if I need to change a column that today says:

sa.String(4096)

I would modify it to:

oslo_db.sqlalchemy.String(4096, ndb_type=TEXT)

OR

Are you proposing that we change very single column across every single
database blindly using some logic in oslo.db, where even if a column doesn't
need to be changed, it gets changed based on the database engine type and
the size of the column?

So even if we have a table that doesn't need to be changed or touched, we
would end up with:

mysql_enable_ndb = True

sa.String(255) -> TINYTEXT

If that is the type of behavior you are aiming for, I think don't that makes
sense.





There are far more tables that fit in both InnoDB and NDB already than those
that don't. As I've stated many times before, the columns that I make
changes to are evaluated to understand:

1. What populates it?
2. Who consumes it?
3. What are the possible values and required lengths?
4. What is the impact of changing the size or type?
5. Evaluated against the other columns in the table, which one makes the
most sense to adjust?

I don't see a way of automating that and making it maintainable without a
lot more overhead in code and people.


My proposal is intended to *reduce* the great verbosity in the current
patches I see and remove the burden of every project having to be aware of
"ndb" every time a column is added.


I agree with using as few arguments to the oslo.

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

On Jul 26, 2017 6:28 PM, "Michael Bayer"  wrote:

On Wed, Jul 26, 2017 at 6:19 PM, Michael Bayer  wrote:
> On Wed, Jul 26, 2017 at 5:30 PM, Michael Bayer  wrote:
>>
>> There is a bigger problem with this entire series of changes, whether
>> or not the "ndb" keyword is present.  Which is that projects need to
>> add new columns, tables, and make datatype changes all the time, and
>> they will not have any idea about the requirements for ndb or even
>> that it exists, nor will anyone have access to this platform for
>> development nor should they be expected to worry about it.   If they
>> not only have to fill in dozens of special "ndb" or generic-but-needed
>> by ndb flags, and then if they even have to worry about the sum of all
>> the sizes in a row, that means the ndb implementation will be
>> continuously broken across many projects in every release unless ndb
>> developers are checking every database change in every project at all
>> times.   Is that level of effort part of the plan?
>
> OK, I apologize, you answered that here:
>
> https://review.openstack.org/#/c/427970/26
>
>
>> Now considering that my company is heavily invested in using MySQL
Cluster (NDB) and that we use the kola framework, we have to keep an eye on
each of the services to make sure that it works. This is why you see lots
of other patches that I'm working on to fix services like Cinder, Neutron,
Nova, etc. So has time goes by, we will continue to make patches to enable
these services to work with NDB.
>
> If we were to approach this as real migrations that change the lengths
> of datatypes, that means the tables must be created at InnoDB first
> and migrated to NDB within the migrations.  Because NDB will disallow
> the initial creation, right?
>
> if the proposal is to modify the actual sizes in the original
> migration files, that's not something that can be done, unfortunately,
> it would be hugely risky because those migrations represent a snapshot
> of the actual schema.
>
> If we *do* need to keep doing something like the "AutoStringXYZ"
> approach I really want to change those names and not have any "ndb."
> in it at all.

thinking out loud

oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=64)
oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=sa.TINYTEXT)
oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=sa.TEXT)


so if you don't have mysql_small_rowsize,  nothing happens.



Also, these flags in theory would *only* be in an old migration file.
Going forward, we'd hope that Oracle will be running its own CI and
ensuring new migrations don't go over the limits , right ?  New columns
would ideally not have conditional datatype rules at all.












>
> But all the options here seem kind of icky.
>
>
>
>
>
>
>>
>>
>>
>>
>>
>>
>>
>>>
>>> I don't see a way of automating that and making it maintainable without
a
>>> lot more overhead in code and people. If we really want to remove the
>>> complexity here, why don't we just change the sizes and types on these
>>> handful of table columns so that they fit within both InnoDB and NDB?
That
>>> way we don't need these functions and the tables are exactly the same?
That
>>> would only leave us with the createtable, savepoint/rollback, etc.
stuff to
>>> address which is already taken care of in the ndb module in oslo.db?
Then we
>>> just fix the foreign key stuff as I've been doing, since it has zero
impact
>>> on InnoDB deployments and if anything ensures things are consistent.
That
>>> would then leave us to really focus on fixing migrations to use oslo.db
and
>>> pass the correct flags, which is a more lengthy process than the rest of
>>> this.
>>>
>>> I don't see the point in trying to make this stuff anymore complicated.
>>>
>>> Octave
>>>
>>>
>>> On 7/25/2017 12:20 PM, Michael Bayer wrote:

 On Mon, Jul 24, 2017 at 5:41 PM, Michael Bayer 
wrote:
>>
>> oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for
>> most
>> dbs, TINYTEXT for ndb
>> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for
most
>> dbs, TEXT for ndb
>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most
dbs,
>> VARCHAR(64) on ndb
>>
>> This way, we can override the String with TINYTEXT or TEXT or change
the
>> size for ndb.
>>>
>>> oslo_db.sqlalchemy.String(255) -> VARCHAR(255) on most dbs,
>>> TINYTEXT() on ndb
>>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on
>>> most dbs, VARCHAR(64) on ndb
>>> oslo_db.sqlalchemy.String(50) -> VARCHAR(50) on all dbs
>>> oslo_db.sqlalchemy.String(64) -> VARCHAR(64) on all dbs
>>> oslo_db.sqlalchemy.String(80) -> VARCHAR(64) on most dbs,
>>> TINYTEXT()
>>> on ndb
>>> oslo_db.sqlalchemy.String(80, ndb_size=55) -> VARCHAR(64) on
most
>>> dbs, VARCHAR(55) on ndb
>>>
>>> don't worry about implementation, can the above declaration ->
>>> datatype mapping work

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

On Wed, Jul 26, 2017 at 6:19 PM, Michael Bayer  wrote:
> On Wed, Jul 26, 2017 at 5:30 PM, Michael Bayer  wrote:
>>
>> There is a bigger problem with this entire series of changes, whether
>> or not the "ndb" keyword is present.  Which is that projects need to
>> add new columns, tables, and make datatype changes all the time, and
>> they will not have any idea about the requirements for ndb or even
>> that it exists, nor will anyone have access to this platform for
>> development nor should they be expected to worry about it.   If they
>> not only have to fill in dozens of special "ndb" or generic-but-needed
>> by ndb flags, and then if they even have to worry about the sum of all
>> the sizes in a row, that means the ndb implementation will be
>> continuously broken across many projects in every release unless ndb
>> developers are checking every database change in every project at all
>> times.   Is that level of effort part of the plan?
>
> OK, I apologize, you answered that here:
>
> https://review.openstack.org/#/c/427970/26
>
>
>> Now considering that my company is heavily invested in using MySQL Cluster 
>> (NDB) and that we use the kola framework, we have to keep an eye on each of 
>> the services to make sure that it works. This is why you see lots of other 
>> patches that I'm working on to fix services like Cinder, Neutron, Nova, etc. 
>> So has time goes by, we will continue to make patches to enable these 
>> services to work with NDB.
>
> If we were to approach this as real migrations that change the lengths
> of datatypes, that means the tables must be created at InnoDB first
> and migrated to NDB within the migrations.  Because NDB will disallow
> the initial creation, right?
>
> if the proposal is to modify the actual sizes in the original
> migration files, that's not something that can be done, unfortunately,
> it would be hugely risky because those migrations represent a snapshot
> of the actual schema.
>
> If we *do* need to keep doing something like the "AutoStringXYZ"
> approach I really want to change those names and not have any "ndb."
> in it at all.

thinking out loud

oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=64)
oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=sa.TINYTEXT)
oslo_db.sqlalchemy.types.String(255, mysql_small_rowsize=sa.TEXT)


so if you don't have mysql_small_rowsize,  nothing happens.










>
> But all the options here seem kind of icky.
>
>
>
>
>
>
>>
>>
>>
>>
>>
>>
>>
>>>
>>> I don't see a way of automating that and making it maintainable without a
>>> lot more overhead in code and people. If we really want to remove the
>>> complexity here, why don't we just change the sizes and types on these
>>> handful of table columns so that they fit within both InnoDB and NDB? That
>>> way we don't need these functions and the tables are exactly the same? That
>>> would only leave us with the createtable, savepoint/rollback, etc. stuff to
>>> address which is already taken care of in the ndb module in oslo.db? Then we
>>> just fix the foreign key stuff as I've been doing, since it has zero impact
>>> on InnoDB deployments and if anything ensures things are consistent. That
>>> would then leave us to really focus on fixing migrations to use oslo.db and
>>> pass the correct flags, which is a more lengthy process than the rest of
>>> this.
>>>
>>> I don't see the point in trying to make this stuff anymore complicated.
>>>
>>> Octave
>>>
>>>
>>> On 7/25/2017 12:20 PM, Michael Bayer wrote:

 On Mon, Jul 24, 2017 at 5:41 PM, Michael Bayer  wrote:
>>
>> oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for
>> most
>> dbs, TINYTEXT for ndb
>> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
>> dbs, TEXT for ndb
>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
>> VARCHAR(64) on ndb
>>
>> This way, we can override the String with TINYTEXT or TEXT or change the
>> size for ndb.
>>>
>>> oslo_db.sqlalchemy.String(255) -> VARCHAR(255) on most dbs,
>>> TINYTEXT() on ndb
>>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on
>>> most dbs, VARCHAR(64) on ndb
>>> oslo_db.sqlalchemy.String(50) -> VARCHAR(50) on all dbs
>>> oslo_db.sqlalchemy.String(64) -> VARCHAR(64) on all dbs
>>> oslo_db.sqlalchemy.String(80) -> VARCHAR(64) on most dbs,
>>> TINYTEXT()
>>> on ndb
>>> oslo_db.sqlalchemy.String(80, ndb_size=55) -> VARCHAR(64) on most
>>> dbs, VARCHAR(55) on ndb
>>>
>>> don't worry about implementation, can the above declaration ->
>>> datatype mapping work ?
>>>
>>>
>> In my patch for Neutron, you'll see a lot of the AutoStringText() calls
>> to
>> replace exceptionally long String columns (4096, 8192, and larger).
>
> MySQL supports large VARCHAR now, OK.   yeah this could be
> String(8192, ndb_type=TEXT) as well.

>>

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

On Wed, Jul 26, 2017 at 5:30 PM, Michael Bayer  wrote:
>
> There is a bigger problem with this entire series of changes, whether
> or not the "ndb" keyword is present.  Which is that projects need to
> add new columns, tables, and make datatype changes all the time, and
> they will not have any idea about the requirements for ndb or even
> that it exists, nor will anyone have access to this platform for
> development nor should they be expected to worry about it.   If they
> not only have to fill in dozens of special "ndb" or generic-but-needed
> by ndb flags, and then if they even have to worry about the sum of all
> the sizes in a row, that means the ndb implementation will be
> continuously broken across many projects in every release unless ndb
> developers are checking every database change in every project at all
> times.   Is that level of effort part of the plan?

OK, I apologize, you answered that here:

https://review.openstack.org/#/c/427970/26


> Now considering that my company is heavily invested in using MySQL Cluster 
> (NDB) and that we use the kola framework, we have to keep an eye on each of 
> the services to make sure that it works. This is why you see lots of other 
> patches that I'm working on to fix services like Cinder, Neutron, Nova, etc. 
> So has time goes by, we will continue to make patches to enable these 
> services to work with NDB.

If we were to approach this as real migrations that change the lengths
of datatypes, that means the tables must be created at InnoDB first
and migrated to NDB within the migrations.  Because NDB will disallow
the initial creation, right?

if the proposal is to modify the actual sizes in the original
migration files, that's not something that can be done, unfortunately,
it would be hugely risky because those migrations represent a snapshot
of the actual schema.

If we *do* need to keep doing something like the "AutoStringXYZ"
approach I really want to change those names and not have any "ndb."
in it at all.

But all the options here seem kind of icky.






>
>
>
>
>
>
>
>>
>> I don't see a way of automating that and making it maintainable without a
>> lot more overhead in code and people. If we really want to remove the
>> complexity here, why don't we just change the sizes and types on these
>> handful of table columns so that they fit within both InnoDB and NDB? That
>> way we don't need these functions and the tables are exactly the same? That
>> would only leave us with the createtable, savepoint/rollback, etc. stuff to
>> address which is already taken care of in the ndb module in oslo.db? Then we
>> just fix the foreign key stuff as I've been doing, since it has zero impact
>> on InnoDB deployments and if anything ensures things are consistent. That
>> would then leave us to really focus on fixing migrations to use oslo.db and
>> pass the correct flags, which is a more lengthy process than the rest of
>> this.
>>
>> I don't see the point in trying to make this stuff anymore complicated.
>>
>> Octave
>>
>>
>> On 7/25/2017 12:20 PM, Michael Bayer wrote:
>>>
>>> On Mon, Jul 24, 2017 at 5:41 PM, Michael Bayer  wrote:
>
> oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for
> most
> dbs, TINYTEXT for ndb
> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
> dbs, TEXT for ndb
> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
> VARCHAR(64) on ndb
>
> This way, we can override the String with TINYTEXT or TEXT or change the
> size for ndb.
>>
>> oslo_db.sqlalchemy.String(255) -> VARCHAR(255) on most dbs,
>> TINYTEXT() on ndb
>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on
>> most dbs, VARCHAR(64) on ndb
>> oslo_db.sqlalchemy.String(50) -> VARCHAR(50) on all dbs
>> oslo_db.sqlalchemy.String(64) -> VARCHAR(64) on all dbs
>> oslo_db.sqlalchemy.String(80) -> VARCHAR(64) on most dbs,
>> TINYTEXT()
>> on ndb
>> oslo_db.sqlalchemy.String(80, ndb_size=55) -> VARCHAR(64) on most
>> dbs, VARCHAR(55) on ndb
>>
>> don't worry about implementation, can the above declaration ->
>> datatype mapping work ?
>>
>>
> In my patch for Neutron, you'll see a lot of the AutoStringText() calls
> to
> replace exceptionally long String columns (4096, 8192, and larger).

 MySQL supports large VARCHAR now, OK.   yeah this could be
 String(8192, ndb_type=TEXT) as well.
>>>
>>> OK, no, sorry each time I think of this I keep seeing the verbosity of
>>> imports etc. in the code, because if we had:
>>>
>>> String(80, ndb_type=TEXT)
>>>
>>> then we have to import both String and TEXT, and then what if there's
>>> ndb.TEXT, the code is still making an ndb-specific decision, etc.
>>>
>>> I still see that this can be mostly automated from a simple ruleset
>>> based on the size:
>>>
>>> length <= 64 :VARCHAR(length) on all backends
>>> length

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

I realize now that we are in fact going for a total "row size", when I
was under the impression that ndb had a simple limit of 64 characters
for a VARCHAR.

As I was going on the completely wrong assumptions, I'd like to
rethink the approach of datatypes.

I do think a real migration that simply reduces the sizes of selected
columns is the best approach in this case, and that the types like
AutoStringXYZ should go away completely.

To that end I've proposed reverting the one ndb patchset that has
merged which is the one in Cinder:

https://review.openstack.org/#/c/487603/

However, if Cinder declines to revert this, the "AutoXYZ" types in
oslo.db (which have also been released) will have to go through a
deprecation cycle.

Additionally, my concern that projects will not have any way to guard
against ever going over a 14K row size remains, and I still think that
checks need to be put in place in oslo.db that would sum the total row
size of any given table and raise an error if the limit is surpassed.




On Wed, Jul 26, 2017 at 5:40 PM, Octave J. Orgeron
 wrote:
> Hi Michael,
>
> Comments below..
>
> On 7/26/2017 1:08 PM, Michael Bayer wrote:
>
>
>
> On Jul 25, 2017 3:38 PM, "Octave J. Orgeron" 
> wrote:
>
> Hi Michael,
>
> I understand that you want to abstract this completely away inside of
> oslo.db. However, the reality is that making column changes based purely on
> the size and type of that column, without understanding what that column is
> being used for is extremely dangerous. You could end up clobbering a column
> that needs a specific length for a value,
>
>
>
> Nowhere in my example is the current length truncated.   Also, if two
> distinct lengths truly must be maintained we add a field "minimum_length".
>
>
> prevent
>
>  an index from working, etc.
>
>
> That's what the indexable flag would achieve.
>
> It
>
>  wouldn't make sense to just do global changes on a column based on the
> size.
>
>
> This seems to be what your patches are doing, however.
>
>
> This is incorrect. I only change columns that meet my criteria for being
> changed. I'm not globally changing columns across every table and service.
> So to be clear and make sure we are on the same page..
>
> Are you proposing that we continue to select specific columns and adjust
> their size by using the below, instead of the ndb.Auto* functions?
>
> oslo_db.sqlalchemy.String(, indexable=, ndb_size=,
> ndb_type=)
>
> i.e.
>
> oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for most
> dbs, TINYTEXT for ndb
> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
> dbs, TEXT for ndb
> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
> VARCHAR(64) on ndb
>
> So if I need to change a column that today says:
>
> sa.String(4096)
>
> I would modify it to:
>
> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT)
>
> OR
>
> Are you proposing that we change very single column across every single
> database blindly using some logic in oslo.db, where even if a column doesn't
> need to be changed, it gets changed based on the database engine type and
> the size of the column?
>
> So even if we have a table that doesn't need to be changed or touched, we
> would end up with:
>
> mysql_enable_ndb = True
>
> sa.String(255) -> TINYTEXT
>
> If that is the type of behavior you are aiming for, I think don't that makes
> sense.
>
>
>
>
>
> There are far more tables that fit in both InnoDB and NDB already than those
> that don't. As I've stated many times before, the columns that I make
> changes to are evaluated to understand:
>
> 1. What populates it?
> 2. Who consumes it?
> 3. What are the possible values and required lengths?
> 4. What is the impact of changing the size or type?
> 5. Evaluated against the other columns in the table, which one makes the
> most sense to adjust?
>
> I don't see a way of automating that and making it maintainable without a
> lot more overhead in code and people.
>
>
> My proposal is intended to *reduce* the great verbosity in the current
> patches I see and remove the burden of every project having to be aware of
> "ndb" every time a column is added.
>
>
> I agree with using as few arguments to the oslo.db.sqlalchemy.String
> function. But at the same time, if a column needs to be adjusted, someone
> has to put the right arguments there. As far as the burden goes, Oracle is
> already taking the ownership of making MySQL Cluster work across services,
> which means maintaining patches and creating new ones as projects evolve.
>
> Also, if we want one behavior for NDB, another for PostgreSQL, and yet
> another for DB2 or Oracle DB, wouldn't we need to be somewhat verbose on
> what we want?
>
> i.e.
>
> String(8192, ndb_type=TEXT, pgs_type=text, db2_type=CLOB, ora_type=CLOB)
>
>
>
>
> If
>
>  we really want to remove the complexity here, why don't we just change the
> sizes and types on these handful of table columns so that they fit within
> both InnoDB and NDB?
>
>
>
> Because

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects


Hi Michael,

Comments below..

On 7/26/2017 1:08 PM, Michael Bayer wrote:



On Jul 25, 2017 3:38 PM, "Octave J. Orgeron" 
mailto:octave.orge...@oracle.com>> wrote:


Hi Michael,

I understand that you want to abstract this completely away inside
of oslo.db. However, the reality is that making column changes
based purely on the size and type of that column, without
understanding what that column is being used for is extremely
dangerous. You could end up clobbering a column that needs a
specific length for a value, 




Nowhere in my example is the current length truncated.   Also, if two 
distinct lengths truly must be maintained we add a field "minimum_length".



prevent

 an index from working, etc. 



That's what the indexable flag would achieve.

It

 wouldn't make sense to just do global changes on a column based
on the size.


This seems to be what your patches are doing, however.


This is incorrect. I only change columns that meet my criteria for being 
changed. I'm not globally changing columns across every table and 
service. So to be clear and make sure we are on the same page..


Are you proposing that we continue to select specific columns and adjust 
their size by using the below, instead of the ndb.Auto* functions?


oslo_db.sqlalchemy.String(, indexable=, ndb_size=, 
ndb_type=)


i.e.

oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for 
most dbs, TINYTEXT for ndb
oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most 
dbs, TEXT for ndb
oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs, 
VARCHAR(64) on ndb


So if I need to change a column that today says:

sa.String(4096)

I would modify it to:

oslo_db.sqlalchemy.String(4096, ndb_type=TEXT)

OR

Are you proposing that we change very single column across every single 
database blindly using some logic in oslo.db, where even if a column 
doesn't need to be changed, it gets changed based on the database engine 
type and the size of the column?


So even if we have a table that doesn't need to be changed or touched, 
we would end up with:


mysql_enable_ndb = True

sa.String(255) -> TINYTEXT

If that is the type of behavior you are aiming for, I think don't that 
makes sense.







There are far more tables that fit in both InnoDB and NDB already
than those that don't. As I've stated many times before, the
columns that I make changes to are evaluated to understand:

1. What populates it?
2. Who consumes it?
3. What are the possible values and required lengths?
4. What is the impact of changing the size or type?
5. Evaluated against the other columns in the table, which one
makes the most sense to adjust?

I don't see a way of automating that and making it maintainable
without a lot more overhead in code and people. 



My proposal is intended to *reduce* the great verbosity in the current 
patches I see and remove the burden of every project having to be 
aware of "ndb" every time a column is added.


I agree with using as few arguments to the oslo.db.sqlalchemy.String 
function. But at the same time, if a column needs to be adjusted, 
someone has to put the right arguments there. As far as the burden goes, 
Oracle is already taking the ownership of making MySQL Cluster work 
across services, which means maintaining patches and creating new ones 
as projects evolve.


Also, if we want one behavior for NDB, another for PostgreSQL, and yet 
another for DB2 or Oracle DB, wouldn't we need to be somewhat verbose on 
what we want?


i.e.

String(8192, ndb_type=TEXT, pgs_type=text, db2_type=CLOB, ora_type=CLOB)





If

 we really want to remove the complexity here, why don't we just
change the sizes and types on these handful of table columns so
that they fit within both InnoDB and NDB? 




Because that requires new migrations which are a great risk and 
inconvenience to projects.


When it comes to projects that need table columns adjusted, so far we 
are only talking about Cinder, Neutron, Nova, and Magnum. Also, let's 
keep in mind that it's only a handful of tables that are being touched. 
I still feel this is being blown out of proportion. Here are some 
metrics to consider, the 4 services with tables that need to be adjusted:


Service:# of Tables with columns changed:
Cinder 1
Neutron  5
Nova1
Magnum 2

With the exception of Magnum, those services tend to have over 75 or 100 
tables. So I ask, are we blowing this out of proportion compared the 
normal churn on tables in these services? For example, Neutron dropped 
30+ tables and changed dozens. These databases are not so static over 
time to begin with.




That

 way we don't need these functions and the tables are exactly the
same? That would only leave us with the createtable,
savepoint/rollback, etc. stuff to address which is already taken
care of in the ndb module in oslo.db? Then w

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

On Tue, Jul 25, 2017 at 3:27 PM, Octave J. Orgeron
 wrote:
> 5. Evaluated against the other columns in the table, which one makes the
> most sense to adjust?


well, the above point is one I've been trying to get a straight answer
on for a long time.

"evaluated against other columns" suggests we cannot change the size
of a datatype in isolation, instead we are trying to create a total
length of the row.Otherwise, the size of the "other" columns
should not matter.

Then, in fact yes I do see you aren't changing every size, in 216_havana I see:

Column('display_name', String(length=255))  -> no change
Column('display_description', String(length=255)), -> becomes TINYTEXT
Column('os_type', String(length=255)) -> becomes VARCHAR(64)

The "display_name" column will render VARCHAR(255).  Which means, ndb
can have a VARCHAR(255).  But in the case of os_type, you shrink it to
be VARCHAR(64) for ndb. Why?   What happens if it stays
VARCHAR(255) ?

There is a bigger problem with this entire series of changes, whether
or not the "ndb" keyword is present.  Which is that projects need to
add new columns, tables, and make datatype changes all the time, and
they will not have any idea about the requirements for ndb or even
that it exists, nor will anyone have access to this platform for
development nor should they be expected to worry about it.   If they
not only have to fill in dozens of special "ndb" or generic-but-needed
by ndb flags, and then if they even have to worry about the sum of all
the sizes in a row, that means the ndb implementation will be
continuously broken across many projects in every release unless ndb
developers are checking every database change in every project at all
times.   Is that level of effort part of the plan?







>
> I don't see a way of automating that and making it maintainable without a
> lot more overhead in code and people. If we really want to remove the
> complexity here, why don't we just change the sizes and types on these
> handful of table columns so that they fit within both InnoDB and NDB? That
> way we don't need these functions and the tables are exactly the same? That
> would only leave us with the createtable, savepoint/rollback, etc. stuff to
> address which is already taken care of in the ndb module in oslo.db? Then we
> just fix the foreign key stuff as I've been doing, since it has zero impact
> on InnoDB deployments and if anything ensures things are consistent. That
> would then leave us to really focus on fixing migrations to use oslo.db and
> pass the correct flags, which is a more lengthy process than the rest of
> this.
>
> I don't see the point in trying to make this stuff anymore complicated.
>
> Octave
>
>
> On 7/25/2017 12:20 PM, Michael Bayer wrote:
>>
>> On Mon, Jul 24, 2017 at 5:41 PM, Michael Bayer  wrote:

 oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for
 most
 dbs, TINYTEXT for ndb
 oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
 dbs, TEXT for ndb
 oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
 VARCHAR(64) on ndb

 This way, we can override the String with TINYTEXT or TEXT or change the
 size for ndb.
>
> oslo_db.sqlalchemy.String(255) -> VARCHAR(255) on most dbs,
> TINYTEXT() on ndb
> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on
> most dbs, VARCHAR(64) on ndb
> oslo_db.sqlalchemy.String(50) -> VARCHAR(50) on all dbs
> oslo_db.sqlalchemy.String(64) -> VARCHAR(64) on all dbs
> oslo_db.sqlalchemy.String(80) -> VARCHAR(64) on most dbs,
> TINYTEXT()
> on ndb
> oslo_db.sqlalchemy.String(80, ndb_size=55) -> VARCHAR(64) on most
> dbs, VARCHAR(55) on ndb
>
> don't worry about implementation, can the above declaration ->
> datatype mapping work ?
>
>
 In my patch for Neutron, you'll see a lot of the AutoStringText() calls
 to
 replace exceptionally long String columns (4096, 8192, and larger).
>>>
>>> MySQL supports large VARCHAR now, OK.   yeah this could be
>>> String(8192, ndb_type=TEXT) as well.
>>
>> OK, no, sorry each time I think of this I keep seeing the verbosity of
>> imports etc. in the code, because if we had:
>>
>> String(80, ndb_type=TEXT)
>>
>> then we have to import both String and TEXT, and then what if there's
>> ndb.TEXT, the code is still making an ndb-specific decision, etc.
>>
>> I still see that this can be mostly automated from a simple ruleset
>> based on the size:
>>
>> length <= 64 :VARCHAR(length) on all backends
>> length > 64, length <= 255:   VARCHAR(length) for most backends,
>> TINYTEXT for ndb
>> length > 4096:  VARCHAR(length) for most backends, TEXT for ndb
>>
>> the one case that seems outside of this is:
>>
>> String(255)  where they have an index or key on the VARCHAR, and in
>> fact they only need < 64 characters to be indexed.  In that case you
>> don't want to use TINY

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

Still working through a full response but wanted to point something
important out about something you say below

On 07/25/2017 03:27 PM, Octave J. Orgeron wrote:

If we really want to remove the
complexity here, why don't we just change the sizes and types on these
handful of table columns so that they fit within both InnoDB and NDB?

Keep in mind that, unfortunately, the choice of string field lengths on
the underlying database columns many times are directly exposed via
corresponding JSONSchema objects for REST API endpoints.

Here's an example:

The POST /servers endpoint accepts a "metadata" field in the JSON
request body. This request body is constrained using a JSONSchema
validation system. The JSONSchema for this little field is here:

https://github.com/openstack/nova/blob/master/nova/api/openstack/compute/schemas/servers.py#L36

which points to here:

https://github.com/openstack/nova/blob/master/nova/api/validation/parameter_types.py#L333-L341

You will note that the above schema has a maxLength attribute of 255.
This means that metadata key/value pairs are limited by the API to a
length of 255. If we were to, say, change that value to something
different just for NDB, we'd need to make an adjustment to the public
REST API validation.

This is just one concern with this approach.

Best,
-jay

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

On Jul 25, 2017 3:38 PM, "Octave J. Orgeron" 
wrote:

Hi Michael,

I understand that you want to abstract this completely away inside of
oslo.db. However, the reality is that making column changes based purely on
the size and type of that column, without understanding what that column is
being used for is extremely dangerous. You could end up clobbering a column
that needs a specific length for a value,

Nowhere in my example is the current length truncated.   Also, if two
distinct lengths truly must be maintained we add a field "minimum_length".

prevent

 an index from working, etc.

That's what the indexable flag would achieve.

It

 wouldn't make sense to just do global changes on a column based on the
size.

This seems to be what your patches are doing, however.

There are far more tables that fit in both InnoDB and NDB already than
those that don't. As I've stated many times before, the columns that I make
changes to are evaluated to understand:

1. What populates it?
2. Who consumes it?
3. What are the possible values and required lengths?
4. What is the impact of changing the size or type?
5. Evaluated against the other columns in the table, which one makes the
most sense to adjust?

I don't see a way of automating that and making it maintainable without a
lot more overhead in code and people.

My proposal is intended to *reduce* the great verbosity in the current
patches I see and remove the burden of every project having to be aware of
"ndb" every time a column is added.

If

 we really want to remove the complexity here, why don't we just change the
sizes and types on these handful of table columns so that they fit within
both InnoDB and NDB?

Because that requires new migrations which are a great risk and
inconvenience to projects.

That

 way we don't need these functions and the tables are exactly the same?
That would only leave us with the createtable, savepoint/rollback, etc.
stuff to address which is already taken care of in the ndb module in
oslo.db? Then we just fix the foreign key stuff as I've been doing, since
it has zero impact on InnoDB deployments and if anything ensures things are
consistent. That would then leave us to really focus on fixing migrations
to use oslo.db and pass the correct flags, which is a more lengthy process
than the rest of this.

I don't see the point in trying to make this stuff anymore complicated.

The proposal is to make it simpler than it is right now.

Run though every column change youve proposed and show me which ones don't
fit into my proposed ruleset.   I will add additional declarative flags to
ensure those use cases are covered.

Octave

On 7/25/2017 12:20 PM, Michael Bayer wrote:

> On Mon, Jul 24, 2017 at 5:41 PM, Michael Bayer  wrote:
>
>> oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for most
>>> dbs, TINYTEXT for ndb
>>> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
>>> dbs, TEXT for ndb
>>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
>>> VARCHAR(64) on ndb
>>>
>>> This way, we can override the String with TINYTEXT or TEXT or change the
>>> size for ndb.
>>>
 oslo_db.sqlalchemy.String(255) -> VARCHAR(255) on most dbs,
 TINYTEXT() on ndb
 oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on
 most dbs, VARCHAR(64) on ndb
 oslo_db.sqlalchemy.String(50) -> VARCHAR(50) on all dbs
 oslo_db.sqlalchemy.String(64) -> VARCHAR(64) on all dbs
 oslo_db.sqlalchemy.String(80) -> VARCHAR(64) on most dbs, TINYTEXT()
 on ndb
 oslo_db.sqlalchemy.String(80, ndb_size=55) -> VARCHAR(64) on most
 dbs, VARCHAR(55) on ndb

 don't worry about implementation, can the above declaration ->
 datatype mapping work ?

 In my patch for Neutron, you'll see a lot of the AutoStringText() calls
>>> to
>>> replace exceptionally long String columns (4096, 8192, and larger).
>>>
>> MySQL supports large VARCHAR now, OK.   yeah this could be
>> String(8192, ndb_type=TEXT) as well.
>>
> OK, no, sorry each time I think of this I keep seeing the verbosity of
> imports etc. in the code, because if we had:
>
> String(80, ndb_type=TEXT)
>
> then we have to import both String and TEXT, and then what if there's
> ndb.TEXT, the code is still making an ndb-specific decision, etc.
>
> I still see that this can be mostly automated from a simple ruleset
> based on the size:
>
> length <= 64 :VARCHAR(length) on all backends
> length > 64, length <= 255:   VARCHAR(length) for most backends,
> TINYTEXT for ndb
> length > 4096:  VARCHAR(length) for most backends, TEXT for ndb
>
> the one case that seems outside of this is:
>
> String(255)  where they have an index or key on the VARCHAR, and in
> fact they only need < 64 characters to be indexed.  In that case you
> don't want to use TINYTEXT, right?   So one exception:
>
> oslo_db.sqlalchemy.types.String(255, indexable=True)
>
> e.g. a declarative hint to the

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-25 Thread Octave J. Orgeron


Hi Michael,

I understand that you want to abstract this completely away inside of 
oslo.db. However, the reality is that making column changes based purely 
on the size and type of that column, without understanding what that 
column is being used for is extremely dangerous. You could end up 
clobbering a column that needs a specific length for a value, prevent an 
index from working, etc. It wouldn't make sense to just do global 
changes on a column based on the size. There are far more tables that 
fit in both InnoDB and NDB already than those that don't. As I've stated 
many times before, the columns that I make changes to are evaluated to 
understand:


1. What populates it?
2. Who consumes it?
3. What are the possible values and required lengths?
4. What is the impact of changing the size or type?
5. Evaluated against the other columns in the table, which one makes the 
most sense to adjust?


I don't see a way of automating that and making it maintainable without 
a lot more overhead in code and people. If we really want to remove the 
complexity here, why don't we just change the sizes and types on these 
handful of table columns so that they fit within both InnoDB and NDB? 
That way we don't need these functions and the tables are exactly the 
same? That would only leave us with the createtable, savepoint/rollback, 
etc. stuff to address which is already taken care of in the ndb module 
in oslo.db? Then we just fix the foreign key stuff as I've been doing, 
since it has zero impact on InnoDB deployments and if anything ensures 
things are consistent. That would then leave us to really focus on 
fixing migrations to use oslo.db and pass the correct flags, which is a 
more lengthy process than the rest of this.


I don't see the point in trying to make this stuff anymore complicated.

Octave

On 7/25/2017 12:20 PM, Michael Bayer wrote:

On Mon, Jul 24, 2017 at 5:41 PM, Michael Bayer  wrote:

oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for most
dbs, TINYTEXT for ndb
oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
dbs, TEXT for ndb
oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
VARCHAR(64) on ndb

This way, we can override the String with TINYTEXT or TEXT or change the
size for ndb.

oslo_db.sqlalchemy.String(255) -> VARCHAR(255) on most dbs,
TINYTEXT() on ndb
oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on
most dbs, VARCHAR(64) on ndb
oslo_db.sqlalchemy.String(50) -> VARCHAR(50) on all dbs
oslo_db.sqlalchemy.String(64) -> VARCHAR(64) on all dbs
oslo_db.sqlalchemy.String(80) -> VARCHAR(64) on most dbs, TINYTEXT()
on ndb
oslo_db.sqlalchemy.String(80, ndb_size=55) -> VARCHAR(64) on most
dbs, VARCHAR(55) on ndb

don't worry about implementation, can the above declaration ->
datatype mapping work ?



In my patch for Neutron, you'll see a lot of the AutoStringText() calls to
replace exceptionally long String columns (4096, 8192, and larger).

MySQL supports large VARCHAR now, OK.   yeah this could be
String(8192, ndb_type=TEXT) as well.

OK, no, sorry each time I think of this I keep seeing the verbosity of
imports etc. in the code, because if we had:

String(80, ndb_type=TEXT)

then we have to import both String and TEXT, and then what if there's
ndb.TEXT, the code is still making an ndb-specific decision, etc.

I still see that this can be mostly automated from a simple ruleset
based on the size:

length <= 64 :VARCHAR(length) on all backends
length > 64, length <= 255:   VARCHAR(length) for most backends,
TINYTEXT for ndb
length > 4096:  VARCHAR(length) for most backends, TEXT for ndb

the one case that seems outside of this is:

String(255)  where they have an index or key on the VARCHAR, and in
fact they only need < 64 characters to be indexed.  In that case you
don't want to use TINYTEXT, right?   So one exception:

oslo_db.sqlalchemy.types.String(255, indexable=True)

e.g. a declarative hint to the oslo_db backend to not use a LOB type.

then we just need oslo_db.sqlalchemy.types.String, and virtually
nothing except the import has to change, and a few keywords.

What we're trying to do in oslo_db is as much as possible state the
intent of a structure or datatype declaratively, and leave as much of
the implementation up to oslo_db itself.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-25 Thread Michael Bayer

On Mon, Jul 24, 2017 at 5:41 PM, Michael Bayer  wrote:
>> oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for most
>> dbs, TINYTEXT for ndb
>> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
>> dbs, TEXT for ndb
>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
>> VARCHAR(64) on ndb
>>
>> This way, we can override the String with TINYTEXT or TEXT or change the
>> size for ndb.
>
>>>
>>> oslo_db.sqlalchemy.String(255) -> VARCHAR(255) on most dbs,
>>> TINYTEXT() on ndb
>>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on
>>> most dbs, VARCHAR(64) on ndb
>>> oslo_db.sqlalchemy.String(50) -> VARCHAR(50) on all dbs
>>> oslo_db.sqlalchemy.String(64) -> VARCHAR(64) on all dbs
>>> oslo_db.sqlalchemy.String(80) -> VARCHAR(64) on most dbs, TINYTEXT()
>>> on ndb
>>> oslo_db.sqlalchemy.String(80, ndb_size=55) -> VARCHAR(64) on most
>>> dbs, VARCHAR(55) on ndb
>>>
>>> don't worry about implementation, can the above declaration ->
>>> datatype mapping work ?
>>>
>>>
>> In my patch for Neutron, you'll see a lot of the AutoStringText() calls to
>> replace exceptionally long String columns (4096, 8192, and larger).
>
> MySQL supports large VARCHAR now, OK.   yeah this could be
> String(8192, ndb_type=TEXT) as well.

OK, no, sorry each time I think of this I keep seeing the verbosity of
imports etc. in the code, because if we had:

String(80, ndb_type=TEXT)

then we have to import both String and TEXT, and then what if there's
ndb.TEXT, the code is still making an ndb-specific decision, etc.

I still see that this can be mostly automated from a simple ruleset
based on the size:

length <= 64 :VARCHAR(length) on all backends
length > 64, length <= 255:   VARCHAR(length) for most backends,
TINYTEXT for ndb
length > 4096:  VARCHAR(length) for most backends, TEXT for ndb

the one case that seems outside of this is:

String(255)  where they have an index or key on the VARCHAR, and in
fact they only need < 64 characters to be indexed.  In that case you
don't want to use TINYTEXT, right?   So one exception:

oslo_db.sqlalchemy.types.String(255, indexable=True)

e.g. a declarative hint to the oslo_db backend to not use a LOB type.

then we just need oslo_db.sqlalchemy.types.String, and virtually
nothing except the import has to change, and a few keywords.

What we're trying to do in oslo_db is as much as possible state the
intent of a structure or datatype declaratively, and leave as much of
the implementation up to oslo_db itself.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-25 Thread Octave J. Orgeron

Hi Graham,

Comments below..

On 7/25/2017 5:04 AM, Graham Hayes wrote:

On 24/07/17 20:37, Octave J. Orgeron wrote:

Rather than having all the projects make use of
oslo_db.sqlalchemy.ndb.AutoStringTinyText / AutoStringSize, we add new
generic types to oslo.db :

oslo_db.sqlalchemy.types.SmallString
oslo_db.sqlalchemy.types.String

(or similar )

Internally, the ndb module would be mapping its implementation for
AutoStringTinyText and AutoStringSize to these types. Functionality
would be identical, just the naming convention exported to downstream
consuming projects would no longer refer to "ndb." for
datatypes.

I think this would make sense.

AutoStringSize, you pass two parameters, one being the non-NDB size and
the second being the NDB size. The point here is where you need to
reduce the size of the column to fit within the NDB limits, but you want
to preserve the String varchar type because it might be used in a key,
index, etc. I only use these in cases where the impacts are very low..
for example where a column is used for keeping track of status (up,
down, active, inactive, etc.) that don't require 255 varchars.

In many cases, the use of these could be removed by simply changing the
columns to more appropriate types and sizes. There is a tremendous
amount of wasted space in many of the databases. I'm more than willing
to help out with this if teams decide they would rather do that instead
as the long-term solution. Until then, these functions enable the use of
both with minimal impact.

Another thing to keep in mind is that the only services that I've had to
adjust column sizes for are:

Cinder
Neutron
Nova
Magnum

The other services that I'm working on like Keystone, Barbican, Murano,
Glance, etc. only need changes to:

1. Ensure that foreign keys are dropped and created in the correct order
when changing things like indexes, constraints, etc. Many services do
these proper steps already, there are just cases where this has been
missed because InnoDB is very forgiving on this. But other databases are
not.
2. Fixing the database migration and sync operations to use oslo.db,
pass the right parameters, etc. Something that should have been done in
the first place, but hasn't. So this more of a house cleaning step to
insure that services are using oslo.db correctly.

The only other oddball use case is deal with disabling nested
transactions, where Neutron is the only one that does this.

On the flip side, here is a short list of services that I haven't had to
make ANY changes for other than having oslo.db 4.24 or above:

aodh
gnocchi
heat
ironic
manila

Which projects are you looking at?

If it's covered by the kolla framework, it's on the list :)

I see a list of String(255)'s changed to one type or the other without
any clear notion why one would use one or the other. Having names
that define simply the declared nature of the type would be most
appropriate.

One has to look at what the column is being used for and decide what
appropriate remediation steps are. This takes time and one must research
what kind of data goes in the column, what puts it there, what consumes
it, and what remediation would have the least amount of impact.

I have been out of the loop for a while - but I thought we were
settling on one database, (MySQL over pgSQL) to ensure that we
no longer had to have weird conditionals in our database layers and
migrations?

Is this something that someone is willing to commit to maintaining for
all projects?

I am just concerned we are adding in more custom code just as we are
trying to indicate that we moving to MySQL (which I understand as MySQL
like DB using an InnoDB based engine e.g. Maria, MySQL, Percona)[1]

- Graham

MySQL Cluster is a MySQL database. It uses a different storage engine
and clustering framework. You can read about the benefits and
differences here:

https://review.openstack.org/#/c/429940/3/specs/mysql-cluster-support.rst

Oracle is committed to maintaining these patches, because our OpenStack
distribution uses MySQL Cluster out of the box and has since version 2.
Aside from services like Cinder, Neutron, and Nova, the impact of
supporting MySQL Cluster is proving to be minimal. Even for Cinder, I
only had to touch one table to make it work. The list of services that
require nothing more than oslo.db 4.24 or above is increasing. Really
the bigger issue that I'm seeing at this point are services that don't
make proper use of oslo.db for their migrations.

So I want to make sure people understand that this isn't dumping a ton
of custom code on every service. With the ndb module in oslo.db, a lot
of the logic is abstracted. All that remains is to deal with are a few
database tables, handling

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-25 Thread Graham Hayes

On 24/07/17 20:37, Octave J. Orgeron wrote:



>>
>> Rather than having all the projects make use of
>> oslo_db.sqlalchemy.ndb.AutoStringTinyText / AutoStringSize, we add new
>> generic types to oslo.db :
>>
>> oslo_db.sqlalchemy.types.SmallString
>> oslo_db.sqlalchemy.types.String
>>
>> (or similar )
>>
>> Internally, the ndb module would be mapping its implementation for
>> AutoStringTinyText and AutoStringSize to these types.   Functionality
>> would be identical, just the naming convention exported to downstream
>> consuming projects would no longer refer to "ndb." for
>> datatypes.
> 
> I think this would make sense.
> 
>>
> 

> 
> AutoStringSize, you pass two parameters, one being the non-NDB size and
> the second being the NDB size. The point here is where you need to
> reduce the size of the column to fit within the NDB limits, but you want
> to preserve the String varchar type because it might be used in a key,
> index, etc. I only use these in cases where the impacts are very low..
> for example where a column is used for keeping track of status (up,
> down, active, inactive, etc.) that don't require 255 varchars.
> 
> In many cases, the use of these could be removed by simply changing the
> columns to more appropriate types and sizes. There is a tremendous
> amount of wasted space in many of the databases. I'm more than willing
> to help out with this if teams decide they would rather do that instead
> as the long-term solution. Until then, these functions enable the use of
> both with minimal impact.
> 
> Another thing to keep in mind is that the only services that I've had to
> adjust column sizes for are:
> 
> Cinder
> Neutron
> Nova
> Magnum
> 
> The other services that I'm working on like Keystone, Barbican, Murano,
> Glance, etc. only need changes to:
> 
> 1. Ensure that foreign keys are dropped and created in the correct order
> when changing things like indexes, constraints, etc. Many services do
> these proper steps already, there are just cases where this has been
> missed because InnoDB is very forgiving on this. But other databases are
> not.
> 2. Fixing the database migration and sync operations to use oslo.db,
> pass the right parameters, etc. Something that should have been done in
> the first place, but hasn't. So this more of a house cleaning step to
> insure that services are using oslo.db correctly.
> 
> The only other oddball use case is deal with disabling nested
> transactions, where Neutron is the only one that does this.
> 
> On the flip side, here is a short list of services that I haven't had to
> make ANY changes for other than having oslo.db 4.24 or above:
> 
> aodh
> gnocchi
> heat
> ironic
> manila

Which projects are you looking at?

>>
>> 3. it's not clear (I don't even know right now by looking at these
>> reviews) when one would use "AutoStringTinyText" or "AutoStringSize".
>> For example in
>> https://review.openstack.org/#/c/446643/10/nova/db/sqlalchemy/migrate_repo/versions/216_havana.py
>>
>> I see a list of String(255)'s changed to one type or the other without
>> any clear notion why one would use one or the other.  Having names
>> that define simply the declared nature of the type would be most
>> appropriate.
> 
> One has to look at what the column is being used for and decide what
> appropriate remediation steps are. This takes time and one must research
> what kind of data goes in the column, what puts it there, what consumes
> it, and what remediation would have the least amount of impact.
> 

I have been out of the loop for a while - but I thought we were
settling on one database, (MySQL over pgSQL) to ensure that we
no longer had to have weird conditionals in our database layers and
migrations?

Is this something that someone is willing to commit to maintaining for
all projects?

I am just concerned we are adding in more custom code just as we are
trying to indicate that we moving to MySQL (which I understand as MySQL
like DB using an InnoDB based engine e.g. Maria, MySQL, Percona)[1]

- Graham

1 - Thinking about it - should https://review.openstack.org/#/c/427880
refer to InnoDB vs just MySQL ?


0x23BA8E2E.asc
Description: application/pgp-keys


signature.asc
Description: OpenPGP digital signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-24 Thread Michael Bayer

On Mon, Jul 24, 2017 at 5:10 PM, Octave J. Orgeron
 wrote:
> I don't think it makes sense to make these global. We don't need to change
> all occurrences of String(255) to TinyText for example. We make that
> determination through understanding the table structure and usage. But I do
> like the idea of changing the second option to ndb_size=, I think that makes
> things very clear. If you want to collapse the use cases.. what about?:
>
> oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for most
> dbs, TINYTEXT for ndb
> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
> dbs, TEXT for ndb
> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
> VARCHAR(64) on ndb
>
> This way, we can override the String with TINYTEXT or TEXT or change the
> size for ndb.

OK.   See, originally when I was pushing for an ndb "dialect", that
hook lets us say String(255).with_variant(TEXT, "ndb") which is what I
was going for originally.  However, since we went with a special flag
and not a dialect, using ndb_type / ndb_size is *probably* fine.


>
>>
>> oslo_db.sqlalchemy.String(255) -> VARCHAR(255) on most dbs,
>> TINYTEXT() on ndb
>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on
>> most dbs, VARCHAR(64) on ndb
>> oslo_db.sqlalchemy.String(50) -> VARCHAR(50) on all dbs
>> oslo_db.sqlalchemy.String(64) -> VARCHAR(64) on all dbs
>> oslo_db.sqlalchemy.String(80) -> VARCHAR(64) on most dbs, TINYTEXT()
>> on ndb
>> oslo_db.sqlalchemy.String(80, ndb_size=55) -> VARCHAR(64) on most
>> dbs, VARCHAR(55) on ndb
>>
>> don't worry about implementation, can the above declaration ->
>> datatype mapping work ?
>>
>> Also where are we using AutoStringText(), it sounds like this is just
>> what SQLAlchemy calls the Text() datatype?   (e.g. an unlengthed
>> string type, comes out as CLOB etc).
>>
> In my patch for Neutron, you'll see a lot of the AutoStringText() calls to
> replace exceptionally long String columns (4096, 8192, and larger).

MySQL supports large VARCHAR now, OK.   yeah this could be
String(8192, ndb_type=TEXT) as well.


>
>
>
>
>>
>>
>>> In many cases, the use of these could be removed by simply changing the
>>> columns to more appropriate types and sizes. There is a tremendous amount
>>> of
>>> wasted space in many of the databases. I'm more than willing to help out
>>> with this if teams decide they would rather do that instead as the
>>> long-term
>>> solution. Until then, these functions enable the use of both with minimal
>>> impact.
>>>
>>> Another thing to keep in mind is that the only services that I've had to
>>> adjust column sizes for are:
>>>
>>> Cinder
>>> Neutron
>>> Nova
>>> Magnum
>>>
>>> The other services that I'm working on like Keystone, Barbican, Murano,
>>> Glance, etc. only need changes to:
>>>
>>> 1. Ensure that foreign keys are dropped and created in the correct order
>>> when changing things like indexes, constraints, etc. Many services do
>>> these
>>> proper steps already, there are just cases where this has been missed
>>> because InnoDB is very forgiving on this. But other databases are not.
>>> 2. Fixing the database migration and sync operations to use oslo.db, pass
>>> the right parameters, etc. Something that should have been done in the
>>> first
>>> place, but hasn't. So this more of a house cleaning step to insure that
>>> services are using oslo.db correctly.
>>>
>>> The only other oddball use case is deal with disabling nested
>>> transactions,
>>> where Neutron is the only one that does this.
>>>
>>> On the flip side, here is a short list of services that I haven't had to
>>> make ANY changes for other than having oslo.db 4.24 or above:
>>>
>>> aodh
>>> gnocchi
>>> heat
>>> ironic
>>> manila
>>>
 3. it's not clear (I don't even know right now by looking at these
 reviews) when one would use "AutoStringTinyText" or "AutoStringSize".
 For example in

 https://review.openstack.org/#/c/446643/10/nova/db/sqlalchemy/migrate_repo/versions/216_havana.py
 I see a list of String(255)'s changed to one type or the other without
 any clear notion why one would use one or the other.  Having names
 that define simply the declared nature of the type would be most
 appropriate.
>>>
>>>
>>> One has to look at what the column is being used for and decide what
>>> appropriate remediation steps are. This takes time and one must research
>>> what kind of data goes in the column, what puts it there, what consumes
>>> it,
>>> and what remediation would have the least amount of impact.
>>>
 I can add these names up to oslo.db and then we would just need to
 spread these out through all the open ndb reviews and then also patch
 up Cinder which seems to be the only ndb implementation that's been
 merged so far.

 Keep in mind this is really me trying to correct my own mistake, as I
 helped design and approved of the original approach here where

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects


Hi Michael,

Comments below..

On 7/24/2017 2:49 PM, Michael Bayer wrote:

On Mon, Jul 24, 2017 at 3:37 PM, Octave J. Orgeron
 wrote:

For these, here is a brief synopsis:

AutoStringTinyText, will convert a column to the TinyText type. This is used
for cases where a 255 varchar string needs to be converted to a text blob to
make the row fit within the NDB limits. If you are using ndb, it'll convert
it to TinyText, otherwise it leaves it alone. The reason that TinyText type
was chosen is because it'll hold the same 255 varchars and saves on space.

AutoStringText, does the same as the above, but converts the type to Text
and is meant for use cases where you need more than 255 varchar worth of
space. Good examples of these uses are where outputs of hypervisor and OVS
commands are dumped into the database.

AutoStringSize, you pass two parameters, one being the non-NDB size and the
second being the NDB size. The point here is where you need to reduce the
size of the column to fit within the NDB limits, but you want to preserve
the String varchar type because it might be used in a key, index, etc. I
only use these in cases where the impacts are very low.. for example where a
column is used for keeping track of status (up, down, active, inactive,
etc.) that don't require 255 varchars.

Can the "auto" that is supplied by AutoStringTinyText and
AutoStringSize be merged?


I don't think it makes sense to make these global. We don't need to 
change all occurrences of String(255) to TinyText for example. We make 
that determination through understanding the table structure and usage. 
But I do like the idea of changing the second option to ndb_size=, I 
think that makes things very clear. If you want to collapse the use 
cases.. what about?:


oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for 
most dbs, TINYTEXT for ndb
oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most 
dbs, TEXT for ndb
oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs, 
VARCHAR(64) on ndb


This way, we can override the String with TINYTEXT or TEXT or change the 
size for ndb.




oslo_db.sqlalchemy.String(255) -> VARCHAR(255) on most dbs,
TINYTEXT() on ndb
oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on
most dbs, VARCHAR(64) on ndb
oslo_db.sqlalchemy.String(50) -> VARCHAR(50) on all dbs
oslo_db.sqlalchemy.String(64) -> VARCHAR(64) on all dbs
oslo_db.sqlalchemy.String(80) -> VARCHAR(64) on most dbs, TINYTEXT() on ndb
oslo_db.sqlalchemy.String(80, ndb_size=55) -> VARCHAR(64) on most
dbs, VARCHAR(55) on ndb

don't worry about implementation, can the above declaration ->
datatype mapping work ?

Also where are we using AutoStringText(), it sounds like this is just
what SQLAlchemy calls the Text() datatype?   (e.g. an unlengthed
string type, comes out as CLOB etc).

In my patch for Neutron, you'll see a lot of the AutoStringText() calls 
to replace exceptionally long String columns (4096, 8192, and larger).








In many cases, the use of these could be removed by simply changing the
columns to more appropriate types and sizes. There is a tremendous amount of
wasted space in many of the databases. I'm more than willing to help out
with this if teams decide they would rather do that instead as the long-term
solution. Until then, these functions enable the use of both with minimal
impact.

Another thing to keep in mind is that the only services that I've had to
adjust column sizes for are:

Cinder
Neutron
Nova
Magnum

The other services that I'm working on like Keystone, Barbican, Murano,
Glance, etc. only need changes to:

1. Ensure that foreign keys are dropped and created in the correct order
when changing things like indexes, constraints, etc. Many services do these
proper steps already, there are just cases where this has been missed
because InnoDB is very forgiving on this. But other databases are not.
2. Fixing the database migration and sync operations to use oslo.db, pass
the right parameters, etc. Something that should have been done in the first
place, but hasn't. So this more of a house cleaning step to insure that
services are using oslo.db correctly.

The only other oddball use case is deal with disabling nested transactions,
where Neutron is the only one that does this.

On the flip side, here is a short list of services that I haven't had to
make ANY changes for other than having oslo.db 4.24 or above:

aodh
gnocchi
heat
ironic
manila


3. it's not clear (I don't even know right now by looking at these
reviews) when one would use "AutoStringTinyText" or "AutoStringSize".
For example in
https://review.openstack.org/#/c/446643/10/nova/db/sqlalchemy/migrate_repo/versions/216_havana.py
I see a list of String(255)'s changed to one type or the other without
any clear notion why one would use one or the other.  Having names
that define simply the declared nature of the type would be most
appropriate.


One has to look at what the column i

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

2017-07-24 Thread Michael Bayer

On Mon, Jul 24, 2017 at 3:37 PM, Octave J. Orgeron
 wrote:
> For these, here is a brief synopsis:
>
> AutoStringTinyText, will convert a column to the TinyText type. This is used
> for cases where a 255 varchar string needs to be converted to a text blob to
> make the row fit within the NDB limits. If you are using ndb, it'll convert
> it to TinyText, otherwise it leaves it alone. The reason that TinyText type
> was chosen is because it'll hold the same 255 varchars and saves on space.
>
> AutoStringText, does the same as the above, but converts the type to Text
> and is meant for use cases where you need more than 255 varchar worth of
> space. Good examples of these uses are where outputs of hypervisor and OVS
> commands are dumped into the database.
>
> AutoStringSize, you pass two parameters, one being the non-NDB size and the
> second being the NDB size. The point here is where you need to reduce the
> size of the column to fit within the NDB limits, but you want to preserve
> the String varchar type because it might be used in a key, index, etc. I
> only use these in cases where the impacts are very low.. for example where a
> column is used for keeping track of status (up, down, active, inactive,
> etc.) that don't require 255 varchars.

Can the "auto" that is supplied by AutoStringTinyText and
AutoStringSize be merged?


oslo_db.sqlalchemy.String(255) -> VARCHAR(255) on most dbs,
TINYTEXT() on ndb
oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on
most dbs, VARCHAR(64) on ndb
oslo_db.sqlalchemy.String(50) -> VARCHAR(50) on all dbs
oslo_db.sqlalchemy.String(64) -> VARCHAR(64) on all dbs
oslo_db.sqlalchemy.String(80) -> VARCHAR(64) on most dbs, TINYTEXT() on ndb
oslo_db.sqlalchemy.String(80, ndb_size=55) -> VARCHAR(64) on most
dbs, VARCHAR(55) on ndb

don't worry about implementation, can the above declaration ->
datatype mapping work ?

Also where are we using AutoStringText(), it sounds like this is just
what SQLAlchemy calls the Text() datatype?   (e.g. an unlengthed
string type, comes out as CLOB etc).




>
> In many cases, the use of these could be removed by simply changing the
> columns to more appropriate types and sizes. There is a tremendous amount of
> wasted space in many of the databases. I'm more than willing to help out
> with this if teams decide they would rather do that instead as the long-term
> solution. Until then, these functions enable the use of both with minimal
> impact.
>
> Another thing to keep in mind is that the only services that I've had to
> adjust column sizes for are:
>
> Cinder
> Neutron
> Nova
> Magnum
>
> The other services that I'm working on like Keystone, Barbican, Murano,
> Glance, etc. only need changes to:
>
> 1. Ensure that foreign keys are dropped and created in the correct order
> when changing things like indexes, constraints, etc. Many services do these
> proper steps already, there are just cases where this has been missed
> because InnoDB is very forgiving on this. But other databases are not.
> 2. Fixing the database migration and sync operations to use oslo.db, pass
> the right parameters, etc. Something that should have been done in the first
> place, but hasn't. So this more of a house cleaning step to insure that
> services are using oslo.db correctly.
>
> The only other oddball use case is deal with disabling nested transactions,
> where Neutron is the only one that does this.
>
> On the flip side, here is a short list of services that I haven't had to
> make ANY changes for other than having oslo.db 4.24 or above:
>
> aodh
> gnocchi
> heat
> ironic
> manila
>
>>
>> 3. it's not clear (I don't even know right now by looking at these
>> reviews) when one would use "AutoStringTinyText" or "AutoStringSize".
>> For example in
>> https://review.openstack.org/#/c/446643/10/nova/db/sqlalchemy/migrate_repo/versions/216_havana.py
>> I see a list of String(255)'s changed to one type or the other without
>> any clear notion why one would use one or the other.  Having names
>> that define simply the declared nature of the type would be most
>> appropriate.
>
>
> One has to look at what the column is being used for and decide what
> appropriate remediation steps are. This takes time and one must research
> what kind of data goes in the column, what puts it there, what consumes it,
> and what remediation would have the least amount of impact.
>
>>
>> I can add these names up to oslo.db and then we would just need to
>> spread these out through all the open ndb reviews and then also patch
>> up Cinder which seems to be the only ndb implementation that's been
>> merged so far.
>>
>> Keep in mind this is really me trying to correct my own mistake, as I
>> helped design and approved of the original approach here where
>> projects would be consuming against the "ndb." namespace.  However,
>> after seeing it in reviews how prevalent the use of this extremely
>> backend-specific name is, I think the use of the name should be much

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

Hi Michael,

Comments below..

On 7/24/2017 9:13 AM, Michael Bayer wrote:

On Mon, Jul 24, 2017 at 10:01 AM, Jay Pipes wrote:

I would much prefer to *add* a brand new schema migration that handles
conversion of the entire InnoDB schema at a certain point to an
NDB-compatible one *after* that point. That way, we isolate the NDB changes
to one specific schema migration -- and can point users to that one specific
migration in case bugs arise. This is the reason that every release we add a
number of "placeholder" schema migration numbered files to handle situations
such as these.

I understand that Oracle wants to support older versions of OpenStack in
their distribution and that's totally cool with me. But, the proper way IMHO
to do this kind of thing is to take one of the placeholder migrations and
use that as the NDB-conversion migration. I would posit that since Oracle
will need to keep some not-insignificant amount of Python code in their
distribution fork of Nova in order to bring in the oslo.db and Nova NDB
support, that it will actually be *easier* for them to maintain a *separate*
placeholder schema migration for all NDB conversion work instead of changing
an existing schema migration with a new patch.

OK, if it is feasible for the MySQL engine to build out the whole
schema as InnoDB and then do a migrate that changes the storage engine
of all tables to NDB and then also changes all the datatypes, that can
work. If you want to go that way, then fine.

Unfortunately, to do that, you'd have to drop all of the constraints,
foreign keys, and probably indexes before doing a change to table type.
Then go back and put them in all into place. You also have to deal with
changing your NDB cluster configuration to force all of the traffic to a
single node since InnoDB tables are not replicated across an NDB
cluster. So this is a lot more overhead for operators and introduces
greater risks.

However, I may be missing something but I'm not seeing the practical
difference. This new "ndb" migration still goes into the source
tree, still gets invoked for all users, and if the "if ndb_enabled()"
flag is somehow broken, it breaks just as well if it's in a brand new
migration vs. if it's in an old migration.

Suppose "if ndb_enabled(engine)" is somehow broken. Either it crashes
the migrations, or it runs inappropriately.

If the conditional is in a brand new migration file that's pushed out
in Queens, *everybody* runs it when they upgrade, as well as when they
do fresh installation, and they get the breakage.

if the conditional is in havana 216, *everybody* gets it when they do
a fresh installation, and they get the breakage. Upgraders do not.

How is "new migration" better than "make old migration compatible" ?

Again, fine by me if the other approach works, I'm just trying to see
where I'm being dense here.

Keep in mind that existing migrations *do* break and have to be fixed
- because while the migration files don't change, the databases they
talk to do. The other thread I introduced about Mariadb 10.2 now
refusing to DROP columns that have a CHECK constraint is an example,
and will likely mean lots of old migration files across openstack
projects will need adjustments.

Exactly! I've seen plenty of cases where these scripts have been patched
to fix problems that crop up in later migrations. So doing these changes
is not that alien to OpenStack, even for Nova:

http://git.openstack.org/cgit/openstack/nova/log/nova/db/sqlalchemy/migrate_repo/versions/216_havana.py

Also, another good point that everyone should be working on fixing is
that with in MySQL 5.7.x you'll get warnings about duplicate keys,
indexes, constraints, etc. that WILL NOT be supported in a future
release. So these scripts have to be patched or MySQL support for these
databases will be broken in the not so distant future.

All the best,
-jay

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects


Hi Jay,

Comments below..

On 7/24/2017 8:01 AM, Jay Pipes wrote:

+Dan Smith

Good morning Mike :) Comments inline...

On 07/23/2017 08:05 PM, Michael Bayer wrote:

On Sun, Jul 23, 2017 at 6:10 PM, Jay Pipes  wrote:
Glad you brought this up, Mike. I was going to start a thread about 
this.

Comments inline.

On 07/23/2017 05:02 PM, Michael Bayer wrote:
Well, besides that point (which I agree with), that is attempting to 
change

an existing database schema migration, which is a no-no in my book ;)


OK this point has come up before and I always think there's a bit of
an over-broad kind of purism being applied (kind of like when someone
says, "never use global variables ever!" and I say, "OK so sys.modules
is out right ?" :)  ).


I'm not being a purist. I'm being a realist :) See below...


I agree with "never change a migration" to the extent that you should
never change the *SQL steps emitted for a database migration*. That
is, if your database migration emitted "ALTER TABLE foobar foobar
foobar" on a certain target databse, that should never change. No
problem there.

However, what we're doing here is adding new datatype logic for the
NDB backend which are necessarily different; not to mention that NDB
requires more manipulation of constraints to make certain changes
happen.  To make all that work, the *Python code that emits the SQL
for the migration* needs to have changes made, mostly (I would say
completely) in the form of new conditionals for NDB-specific concepts.
In the case of the datatypes, the migrations will need to refer to
a SQLAlchemy type object that's been injected with the ndb-specific
logic when the NDB backend is present; I've made sure that when the
NDB backend is *not* present, the datatypes behave exactly the same as
before.


No disagreement here.


So basically, *SQL steps do not change*, but *Python code that emits
the SQL steps* necessarily has to change to accomodate for when the
"ndb" flag is present - this because these migrations have to run on
brand new ndb installations in order to create the database. If Nova
and others did the initial "create database" without using the
migration files and instead used a create_all(), things would be
different, but that's not how things work (and also it is fine that
the migrations are used to build up the DB).


So, I see your point here, but my concern here is that if we *modify* 
an existing schema migration that has already been tested to properly 
apply a schema change for MySQL/InnoDB and PostgreSQL with code that 
is specific to NDB, we introduce the potential for bugs where users 
report that the same migration works sometimes but fails other times.


I don't think that the testing issues should be a concern here because 
I've been working to make sure that the tests work with both InnoDB and 
NDB. It's a pain, but again, we are only talking about a handful of the 
services. Bottom line here is that if you are not using NDB, the changes 
have zero effect on your setup.




I would much prefer to *add* a brand new schema migration that handles 
conversion of the entire InnoDB schema at a certain point to an 
NDB-compatible one *after* that point. That way, we isolate the NDB 
changes to one specific schema migration -- and can point users to 
that one specific migration in case bugs arise. This is the reason 
that every release we add a number of "placeholder" schema migration 
numbered files to handle situations such as these.


The only problem with this approach is that it assumes you are on InnoDB 
to start out with, which is not the use case here. This is for new 
installations or ones that started out with NDB, so we have to start out 
with the base schema in the scripts working.


I understand that Oracle wants to support older versions of OpenStack 
in their distribution and that's totally cool with me. But, the proper 
way IMHO to do this kind of thing is to take one of the placeholder 
migrations and use that as the NDB-conversion migration. I would posit 
that since Oracle will need to keep some not-insignificant amount of 
Python code in their distribution fork of Nova in order to bring in 
the oslo.db and Nova NDB support, that it will actually be *easier* 
for them to maintain a *separate* placeholder schema migration for all 
NDB conversion work instead of changing an existing schema migration 
with a new patch.


And this is the whole point of the work that I'm doing. Getting upstream 
so that others can benefit and so that we don't have to waste cycles 
maintaining custom code. Instead, we do all of the work upstream and 
that will enable our customers to more easily upgrade from one release 
to another. FYI, we have been using NDB since version 2 of our product. 
We are working on version 4 right now.




All the best,
-jay

__ 


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
openstack-dev-requ...@lists.open

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

Hi Jay,

Comments below..

Thanks,
Octave

On 7/23/2017 4:10 PM, Jay Pipes wrote:
Glad you brought this up, Mike. I was going to start a thread about
this. Comments inline.

On 07/23/2017 05:02 PM, Michael Bayer wrote:

I've been working with Octave Oregon in assisting with new rules and
datatypes that would allow projects to support the NDB storage engine
with MySQL.

To that end, we've made changes to oslo.db in [1] to support this, and
there are now a bunch of proposals such as [2] [3] to implement new
ndb-specific structures in projects.

The reviews for all downstream projects except Cinder are still under
review. While we have a chance to avoid a future naming problem, I am
making the following proposal:

Rather than having all the projects make use of
oslo_db.sqlalchemy.ndb.AutoStringTinyText / AutoStringSize, we add new
generic types to oslo.db :

oslo_db.sqlalchemy.types.SmallString
oslo_db.sqlalchemy.types.String

This is precisely what I was going to suggest because I was not going
to go along with the whole injection of NDB-name-specific column types
in Nova. :)

(or similar )

Reasons for doing so include:

Right, my thoughts exactly.

if IBM wanted Openstack to run on DB2 (again?) and wanted to add a
"db2.String" implementation to oslo.db for example, the naming and
datatypes would need to be opened up as above in any case; might as
well make the change now before the patch sets are merged.

Yep.

2. The names "AutoStringTinyText" and "AutoStringSize" themselves are
confusing and inconsistent w/ each other (e.g. what is "auto"? one is
"auto" if its String or TinyText and the other is "auto" if its
String, and..."size"?)

Yes. Oh God yes. The MySQL TINY/MEDIUM/BIG [INT|TEXT] data types were
always entirely irrational and confusing. No need to perpetuate that
terminology.

FYI, the TINYTEXT is part of the MySQL syntax and dialect. So it's not
alien to MySQL folks.

Well, besides that point (which I agree with), that is attempting to
change an existing database schema migration, which is a no-no in my
book ;)

Unfortunately, if we don't modify the scripts, we can't create the
schemas on the NDB database. Tables have to fit in the row limits. So
unless we have a way to override the scripts, we have to modify them.

I can add these names up to oslo.db and then we would just need to
spread these out through all the open ndb reviews and then also patch
up Cinder which seems to be the only ndb implementation that's been
merged so far.

Keep in mind this is really me trying to correct my own mistake, as I
helped design and approved of the original approach here where
projects would be consuming against the "ndb." namespace. However,
after seeing it in reviews how prevalent the use of this extremely
backend-specific name is, I think the use of the name should be much
less frequent throughout projects and only surrounding logic that is
purely to do with the ndb backend and no others. At the datatype
level, the chance of future naming conflicts is very high and we
should fix this mistake (my mistake) before it gets committed
throughout many downstream projects.

I had a private conversation with Octave on Friday. I had mentioned
that I was upset I didn't know about the series of patches to oslo.db
that added that module. I would certainly have argued against that
approach. Please consider hitting me with a cluestick next time
something of this nature pops up. :)

Also, as I told Octave, I have no problem whatsoever with NDB Cluster.
I actually think it's a pretty brilliant piece of engineering -- and
have for over a decade since I worked at MySQL.

My complaint regarding the code patch proposed to Nova was around the
hard-coding of the ndb namespace into the model definitions.

Best,
-jay

[1] https://review.openstack.org/#/c/427970/

[2] https://review.openstack.org/#/c/446643/

[3] https://review.o

Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects