Re: [openstack-dev] Work around DB in OpenStack (Oslo, Nova, Cinder, Glance)

2013-07-11 Thread Mark McLoughlin
On Wed, 2013-07-10 at 14:14 -0600, John Griffith wrote:

 
 Given that Cinder doesn't have anybody actively engaged in this other
 than what's being proposed and worked on by Boris and folks, we'd be a
 willing candidate for most of these changes, particularly if they're
 accepted in Nova to begin with.
 
 
 The question of having it in oslo-incubator or not, I think ultimately
 that's likely to be the best thing, but as is evident by this thread
 it seems there are a number of things that are going to have to be
 sorted before that happens, and I'm not convinced that move things to
 OSLO first then fix is the right answer.  In my opinion things should
 be pretty solid before they go into the OSLO repo, but that's just my
 2 cents.
 
 
 AS is evident by the approval of the BP's in Cinder and the reviews on
 the patches that have been submitted thus far Cinder is fine going the
 direction/implementations that have been proposed by Boris.  I would
 like to see the debate around the archiving strategy and use of
 alembic settled, but regardless on the Cinder side I would like to
 move forward and make progress and as there's no other real effort to
 move forward with improving the DB code in Cinder (which I think is
 needed and very valuable) I'm fine with most of what's being proposed.

My conclusion from that (admittedly based on limited understanding)
would be that everything Boris is proposing makes sense to copy from
Nova to oslo-incubator so Cinder can re-use it, with the exception of
the DB archiving strategy.

i.e. we'd improve Nova's DB archiving strategy before having Cinder
adopt it.

Cheers,
Mark.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Work around DB in OpenStack (Oslo, Nova, Cinder, Glance)

2013-07-11 Thread Boris Pavlovic
Mark, John, Nikola,

Current in oslo we would like to put only 2 functions:
1) generic method for creating shadow table
2) generic method that the columns are same in shadow and main table

So migration that adds shadow table could be done after all other works,
when we finish improving of db-archiving utils (that moves deleted rows to
shadow tables), to avoid problems that noticed Nikola.

These 2 functions won't be affected and will be used in future in cinder,
glance and they are already used in Nova. So I don't see any problem to
push it into oslo at this moment.


Best regards,
Boris Pavlovic




On Thu, Jul 11, 2013 at 11:25 AM, Mark McLoughlin mar...@redhat.com wrote:

 On Wed, 2013-07-10 at 14:14 -0600, John Griffith wrote:

 
  Given that Cinder doesn't have anybody actively engaged in this other
  than what's being proposed and worked on by Boris and folks, we'd be a
  willing candidate for most of these changes, particularly if they're
  accepted in Nova to begin with.
 
 
  The question of having it in oslo-incubator or not, I think ultimately
  that's likely to be the best thing, but as is evident by this thread
  it seems there are a number of things that are going to have to be
  sorted before that happens, and I'm not convinced that move things to
  OSLO first then fix is the right answer.  In my opinion things should
  be pretty solid before they go into the OSLO repo, but that's just my
  2 cents.
 
 
  AS is evident by the approval of the BP's in Cinder and the reviews on
  the patches that have been submitted thus far Cinder is fine going the
  direction/implementations that have been proposed by Boris.  I would
  like to see the debate around the archiving strategy and use of
  alembic settled, but regardless on the Cinder side I would like to
  move forward and make progress and as there's no other real effort to
  move forward with improving the DB code in Cinder (which I think is
  needed and very valuable) I'm fine with most of what's being proposed.

 My conclusion from that (admittedly based on limited understanding)
 would be that everything Boris is proposing makes sense to copy from
 Nova to oslo-incubator so Cinder can re-use it, with the exception of
 the DB archiving strategy.

 i.e. we'd improve Nova's DB archiving strategy before having Cinder
 adopt it.

 Cheers,
 Mark.



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Work around DB in OpenStack (Oslo, Nova, Cinder, Glance)

2013-07-09 Thread Mark McLoughlin
On Mon, 2013-07-08 at 14:15 +0200, Nikola Đipanov wrote:
 On 05/07/13 14:26, Boris Pavlovic wrote:
  Hi all, 
  
  I would like to explain very high level steps of our work: 
  1) Sync work with DB in all projects (We have what we have, let it be in
  one place)
  2) Refactor work with DB in one place (not independently in all projects) 
  
  So I understand that our code around DB is not ideal, but let it be in
  one place at first.
  
 
 This is fine in principle, however I don't think we should push it
 without considering the details (where the devil is apparently).
 I am arguing that DB archiving should be re-done and is broken
 conceptually (example below), and I think it would be suboptimal (to say
 the least) to get it everywhere first and then fix it.
 
 Just saying a hand-wavy yeah, but once it's in Oslo we can fix it is
 wrong - especially for functionality that is younger than the time it
 will likely take it to 'graduate' Oslo.

I'm not following this DB archiving debate closely enough to take a
position either way, but 

I think what you're really arguing is that no other project should adopt
this approach to DB archiving. I'm fine with saying that it shouldn't
move into oslo-incubator if it will only be used in Nova.

So - the debate to have is which projects are proposing to adopt this DB
archiving strategy and whether it makes sense for them to adopt it as is
and fix it up later, or adopt an entirely different approach.

Cheers,
Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Work around DB in OpenStack (Oslo, Nova, Cinder, Glance)

2013-07-09 Thread Boris Pavlovic
Hi Mark, Nikola, David

Our work is not just in case of unifying. It improves the situation in all
project (not only in Nova).


I would like to say my opinion about DB Archiving also ;)

Let start from the problem, abstract solution, current solution, and why
this solution is ok.

*) Problem.
Records from DB are not deleted at all, so our DB will die.
*) Abstract solution
We should somehow remove old records, I see only one solution, create
shadow tables and have a utilities that are smart and could remove data in
such way, that shadow and main table are absolutly independent.
*) Current solution
1) Create shadow tables
2) Simple utils that move from table to shadow table deleted records

*) Problems in current solution.
If we just move deleted records to shadow table we have to do all joins
(like in Nikola's migration).

So the problem is not in approach of shadow tables, problem is in current
utils that are not enough smart.
And in oslo there is only code (that allows to create_shadow table and that
check that shadow tables and main are synced)

And one more nit such migrations (as made Nikola) are pretty rare.

So I don't see any reason to block this DB Archiving code in oslo and block
this approach. It could be improved not replaced.
More than we are ready to improve it.


Best regards,
Boris Pavlovic












On Tue, Jul 9, 2013 at 3:05 PM, Mark McLoughlin mar...@redhat.com wrote:

 On Mon, 2013-07-08 at 14:15 +0200, Nikola Đipanov wrote:
  On 05/07/13 14:26, Boris Pavlovic wrote:
   Hi all,
  
   I would like to explain very high level steps of our work:
   1) Sync work with DB in all projects (We have what we have, let it be
 in
   one place)
   2) Refactor work with DB in one place (not independently in all
 projects)
  
   So I understand that our code around DB is not ideal, but let it be in
   one place at first.
  
 
  This is fine in principle, however I don't think we should push it
  without considering the details (where the devil is apparently).
  I am arguing that DB archiving should be re-done and is broken
  conceptually (example below), and I think it would be suboptimal (to say
  the least) to get it everywhere first and then fix it.
 
  Just saying a hand-wavy yeah, but once it's in Oslo we can fix it is
  wrong - especially for functionality that is younger than the time it
  will likely take it to 'graduate' Oslo.

 I'm not following this DB archiving debate closely enough to take a
 position either way, but 

 I think what you're really arguing is that no other project should adopt
 this approach to DB archiving. I'm fine with saying that it shouldn't
 move into oslo-incubator if it will only be used in Nova.

 So - the debate to have is which projects are proposing to adopt this DB
 archiving strategy and whether it makes sense for them to adopt it as is
 and fix it up later, or adopt an entirely different approach.

 Cheers,
 Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Work around DB in OpenStack (Oslo, Nova, Cinder, Glance)

2013-07-08 Thread Nikola Đipanov
On 05/07/13 14:26, Boris Pavlovic wrote:
 Hi all, 
 
 I would like to explain very high level steps of our work: 
 1) Sync work with DB in all projects (We have what we have, let it be in
 one place)
 2) Refactor work with DB in one place (not independently in all projects) 
 
 So I understand that our code around DB is not ideal, but let it be in
 one place at first.
 

This is fine in principle, however I don't think we should push it
without considering the details (where the devil is apparently).
I am arguing that DB archiving should be re-done and is broken
conceptually (example below), and I think it would be suboptimal (to say
the least) to get it everywhere first and then fix it.

Just saying a hand-wavy yeah, but once it's in Oslo we can fix it is
wrong - especially for functionality that is younger than the time it
will likely take it to 'graduate' Oslo.

 --
 About DB archiving. 
 --
 Let me describe how it works for non familiar contributors:
 
 For each table (that have columns, indexes, unique constraints, fk and
 etc) we have shadow table that have only columns (without indexes,
 unique constraints, fk..)
 
 And then we have utility that makes next things: 
 move from original table records (that are marked as deleted) to shadow
 
 This was done by David Ripton in Nova in Grizzly. 
 
 -
 
 After a few months I found that there are tons of migrations for
 original table and there is no migration for shadow table. 
 And implement this
 BP https://blueprints.launchpad.net/nova/+spec/db-improve-archiving that
 makes next:
 a) sync shadow tables with original
 b) add test that checks that:
   1) for each original table we have shadow
   2) we don't have extra shadow tables
   3) shadow tables have same columns as original
 
 Why is this so important: 
 1) If shadow and original table are not synced there could be 2
 results after shadow util was ran:
   a) it will fail
   b) (worst) it will break data in shadow table
 
 --
 
 Also there is no exponential growth of JOINs when we are using shadow
 tables: 
 
 In migrations we should:
 a) Do the same actions on columns (drop, alter) in main and shadow
 b) Do the same actions on tables (create/drop/rename)
 c) Do the same actions on data in Tables 
 
 So you are doing separately actions on Main tables and Shadow tables,
 but after migration our tables should be synced.
 
 And it is easier to make the same actions 2 times on main and shadow
 table in one migration then in separated migrations. 
 

This is only true if you have one table with no relations that need to
be considered.

Here is an example of when it gets tricky - Say you have a table T1 and
a migration that adds a column c1 that relies on some data from table T2
and T1 has a FK that points to T2. And say for the sake of argument that
objects that are represented by rows in T1 and T2 have different
life-times in the system (think instances and devices, groups, quotas,
networks... this is common in our data model).

In order to properly migrate and assign values to the newly created c1
you will need to:

* Add the column c1 to the live T1
* join on live T2 *and* shadow T2 to get the data needed and populate
the new column.
* Add the column c1 to the shadow T1
* join on live T2 *and* shadow T2 to get the data needed and populate
the new column.

Hence - exponentially more joins, as I stated in my previous email.

Now - this was the *simplest* possible example - things get potentially
much more complicated if the new column relies on previous state of data
(say - counters of some sort), if you need to get data from a third
table (think many-to-many relationships) etc.

If you need a real example - take a look at migration 186 in the current
trunk.

As I said in the previous email, and based on the examples above - this
design decision (unconstrained rows) makes it difficult to reason about
data in the system!

I personally - as a developer working on the codebase - am not happy
making this trade-off in favour of archiving in this way - and would
like to see some design decisions changed, or at the very least a more
broad consensus, that the state as-is is actually OK and we don't need
to worry about it.

 -
 
 About the db_sync downtime (upgrading from one to another DB version)
 (IRC)
 
 DB Archiving just help us to reduce this time. One of possible variant
 (high level): 
 1) Move to shadow_tables our deleted rows

This step is in the case of the workflow you describe here:
  1) mandatory
  2) completely defeating the purpose of unconstrained rows if in order
to migrate we have to move *all* of them to shadow tables whcih may take
a non-trivial amount of time.

 2) Copy shadow_tables from schema - to tmp_schema
 3) Drop data from shadow_tables
 4) Make migrations on schema: 
 a) As shadow tables are empty all migrations will be done really fast 
 b) As our original tables (have) only non deleted rows migration will
 be done also much faster.
 5) Run Nova
 6) 

Re: [openstack-dev] Work around DB in OpenStack (Oslo, Nova, Cinder, Glance)

2013-07-05 Thread Boris Pavlovic
Hi all,

I would like to explain very high level steps of our work:
1) Sync work with DB in all projects (We have what we have, let it be in
one place)
2) Refactor work with DB in one place (not independently in all projects)

So I understand that our code around DB is not ideal, but let it be in one
place at first.

--
About DB archiving.
--
Let me describe how it works for non familiar contributors:

For each table (that have columns, indexes, unique constraints, fk and etc)
we have shadow table that have only columns (without indexes, unique
constraints, fk..)

And then we have utility that makes next things:
move from original table records (that are marked as deleted) to shadow

This was done by David Ripton in Nova in Grizzly.

-

After a few months I found that there are tons of migrations for original
table and there is no migration for shadow table.
And implement this BP
https://blueprints.launchpad.net/nova/+spec/db-improve-archiving that makes
next:
a) sync shadow tables with original
b) add test that checks that:
  1) for each original table we have shadow
  2) we don't have extra shadow tables
  3) shadow tables have same columns as original

Why is this so important:
1) If shadow and original table are not synced there could be 2 results
after shadow util was ran:
  a) it will fail
  b) (worst) it will break data in shadow table

--

Also there is no exponential growth of JOINs when we are using shadow
tables:

In migrations we should:
a) Do the same actions on columns (drop, alter) in main and shadow
b) Do the same actions on tables (create/drop/rename)
c) Do the same actions on data in Tables

So you are doing separately actions on Main tables and Shadow tables, but
after migration our tables should be synced.

And it is easier to make the same actions 2 times on main and shadow
table in one migration then in separated migrations.

-

About the db_sync downtime (upgrading from one to another DB version)
(IRC)

DB Archiving just help us to reduce this time. One of possible variant
(high level):
1) Move to shadow_tables our deleted rows
2) Copy shadow_tables from schema - to tmp_schema
3) Drop data from shadow_tables
4) Make migrations on schema:
a) As shadow tables are empty all migrations will be done really fast
b) As our original tables (have) only non deleted rows migration will be
done also much faster.
5) Run Nova
6) Make migration on tmp_schema
7) Copy from tmp_schema to shcema (if it is required for some reasons)

So for example writing utitlites that will be able to do this will be very
useful.
--


So what I think about DB archiving.
It is great things that helps us:
1) to reduce migrations downtime
2) to reduce count of rows in original table and improve performance

And I think that tests that checks that original and shadow tables are
synces is required here.


Best regards,
Boris Pavlovic





On Fri, Jul 5, 2013 at 3:41 PM, Nikola Đipanov ndipa...@redhat.com wrote:

 On 02/07/13 19:50, Boris Pavlovic wrote:
 
*) DB Archiving
   a) create shadow tables
   b) add tests that checks that shadow and main table are synced.
   c) add code that work with shadow tables.
 

 Hi Boris  all,

 I have a few points regarding db archiving work that I am growing more
 concerned about, so I though I might mention them on this thread. I
 pointed them out ad-hoc on a recent review
 https://review.openstack.org/#/c/34643/ and there is some discussion
 there already, although was not very fruitful.

 I feel that there were a few design oversights and as a result it has a
 couple of rough edges I noticed.

 First issue is about the fact that shadow tables do not present a view
 of the world themselves but are just unconstrained rows copied from
 live tables.

 This is understandably done for performance reasons while archiving
 (with current design ideas in place), but also causes issues when
 migrations affect more than one table. Especially if data migrations
 need to look at more tables at once, the actual number of table joins
 needed in order to consider everything grows exponentially. It could be
 argued that these are not that common, but is something that will make
 development more difficult and migrations painful once it comes up.

 To put it shortly - this property generally makes it harder to reason
 about data.

 Second point (and it ties in with the first one since it makes it
 difficult to fix) - Maybe shadow table migrations should be kept
 separate, and made optional? Currently there is a check that will fail
 the tests unless the migration is done on both tables, which I think
 should be removed in favour of separate migrations. Developers should
 still migrate both of course - but deployers should be able to choose
 not to do it according to their needs/scale. I am sure there are people
 on this list that can chip in more on this subject (I've had a brief
 discussion with lifeless on this topic on IRC).

 I'm afraid that if you 

Re: [openstack-dev] Work around DB in OpenStack (Oslo, Nova, Cinder, Glance)

2013-07-03 Thread Johannes Erdfelt
On Wed, Jul 03, 2013, Michael Still mi...@stillhq.com wrote:
 On Wed, Jul 3, 2013 at 3:50 AM, Boris Pavlovic bo...@pavlovic.me wrote:
 
  Question:
Why we should put in oslo slqlalchemy-migrate monkey patches, when we are
  planing to switch to alembic?
 
  Answer:
 If we don’t put in oslo sqlalchemy-migrate monkey patches. We won't be
  able to work on 7 point at all until 8 and 10 points will be implemented in
  every project. Also work around 8 point is not finished, so we are not able
  to implement 10 points in any of project. So this blocks almost all work in
  all projects. I think that these 100-200 lines of code are not so big price
  for saving few cycles of time.
 
 We've talked in the past (Folsom summit?) about alembic, but I'm not
 aware of anyone who is actually working on it. Is someone working on
 moving us to alembic? If not, it seems unfair to block database work
 on something no one is actually working on.

I've started working on a non-alembic migration path that was discussed
at the Grizzly summit.

While alembic is better than sqlalchemy-migrate, it still requires long
downtimes when some migrations are run. We discussed moving to an
expand/contract cycle where migrations add new columns, allow migrations
to slowly (relatively speaking) migrate data over, then (possibly) remove
any old columns.

JE


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Work around DB in OpenStack (Oslo, Nova, Cinder, Glance)

2013-07-03 Thread Doug Hellmann
On Wed, Jul 3, 2013 at 6:50 AM, Michael Still mi...@stillhq.com wrote:

 On Wed, Jul 3, 2013 at 3:50 AM, Boris Pavlovic bo...@pavlovic.me wrote:

  Question:
Why we should put in oslo slqlalchemy-migrate monkey patches, when we
 are
  planing to switch to alembic?
 
  Answer:
 If we don’t put in oslo sqlalchemy-migrate monkey patches. We won't be
  able to work on 7 point at all until 8 and 10 points will be implemented
 in
  every project. Also work around 8 point is not finished, so we are not
 able
  to implement 10 points in any of project. So this blocks almost all work
 in
  all projects. I think that these 100-200 lines of code are not so big
 price
  for saving few cycles of time.

 We've talked in the past (Folsom summit?) about alembic, but I'm not
 aware of anyone who is actually working on it. Is someone working on
 moving us to alembic? If not, it seems unfair to block database work
 on something no one is actually working on.


That's not quite what happened. Unfortunately the conversation happened in
gerrit, IRC, and email, so it's a little hard to piece together from the
outside.

I had several concerns with the nature of this change, not the least of
which is it is monkey-patching a third-party library to add a feature
instead of just modifying that library upstream.

The patch I objected to (https://review.openstack.org/#/c/31016) modifies
the sqlite driver inside sqlalchemy-migrate to support some migration
patterns that it does not support natively. There's no blueprint linked
from the commit message on the patch I was reviewing, so I didn't have the
full background. The description of the patch, and the discussion in
gerrit, initially led me to believe this was for unit tests for the
migrations themselves. I pointed out that it didn't make any sense to test
the migrations on a database no one would use in production, especially if
we had to monkey patch the driver to make the migrations work in the first
place.

Boris clarified that the tests were the general nova tests, at which point
I asked why nova was relying on the migrations to set up a database for its
tests instead of just using the models. Sean cleared up the history on that
point, and although I'm still not happy with the idea of putting code in
oslo with the pre-declared plan to remove it (rather than consider it for
graduation), I agreed that the pragmatic thing to do for now is to live
with the monkey patched version of sqlalchemy-migrate.

At this point, I have removed my -2 to the patch, but I haven't had a
chance to fully review the code. I voted 0 to unblock it in case other
reviewers had time to look at it before I was able to come back. That
hasn't happened, but the patch is no longer blocked.

Somewhere during that conversation, I suggested looking at alembic as an
alternative, but alembic clearly states in its documentation that
migrations on sqlite are not supported because of the database's limited
support for alter statements, but that if someone wants to contribute those
features patches would be welcome. If we do need this feature to support
good unit tests of SQLalchemy-based projects, we should eventually move it
out of oslo and into alembic, then move our migration scripts to use
alembic. It would make the most sense to do that on a release boundary,
when we normally collapse the migration scripts anyway. Even better would
be if we could make the models and migration scripts produce databases that
are compatible enough for testing the main project, and then run tests for
the migrations themselves against real databases as a separate step. Based
on the plan Boris has posted, it sounds like he is working toward both of
these goals.

Doug



 Michael

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Work around DB in OpenStack (Oslo, Nova, Cinder, Glance)

2013-07-03 Thread Monty Taylor


On 07/03/2013 07:26 AM, Johannes Erdfelt wrote:
 On Wed, Jul 03, 2013, Michael Still mi...@stillhq.com wrote:
 On Wed, Jul 3, 2013 at 3:50 AM, Boris Pavlovic bo...@pavlovic.me wrote:

 Question:
   Why we should put in oslo slqlalchemy-migrate monkey patches, when we are
 planing to switch to alembic?

 Answer:
If we don’t put in oslo sqlalchemy-migrate monkey patches. We won't be
 able to work on 7 point at all until 8 and 10 points will be implemented in
 every project. Also work around 8 point is not finished, so we are not able
 to implement 10 points in any of project. So this blocks almost all work in
 all projects. I think that these 100-200 lines of code are not so big price
 for saving few cycles of time.

 We've talked in the past (Folsom summit?) about alembic, but I'm not
 aware of anyone who is actually working on it. Is someone working on
 moving us to alembic? If not, it seems unfair to block database work
 on something no one is actually working on.
 
 I've started working on a non-alembic migration path that was discussed
 at the Grizzly summit.

 While alembic is better than sqlalchemy-migrate, it still requires long
 downtimes when some migrations are run. We discussed moving to an
 expand/contract cycle where migrations add new columns, allow migrations
 to slowly (relatively speaking) migrate data over, then (possibly) remove
 any old columns.

I think if you're working on a non-alembic plan and boris is working on
an alembic plan, then something is going to be unhappy in the
not-too-distant future. Can we get alignment on this?

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Work around DB in OpenStack (Oslo, Nova, Cinder, Glance)

2013-07-03 Thread Monty Taylor


On 07/02/2013 10:50 AM, Boris Pavlovic wrote:
 ###
 Goal
 ###
 
 We should fix work with DB, unify it in all projects and use oslo code
 for all common things.

Just wanted to say a quick word that isn't about migrations...

Thank you. This is all great, and I'm thrilled someone is taking on the
task of fixing what is probably one of OpenStack's biggest nightmares.

 In more words:
 
 DB API
 
   *) Fully cover by tests.
 
   *) Run tests against all backends (now they are runed only against
 sqlite).
 
   *) Unique constraints (instead of select + insert)
  a) Provide unique constraints.
  b) Add missing unique constraints.
 
   *) DB Archiving
  a) create shadow tables
  b) add tests that checks that shadow and main table are synced.
  c) add code that work with shadow tables.
 
   *) DB API performance optimization
 a) Remove unused joins..
 b) 1 query instead of N (where it is possible).
 c) Add methods that could improve performance.
 d) Drop unused methods.
 
   *) DB reconnect
 a) Don’t break huge task if we lost connection for a moment.. just
 retry DB query.
 
   *) DB Session cleanup
 a) do not use session parameter in public DB API methods.
 b) fix places where we are doing N queries in N transactions instead
 of 1.
 c) get only data that is used (e.g. len(query.all()) = query.count()).
 
 
 
 DB Migrations
 
   *) Test DB Migrations against all backends and real data.
 
   *) Fix: DB schemas after Migrations should be same in different backends
 
   *) Fix: hidden bugs, that are caused by wrong migrations:
  a) fix indexes. e.g. 152 migration in Nova drop all Indexes that
 has deleted column
  b) fix wrong types
  c) drop unused tables
 
   *) Switch from sqlalchemy-migrate to something that is not death (e.g.
 alembic).
 
 
 
 DB Models
 
   *) Fix: Schema that is created by Models should be the same as after
 migrations.
  
   *) Fix: Unit tests should be runed on DB that was created by Models
 not migrations.
 
   *) Add test that checks that Models are synced with migrations.
 
 
 
 Oslo Code
 
   *) Base Sqlalchemy Models.
 
   *) Work around engine and session.
 
   *) SqlAlchemy Utils - that helps us with migrations and tests.
 
   *) Test migrations Base.
 
   *) Use common test wrapper that allows us to run tests on different
 backends.
 
 
 ###
Implementation
 ###
 
   This is really really huge task. And we are almost done with Nova=).
 
   In OpenStack for such work there is only one approach (“baby steps”
 development deriven). So we are making tons of patches that could be
 easy reviewed. But there is also minuses in such approach. It is pretty
 hard to track work on high level. And sometimes there are misunderstand.
  
   For example with oslo code. In few words at this moment we would like
 to add (for some time) in oslo monkey patching for sqlalchemy-migrate.
 And I got reasonable question from Doug Hellmann. Why? I answer because
 of our “baby steps”. But if you don’t have a list of baby steps it is
 pretty hard to understand why our baby steps need this thing. And why we
 don’t switch to alembic firstly. So I would like to describe our Road
 Map and write list of baby steps.
 
 
 ---
 
 OSLO
 
   *) (Merged) Base code for Models and sqlalchemy engine (session)
 
   *) (On review) Sqlalchemy utils that are used to:
   1. Fix bugs in sqlalchemy-migrate
   2. Base code for migrations that provides Unique Constraints.
   3. Utils for db.archiving helps us to create and check shadow tables.
 
   *) (On review) Testtools wrapper
We should have only one testtool wrapper in all projects. And
 this is the one of base steps in task of running tests against all backends.
 
   *) (On review) Test migrations base
Base classes that provides us to test our migrations against all
 backends on real data
 
   *) (On review, not finished yet) DB Reconnect.  
 
   *) (Not finished) Test that checks that schemas and models are synced
 
 ---
 
 ${PROJECT_NAME}
 
 
 In different projects we could work absolutely simultaneously, and first
 candidates are Glance and Cinder. But inside project we could also work
 simultaneously. Here is the workflow:
 
 
   1) (SYNC) Use base code for Models and sqlalchemy engines (from oslo)
 
   2) (SYNC) Use test migrations base (from oslo)
 
   3) (SYNC) Use SqlAlchemy utils (from oslo)
 
   4) (1 patch) Switch to OSLO DB code
 
   5) (1 patch) Remove ported test migrations
 
   6) (1 Migration) Provide unique constraints (change type 

Re: [openstack-dev] Work around DB in OpenStack (Oslo, Nova, Cinder, Glance)

2013-07-03 Thread Boris Pavlovic
Hi Monty,

 I think if you're working on a non-alembic plan and boris is working on
 an alembic plan, then something is going to be unhappy in the
 not-too-distant future. Can we get alignment on this?


As I said before, we are preparing our DB code to move from
sqlalchemy-migrate to something another.
There will be a tons of work before we will be able to rewrite or migration
scripts to alembic or something else.

And we are not sure that we would like to use alembic=)


Best regards,
Boris Pavlovic



On Wed, Jul 3, 2013 at 9:30 PM, Monty Taylor mord...@inaugust.com wrote:



 On 07/02/2013 10:50 AM, Boris Pavlovic wrote:
 
 ###
  Goal
 
 ###
 
  We should fix work with DB, unify it in all projects and use oslo code
  for all common things.

 Just wanted to say a quick word that isn't about migrations...

 Thank you. This is all great, and I'm thrilled someone is taking on the
 task of fixing what is probably one of OpenStack's biggest nightmares.

  In more words:
 
  DB API
 
*) Fully cover by tests.
 
*) Run tests against all backends (now they are runed only against
  sqlite).
 
*) Unique constraints (instead of select + insert)
   a) Provide unique constraints.
   b) Add missing unique constraints.
 
*) DB Archiving
   a) create shadow tables
   b) add tests that checks that shadow and main table are synced.
   c) add code that work with shadow tables.
 
*) DB API performance optimization
  a) Remove unused joins..
  b) 1 query instead of N (where it is possible).
  c) Add methods that could improve performance.
  d) Drop unused methods.
 
*) DB reconnect
  a) Don’t break huge task if we lost connection for a moment.. just
  retry DB query.
 
*) DB Session cleanup
  a) do not use session parameter in public DB API methods.
  b) fix places where we are doing N queries in N transactions instead
  of 1.
  c) get only data that is used (e.g. len(query.all()) =
 query.count()).
 
  
 
  DB Migrations
 
*) Test DB Migrations against all backends and real data.
 
*) Fix: DB schemas after Migrations should be same in different
 backends
 
*) Fix: hidden bugs, that are caused by wrong migrations:
   a) fix indexes. e.g. 152 migration in Nova drop all Indexes that
  has deleted column
   b) fix wrong types
   c) drop unused tables
 
*) Switch from sqlalchemy-migrate to something that is not death (e.g.
  alembic).
 
  
 
  DB Models
 
*) Fix: Schema that is created by Models should be the same as after
  migrations.
 
*) Fix: Unit tests should be runed on DB that was created by Models
  not migrations.
 
*) Add test that checks that Models are synced with migrations.
 
  
 
  Oslo Code
 
*) Base Sqlalchemy Models.
 
*) Work around engine and session.
 
*) SqlAlchemy Utils - that helps us with migrations and tests.
 
*) Test migrations Base.
 
*) Use common test wrapper that allows us to run tests on different
  backends.
 
 
 
 ###
 Implementation
 
 ###
 
This is really really huge task. And we are almost done with Nova=).
 
In OpenStack for such work there is only one approach (“baby steps”
  development deriven). So we are making tons of patches that could be
  easy reviewed. But there is also minuses in such approach. It is pretty
  hard to track work on high level. And sometimes there are misunderstand.
 
For example with oslo code. In few words at this moment we would like
  to add (for some time) in oslo monkey patching for sqlalchemy-migrate.
  And I got reasonable question from Doug Hellmann. Why? I answer because
  of our “baby steps”. But if you don’t have a list of baby steps it is
  pretty hard to understand why our baby steps need this thing. And why we
  don’t switch to alembic firstly. So I would like to describe our Road
  Map and write list of baby steps.
 
 
  ---
 
  OSLO
 
*) (Merged) Base code for Models and sqlalchemy engine (session)
 
*) (On review) Sqlalchemy utils that are used to:
1. Fix bugs in sqlalchemy-migrate
2. Base code for migrations that provides Unique Constraints.
3. Utils for db.archiving helps us to create and check shadow
 tables.
 
*) (On review) Testtools wrapper
 We should have only one testtool wrapper in all projects. And
  this is the one of base steps in task of running tests against all
 backends.
 
*) (On review) Test migrations base
 Base classes that provides us to test our migrations against all
  backends on real data

Re: [openstack-dev] Work around DB in OpenStack (Oslo, Nova, Cinder, Glance)

2013-07-02 Thread Ben Nemec
One small addition I would suggest is a step to remove the unused 
sqlalchemy-migrate code once this is all done.  That's my main concern 
with moving it to Oslo right now.


Also, is this a formal blueprint(s)?  Seems like it should be.

-Ben

On 2013-07-02 12:50, Boris Pavlovic wrote:
 
 
 
 
 
 
 
 
###

 Goal
 
 
 
 
 
 
 
 
###


We should fix work with DB, unify it in all projects and use oslo code
for all common things.

In more words:

DB API

 *) Fully cover by tests.

 *) Run tests against all backends (now they are runed only against
sqlite).

 *) Unique constraints (instead of select + insert)
 a) Provide unique constraints.
 b) Add missing unique constraints.

 *) DB Archiving
 a) create shadow tables
 b) add tests that checks that shadow and main table are synced.
 c) add code that work with shadow tables.

 *) DB API performance optimization
 a) Remove unused joins.
 b) 1 query instead of N (where it is possible).
 c) Add methods that could improve performance.
 d) Drop unused methods.

 *) DB reconnect
 a) Don't break huge task if we lost connection for a moment. just
retry DB query.

 *) DB Session cleanup
 a) do not use session parameter in public DB API methods.
 b) fix places where we are doing N queries in N transactions instead
of 1.
 c) get only data that is used (e.g. len(query.all()) =
query.count()).



DB Migrations

 *) Test DB Migrations against all backends and real data.

 *) Fix: DB schemas after Migrations should be same in different
backends

 *) Fix: hidden bugs, that are caused by wrong migrations:
 a) fix indexes. e.g. 152 migration in Nova drop all Indexes that has
deleted column
 b) fix wrong types
 c) drop unused tables

 *) Switch from sqlalchemy-migrate to something that is not death
(e.g. alembic).



DB Models

 *) Fix: Schema that is created by Models should be the same as after
migrations.

 *) Fix: Unit tests should be runed on DB that was created by Models
not migrations.

 *) Add test that checks that Models are synced with migrations.



Oslo Code

 *) Base Sqlalchemy Models.

 *) Work around engine and session.

 *) SqlAlchemy Utils - that helps us with migrations and tests.

 *) Test migrations Base.

 *) Use common test wrapper that allows us to run tests on different
backends.

 
 
 
 
 
 
 
 
###

 Implementation
 
 
 
 
 
 
 
 
###


 This is really really huge task. And we are almost done with Nova=).

 In OpenStack for such work there is only one approach (baby steps
development deriven). So we are making tons of patches that could be
easy reviewed. But there is also minuses in such approach. It is
pretty hard to track work on high level. And sometimes there are
misunderstand.

 For example with oslo code. In few words at this moment we would like
to add (for some time) in oslo monkey patching for sqlalchemy-migrate.
And I got reasonable question from Doug Hellmann. Why? I answer
because of our baby steps. But if you don't have a list of baby
steps it is pretty hard to understand why our baby steps need this
thing. And why we don't switch to alembic firstly. So I would like to
describe our Road Map and write list of baby steps.

---

OSLO

 *) (Merged) Base code for Models and sqlalchemy engine (session)

 *) (On review) Sqlalchemy utils that are used to:
 1. Fix bugs in sqlalchemy-migrate
 2. Base code for migrations that provides Unique Constraints.
 3. Utils for db.archiving helps us to create and check shadow tables.

 *) (On review) Testtools wrapper
 We should have only one testtool wrapper in all projects. And this is
the one of base steps in task of running tests against all backends.

 *) (On review) Test migrations base
 Base classes that provides us to test our migrations against all
backends on real data

 *) (On review, not finished yet) DB Reconnect.

 *) (Not finished) Test that checks that schemas and models are synced

---

${PROJECT_NAME}

In different projects we could work absolutely simultaneously, and
first candidates are Glance and Cinder. But inside project we could
also work simultaneously. Here is the workflow:

 1) (SYNC) Use base code for Models and sqlalchemy engines (from oslo)

 2) (SYNC) Use test migrations base (from oslo)

 3) (SYNC) Use SqlAlchemy utils (from oslo)

 4) (1 patch) Switch to OSLO DB code

 5) (1 patch) Remove ported test migrations

 6) (1 Migration) Provide unique constraints (change type of deleted
column)

 7) (1 Migration) Add shadow tables
 a) Create shadow tables
 b) Add test that checks that they are synced always

 8) (N Migrations) UniqueConstraint/Session/Optimization workflow:
 a) (1 patch) Add/Improve/Refactor tests for part of api (that is
connected with model)
 b) (1 patch) Fix session
 c) (1 patch)