On Jun 6, 2014, at 8:12 PM, Devananda van der Veen <devananda....@gmail.com> 
wrote:

> I think some things are broken in the oslo-incubator db migration code.
> 
> Ironic moved to this when Juno opened and things seemed fine, until recently 
> when Lucas tried to add a DB migration and noticed that it didn't run... So I 
> looked into it a bit today. Below are my findings.
> 
> Firstly, I filed this bug and proposed a fix, because I think that tests that 
> don't run any code should not report that they passed -- they should report 
> that they were skipped.
>   https://bugs.launchpad.net/oslo/+bug/1327397
>   "No notice given when db migrations are not run due to missing engine"
> 
> Then, I edited the test_migrations.conf file appropriately for my local mysql 
> service, ran the tests again, and verified that migration tests ran -- and 
> they passed. Great!
> 
> Now, a little background... Ironic's TestMigrations class inherits from 
> oslo's BaseMigrationTestCase, then "opportunistically" checks each back-end, 
> if it's available. This opportunistic checking was inherited from Nova so 
> that tests could pass on developer workstations where not all backends are 
> present (eg, I have mysql installed, but not postgres), and still 
> transparently run on all backends in the gate. I couldn't find such 
> opportunistic testing in the oslo db migration test code, unfortunately - but 
> maybe it's well hidden.
> 
> Anyhow. When I stopped the local mysql service (leaving the configuration 
> unchanged), I expected the tests to be skipped, but instead I got two 
> surprise failures:
> - test_mysql_opportunistically() failed because setUp() raises an exception 
> before the test code could call calling _have_mysql()
> - test_mysql_connect_fail() actually failed! Again, because setUp() raises an 
> exception before running the test itself
> 
> Unfortunately, there's one more problem... when I run the tests in parallel, 
> they fail randomly because sometimes two test threads run different migration 
> tests, and the setUp() for one thread (remember, it calls _reset_databases) 
> blows up the other test.
> 
> Out of 10 runs, it failed three times, each with different errors:
>   NoSuchTableError: `chassis`
>   ERROR 1007 (HY000) at line 1: Can't create database 'test_migrations'; 
> database exists
>   ProgrammingError: (ProgrammingError) (1146, "Table 
> 'test_migrations.alembic_version' doesn't exist")
> 
> As far as I can tell, this is all coming from:
>   
> https://github.com/openstack/oslo-incubator/blob/master/openstack/common/db/sqlalchemy/test_migrations.py#L86;L111

Hello -

Just an introduction, I’m Mike Bayer, the creator of SQLAlchemy and Alembic 
migrations.     I’ve just joined on as a full time Openstack contributor, and 
trying to help improve processes such as these is my primary responsibility.

I’ve had several conversations already about how migrations are run within test 
suites in various openstack projects.   I’m kind of surprised by this approach 
of dropping and recreating the whole database for individual tests.   Running 
tests in parallel is obviously made very difficult by this style, but even 
beyond that, a lot of databases don’t respond well to lots of 
dropping/rebuilding of tables and/or databases in any case; while SQLite and 
MySQL are probably the most forgiving of this, a backend like Postgresql is 
going to lock tables from being dropped more aggressively, if any open 
transactions or result cursors against those tables remain, and on a backend 
like Oracle, the speed of schema operations starts to become prohibitively 
slow.   Dropping and creating tables is in general not a very speedy task on 
any backend, and on a test suite that runs many tests against a fixed schema, I 
don’t see why a full drop is necessary.

If you look at SQLAlchemy’s own tests, they do in fact create tables on each 
test, or just as often for a specific suite of tests.  However, this is due to 
the fact that SQLAlchemy tests are testing SQLAlchemy itself, so the database 
schemas used for these tests are typically built explicitly for small groups or 
individual tests, and there are ultimately thousands of small “mini schemas” 
built up and torn down for these tests.   A lot of framework code is involved 
within the test suite to keep more picky databases like Postgresql and Oracle 
happy when building up and dropping tables so frequently.

However, when testing an application that uses a fixed set of tables, as should 
be the case for the majority if not all Openstack apps, there’s no reason that 
these tables need to be recreated for every test.   Typically, the way I 
recommend is that the test suite includes a “per suite” activity which creates 
the test schema just once (with or without using CREATE DATABASE; I’m not a fan 
of tests running CREATE DATABASE as this is not a command so easily available 
in some environments).   The tests themselves then run within a transactional 
container, such that each test performs all of its work within a context that 
doesn’t actually commit any data to the database; a test that actually states 
“session.commit()” runs within a container that doesn’t actually emit the 
COMMIT, and if support is needed for tests that also emit “session.rollback()”, 
the container can be written to support this paradigm as well (I helped some 
Dropbox devs with such an approach recently).

In this way, the database migrations are exercised, but only once at the 
beginning in order to build up the schema; the tests can then run with very low 
complexity/performance overhead as far as database-level setup/teardown, and 
parallel testing is also much easier.   When the test suite completes is when a 
drop of the entire set of tables can proceed.  Because tests are run within 
transactions, assuming READ COMMITTED isolation is established, the tests don’t 
even see the data being incurred by other tests running in parallel.

It remains to be seen what aspects of Openstack I’m going to get involved with 
first, though this migration and testing issue seems to be a big one.   I’d 
love to get comments from the community here as to how this process might be 
improved and if a rearrangement of the fixture system in the way I describe 
might be helpful and feasible.

- mike


_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to