Re: Sharing metadata between alembic and SQLAlchemy

Mike Bayer Sat, 08 Aug 2015 13:12:35 -0700


On 8/8/15 3:50 PM, Ian McCullough wrote:

So I've continued thinking about this a bit, mainly because trying tokeep the database schema and object model in sync is effectively akind of a "double entry" system, and I have been burned *so. many.times.* in the past by (N>1)-entry systems where the different"truths" get out of sync, leading to subtle, but week-ruining bugs.

of course, this is something we all share. I only ask that you considerthat the Alembic migration files and other similar concepts might bethought of differently, as they intended to be an *immutable* truth;that is, a particular migration file can never be incorrect, because itrepresents a point in time that has already passed. But I guess that'swhy you favor the migration files as that source of truth.

In the abstract, my goal is to have a single source of truth and tohave that truth 'flow' through the system (in my case, I'm trying touse the alembic version scripts, but I really don't care what form thetruth takes as long as there's only one of them.)
Michael said:

    However, where it gets thorny is that neither Alembic migrations
    nor SQLAlchemy metadata are supersets of each other
My approach here has been to "change" this by leveraging the abilityto specify arbitrary, ancillary data in the `info` property of`SchemaItem` to store any/all additional information necessary tore-create the models (i.e. making alembic migrations a superset of theSQLAlchemy metadata /that I need, for my specific purposes/) Then,once I've captured that metadata, I push it up into the database (inmy case, I'm using Postgres's COMMENTs feature, but in other DBs itcould just be an arbitrary table), which can then be used by another,build-time tool to generate my models.py file from the database.Ideally, there would be a way to cut out the database so that youcould just run the alembic scripts and get out the appropriatemetadata, but going through the database is an acceptable detour forme (especially now that I've wrapped up the necessary fixtures tospool up an ephemeral, local Postgres installation.) Considering thatyou could conceivably even ship pickled python object graphs to thiskind of "sidecar" storage, I suspect that there probably /is/ enoughflexibility to capture all the SQLAlchemy metadata, if you carriedthis to its logical conclusion.
I realize this approach is probably too "restrictive" to be useful inthe general case, but I figured I'd share my thoughts and hacksanyway. Conceptually, I think the best thing, long term, would be foralembic to be able to handle both DB schema migration and object modelmigration, and to serve as a single source of truth for systemswilling to operate completely within alembic's purview. Based on yourcomments about mine being an unusual workflow, I assume many folkswon't want to work this way, but for those who consider single sourcesof truth to be critical, I think it could be a win.
If anyone is interested in talking more about this, let me know.

a technique you might want to consider, which we do in Openstack, isthat as part of our CI suite we actually run the migrations fully andthen do a diff of the schema as developed against the metadata. Thisfunction is available from Alembic using the compare_metadata function:http://alembic.readthedocs.org/en/rel_0_7/api.html#alembic.autogenerate.compare_metadata.Basically you want it to return nothing, meaning nothing has changed.

This allows the alembic migrations and the current metadata to remainseparately, but they are checked for accuracy against each other as partof the test suite. It is of course only as accurate as SQLAlchemyreflection goes (so things like check constraints, triggers, etc. thataren't currently reflected aren't included, unless you augment thecomparison with these), but as far as those aspects of the model thatmake a difference from the Python side, e.g. names of tables and columnsand datatypes, those would all be affirmed.


Regards,
Ian



On Monday, August 3, 2015 at 9:36:19 AM UTC-4, Michael Bayer wrote:



    On 8/1/15 6:59 PM, Ian McCullough wrote:

    I've been getting up to speed with SQLAlchemy and alembic and
    having a great time of it. This is some great stuff!

    One thing that's been confounding me is this: My Alembic schema
    revisions are 'authoritative' for my metadata (i.e. I've started
    from scratch using alembic to build the schema from nothing), yet
    it doesn't appear that the metadata that exists in my alembic
    scripts can be leveraged by my models in my main app. So far,
    I've been maintaining effectively two versions of the metadata,
    one in the form of the "flattened projection" of my alembic
    schema rev scripts, and another in my application models scripts.
    I understand that there are some facilities to auto-re-generate
    the metadata from the RDBMS on the application side, but that
    seems potentially "lossy", or at least subject to the whims of
    whatever DBAPI provider I'm using.

    Is there a way to pull this flattened projection of metadata out
    of alembic and into my app's models at runtime? (i.e. import
    alembic, read the version from the live DB, then build the
    metadata by playing the upgrade scripts forward, not against the
    database, but against a metadata instance?) It seems like a
    fool's errand to try to keep my app models in sync with the
    flattened projection of the schema revisions by hand. My
    assumption is that I'm missing something super-obvious here.


    There's a lot to say on this issue.    The idea of the migrations
    themselves driving the metadata would be nice, and I think that
    the recent rewrite of django south does something somewhat
    analogous to this.

    Also, the reorganization of Alembic operations into objects that
    you can hang any number of operations upon, this is due for
    Alembic 0.8, is also something that we'd leverage to make this
    kind of thing happen.

    However, where it gets thorny is that neither Alembic migrations
    nor SQLAlchemy metadata are supersets of each other.     That is,
    there's many things in SQLAlchemy metadata that currently has no
    formal representation in Alembic operations, the primary example
    is that of Python-side default operations on columns, which have
    no relevance to emitting ALTER statements.    On the Alembic side,
    a set of migrations that takes care to only use the official
    Alembic op.* operations, and also does not use "execute()" for any
    of them, is the only way to guarantee that each change is
    potentially representable in SQLAlchemy metadata.    A migration
    that emits op.execute("ALTER TABLE foo ADD COLUMN xyz") wouldn't
    work here, and a migration that has lots of conditionals and
    runtime logic might also not be useful in this way.

    SQLAlchemy Table and Column objects also do not support removal
    from their parents.   This would be necessary in order to
    represent "drop" mutations as targeted at a SQLAlchemy metadata
    structure.  This is something that could be implemented but SQLA
    has always made a point to not get into this because it's very
    complicated to handle "cascades" of dependent objects, whether
    that means raising an error or mimicking other functionality of a
    real "drop" operation.

    Finally, the whole workflow of Alembic up til now has been
    organized for the opposite workflow; the MetaData is the
    authoritative model, and migrations are generated using tools like
    autogenerate to minimize how much they need to be coded by hand
    (and there is of course no issue of maintaining the same code in
    two places because migration scripts are a fixed point in time
    once created).    This model is practical for many reasons; all of
    the above reasons, plus that it is compatible with applications
    that weren't using migrations up to point or were using some other
    system, plus that it allows easy pruning of old migrations.


    Thanks,
    Ian

--

You received this message because you are subscribed to the GoogleGroups "sqlalchemy-alembic" group.To unsubscribe from this group and stop receiving emails from it, sendan email to sqlalchemy-alembic+unsubscr...@googlegroups.com<mailto:sqlalchemy-alembic+unsubscr...@googlegroups.com>.

For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups 
"sqlalchemy-alembic" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy-alembic+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Sharing metadata between alembic and SQLAlchemy

Reply via email to