On 8/8/15 3:50 PM, Ian McCullough wrote:

So I've continued thinking about this a bit, mainly because trying to keep the database schema and object model in sync is effectively a kind of a "double entry" system, and I have been burned *so. many. times.* in the past by (N>1)-entry systems where the different "truths" get out of sync, leading to subtle, but week-ruining bugs.

of course, this is something we all share. I only ask that you consider that the Alembic migration files and other similar concepts might be thought of differently, as they intended to be an *immutable* truth; that is, a particular migration file can never be incorrect, because it represents a point in time that has already passed. But I guess that's why you favor the migration files as that source of truth.


In the abstract, my goal is to have a single source of truth and to have that truth 'flow' through the system (in my case, I'm trying to use the alembic version scripts, but I really don't care what form the truth takes as long as there's only one of them.)

Michael said:

    However, where it gets thorny is that neither Alembic migrations
    nor SQLAlchemy metadata are supersets of each other


My approach here has been to "change" this by leveraging the ability to specify arbitrary, ancillary data in the `info` property of `SchemaItem` to store any/all additional information necessary to re-create the models (i.e. making alembic migrations a superset of the SQLAlchemy metadata /that I need, for my specific purposes/) Then, once I've captured that metadata, I push it up into the database (in my case, I'm using Postgres's COMMENTs feature, but in other DBs it could just be an arbitrary table), which can then be used by another, build-time tool to generate my models.py file from the database. Ideally, there would be a way to cut out the database so that you could just run the alembic scripts and get out the appropriate metadata, but going through the database is an acceptable detour for me (especially now that I've wrapped up the necessary fixtures to spool up an ephemeral, local Postgres installation.) Considering that you could conceivably even ship pickled python object graphs to this kind of "sidecar" storage, I suspect that there probably /is/ enough flexibility to capture all the SQLAlchemy metadata, if you carried this to its logical conclusion.

I realize this approach is probably too "restrictive" to be useful in the general case, but I figured I'd share my thoughts and hacks anyway. Conceptually, I think the best thing, long term, would be for alembic to be able to handle both DB schema migration and object model migration, and to serve as a single source of truth for systems willing to operate completely within alembic's purview. Based on your comments about mine being an unusual workflow, I assume many folks won't want to work this way, but for those who consider single sources of truth to be critical, I think it could be a win.

If anyone is interested in talking more about this, let me know.

a technique you might want to consider, which we do in Openstack, is that as part of our CI suite we actually run the migrations fully and then do a diff of the schema as developed against the metadata. This function is available from Alembic using the compare_metadata function: http://alembic.readthedocs.org/en/rel_0_7/api.html#alembic.autogenerate.compare_metadata. Basically you want it to return nothing, meaning nothing has changed.

This allows the alembic migrations and the current metadata to remain separately, but they are checked for accuracy against each other as part of the test suite. It is of course only as accurate as SQLAlchemy reflection goes (so things like check constraints, triggers, etc. that aren't currently reflected aren't included, unless you augment the comparison with these), but as far as those aspects of the model that make a difference from the Python side, e.g. names of tables and columns and datatypes, those would all be affirmed.





Regards,
Ian



On Monday, August 3, 2015 at 9:36:19 AM UTC-4, Michael Bayer wrote:



    On 8/1/15 6:59 PM, Ian McCullough wrote:
    I've been getting up to speed with SQLAlchemy and alembic and
    having a great time of it. This is some great stuff!

    One thing that's been confounding me is this: My Alembic schema
    revisions are 'authoritative' for my metadata (i.e. I've started
    from scratch using alembic to build the schema from nothing), yet
    it doesn't appear that the metadata that exists in my alembic
    scripts can be leveraged by my models in my main app. So far,
    I've been maintaining effectively two versions of the metadata,
    one in the form of the "flattened projection" of my alembic
    schema rev scripts, and another in my application models scripts.
    I understand that there are some facilities to auto-re-generate
    the metadata from the RDBMS on the application side, but that
    seems potentially "lossy", or at least subject to the whims of
    whatever DBAPI provider I'm using.

    Is there a way to pull this flattened projection of metadata out
    of alembic and into my app's models at runtime? (i.e. import
    alembic, read the version from the live DB, then build the
    metadata by playing the upgrade scripts forward, not against the
    database, but against a metadata instance?) It seems like a
    fool's errand to try to keep my app models in sync with the
    flattened projection of the schema revisions by hand. My
    assumption is that I'm missing something super-obvious here.

    There's a lot to say on this issue.    The idea of the migrations
    themselves driving the metadata would be nice, and I think that
    the recent rewrite of django south does something somewhat
    analogous to this.

    Also, the reorganization of Alembic operations into objects that
    you can hang any number of operations upon, this is due for
    Alembic 0.8, is also something that we'd leverage to make this
    kind of thing happen.

    However, where it gets thorny is that neither Alembic migrations
    nor SQLAlchemy metadata are supersets of each other.     That is,
    there's many things in SQLAlchemy metadata that currently has no
    formal representation in Alembic operations, the primary example
    is that of Python-side default operations on columns, which have
    no relevance to emitting ALTER statements.    On the Alembic side,
    a set of migrations that takes care to only use the official
    Alembic op.* operations, and also does not use "execute()" for any
    of them, is the only way to guarantee that each change is
    potentially representable in SQLAlchemy metadata.    A migration
    that emits op.execute("ALTER TABLE foo ADD COLUMN xyz") wouldn't
    work here, and a migration that has lots of conditionals and
    runtime logic might also not be useful in this way.

    SQLAlchemy Table and Column objects also do not support removal
    from their parents.   This would be necessary in order to
    represent "drop" mutations as targeted at a SQLAlchemy metadata
    structure.  This is something that could be implemented but SQLA
    has always made a point to not get into this because it's very
    complicated to handle "cascades" of dependent objects, whether
    that means raising an error or mimicking other functionality of a
    real "drop" operation.

    Finally, the whole workflow of Alembic up til now has been
    organized for the opposite workflow; the MetaData is the
    authoritative model, and migrations are generated using tools like
    autogenerate to minimize how much they need to be coded by hand
    (and there is of course no issue of maintaining the same code in
    two places because migration scripts are a fixed point in time
    once created).    This model is practical for many reasons; all of
    the above reasons, plus that it is compatible with applications
    that weren't using migrations up to point or were using some other
    system, plus that it allows easy pruning of old migrations.






    Thanks,
    Ian



--
You received this message because you are subscribed to the Google Groups "sqlalchemy-alembic" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy-alembic+unsubscr...@googlegroups.com <mailto:sqlalchemy-alembic+unsubscr...@googlegroups.com>.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups 
"sqlalchemy-alembic" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy-alembic+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to