On 8/8/15 3:50 PM, Ian McCullough wrote:
So I've continued thinking about this a bit, mainly because trying to
keep the database schema and object model in sync is effectively a
kind of a "double entry" system, and I have been burned *so. many.
times.* in the past by (N>1)-entry systems where the different
"truths" get out of sync, leading to subtle, but week-ruining bugs.
of course, this is something we all share. I only ask that you consider
that the Alembic migration files and other similar concepts might be
thought of differently, as they intended to be an *immutable* truth;
that is, a particular migration file can never be incorrect, because it
represents a point in time that has already passed. But I guess that's
why you favor the migration files as that source of truth.
In the abstract, my goal is to have a single source of truth and to
have that truth 'flow' through the system (in my case, I'm trying to
use the alembic version scripts, but I really don't care what form the
truth takes as long as there's only one of them.)
Michael said:
However, where it gets thorny is that neither Alembic migrations
nor SQLAlchemy metadata are supersets of each other
My approach here has been to "change" this by leveraging the ability
to specify arbitrary, ancillary data in the `info` property of
`SchemaItem` to store any/all additional information necessary to
re-create the models (i.e. making alembic migrations a superset of the
SQLAlchemy metadata /that I need, for my specific purposes/) Then,
once I've captured that metadata, I push it up into the database (in
my case, I'm using Postgres's COMMENTs feature, but in other DBs it
could just be an arbitrary table), which can then be used by another,
build-time tool to generate my models.py file from the database.
Ideally, there would be a way to cut out the database so that you
could just run the alembic scripts and get out the appropriate
metadata, but going through the database is an acceptable detour for
me (especially now that I've wrapped up the necessary fixtures to
spool up an ephemeral, local Postgres installation.) Considering that
you could conceivably even ship pickled python object graphs to this
kind of "sidecar" storage, I suspect that there probably /is/ enough
flexibility to capture all the SQLAlchemy metadata, if you carried
this to its logical conclusion.
I realize this approach is probably too "restrictive" to be useful in
the general case, but I figured I'd share my thoughts and hacks
anyway. Conceptually, I think the best thing, long term, would be for
alembic to be able to handle both DB schema migration and object model
migration, and to serve as a single source of truth for systems
willing to operate completely within alembic's purview. Based on your
comments about mine being an unusual workflow, I assume many folks
won't want to work this way, but for those who consider single sources
of truth to be critical, I think it could be a win.
If anyone is interested in talking more about this, let me know.
a technique you might want to consider, which we do in Openstack, is
that as part of our CI suite we actually run the migrations fully and
then do a diff of the schema as developed against the metadata. This
function is available from Alembic using the compare_metadata function:
http://alembic.readthedocs.org/en/rel_0_7/api.html#alembic.autogenerate.compare_metadata.
Basically you want it to return nothing, meaning nothing has changed.
This allows the alembic migrations and the current metadata to remain
separately, but they are checked for accuracy against each other as part
of the test suite. It is of course only as accurate as SQLAlchemy
reflection goes (so things like check constraints, triggers, etc. that
aren't currently reflected aren't included, unless you augment the
comparison with these), but as far as those aspects of the model that
make a difference from the Python side, e.g. names of tables and columns
and datatypes, those would all be affirmed.
Regards,
Ian
On Monday, August 3, 2015 at 9:36:19 AM UTC-4, Michael Bayer wrote:
On 8/1/15 6:59 PM, Ian McCullough wrote:
I've been getting up to speed with SQLAlchemy and alembic and
having a great time of it. This is some great stuff!
One thing that's been confounding me is this: My Alembic schema
revisions are 'authoritative' for my metadata (i.e. I've started
from scratch using alembic to build the schema from nothing), yet
it doesn't appear that the metadata that exists in my alembic
scripts can be leveraged by my models in my main app. So far,
I've been maintaining effectively two versions of the metadata,
one in the form of the "flattened projection" of my alembic
schema rev scripts, and another in my application models scripts.
I understand that there are some facilities to auto-re-generate
the metadata from the RDBMS on the application side, but that
seems potentially "lossy", or at least subject to the whims of
whatever DBAPI provider I'm using.
Is there a way to pull this flattened projection of metadata out
of alembic and into my app's models at runtime? (i.e. import
alembic, read the version from the live DB, then build the
metadata by playing the upgrade scripts forward, not against the
database, but against a metadata instance?) It seems like a
fool's errand to try to keep my app models in sync with the
flattened projection of the schema revisions by hand. My
assumption is that I'm missing something super-obvious here.
There's a lot to say on this issue. The idea of the migrations
themselves driving the metadata would be nice, and I think that
the recent rewrite of django south does something somewhat
analogous to this.
Also, the reorganization of Alembic operations into objects that
you can hang any number of operations upon, this is due for
Alembic 0.8, is also something that we'd leverage to make this
kind of thing happen.
However, where it gets thorny is that neither Alembic migrations
nor SQLAlchemy metadata are supersets of each other. That is,
there's many things in SQLAlchemy metadata that currently has no
formal representation in Alembic operations, the primary example
is that of Python-side default operations on columns, which have
no relevance to emitting ALTER statements. On the Alembic side,
a set of migrations that takes care to only use the official
Alembic op.* operations, and also does not use "execute()" for any
of them, is the only way to guarantee that each change is
potentially representable in SQLAlchemy metadata. A migration
that emits op.execute("ALTER TABLE foo ADD COLUMN xyz") wouldn't
work here, and a migration that has lots of conditionals and
runtime logic might also not be useful in this way.
SQLAlchemy Table and Column objects also do not support removal
from their parents. This would be necessary in order to
represent "drop" mutations as targeted at a SQLAlchemy metadata
structure. This is something that could be implemented but SQLA
has always made a point to not get into this because it's very
complicated to handle "cascades" of dependent objects, whether
that means raising an error or mimicking other functionality of a
real "drop" operation.
Finally, the whole workflow of Alembic up til now has been
organized for the opposite workflow; the MetaData is the
authoritative model, and migrations are generated using tools like
autogenerate to minimize how much they need to be coded by hand
(and there is of course no issue of maintaining the same code in
two places because migration scripts are a fixed point in time
once created). This model is practical for many reasons; all of
the above reasons, plus that it is compatible with applications
that weren't using migrations up to point or were using some other
system, plus that it allows easy pruning of old migrations.
Thanks,
Ian
--
You received this message because you are subscribed to the Google
Groups "sqlalchemy-alembic" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to sqlalchemy-alembic+unsubscr...@googlegroups.com
<mailto:sqlalchemy-alembic+unsubscr...@googlegroups.com>.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"sqlalchemy-alembic" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to sqlalchemy-alembic+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.