On Jun 26, 2012, at 7:59 AM, Iwan Vosloo wrote:
> Hi there,
>
> When we test, we sometimes create temporary classes just for a specific test
> (or group of tests). You might want to inherit from a class that is part of
> production code, for example and override a method for testing purposes, etc.
> When these temporary classes are declarative sqlalchemy mapped classes, we
> run into all sorts of problems.
>
> What we do is something along the lines of:
> 1) create the tables for the actual model
> 2) commit
> 3) define the test classes and create their tables
> 4) run the test logic
> 5) roll back
>
> Steps 1-2 happen one for a test run with many tests.
> Steps 3-5 represents one test, so they can be repeated.
>
> However, it seems that some cleanup is needed of mappers / the metadata /
> classes that inherit from Base, etc AFTER 5.
> The sqlalchemy Table objects and mappers are now out of sync with the
> database (which has been rolled back).
>
> What can one do to get all the sqlalchemy related stuff (metadata, mappers,
> tables and how these were set on existing classes) back in sync with the
> rolled-back database schema? We know the metadata can be cleared (and
> reflected); we know we can clear mappers and recreate them... and we've tried
> several of these things to no avail. The example given below, however should
> work without too much clearing, surely?
I do have some simple solutions to this issue, however since I suspect there
may be more complexity to your use case that isn't illustrated in your example,
I'll give you the whole background.
When you deal with a declarative Base, there are two registries at work. One
is the MetaData object associated with it, which has a dictionary called
"tables" stuck on it. The other is the "_decl_class_registry", which is a
dictionary of class names to mapped classes.
The purpose of these registries is primarily to facilitate configuration based
on string names. For example, if you say ForeignKey("mytable.id"), that
string "mytable" needs to be resolved to a Table object; if you say
relationship("SomeClass"), the string "SomeClass" needs to be resolved to a
class object. This is the biggest reason Base and MetaData need to follow the
registry pattern.
The MetaData object in particular also has several other purposes for this
registry. It serves as a "singleton" registry so that there is only one
instance of a particular Table for a particular name. This is important
because SQLAlchemy's expression language uses the identity of a particular
clause element to indicate it's lexical identity within a statement. If you
had two distinct Table objects, both with the same name, SQLAlchemy would treat
them as distinct "FROM" clauses among other things leading to incorrect SQL.
So the singleton registry here encourages that multiple Table objects of the
same name aren't created.
MetaData is also the "collection of record" for a set of Table objects, such as
when you say "metadata.reflect()", MetaData is where the Table objects are
stored. If they weren't there, they'd fall out of scope (unless
metadata.reflect() were modified to actually return the collection of Table
objects).
So MetaData's registry is strong-referencing, meaning entries do not get
removed when they are fully dereferenced elsewhere. A Table can be removed
from the MetaData using remove(), however this has caveats which I'll detail in
a moment.
The "_decl_class_registry" on declarative Base is also not a weak-referencing
dictionary - so when you make a subclass of Base, and then lose all references
to that class, it is still present in _decl_class_registry. To that extent,
Base is not automatically managing of temporary classes. It is possible to
replace _decl_class_registry with a WeakValueDictionary, which you could pass
into declarative_base() via the "class_registry" argument. I would say that
perhaps we should make _decl_class_registry weak referencing by default in any
case, as it is only used at configuration time so is not a performance concern.
I've created #2524 to think about that and it's likely this will be in 0.8 as
it seems to work fine.
Outside of using declarative with the ORM, there is also a registry of
underlying mapper() objects to their mapped classes, which is the original
"classical" system of mapping classes to tables. This registry is in fact weak
referencing. So the original intent behind the concept of "mapping a class
to a table" did include that this is a one-way setup, where when the class
falls out of scope, it's mapping would also (with similar caveats to those of
Tables). SQLAlchemy's test suite creates new classes, maps them, and then
just forgets about them regularly. This is supported.
So what is the big caveat of removing individual tables, individual mappers
from a full configuration of tables and classes ? The answer is that the
table/class/mapper in question may be referred to by other table/class/mappers
- and there is no functionality in place to surgically sever these connections.
The connections that may be established to a particular mapped class are:
1. the Table may be referred to by the ForeignKey(Constraint) of one or more
remote tables.
2. the Table may be referred to by any number of mappings.
3. the mapper() may be referred to by the relationship()/backref of one or more
remote mappers.
4. the mapper() may be a superclass of another mapping, in which case there's
obviously a high degree of dependency, or
5. the mapper() may be a subclass of another mapping, in which case there are
still connections - a superclass mapping needs to be aware of all it's
descendants in many cases, such as when organizing a flush as well as using
"polymorphic" queries. These linkages are not weak-referencing at the moment
though perhaps they could be.
So when any of #1-#5 above are present, simply dropping a class from
_decl_class_registry or removing a table from MetaData via remove() is not
enough. The mapper/class/Table is still referred to by other linkages. The
nature of these linkages is obviously very involved, and it's way out of scope
for SQLAlchemy to be able to surgically remove individual linkages from a graph
of mapped classes and tables. This is why SQLAlchemy in general does not
support the explicit removal of individual mappings and greatly downplays
metadata.remove() - the problem *may* be much more complicated.
The good news about #1-#5 is that none of these scenarios will be present *if
you didn't create this scenario yourself* - SQLAlchemy itself never does any of
these things without being told. If you made a Table that nobody refers to by
foreign key, and you're going to drop all the mappings that point to it, then
that's fine - remove it and it's gone. If you made a mapper() on a class that
is not referred to by any relationship(), and is not present within a
joined/single table inheritance hierarchy, also fine - when you lose all
references to that class, the mapper() is also dropped, and if you remove the
class from Base._decl_class_registry, it is also gone.
What follows from the assumption that none of #1-#5 are present, is that since
it's established that your mappers/classes/tables aren't referred to by other
mappers/classes/tables that you aren't dropping, there is no need to even use
the same MetaData object and Base as registries. I'll illustrate using this
below.
However, you stated that "You might want to inherit from a class that is part
of production code". Assuming this "production class" is mapped, that changes
everything. When you make a subclass mapping of an existing mapper, you are
making not just a subclass but a new mapper() as well, and you're really
changing the configuration significantly, including that your class is going to
register itself with the base. A subclass mapping is strongly referenced by
its base mapper at the moment along many paths, as the mapper() is big on
memoizing things so that it doesn't have to recalculate expensive things more
than once. I've added http://www.sqlalchemy.org/trac/ticket/2526 to think
about this, but it would require lots of weak references, and weak references
are expensive to use.
The rationale here seems to be that you'd like to override methods on your
class to do different things for tests, and I'd submit that this is an
architectural problem where your classes are not testable. I'd seek here to
use compositional patterns, rather than inheriting patterns, in order to do
things like injecting mock data and objects into production objects.
Anyway, some more background on your example. There's actually a bug here,
which is that extend_existing in conjunction with a Table that has already been
used is causing a failure in locating the primary key after an INSERT, which is
because there's a memoized attribute _autoincrement_column that isn't being
reset (see http://www.sqlalchemy.org/trac/ticket/2525).
It was not anticipated that extend_existing would be used for a Table that's
already been used - the purpose of extend_existing is so that you can create a
Table with some columns while at the same time reflecting other columns from
the database. Meaning, extend_existing is only to help create a fully formed
Table, not to re-build in place a Table that is already completed. But in this
case you obviously just saw the error message and did what it said. Usually
SQLAlchemy's approach of "suggesting fixes" helps more than not but this would
be an exception to that.
Anyway, the test as is can work fine, regardless of this error, by getting rid
of the unnecessary extend_existing and just removing that table at the end:
Base.metadata.remove(TestClass.__table__)
simple enough. But as I mentioned before, if your classes/tables have no
dependencies, just use a new MetaData. Using a new Base makes it easier, but
let's say you want to share the Base - you might have noticed your test also
emits this warning "The classname 'TestClass' is already in the registry of
this declarative base", so we'll use a WeakValueDictionary to nix that. We'll
use two MetaData objects as well:
from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import *
import weakref
connection_uri = 'sqlite:///:memory:'
engine = create_engine(connection_uri, echo=True)
Session = sessionmaker(autoflush=True, autocommit=False, bind=engine)
Base = declarative_base(class_registry=weakref.WeakValueDictionary())
session = Session()
class RealClass(Base):
__tablename__ = 'normal_class'
id = Column(Integer, primary_key=True)
attr = Column(String(50))
Base.metadata.create_all(bind=session.connection()) # Creating the real
classes (once per run)
# there is no need to "commit" here. SQLA DDL does this automatically.
def test():
class TestBase(Base):
# abstract Base with its own MetaData. Example + docs:
#
http://docs.sqlalchemy.org/en/rel_0_7/orm/extensions/declarative.html#abstract
metadata = MetaData()
__abstract__ = True
class TestClass(TestBase):
__tablename__ = 'simple_test_class'
attr = Column(String(50))
id = Column(Integer, primary_key=True)
TestBase.metadata.create_all(bind=session.connection())
session.add(RealClass(attr='sdf'))
session.flush()
session.add(TestClass(attr='sdf'))
session.flush()
print session.query(RealClass).all()
print session.query(TestClass).all()
for i in xrange(3):
try:
test()
finally:
session.rollback()
>
> We always get the first test to work, but some subsequent tests tend to fail.
> Of course, there are a large number of scenarios - test classes with
> relationships to 'real' classes, inheritance, etc. Those scenarios can
> complicate things. But here is the simplest basic example we'd like to get
> working:
>
> from pkg_resources import require
>
> require('sqlalchemy==0.7.7')
>
> from sqlalchemy import *
> from sqlalchemy.orm import *
> from sqlalchemy.ext.declarative import *
>
> connection_uri = 'sqlite:///:memory:'
>
> engine = create_engine(connection_uri, echo=True)
> Session = sessionmaker(autoflush=True, autocommit=False, bind=engine)
> Base = declarative_base()
> session = Session()
>
>
> class RealClass(Base):
> __tablename__ = 'normal_class'
> id = Column(Integer, primary_key=True)
> attr = Column(String(50))
>
> Base.metadata.create_all(bind=session.connection()) # Creating the real
> classes (once per run)
> session.commit() #
> Committed also...
>
>
> def test():
> class TestClass(Base):
> __tablename__ = 'simple_test_class'
> attr = Column(String(50))
> id = Column(Integer, primary_key=True)
>
> __table_args__ = {
> 'extend_existing': True
> }
>
> Base.metadata.create_all(bind=session.connection()) # To create the test
> classes
>
> session.add(RealClass(attr='sdf'))
> session.flush()
> session.add(TestClass(attr='sdf'))
> session.flush()
>
> print session.query(RealClass).all()
> print session.query(TestClass).all()
>
>
> try:
> test() # Works
> finally:
> session.rollback()
>
> try:
> test() # Breaks the second time round
> finally:
> session.rollback()
>
>
> Regards
> - Iwan
>
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/sqlalchemy/-/snYsPmsRjOEJ.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/sqlalchemy?hl=en.
--
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sqlalchemy?hl=en.