Re: [sqlalchemy] Temporary mapped classes for testing

Michael Bayer Tue, 26 Jun 2012 10:13:35 -0700

On Jun 26, 2012, at 7:59 AM, Iwan Vosloo wrote:

> Hi there,
> 
> When we test, we sometimes create temporary classes just for a specific test 
> (or group of tests).  You might want to inherit from a class that is part of 
> production code, for example and override a method for testing purposes, etc. 
> When these temporary classes are declarative sqlalchemy mapped classes, we 
> run into all sorts of problems.
> 
> What we do is something along the lines of:
>  1) create the tables for the actual model
>  2) commit
>  3) define the test classes and create their tables
>  4) run the test logic
>  5) roll back
> 
> Steps 1-2 happen one for a test run with many tests.
> Steps 3-5 represents one test, so they can be repeated.
> 
> However, it seems that some cleanup is needed of mappers / the metadata / 
> classes that inherit from Base, etc AFTER 5.
> The sqlalchemy Table objects and mappers are now out of sync with the 
> database (which has been rolled back).
> 
> What can one do to get all the sqlalchemy related stuff (metadata, mappers, 
> tables and how these were set on existing classes) back in sync with the 
> rolled-back database schema?  We know the metadata can be cleared (and 
> reflected); we know we can clear mappers and recreate them... and we've tried 
> several of these things to no avail. The example given below, however should 
> work without too much clearing, surely?


I do have some simple solutions to this issue, however since I suspect there 
may be more complexity to your use case that isn't illustrated in your example, 
I'll give you the whole background.

When you deal with a declarative Base, there are two registries at work.  One 
is the MetaData object associated with it, which has a dictionary called 
"tables" stuck on it.   The other is the "_decl_class_registry", which is a 
dictionary of class names to mapped classes.   

The purpose of these registries is primarily to facilitate configuration based 
on string names.   For example, if you say ForeignKey("mytable.id"), that 
string "mytable" needs to be resolved to a Table object; if you say 
relationship("SomeClass"), the string "SomeClass" needs to be resolved to a 
class object.   This is the biggest reason Base and MetaData need to follow the 
registry pattern.  

The MetaData object in particular also has several other purposes for this 
registry.   It serves as a "singleton" registry so that there is only one 
instance of a particular Table for a particular name.  This is important 
because SQLAlchemy's expression language uses the identity of a particular 
clause element to indicate it's lexical identity within a statement.  If you 
had two distinct Table objects, both with the same name, SQLAlchemy would treat 
them as distinct "FROM" clauses among other things leading to incorrect SQL.  
So the singleton registry here encourages that multiple Table objects of the 
same name aren't created.

MetaData is also the "collection of record" for a set of Table objects, such as 
when you say "metadata.reflect()", MetaData is where the Table objects are 
stored.  If they weren't there, they'd fall out of scope (unless 
metadata.reflect() were modified to actually return the collection of Table 
objects).

So MetaData's registry is strong-referencing, meaning entries do not get 
removed when they are fully dereferenced elsewhere.  A Table can be removed 
from the MetaData using remove(), however this has caveats which I'll detail in 
a moment.

The "_decl_class_registry" on declarative Base is also not a weak-referencing 
dictionary - so when you make a subclass of Base, and then lose all references 
to that class, it is still present in _decl_class_registry.  To that extent, 
Base is not automatically managing of temporary classes.   It is possible to 
replace _decl_class_registry with a WeakValueDictionary, which you could pass 
into declarative_base() via the  "class_registry" argument.    I would say that 
perhaps we should make _decl_class_registry weak referencing by default in any 
case, as it is only used at configuration time so is not a performance concern. 
 I've created #2524 to think about that and it's likely this will be in 0.8 as 
it seems to work fine.

Outside of using declarative with the ORM, there is also a registry of 
underlying mapper() objects to their mapped classes, which is the original 
"classical" system of mapping classes to tables.  This registry is in fact weak 
referencing.    So the original intent behind the concept of "mapping a class 
to a table" did include that this is a one-way setup, where when the class 
falls out of scope, it's mapping would also (with similar caveats to those of 
Tables).   SQLAlchemy's test suite creates new classes, maps them, and then 
just forgets about them regularly.  This is supported.

So what is the big caveat of removing individual tables, individual mappers 
from a full configuration of tables and classes ?  The answer is that the 
table/class/mapper in question may be referred to by other table/class/mappers 
- and there is no functionality in place to surgically sever these connections. 
  The connections that may be established to a particular mapped class are:

1. the Table may be referred to by the ForeignKey(Constraint) of one or more 
remote tables.
2. the Table may be referred to by any number of mappings.
3. the mapper() may be referred to by the relationship()/backref of one or more 
remote mappers.
4. the mapper() may be a superclass of another mapping, in which case there's 
obviously a high degree of dependency, or
5. the mapper() may be a subclass of another mapping, in which case there are 
still connections - a superclass mapping needs to be aware of all it's 
descendants in many cases, such as when organizing a flush as well as using 
"polymorphic" queries.   These linkages are not weak-referencing at the moment 
though perhaps they could be.

So when any of #1-#5 above are present, simply dropping a class from 
_decl_class_registry or removing a table from MetaData via remove() is not 
enough.  The mapper/class/Table is still referred to by other linkages.  The 
nature of these linkages is obviously very involved, and it's way out of scope 
for SQLAlchemy to be able to surgically remove individual linkages from a graph 
of mapped classes and tables.   This is why SQLAlchemy in general does not 
support the explicit removal of individual mappings and greatly downplays 
metadata.remove() - the problem *may* be much more complicated.

The good news about #1-#5 is that none of these scenarios will be present *if 
you didn't create this scenario yourself* - SQLAlchemy itself never does any of 
these things without being told.  If you made a Table that nobody refers to by 
foreign key, and you're going to drop all the mappings that point to it, then 
that's fine - remove it and it's gone.  If you made a mapper() on a class that 
is not referred to by any relationship(), and is not present within a 
joined/single table inheritance hierarchy, also fine - when you lose all 
references to that class, the mapper() is also dropped, and if you remove the 
class from Base._decl_class_registry, it is also gone.

What follows from the assumption that none of #1-#5 are present, is that since 
it's established that your mappers/classes/tables aren't referred to by other 
mappers/classes/tables that you aren't dropping, there is no need to even use 
the same MetaData object and Base as registries.   I'll illustrate using this 
below.

However, you stated that "You might want to inherit from a class that is part 
of production code".   Assuming this "production class" is mapped, that changes 
everything.   When you make a subclass mapping of an existing mapper, you are 
making not just a subclass but a new mapper() as well, and you're really 
changing the configuration significantly, including that your class is going to 
register itself with the base.    A subclass mapping is strongly referenced by 
its base mapper at the moment along many paths, as the mapper() is big on 
memoizing things so that it doesn't have to recalculate expensive things more 
than once.  I've added http://www.sqlalchemy.org/trac/ticket/2526 to think 
about this, but it would require lots of weak references, and weak references 
are expensive to use.

The rationale here seems to be that you'd like to override methods on your 
class to do different things for tests, and I'd submit that this is an 
architectural problem where your classes are not testable.   I'd seek here to 
use compositional patterns, rather than inheriting patterns, in order to do 
things like injecting mock data and objects into production objects.

Anyway, some more background on your example.    There's actually a bug here, 
which is that extend_existing in conjunction with a Table that has already been 
used is causing a failure in locating the primary key after an INSERT, which is 
because there's a memoized attribute _autoincrement_column that isn't being 
reset (see http://www.sqlalchemy.org/trac/ticket/2525).

It was not anticipated that extend_existing would be used for a Table that's 
already been used - the purpose of extend_existing is so that you can create a 
Table with some columns while at the same time reflecting other columns from 
the database.   Meaning, extend_existing is only to help create a fully formed 
Table, not to re-build in place a Table that is already completed.  But in this 
case you obviously just saw the error message and did what it said.   Usually 
SQLAlchemy's approach of "suggesting fixes" helps more than not but this would 
be an exception to that.

Anyway, the test as is can work fine, regardless of this error, by getting rid 
of the unnecessary extend_existing and just removing that table at the end:

    Base.metadata.remove(TestClass.__table__)

simple enough.  But as I mentioned before, if your classes/tables have no 
dependencies, just use a new MetaData.   Using a new Base makes it easier, but 
let's say you want to share the Base - you might have noticed your test also 
emits this warning "The classname 'TestClass' is already in the registry of 
this declarative base", so we'll use a WeakValueDictionary to nix that.   We'll 
use two MetaData objects as well:

from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import *
import weakref

connection_uri = 'sqlite:///:memory:'

engine = create_engine(connection_uri, echo=True)
Session = sessionmaker(autoflush=True, autocommit=False, bind=engine)
Base = declarative_base(class_registry=weakref.WeakValueDictionary())
session = Session()

 
class RealClass(Base):
    __tablename__ = 'normal_class'
    id = Column(Integer, primary_key=True)
    attr =  Column(String(50))

Base.metadata.create_all(bind=session.connection())   # Creating the real 
classes (once per run)
# there is no need to "commit" here.  SQLA DDL does this automatically.

def test():
    class TestBase(Base):
        # abstract Base with its own MetaData.  Example + docs:
        # 
http://docs.sqlalchemy.org/en/rel_0_7/orm/extensions/declarative.html#abstract
        metadata = MetaData()
        __abstract__ = True

    class TestClass(TestBase):
        __tablename__ = 'simple_test_class'
        attr =  Column(String(50))
        id = Column(Integer, primary_key=True)

    TestBase.metadata.create_all(bind=session.connection())

    session.add(RealClass(attr='sdf'))
    session.flush()
    session.add(TestClass(attr='sdf'))
    session.flush()

    print session.query(RealClass).all()
    print session.query(TestClass).all()

for i in xrange(3):
    try:
        test()
    finally:
        session.rollback()










> 
> We always get the first test to work, but some subsequent tests tend to fail. 
> Of course, there are a large number of scenarios - test classes with 
> relationships to 'real' classes, inheritance, etc. Those scenarios can 
> complicate things. But here is the simplest basic example we'd like to get 
> working:
> 
> from pkg_resources import require
> 
> require('sqlalchemy==0.7.7')
> 
> from sqlalchemy import *
> from sqlalchemy.orm import *
> from sqlalchemy.ext.declarative import *
> 
> connection_uri = 'sqlite:///:memory:'
> 
> engine = create_engine(connection_uri, echo=True)
> Session = sessionmaker(autoflush=True, autocommit=False, bind=engine)
> Base = declarative_base()
> session = Session()
> 
>  
> class RealClass(Base):
>     __tablename__ = 'normal_class'
>     id = Column(Integer, primary_key=True)
>     attr =  Column(String(50))
> 
> Base.metadata.create_all(bind=session.connection())   # Creating the real 
> classes (once per run)
> session.commit()                                                       # 
> Committed also...
> 
> 
> def test():
>     class TestClass(Base):
>         __tablename__ = 'simple_test_class'
>         attr =  Column(String(50))
>         id = Column(Integer, primary_key=True)
> 
>         __table_args__ = {
>             'extend_existing': True
>             }
> 
>     Base.metadata.create_all(bind=session.connection())  # To create the test 
> classes
> 
>     session.add(RealClass(attr='sdf'))
>     session.flush()
>     session.add(TestClass(attr='sdf'))
>     session.flush()
> 
>     print session.query(RealClass).all()
>     print session.query(TestClass).all()
> 
> 
> try:
>     test()  # Works
> finally:
>     session.rollback()
> 
> try:
>     test()  # Breaks the second time round
> finally:
>     session.rollback()
> 
> 
> Regards
> - Iwan
> 
> 
> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To view this discussion on the web visit 
> https://groups.google.com/d/msg/sqlalchemy/-/snYsPmsRjOEJ.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/sqlalchemy?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Re: [sqlalchemy] Temporary mapped classes for testing

Reply via email to