Re: [sqlalchemy] Can I restrict the set of mappers that configure_mappers() works on?

Steven Winfield Wed, 22 Apr 2015 08:16:24 -0700

Sort of... it goes something like this:

In some database (call it "code") there is a table - script(name text, 
contents text) - which is wrapped using an sqlalchemy class called Script.
In our environment we fire up fresh python interpreters, import the library 
that defines Script, install a PEP302-compliant import hook in 
sys.meta_path, and hand off the interpreter to the end-user (either to an 
interactive session or we carry on executing a given script).
When python hits an "import" statement our import hook is asked if it can 
import the given module, so we query the database to find out and - if 
found - we compile it and put it in sys.modules (all details are in the 
PEP). This is all transparent to the user.


The problem comes when the users define some of their own sqlalchemy 
wrappers for tables in a completely separate database and, further, they 
split those definitions across two or more scripts and tables in one script 
refer to tables in another... let's say that classes A1 and A2 are defined 
in module foo.a and class B in foo.b, and that A1 has a relationship to B, 
which has a relationship to A2. 

Both foo.a and foo.b need to be imported before the mappers can be properly 
configured. But since a query is done on the code db to determine if a 
script exists and then to retrieve its contents, and this query triggers 
mapper configuration after only one of foo.a/b have been imported, it 
causes configuration to fail.

In reality we have nearly 300 sqlalchemy classes wrapping tables across 
about 20 scripts, all of which live in the code database. In total there 
are about 2000 scripts in there.
So far we have managed by caching the contents of _all_ 2000 scripts 
client-side to prevent further queries, but this isn't sustainable.

I'd be interested to hear your thoughts. Congrats on v1.0.0 btw!

Cheers,
Steve.


On Tuesday, April 21, 2015 at 5:47:06 PM UTC+1, Michael Bayer wrote:
>
>  
>
> On 4/21/15 12:25 PM, Steven Winfield wrote:
>  
> OK, thanks for the quick answer - I guess I shouldn't be using sqlalchemy 
> for the importer then, since this necessarily has to perform queries in 
> order for scripts to be imported.
>
>
> what's that about ?   some kind of dynamic scripting environment?   If 
> you're using a special kind of importer, there should still be ways to make 
> it work, because at least there is some top-level control still being 
> exercised over the loading of these scripts.
>
>
>  
> I might attempt restricting mapper configuration to a group of tables - I 
> think the performance penalty would be quite small, as the categorization 
> tests only need to be done when new mappers need configuring 
> (Mapper._new_mappers == True) and should only occur once per table, but 
> I'll see if this is the case.
>
> On Tuesday, April 21, 2015 at 4:19:12 PM UTC+1, Michael Bayer wrote: 
>>
>>  
>>
>> On 4/21/15 9:31 AM, Steven Winfield wrote:
>>  
>> Hi, 
>>
>>  It seems like configuration is attempted for all new mappers, globally, 
>> whenever a query is done. So if library A and B both use sqlalchemy, and A 
>> imports B before A's mappers can be properly initialised (e.g. there is a 
>> relationship("ClassnameAsString") call somewhere that can't be resolved 
>> yet), and B does something to trigger mapper configuration, then it will 
>> fail. 
>> This occurs even if A and B make separate calls to declarative_base(), 
>> even with explicitly different metadata and bound engines. 
>>  
>>
>> no there's not, and the short answer is that libraries shouldn't be 
>> triggering mapper configuration (and definitely not doing ORM queries) at 
>> import time, and/or the imports of A and B should be organized such that B 
>> imports fully before A starts doing things.   Either these libraries have 
>> inter-dependencies, in which case this implies mapper configuration should 
>> be across all of the mappings in both, or they don't, in which case the 
>> imports of A and B should not be from within each other.
>>
>> An enhancement that would limit configuration to groups of mappings is a 
>> feasible proposal but we don't have that right now.     Wouldn't be that 
>> easy to do without adding a performance penalty since the check for "new 
>> mappers" would have to be limited to some categorization, meaning lookups 
>> in critical sections.
>>
>>
>>
>>
>>
>>
>>  
>>  Here's a boiled-down version of the problem that I've been playing 
>> with, which shows that the relationship between Parent and Child is 
>> configured when a query on Test is done - even though it may be part of a 
>> different library and in a different database:
>>
>>   from sqlalchemy import Column, Integer, Text, ForeignKey, create_engine
>> from sqlalchemy.ext.declarative import declarative_base
>> from sqlalchemy.orm import sessionmaker, relationship
>> import traceback
>>
>>  Base1 = declarative_base()
>>
>>  class Test(Base1):
>>     __tablename__ = "test"
>>     id = Column(Integer, primary_key=True)
>>
>>  Base2 = declarative_base()
>>
>>  class Parent(Base2):
>>     __tablename__ = "parent"
>>     id = Column(Integer, primary_key=True)
>>
>>  def deferred_parent():
>>     traceback.print_stack()
>>     return Parent
>>
>>  class Child(Base2):
>>     __tablename__ = "child"
>>     id_parent = Column(Integer, ForeignKey(Parent.id), primary_key=True)
>>     name = Column(Text, primary_key=True)
>>     parent = relationship(deferred_parent)
>>
>>  engine = create_engine('sqlite://')
>> Session = sessionmaker(bind=engine)
>> session = Session()
>> try:
>>     session.query(Test).all()
>> except:
>>     pass
>>
>>   
>>  ...the important bit of the traceback being:
>>
>>    File 
>> "R:\sw\external\20150407-0\python27\lib\site-packages\sqlalchemy-0.9.7-py2.7-win32.egg\sqlalchemy\orm\session.py",
>>  
>> line 1165, in query
>>
>>     return self._query_cls(entities, self, **kwargs)
>>
>>   File 
>> "R:\sw\external\20150407-0\python27\lib\site-packages\sqlalchemy-0.9.7-py2.7-win32.egg\sqlalchemy\orm\query.py",
>>  
>> line 108, in __init__
>>
>>     self._set_entities(entities)
>>
>>   File 
>> "R:\sw\external\20150407-0\python27\lib\site-packages\sqlalchemy-0.9.7-py2.7-win32.egg\sqlalchemy\orm\query.py",
>>  
>> line 118, in _set_entities
>>
>>     self._set_entity_selectables(self._entities)
>>
>>   File 
>> "R:\sw\external\20150407-0\python27\lib\site-packages\sqlalchemy-0.9.7-py2.7-win32.egg\sqlalchemy\orm\query.py",
>>  
>> line 151, in _set_entity_selectables
>>
>>     ent.setup_entity(*d[entity])
>>
>>   File 
>> "R:\sw\external\20150407-0\python27\lib\site-packages\sqlalchemy-0.9.7-py2.7-win32.egg\sqlalchemy\orm\query.py",
>>  
>> line 2997, in setup_entity
>>
>>     self._with_polymorphic = ext_info.with_polymorphic_mappers
>>
>>   File 
>> "R:\sw\external\20150407-0\python27\lib\site-packages\sqlalchemy-0.9.7-py2.7-win32.egg\sqlalchemy\util\langhelpers.py",
>>  
>> line 726, in __get__
>>
>>     obj.__dict__[self.__name__] = result = self.fget(obj)
>>
>>   File 
>> "R:\sw\external\20150407-0\python27\lib\site-packages\sqlalchemy-0.9.7-py2.7-win32.egg\sqlalchemy\orm\mapper.py",
>>  
>> line 1871, in _with_polymorphic_mappers
>>
>>     configure_mappers()
>>
>>   File 
>> "R:\sw\external\20150407-0\python27\lib\site-packages\sqlalchemy-0.9.7-py2.7-win32.egg\sqlalchemy\orm\mapper.py",
>>  
>> line 2583, in configure_mappers
>>
>>     mapper._post_configure_properties()
>>
>>   File 
>> "R:\sw\external\20150407-0\python27\lib\site-packages\sqlalchemy-0.9.7-py2.7-win32.egg\sqlalchemy\orm\mapper.py",
>>  
>> line 1688, in _post_configure_properties
>>
>>     prop.init()
>>
>>   File 
>> "R:\sw\external\20150407-0\python27\lib\site-packages\sqlalchemy-0.9.7-py2.7-win32.egg\sqlalchemy\orm\interfaces.py",
>>  
>> line 144, in init
>>
>>     self.do_init()
>>
>>   File 
>> "R:\sw\external\20150407-0\python27\lib\site-packages\sqlalchemy-0.9.7-py2.7-win32.egg\sqlalchemy\orm\relationships.py",
>>  
>> line 1549, in do_init
>>
>>     self._process_dependent_arguments()
>>
>>   File 
>> "R:\sw\external\20150407-0\python27\lib\site-packages\sqlalchemy-0.9.7-py2.7-win32.egg\sqlalchemy\orm\relationships.py",
>>  
>> line 1605, in _process_dependent_arguments
>>
>>     self.target = self.mapper.mapped_table
>>
>>   File 
>> "R:\sw\external\20150407-0\python27\lib\site-packages\sqlalchemy-0.9.7-py2.7-win32.egg\sqlalchemy\util\langhelpers.py",
>>  
>> line 726, in __get__
>>
>>     obj.__dict__[self.__name__] = result = self.fget(obj)
>>
>>   File 
>> "R:\sw\external\20150407-0\python27\lib\site-packages\sqlalchemy-0.9.7-py2.7-win32.egg\sqlalchemy\orm\relationships.py",
>>  
>> line 1522, in mapper
>>
>>     argument = self.argument()
>>
>>   File "user!winfis!sqlalchemy!query_triggers_relationship_config.py", 
>> line 19, in deferred_parent
>>
>>     traceback.print_stack()
>>  
>>  
>>  Is there some method that I've missed of delaying mapper configuration? 
>> Aren't the only mappers than need to be set up those that share metadata 
>> with entities in the query, or any metadata bound to the engine that will 
>> be used? 
>> Perhaps configure_mappers() could take an optional metadata/engine and 
>> only set up mappers that are related to this?
>>
>>  As you can see, I'm doing this with 0.9.7 but looking at the 1.0.0 code 
>> I think I'd have the same problem.
>>
>>  
>>  If it helps, (and you're not already bored) here's our use-case:
>> We have one library that implements a PEP302 import hook, which fetches 
>> python code from a database and compiles it. This is managed by sqlalchemy.
>> Some of the code in the database also use sqlalchemy and define other 
>> sets of ORM-mapped classes, completely unrelated to the first set, and 
>> which relate to tables inreside in completely different databases.
>> If a query needs to be executed to fetch and compile some code while 
>> another set of classes are not ready to have their mappers initialised then 
>> exceptions are raised.
>>
>>  Thanks,
>> Steve.
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "sqlalchemy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at http://groups.google.com/group/sqlalchemy.
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>>    -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] <javascript:>.
> To post to this group, send email to [email protected] 
> <javascript:>.
> Visit this group at http://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.
>
>
>  

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Can I restrict the set of mappers that configure_mappers() works on?

Reply via email to