As you may know if you've been reading this list for a while, I'm
working on a GUI application that can have multiple "document"
editors open at the same time. Since it's not a web application, I
don't have user interactions nicely diced into isolated requests.
Rather, a user may be editing a document in one window, then load a
new window and drag an item from the new window back to the original
window. Since the user may be editing many different "documents" at
the same time (each of which consists of a graph of related data
entities), I decided to give each editor (window) it's own SQLAlchemy
session. Objects in each session are effectively isolated from
eachother and can be edited and saved without affecting other open
documents.
SQLAlchemy's heavy dependence on the thread-local session pattern is
giving me bad headaches. I need to be able to guarantee that all data
entities belonging to a given document are loaded in a specific
session, but the thread-local pattern doesn't always allow me to do
that (it will silently load some objects in the wrong session and
I'll have strange bugs as a result).
Since SQLAlchemy has been written very well, it is almost possible to
do what I want by monkey-patching the parts of SA that deal with
thread local. However, I really want to avoid that because it would
most likely make very painful experience each time I upgrade to a new
version of SQLAlchemy.
[Aside] Yesterday I updated my SA source from r1025 to HEAD. I
haven't found anything broken yet, which is a huge testament to the
quality of development used on this project! Way to go Mike!!
I had tried to do a session-per-window pattern by monkey-patching
objectstore.session_registry to consult my application for the active
window's session. However, this broke down with drag/drop between
windows. If an object is dragged from the active window (we'll call
it window_1) and dropped on a non-active window (window_2), then
objects loaded from the database by window_2 in response to the drop
are loaded in window_1's session. Ouch. In fact, this will always
happen when two windows need to communicate with eachother: only one
window can be active at a time. It's even possible that neither
window involved in the communication is active, which means that both
windows involved in the communication will be loading their objects
in some other window's session. Double ouch.
Luckily I have confined the data access code to a few specific areas,
so whatever solution I come up with will not require me to rewrite
the entire application.
So I set out to find out exactly what SA is doing behind the scenes.
First, I monkey-patched objectstore.session_registry to disable
thread-local sessions. This allowed me to find out when the default
thread-local session was being used:
class NotImplementedProxy(object):
def __getattr__(self, name):
raise NotImplementedError("thread-local sessions are disabled")
def __setattr__(self, name, value):
raise NotImplementedError("thread-local sessions are disabled")
sa.objectstore.session_registry = sa.util.ScopedRegistry
(NotImplementedProxy)
This forced me to execute every mapper statement "using(session)",
which was exactly what I wanted anyway. The problem arises when I
load an object object with lazy attributes. When I do obj.lazy_attr I
get a "thread-local disabled" exception:
Traceback (most recent call last):
...
File "/Users/dmiller/Code/SQLAlchemy/lib/sqlalchemy/
attributes.py", line 62, in __get__
return self.manager.get_list_attribute(obj, self.key)
File "/Users/dmiller/Code/SQLAlchemy/lib/sqlalchemy/
attributes.py", line 345, in get_list_attribute
return self.get_history(obj, key, **kwargs)
File "/Users/dmiller/Code/SQLAlchemy/lib/sqlalchemy/
attributes.py", line 414, in get_history
return self.get_unexec_history(obj, key).history(**kwargs)
File "/Users/dmiller/Code/SQLAlchemy/lib/sqlalchemy/
attributes.py", line 256, in history
value = self.callable_()
File "/Users/dmiller/Code/SQLAlchemy/lib/sqlalchemy/mapping/
properties.py", line 618, in lazyload
result = self.mapper.select_whereclause(self.lazywhere,
order_by=order_by, params=params)
File "/Users/dmiller/Code/SQLAlchemy/lib/sqlalchemy/mapping/
mapper.py", line 557, in select_whereclause
return self._select_statement(statement, params=params)
File "/Users/dmiller/Code/SQLAlchemy/lib/sqlalchemy/mapping/
mapper.py", line 577, in _select_statement
return self.instances(statement.execute(**params), **kwargs)
File "/Users/dmiller/Code/SQLAlchemy/lib/sqlalchemy/mapping/
mapper.py", line 317, in instances
self._instance(row, imap, result,
populate_existing=populate_existing)
File "/Users/dmiller/Code/SQLAlchemy/lib/sqlalchemy/mapping/
mapper.py", line 871, in _instance
if sess.has_key(identitykey):
File "<path to NotImplementedProxy>", line 3, in __getattr__
raise NotImplementedError("thread-local sessions are disabled")
NotImplementedError: thread-local sessions are disabled
The problem is with Mapper._instance(...), which calls
objectstore.get_session() to get a session to load the lazy
attribute. It should check for obj._sa_session and use that instead.
If that session is not valid, the mapper should raise an exception
about the object not being associated with a session (this is what
Hibernate does).
Note that this problem could happen to anyone who loads some objects,
pushes a new session, then loads lazy attributes of the original
objects. The child objects loaded by lazy attributes will be loaded
in the new session. In most cases we want child objects to be loaded
in the same session as the parent object.
I hope I have communicated that I have a use case where thread-local
is not appropriate. Now I will try to show that the session-
resolution mechanism can be abstracted to a well-defined interface.
The thread-local pattern can simply be the default implementation,
but it will be pluggable so any pattern may be used.
The main thing that needs to go is the notion that there is a global
session somewhere. This is a problem with the current mapper
implementation. Remember way back in CS1 when they told you not to
use global variables? This is why. Whether you like it or not, thread-
local is a global variable (within the context of a thread). It's
handy because a resource can be stored there and any function that
needs it can just use it. This is where the beauty of object oriented
programming comes in. The concept of an object allows initialization
code to put some stuff on the object and then later it can use that
stuff without referring to a global context. This is what needs to
happen with the mapper.
The "using" thing is a hack that says "swap the thread-local session
for a moment while I do this thing". In theory this will work, but in
practice (when lazy attributes are loaded) the mapper tries to use
the thread-local session when I'm not around to say "use this session".
This could be much simpler if there was a concept of a session/engine-
bound Query. This Query object would execute SQL constructed by a
mapper on an engine and store the results in a session. It would
decouple the mapper from the engine and the thread-local session. It
would become the job of the query to execute(...) mapper-generated-
SQL on it's engine and then associate all resulting mapped objects
with it's session. Session-bound objects would use their session to
load lazy attributes (exactly what Hibernate does, except in
Hibernate the reference to the session is stored in the lazy
attribute proxy instead of on the parent object). A thread-local
implementation of this Query object would be trivial to write, and
that implementation could be the default used by everyone who doesn't
want to worry about session scoping.
I realize this covers some of the same material I have already
covered in the "Decoupling Tables/Mappers from engine" thread. I
wrote that on my own spare time for the benefit of SQLAlchemy. I'm
writing this from work where I actually need a real solution (I'm not
complaining or saying the community must provide one--I know that's
not what open-source is about). In the short term I'll try to hack my
own solution as best I can. I'll try to contribute patches as I come
up with them.
Thanks for your help.
~ Daniel
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Sqlalchemy-users mailing list
Sqlalchemy-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sqlalchemy-users