On Dec 13, 2010, at 8:31 PM, Vince Ng wrote:

> Hi,
> 
> I've been attempting to try to add an abstract layer of Memcache on top of 
> the current version of SQLAlchemy, but I've been running into a few different 
> issues and wanted to get a few thoughts on the best approach on loading 
> objects.
> 
> I've studied the examples at:
> http://www.sqlalchemy.org/trac/browser/examples/beaker_caching
> My issue with these methods are they cache simple queries and relationships, 
> but in a fairly simple way and doesn't account for invalidation.  They also 
> seem to require more explicit query calls versus some of the sophisticated 
> lazy-loading property relationship mapping of SQLAlchemy.
> 
> My goal is to add Memcache at the object layer based on primary key (and to 
> control caching of specific object types, versus caching all queries 
> generated).  All ColumnProperty's of an object may be stored in cache, and 
> relationships and RelationshipProperty's are ignored for now.  The objects 
> that the relationships point to may be stored in cache on their primary key.  
> If an object is updated/deleted, then the object in cache will be invalidated.
> 
> Basic Example of What I'd Like to Accomplish:
>       # User has property 'image' which has a 'image_id' foreign key relation 
> to Image object
> 
>       # initial request
>       user = Session.query(User).get(1)
>               # checks memcache for user ID 1
>               # nothing in memcache, queries DB for user ID 1 (image_id for 
> user 1 is 100)
>               # stores user 1 data in memcache
>       image = user.image
>               # checks memcache for image ID 1
>               # nothing in memcache, queries DB for image ID 100
>               # stores image 100 data in memcache
> 
>       # separate subsequent request
>       user = Session.query(User).get(1)
>               # checks memcache for user ID 1, found it!
>               # populate User object with data from memcache
>       image = user.image
>               # checks memcache for image ID 100, found it!
>               # populate Image object with data from memcache
>       image.view_count = image.view_count + 1
>       Session.commit()
>               # invalidate image ID 100 in memcache
> 
> I suspect the best way to issue invalidate requests to Memcache will be to 
> create a MapperExtension and use before_update() and before_delete().

I'd go with SessionExtension instead - you can gather up everything from the 
"dirty" list in one go and invalidate.


> 
> However, I'm having much more trouble figuring out where and how I should 
> store data in Memcache and when/how to load from it. 

if you're going for get(), subclass Query and override get() to check memcache 
first.   This is pretty much what LOLApps! talked about in their Pycon talk 
last year.   But this approach is not as comprehensive as that of the Beaker 
example, which places itself at the central point of __iter__() and can make 
decisions about all queries, including that it can cache results not just on 
primary key but on filter criterion too.

The loading of related many-to-ones like "user.image" uses part of the 
functionality of query.get(), but not the get() method itself, so again 
overriding __iter__() as stated in the example is the most focused place.   The 
Beaker example illustrates how to augment exactly the process of the 
many-to-one lazyload, using custom query options that are invoked at the point 
of load.  The RelationshipCache() represents an object that, when placed in the 
"options" collection of a Query, is invoked directly before the lazyload 
occurs.   It looks at the "path" that is to be loaded, of the form (mapper, 
attributename), i.e. such as (user_mapper, 'image'), then compares this path 
into its own list of paths to determine if the Query should be instructed to 
pull from cache.   You can modify this to do something like 
"user_mapper.get_property('image').direction is 
sqlalchemy.orm.interfaces.MANYTOONE" to apply caching to all m2os.   Running 
the Beaker example for a few hours and using pdb to step through each line of 
code should make it clearer what's going on.


> MapperExtensions only offer hooks after data has been read from the DB.  I've 
> looked into subclassing the Query class and overriding the get() function (as 
> well as a few other functions), but the problem I run into is that I can load 
> the regular columns properties, but will run into issues with lazy-loaded 
> RelationshipProperties with mostly the error:
> "DetachedInstanceError: Parent instance <User at 0x7fc7503942d0> is not bound 
> to a Session; lazy load operation of attribute 'images' cannot proceed"

You need your objects to be bound to Sessions when they load things - this so 
they are aware of the database transaction they mean to participate in, as well 
as where they should be loading unloaded (or in this case, cache-expired as 
well) data.  The Beaker example uses query.merge_result() for all loaded 
results to ensure that objects returned participate within a Session.



-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Reply via email to