Re: [SQLObject] Rationale: Support for cache distribution delegates in SQLObject

Stuart Bishop Fri, 03 Mar 2006 20:45:01 -0800

David Faure wrote:

> So then, what are the data access modes currently supported? (or at  
> least proposed to be supported)
> 
> a) more consistent read: "select ... for update" (proposed)
> b) less consistent read: "select"
> c) even less consistent read: from cache
> 
> There is a tension between (b) and (c):  select from the database  
> server will provide access to recent updates from other clients,  
> while the cache will provide significantly better performance.


The differences between read consistency do not occur when running in a
SERIALIZEABLE transaction isolation level, which is currently the default
with SQLObject and PostgreSQL (as it is the default setting with psycopg1).
This, and ensuring the caches are destroyed at the end of each transaction,
is how we are keeping both read consistency and cache speed.

> We should ask the question, "Can we have the performance of the cache  
> with inclusion of recent updates from other clients?"
> 
> If so, then the need for (b) diminishes and we can rely more and more  
> on (a) and on (c).

Using FOR UPDATE should be kept to a minimum. Using it increases the chances
of deadlocks from other processes. Using it as a mechanism to ensure your
cache is correct is somewhat self defeating as other processes will be
blocked waiting for the locks.

> We (myself and David Faure) are proposing that SQLObject support  
> cache mechanisms that have the following properties:
> 
> - when an object is persisted to the database, a registered delegate  
> may distribute that object update to other client caches, either  
> directly or via one or more cache manager processes
> - when an object is requested from the cache, a registered delegate  
> handles the actual lookup from a potentially user-supplied cache  
> mechanism, e.g., one that uses shared memory
> 
> In our particular application, we want each SQLObject update to  
> notify one cache manager process per server that hosts clients of the  
> "object database."  Each cache manager populates a (single writer,  
> many readers) shared memory database of objects for that server.   
> Each SQLObject client has the option of fetching object state from  
> the shared memory.
> 
> Of course the API for these delegates should be carefully  
> considered.  For example, it would probably be good if the update  
> notification delegate were not triggered until after committing the  
> transaction that touched the object(s) in question, so as not to  
> advertise object state that never really existed.
> 
> I would be interested in thoughts on this approach, and in particular  
> in arguments against it.  I would hope that the latter would be  
> accompanied by a better suggestion for how to examine recent state of  
> SQLObjects in a distributed system with queries outweighing data  
> modifications by at least 50000 to 1.

In some ways, it seems like you are building into the application layer a
work around for a lack in your database layer (no SERIALIZABLE transaction
isolation level in MySQL?). However, what you do gain is the ability to keep
your cache alive across transaction boundaries *if* you only have data
changes being made through processes using SQLObject and your shared cache.
It seems a decent approach if replication cannot meet your needs. Does MySQL
give you two phase commit? If not, you are going to have a race condition
that may or may not be important in your application (actually - that could
be avoided if the updater could lock the cache during commit to the RDB).

Is your high query load due to the volume of transactions, or SQLObjects
natural desire to issue 10,000 individual queries where one intelligent JOIN
would do? We are finding that identifying and optimizing transactions that
are issuing too many queries are giving us better improvements than any
amount of caching or improved hardware could.

You may want to investigate ZODB and ZEO as your object cache. If it meets
your performance needs you might find all the heavy lifting is already done
for you, and replacing the cache simply a matter of setting up a RAM based
ZEO store and per server caches and replacing SQLObjects cache with an ZODB
backed OOBTree of IOBTree.



-- 
Stuart Bishop <[EMAIL PROTECTED]>
http://www.stuartbishop.net/

signature.asc
Description: OpenPGP digital signature

Re: [SQLObject] Rationale: Support for cache distribution delegates in SQLObject

Reply via email to