In article <85659fdd-511b-4aea-9c4b-17a4bbb88...@googlegroups.com>,
> My problem: I have a large database of interconnected objects which I need to
> process with a combination of short- and long-lived workers. These objects
> are mostly read-only (i.e. any of them can be changed/marked-as-deleted, but
> that happens infrequently). The workers may or may not be within one Python
> process, or even on one system.
> I've been doing this with a "classic" session-based SQLAlchemy ORM, approach,
> but that ends up way too slow and memory intense, as each thread gets its own
> copy of every object it needs. I don't want that.
> My existing code does object loading and traversal by simple attribute
> access; I'd like to keep that if at all possible.
> Ideally, what I'd like to have is an object server which mediates write
> access to the database and then sends change/invalidation notices to the
> workers. (Changes are infrequent enough that I don't care if a worker gets a
> notice it's not interested in.)
> I don't care if updates are applied immediately or are only visible to the
> local process until committed. I also don't need fancy indexing or query
> abilities; if necessary I can go to the storage backend for that. (That
> should be SQL, though a NoSQL back-end would be nice to have.)
> Does something like this already exist, somewhere out there, or do I need to
> write this, or does somebody know of an alternate solution?
If you want to go NoSQL, I think what you're describing is a MongoDB
replica set (http://docs.mongodb.org/manual/replication/). One of the
replicas is the primary, to which all writes are directed. You can have
some number of secondaries, which get all the changes applied to the
primary, and spread out the load for read access. If you want a vaguely
SQLAlchemy flavored ORM, there's mongoengine (http://mongoengine.org/).
On the other hand, this may be overkill for what you're trying to do.
Can you give us some more quantitative idea of your requirements? How
many objects? How much total data is being stored? How many queries
per second, and what is the acceptable latency for a query?