[elephant-devel] Discussions

Ian Eslick Fri, 16 May 2008 20:23:21 -0700

Robert said:

> I'll go out on a limb and say that offering object-level caching is

> the single biggest performance enhancement we make for the mostcommon

> cases.

A clarifying question. How did you ensure ACID properties in the DCMscenario in the presence of threading? Without letting BDB or sqlknow about the reads that you've done, you can't tell if a priortransaction has clobbered on data that you are currently using becausethe reads are directly from memory.

e.g. You can easily read the old 'balance' on the checking account, doyour computing while someone else has written that same object, thenwrite back an incorrect value.

Rucksack tracks changes by versioning objects in memory and rollingback newer versions when older versions are committed. This is a copy-on-write model which keeps everything in memory during thetransaction, but then writes the txn log and a version of the objectto disk, updating the in-memory 'valid' version as appropriate.


Leslie had a good related e-mail on this topic a few days ago:

I don't know what the best decision might be here.
But I have a use case that might help; it has the following
features:

 * I access the slots of two persistent objects.

 * The number of the slots and the times requested
   together produce very bad performance (think seconds)
   even with PM txn caching (for comparison, BDB is about
   three times faster)

 * The environment is multi-threaded (web server), but the
   slots won't be changed by any other process.

 * Ideally the slots would be cached only for this one
   function and the functions called by it (and only
   per-invocation, i.e. slot caches get refreshed right at
   the beginning of the function).

 * This is currently the only place in my app where I would
   need the performance advantages of slot caching. In all
   other places ACID is highly preferred and speed is sufficient.

 * The desired behaviour can be somewhat modelled by CLSQL's
   OO interface:

     - get the objects from the DB at the beginning

     - work with those in-memory objects

     - write back the values to the DB at the end of the process

   The difference is that I don't want the whole object (other slot
   values of it might be changed from outside!) but only a few
   selected slots.

I think we can basically do this today. A refresh command simplyreads from the DB for all cached slots (in a transaction this isthread safe and avoids the aforementioned problem). You operate onthe cached data, nothing happens in the transaction, at the end you doa 'save' and those cached slots get written to disk.I think this meets leslie's use case and I think it's an hour or twoto implement on top of what is already there.


> However, I don't know if this is more important than a native-lisp

> backend, or a query-language. For the next year at least I amworking> at a job rather than working on my lisp application; and even thenI was

> happy with the performance I was getting out of DCM.  So I personally
> don't have performance need that drives anything.  I wish I knew how

> many new users we would have from better performance vs. a native-lisp> backend vs. a query-language, or what our existing users wouldprefer.

My two dollars on this topic is that the most interesting thing toimprove adoption and overall utility is a lisp-only backend to getgoing with. The most interesting value to the current users,including myself, is a query system that manages and abstracts some ofthe performance query hacks that today you have to write yourself inlisp, often over and over.

I think of the query system, by the way, as a DSL (domain specificlanguage) extension of lisp, not a SQL syntax. So it's not an eitheror, it's exactly what Lisp was meant to do, enable linguisticabstraction that makes thinking about a given problem easier. That'swhat I think when I hear 'lisp as the query language'.

Rucksack strikes me as the best way to start on the lisp-only front,because so much is there. It's a non-trivial port/adaptation sosomeone needs to be willing to put in a week or two (at least) ofserious effort.

I think we may also be able to change it so that it only writes atransaction log and doesn't write the underlying DB unless somethingis flushed from the cache. What I like about Rucksack for a moreprevalence style model (and maybe I'm misreading this and it's notflushing objects to disk on each write) is that it already implementsversioning as its transaction model, which gets around fine-grainedlocking performance problems. If we add in Robert's DCM ideas abouthaving a cache instead of the whole DB in memory, then we couldimagine writing flushed objects to disk and effectively incrementallysyncing the memory objects to disk rather than having to do a fullsnapshot every so often.


Regards,
Ian





Ian


_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel

[elephant-devel] Discussions

Reply via email to