Hi, Thanks for working on this. I am particularly interested in the potential of this backend because I need (a) speed and (b) the ability to build a binary which will run on end-user machines without requiring them to install a separate library.
If you send me the necessary patches (or get them integrated into their respective libraries), I would be happy to help you test this code on my application. I look forward to hearing from you. On Mon, Feb 2, 2009 at 12:29 PM, Ian Eslick <esl...@media.mit.edu> wrote: > Hi All, > > I was inspired the other day to come up with a minimal, quick-and- > dirty, all-lisp data store for Elephant based on cl-prevalence. The > problem with cl-prevalence is that every access to a persistent slot > has to be explicitly transactionally protected so it can be > recovered. This is onerous and violates the abstractions provided by > the rest of the elephant data stores. > > The general idea is to hide the cl-prevalence transaction model inside > the existing meta-object protocol. Instead of the usual hash table, > we use trees from cl-collections as our replacement for btrees and > indexes - this gives us reasonably efficient successor/predecessor ops > and makes it easier to implement the mapping and cursor APIs. > > It's not going to be nearly as fast as a full prevalence solution, but > it will be faster than BDB and should be much easier to install and > work with on a single-image basis. The new db-clp store requires a > patches to cl-prevalence and cl-containers. I'll see about getting > those put into the mainstream, but can send them to anyone interesting > in hacking on this in the meantime. > > I checked a prototype of this into elephant-1.0 today; it does not > interfere with existing functionality at all and I'm not sure yet > whether it will be part of 1.0. You can open a new store, subject to > the following caveats, by: > > In elephant root: ln -s src/contrib/eslick/db-clp/ele-clp.asd ele- > clp.asd > > (open-store '(:CLP "/home/me/db/")) > > where /home/me/db/ is a fresh directory. > > 85%+ of the tests pass and a ton of stuff does work: persistent > classes, btrees, dup-btrees, indexed-btrees, mapping (mostly), and > class indexing (mostly). > > > However, there are still some very serious holes. > The ones I'm aware of are: > > - You can only create, but not re-open a store > (this is due to a bootstrapping problem in recreating persistent > instances > when loading a snapshot) > > - No cursors are supported (all those tests fail), but should be easy > as there is a good match between the RB tree and the btree > abstractions but you will need to add a couple of functions to cl- > containers. > > - On-line recovery has not been tested. > > - I'm unsure of how cl-prevalence guarantees global transaction > serializability - I don't see locks anywhere in the code. > > - I faked out a bunch of serializer tests by including the default > serializer, > because they depend on buffer streams which the xml serializer > doesn't support; > the tests should be fixed to be more general and not rely on buffer > streams. > > - Some issues in schema evolution tests > > Performance issues: > > - Reads need to be serialized so are currently considered transactions > for simplicity, but they write a transaction log - they should be > rewritten to only use the serialization mechanism but not write a log > and the grabbing of a lock during serialization should be done once > per with-transaction call. > > - with-transaction is a no-op, a significant performance enhancement > would be to bundle a set of primitive operations such as tx-write- > slot, tx-remove-kv and write a single prevalence log entry for them. > This would avoid a disk write per primop. > > - I started using splay trees, retreated to RB trees due to bugs and > finally retreated to binary-search-trees due to more bugs. Moving to > a more efficient, balanced tree data structure will improve the > performance of all operations. However, several tests use linear > insertion, reducing the asymptotic behavior of the tree to that of a > list - it really grinds the tests to a halt. > > Obvious next steps, in order of increasing difficulty: > - Implement cursor API > - Fix red-black or splay tree implementation in cl-containers > - Figure out how to load from a snapshot > - Use with-transaction to improve performance > - Read transaction optimizations > > Regards, > Ian > > > > _______________________________________________ > elephant-devel site list > elephant-devel@common-lisp.net > http://common-lisp.net/mailman/listinfo/elephant-devel > -- Elliott Slaughter "Any road followed precisely to its end leads precisely nowhere." - Frank Herbert
_______________________________________________ elephant-devel site list elephant-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/elephant-devel