Re: Tokyo Cabinet

The API is nearly identical to BDB's. I think a TC version of the datastore would be pretty easy to do. The only way it makes sense to me to do this is to deprecate the BDB data store as of the next major release. Any thoughts on this?

Depreciation:

Speaking of which, what is the thinking on the CL-SQL store? With postmodern, there is a SQL interface - do we want to maintain CL-SQL long term? It does cover more SQL backends, including the easy-to-use SQLite, but it's another fork to maintain. I'm agnostic, but I thought I'd toss that out for Robert to comment on.

Re: DirectoryStorage

That's essentially the idea behind Rucksack, if I recall correctly. Finding a way to leverage the existing filesystem is a good idea. That may be a way to bypass the low-level locking, Btree design, paging and caching issues in the near term and perhaps the long term. Does anyone have a sense of the performance implications of this vs. BDB?

Some potential issues:
- Lots of file handles being created and destroyed
(i.e. When walking an index, to get a primary value, you have to open and
   close each object's file)
- Lots of open file handles!
- Efficient index implementation
- Secondary indices
- We have to update whole objects on each commit, not just slot values as today. This may or may not matter given that an object usually lives on one BDB page and locking in BDB is done at the page level... - I think we still need a C interface to a POSIX function to lock a file explicitly; does Windows have a similar interface?

Ian


On Feb 13, 2008, at 1:33 PM, Ben wrote:

one (perhaps insane) idea to make an all-lisp backend easier to
implement was to leverage the underlying file system ala ZODB
directory storage, since the file system is probably using B-trees
anyways.  there are fairly good architecture docs on

http://dirstorage.sourceforge.net/

tokyo cabinet looks good too.

b

On Feb 13, 2008 7:41 AM, Ian Eslick <[EMAIL PROTECTED]> wrote:
In general, I'm with Henrik on this.  I'd rather see us get Elephant
to a reasonable degree of feature completeness before we start to add
more non-lisp datastore functionality.  You can use postmodern for
licensing purposes and BDB for performance.

The answer to all of this, I think, is having a native lisp version
that has BDB's performance and no licensing restrictions.  Then
supporting the other two becomes: Postmodern for a higher degree of
reliability as well as for distributed systems and BDB for legacy
reasons.

I have a pretty good idea in my head of what an all-lisp backend
requires and having one would lay to rest all of these discussions of
bringing up "yet another backend".  Edi Weitz and I discussed
collaborating on this, but unfortunately he had some other projects
that took priority.

Is there a small critical mass of people out there that care enough
about this that they'd be willing to contribute to such a project?  I
don't have the time to do it on my own, but if we broke it up into
small projects over the next handful of months, I don't think it's a
ton of work. I can put in a solid chunk of integration work in mid to
late April.

So what is involved?

The tricky problems I've discovered so far are:
- An efficient model of BTree-like storage for Elephant
1) BDB-like paged data + explicit page cache + operations over fields
  2) Something more customized?
- Efficient pointer-based indexing (BTtree plus ptrs to data in main
BTree)
- Performing sorting and searching on serialized data rather than
having to
  deserialize to sort as in the clsql backend (required to do BTree
insertions)
- Transaction/logging architecture; how to store transaction data,
track conflicts, etc.
  (at lisp layer, in page cache ala BDB?)
  multi-thread and multi-process safe?
- locking to enable transactions on all 3 platforms; multi-process safe? (Is there a free library that has a C library that does this already,
   I think having a simple library that compensates for some of the
missing
   features in lisps is fine)

Some additional considerations:
- Do we add support for persistent heap garbage collection?
- Do we want to add supports for large persistent sets?
- Do we want a server mode for N:1 distributed transactions?

This is by no means a trivial design, but I think if we sketched out
the architecture there are a set of subsystems that could be made
somewhat independent:
- BTrees and disk storage
- Database maintenance ops: (reconstruct DB from log files, dump db,
optimize, etc)
- transaction support and logging
- low-level locking library
- online garbage collection

Cheers,
Ian


On Feb 13, 2008, at 10:11 AM, Henrik Hjelte wrote:

I had never heard of this project, but I it seems that Tokyo Cabinet
describes itself as fast, has transactions and can handle multiple
clients which is good. And it has a tcp/ip interface and protocol so
you wouldn't even need uffi/cffi to interface it from Lisp. Tokyo
cabinet seems to map to the bdb model good, so it should probably be
easier to do an interface than the sql interfaces. One observation
though, do we need yet another backend at this time, there are other
things to fix first on my personal wishlist.

Henrik
_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel

_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel

_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel

_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel

Reply via email to