Hello all,
although I am still doing a CouchDB review to better understand its
design, I like to ask for comments for a tiny idea.
I would like to add another index structure to CouchDB (a Merkle-Tree)
and come up with asking myself what the best way of doing this would be.
I have a rough guess of how closely couch_btree is knit into CouchDB.
Therefore I would like to hear from you experienced developers comments
on some of my ideas:
My suggestion is a MySQL-ly approach (pluggable engines) for CouchDB,
that is to factor out several components into generic behaviours:
e.g.
- a couch_gen_tree:
abstracts access to couch_btree
Maybe even a
- couch_gen_storage
e.g. file system, file storage access, etc.
- couch_gen_replicator
an imperative approach to tree / storage replication.
As I said: I am new to CouchDB's code so I cannot really estimate how
the current layering approach looks like, and whether we can even split
out the 3 components.
Imho there would be several benefits in having this flexibility brough
by couch_gen_*:
- new use-cases for CouchDB:
- R-trees: for adding another way of querrying
documents (e.g. nearest neighbour search)
- genome databases
- a special datastructure for indexing tags
- ...
- with a flexible storage layer, CouchDB could ran on top of other
infrastructures and products: like S3, SimpleDB, AppEngine, etc.
- (the following is just a guess:) a cleaner CouchDB codebase with a
clear layering and separation of components
- possibly (again, just a guess): with the plugin approach, we can more
easily support advanced indexing and db management schemes, like
distributed storage access, distributed transactions, etc.
Martin