Re: couch_gen_btree: pluggable storage / tree engines

Robert Dionne Fri, 06 Feb 2009 04:32:14 -0800


Robert Dionne
Chief Programmer
dio...@dionne-associates.com
203.231.9961




On Feb 1, 2009, at 8:52 AM, Martin Scholl wrote:

Hello Robert,


Robert Dionne wrote:
Martin,

  I'm very keen on relationships between documents. Coming from the
description logic community, I'd like to allow users to declarecertainfields that relate documents and then compute transitive closuresover
dags whose nodes are documents and whose arcs the fields of interest.
This goes against the grain of  couchdb as collections of unrelated
documents, I know, but it's what I want to do as couchdb's schema-less
design offers many advantages over relational databases. Relational
databases aren't that great for storing graphs either.
I like the idea (reminds me of an RDF DB btw), especially when used
together with views.

it does, though I'm convinced the OWL/RDF community is laboring undera delusion that the semantic web can be enabled if we just get enoughontologies out there and can federate them. Even RDF is overkill formost applications.

  I don't need to run full classification algorithms in the document
store, but would like to just maintain relationships (user-defined) and
transitive closures of them. Inferencing would perhaps be better done
externally similar to the hypercouch work. So this would best beserved
by pluggable indexing and maybe pluggable storage, though I think I
could live without the latter for now.
With Antony's latest hints (thank you Antony!) in mind, I think I will
implement first sketches in an external way first. FTI isimplemented in
the same way afair.
  So I'm very excited about your ideas. I too have been reviewing the
code with this in mind and I would agree with others that it'sperhaps apost 1.0 task. From the little time I've spent chasing down acouple ofbugs I've seen there are a few subtle aspects to it. I've alsonoticedthat the style of design in this community is more bottoms up,which ishow it should be when building something new, so prototypes areperhaps
better for fleshing out ideas. Anyway I'm very happy to help an d
collaborate on this as I can.
Great! I will just publish my results on github. I hope, others will
join then.
What worries me most, is that I am still unsure in how to differbetweendesign docs and indexing schemes, and when to use whichinfrastructure.
Applied to the doc-relationship example you gave: how should
"intermediate reults" of the dag processing be treated? As documents?
Should they be put into view functions? Should views be able to hint,
which indexing scheme is to be used? Depending on the index type,
indexing and doc / view-processing can become inherently coupled and
complex. Is this still CouchDB then?

Great question, I'd say no it runs entirely against the grain of whatCouchDB is. Documents aren't supposed to be related to one another.But relational databases don't handle this kind of thing either so Ifigure why not CouchDB as it offers other features that solve lots ofproblems. Here's a typical use case (quoted words are documents,those with asterisks are fields between documents)"


"heart disease"  is *located_in* the "heart"
"myopathy" is *located_in* the "myocardium"
"myocardium" is *part_of* the "heart"

A reasoner might allow one to compose two relations, .eg.*located_in* composed with *part_of* is equal to *located_in* andthus conclude that myopathy is a disease of the heart.

So these transitive closures of links between documents would need tobe incrementally computed and treated the same as views. I think thiswould be best implemented with plugins in the same vm? This kinds fprocessing sems to require a tighter coupling than something likefull text indexing.


regards,

Bob



Martin

Re: couch_gen_btree: pluggable storage / tree engines

Reply via email to