At work we have a web app project we're currently working on, and we're
considering options for rewriting it in both a different language, and
using a different type of database system. I won't go over why we're
switching from PHP to Ruby without Rails right now, but I'm trying to
find a database system that better fits our working structure over what
we are currently using.
Right now we're basically abusing a single MySQL table to create a
semantic-like structure. Basically it's just an "atoms" table with a
subject column, a predicate column, and a value column. This structure
quite simply is not scaling the way we want it to, because unless we go
beyond simple queries and embed raw SQL inside of the application with a
huge pile of unreadable JOIN statements, we won't be able to do anything
beyond inefficiently grabbing dozens of pieces of data individually over
and over when we're walking along things.
Ultimately, I have no interest in building some complex querying system
to make everything more readable and compatible with SQL when I know
that systems actually meant for this kind of data structure already exist.
I've been looking over various Semantic database engines (cause quite
simply our structure is a semantic model) but I've been having issues
getting something working. Rather my biggest issue is trying to grab
data from the engine, the interfaces from one language to another aren't
documented that well and the communities feel fairly inactive when it
comes to trying to get help.
So I'm looking at CouchDB to see if it's possible to use CouchDB
efficiently with our model of data.
So I'll dive into the basic structure right now. This is fairly
oversimplified in comparison to the actual structure of our data, but it
should be enough to detail what the issue is and what we'd need out of
CouchDB.
Our system is basically built up of Widgets, these are built up somewhat
like a tree most of the time, though not always (occasionally a Widget
can actually have multiple parents, though only one is relevant most of
the time, it's not really relevant to the discussion so I won't go into
it). We use a structure somewhat like:
"uidZZZ": {
"isa": "widget",
"type": "page",
"state": {},
"hasJit", [
"uidYYY",
]
},
"uidYYY": {
"isa": "widget",
"type": "container",
"state": {},
"hasJit", [
"uidAAA",
]
},
"uidAAA": {
"isa": "widget",
"type": "container",
"state": {},
"hasJit", [
"uidBBB",
"uidCCC"
]
},
"uidBBB": {
"isa": "widget",
"type": "text",
"state": {
"content": "Some text content"
}
},
"uidCCC": {
"isa": "jit",
"type": "text",
"state": {
"content": "Some text content"
}
}
So basically understand that this tree hierarchy can grow fairly large
even for a simple page. The big issue currently is that we basically
have to ask the database first for the ids of widgets under a page, then
individually ask for information about those widgets and also the ids of
widgets under those widgets, then we have to query for that information
about the jits under those jits, and so on. The big issue is that we're
doing all this sending requests between the app and the database, when
ideally instead we'd just give the database the id of the page, and tell
it to walk through this tree hierarchy and just return to us all the
relevant widgets for a page at once and then we could handle the rest on
our own.
So, CouchDB does look extremely promising with the views and potentials
for generating things like inverse relationships database side using
views, but is it possible to setup something like a walk where we can
query the database for all objects relevant to a certain tree structure
(note that the root of that tree doesn't have to be a page, in our
dynamic system it's perfectly valid to ask for stuff relative to one of
those nested Jits rather than asking for the entire page)?
--
~Daniel Friesen (Dantman, Nadir-Seen-Fire)