On Tue, May 4, 2010 at 2:20 PM, Laurence Rowe <l...@lrowe.co.uk> wrote:
> On 2 May 2010 22:03, Luciano Ramalho <luci...@ramalho.org> wrote:
>> In these, we don't store serialized objects, but just the data to
>> reconstruct the objects. But the data is not completely dismembered in
>> some normalized form.
>> In a semi-structured database the data graph can follow very closely
>> the original object graph, which makes retrieval easier for the
>> programmer and more efficient for the database. And the schema is
>> self-describing, which means if you have a database backup, then you
>> are able to get to the data even if you don't have the software that
>> put it there.
> ...
>> Here is formal definition, from the same source:
>> """
>> A semi-structured data instance is a rooted, directed graph in which
>> the edges carry labels representing schema components, and leaf nodes
>> (i.e., nodes without any
>> outgoing edges) are labeled with data values (integers, reals, strings, 
>> etc.).
>> """
> I suspect that databases such as CouchDB and the others you mention
> are not well suited to graph traversal. Efficient traversal must occur
> near the data, otherwise you pay the latency cost on each edge
> traversed. In ZODB this is achieved through the object cache - you
> expect most of the target object's parents to already be present in
> the cache when serving a request. In other systems traversal happens
> in the database - for an example see
> http://highscalability.com/neo4j-graph-database-kicks-buttox.

The problem with that formal definition I quoted is that it is
academic and not very practical. There is a related concept of graph
databases, but that is not what interests me for content management

In practice, for content management we don't need very deep graphs.
But it is highly useful to be able to store more data along with a
document. For instance, tags, attributions, comments (to a degree) are
all things that can be easily stored within a document in CouchDB and
similar databases. Even references to other objects are much easier to
manage if they are stored with the document than as foreign keys (for
example, related documents can be represented as a list of OIDs
instead of records in a link table). We know all that because we use
the ZODB in pretty much the same way, right?


Repoze-dev mailing list

Reply via email to