On Tue, May 4, 2010 at 2:20 PM, Laurence Rowe <l...@lrowe.co.uk> wrote: > On 2 May 2010 22:03, Luciano Ramalho <luci...@ramalho.org> wrote: > >> In these, we don't store serialized objects, but just the data to >> reconstruct the objects. But the data is not completely dismembered in >> some normalized form. >> In a semi-structured database the data graph can follow very closely >> the original object graph, which makes retrieval easier for the >> programmer and more efficient for the database. And the schema is >> self-describing, which means if you have a database backup, then you >> are able to get to the data even if you don't have the software that >> put it there. > > ... > >> Here is formal definition, from the same source: >> >> """ >> A semi-structured data instance is a rooted, directed graph in which >> the edges carry labels representing schema components, and leaf nodes >> (i.e., nodes without any >> outgoing edges) are labeled with data values (integers, reals, strings, >> etc.). >> """ > > I suspect that databases such as CouchDB and the others you mention > are not well suited to graph traversal. Efficient traversal must occur > near the data, otherwise you pay the latency cost on each edge > traversed. In ZODB this is achieved through the object cache - you > expect most of the target object's parents to already be present in > the cache when serving a request. In other systems traversal happens > in the database - for an example see > http://highscalability.com/neo4j-graph-database-kicks-buttox.
The problem with that formal definition I quoted is that it is academic and not very practical. There is a related concept of graph databases, but that is not what interests me for content management applications. In practice, for content management we don't need very deep graphs. But it is highly useful to be able to store more data along with a document. For instance, tags, attributions, comments (to a degree) are all things that can be easily stored within a document in CouchDB and similar databases. Even references to other objects are much easier to manage if they are stored with the document than as foreign keys (for example, related documents can be represented as a list of OIDs instead of records in a link table). We know all that because we use the ZODB in pretty much the same way, right? Cheers, Luciano _______________________________________________ Repoze-dev mailing list Repoze-dev@lists.repoze.org http://lists.repoze.org/listinfo/repoze-dev