Re: CouchDB x RDF databases comparison

Nitin Borwankar Thu, 07 May 2009 13:05:02 -0700

Demetrius Nunes wrote:

Hi there,


We are evaluating new technologies for managing semi-structured data and
documents in one of our applications. We've got tired of wrestling
relational databases for this.

I would like to know why would I prefer to use CouchDB instead of a RDF
database, such as Sesame ou Mulgara.

I know some of the RDF advantages, such as open standards, interoperability,
rules engines, semantic queries, community and tool support, maturity, etc.

But I really like the simplicity of the CouchDB model.

Can anyone enlighten me?

Thanks a lot,
Demetrius

Hi Demetrius,

We ( bibkn.org) have investigated and used SQL databases, RDF store(Virtuoso) and CouchDB for bibliographic metadata management. I am theproject manager and data architect for this project.Relnl databases are a first choice often but have many limitations inmanagement of loosely typed, messy, string based data sets. So we arein agreement on not using that technology.

We, bibkn.org, need both the schemalessness of CouchDB at one end ofour workflow and the strongly-typedness of RDF at the other end of theworkflow when all our data has been cleaned up and "ontologized". So wedon't see this as an either/or between CouchDB and RDF stores.However we can definitely say one thing - if you need just theflexible schema aspect and are using RDF to give you that, then thatis massive overkill and the conceptual overhead of the RDF(ontology, schemas, namespaces, completely normalized everything ieURI's for subject, predictae, object) , is simply not worth it. Ifhowever you want to do logical inference and reasoning over your datathen clearly the RDF and semantic machinery gives you a whole lot ofgoodness that is worth the overhead.

So CouchDB is not a substitute for an RDF-store, but you may be using anRDF-store for the lesser things it gives you (flexible schema) and inthat case CouchDB can do a lot more for you at a much lower overhead andmuch greater ease of use and integration into existing tools.

Additionally SPARQL (like SQL) is not really meant for text searchwhich is critical for loosely typed data. So even at our RDF end we havea Solr instance for rapid text search over the RDF store.Additionally we have couchdb-lucene as an extension on our CouchDBinstance and this has given us everything we need at the loosely typeddata end of our workflow.

So if semi-structured data and document management is your primary usecase and there is no semantic/ontology/inference component then forgetRDF-stores and just go with CouchDB.

In our project we are developing a format on top of JSON to exportbibliographic metadata for integration into JSON friendly dateconsumers, it also happens to have easy mapping to RDF.So even if you go to Couch now you may be able to integrate into anRDF-store at some later stage if the need arises.


Hope this helps,

Nitin Borwankar,
Project Manager,  Bibliographic Knowledge Network
bibkn.org

Re: CouchDB x RDF databases comparison

Reply via email to