Hi Nitin, Great answer. Thanks a lot. One more question...
I am in the Javaland here, so another viable option for my application is using JCR, such as the Apache Jackrabbit implementation. Did you happen to take a look at that as well? I think JCR has even more similarities with CouchDB than RDF. How would you compare JCR and CouchDB ? Thanks a lot, Demetrius On Thu, May 7, 2009 at 5:04 PM, Nitin Borwankar <[email protected]> wrote: > Demetrius Nunes wrote: > >> Hi there, >> >> We are evaluating new technologies for managing semi-structured data and >> documents in one of our applications. We've got tired of wrestling >> relational databases for this. >> >> I would like to know why would I prefer to use CouchDB instead of a RDF >> database, such as Sesame ou Mulgara. >> >> I know some of the RDF advantages, such as open standards, >> interoperability, >> rules engines, semantic queries, community and tool support, maturity, >> etc. >> >> But I really like the simplicity of the CouchDB model. >> >> Can anyone enlighten me? >> >> Thanks a lot, >> Demetrius >> >> >> > Hi Demetrius, > > We ( bibkn.org) have investigated and used SQL databases, RDF store > (Virtuoso) and CouchDB for bibliographic metadata management. I am the > project manager and data architect for this project. > Relnl databases are a first choice often but have many limitations in > management of loosely typed, messy, string based data sets. So we are in > agreement on not using that technology. > > We, bibkn.org, need both the schemalessness of CouchDB at one end of our > workflow and the strongly-typedness of RDF at the other end of the workflow > when all our data has been cleaned up and "ontologized". So we don't see > this as an either/or between CouchDB and RDF stores. > However we can definitely say one thing - if you need just the flexible > schema aspect and are using RDF to give you that, then that is massive > overkill and the conceptual overhead of the RDF (ontology, schemas, > namespaces, completely normalized everything ie URI's for subject, > predictae, object) , is simply not worth it. If however you want to do > logical inference and reasoning over your data then clearly the RDF and > semantic machinery gives you a whole lot of goodness that is worth the > overhead. > > So CouchDB is not a substitute for an RDF-store, but you may be using an > RDF-store for the lesser things it gives you (flexible schema) and in that > case CouchDB can do a lot more for you at a much lower overhead and much > greater ease of use and integration into existing tools. > > Additionally SPARQL (like SQL) is not really meant for text search which > is critical for loosely typed data. So even at our RDF end we have a Solr > instance for rapid text search over the RDF store. > Additionally we have couchdb-lucene as an extension on our CouchDB instance > and this has given us everything we need at the loosely typed data end of > our workflow. > > So if semi-structured data and document management is your primary use case > and there is no semantic/ontology/inference component then forget RDF-stores > and just go with CouchDB. > > In our project we are developing a format on top of JSON to export > bibliographic metadata for integration into JSON friendly date consumers, it > also happens to have easy mapping to RDF. > So even if you go to Couch now you may be able to integrate into an > RDF-store at some later stage if the need arises. > > Hope this helps, > > Nitin Borwankar, > Project Manager, Bibliographic Knowledge Network > bibkn.org > > > > > -- ____________________________ http://www.demetriusnunes.com
