Thanks for all the thoughtful responses. They really helped. On Fri, May 8, 2009 at 4:09 AM, Daniel Friesen <[email protected]>wrote:
> There's also XML Databases (XQuery) (I'll just use X for simplicity) to > compare. I ended up starting to use Sedna over at my work. > > CouchDB uses JSON, X use XML > CouchDB uses views, X uses XQuery which has some simple indexing and has a > significantly powerful and understandable query language > CouchDB has a lucene plugin, Sedna can have an extra fulltext index feature > enabled. > Updating data in CouchDB requires an entire document be updated, X > databases can modify small parts of the document > CouchDB saves a new document each change, X works on a current document. > CouchDB handles conflicts using conflict resolution, X makes the > modification query on the current document in order of queries (transactions > are also supported). > CouchDB uses a HTTP REST API, most X databases use a normal binary protocol > (Sedna seams to have a good set of libraries for most languages) > CouchDB is distributed and scalable. > In X databases documents can be grouped into collections. (These can also > be used in queries) > It's probably a moot point, but XQuery is w3c standardized and implemented > by a number of databases. > > IMHO compiling a comparison of alternative databases and seeing what > features work best for what data you're working with is the best option. > > I went through the semantic databases myself to cause our company had > "Semantics" in mind. I had issues getting them to work and finding help for > most of them myself and ended up finding that our data better fit the > document based database type. For us TQL was the only actual one with a > significant improvement (we really needed the walk capabilities) other than > that Semantics were only a little better than a RDBMS (although we were > actually using RDBMS in an ugly semantic like hack; atoms table 3 columns). > Our reason for moving away from RDBMS' was a need to remove the large > amounts of queries going between our app and the database. We had a huge > amount of hierarchical data the entire app was based around (a tree > structure wasn't even guaranteed, something could have multiple parents > referencing it and be part of multiple trees). > We decided on Sedna (XQuery) rather than CouchDB because CouchDB's views > couldn't handle our hierarchical data in multiple documents, and we couldn't > put everything in one document because of how we update small pieces of data > a lot which doesn't work out well with how entire documents need to be > modified in Couch (Transmitting entire document to modify a single value, > new document revision saved each time, getting a conflict because an > unrelated part of the document was modified). > > Personally I have an idea for another type of database. The one thing I've > always wanted was one program oriented. ie: Simplifying a database down to > what it is, centralized data storage. Instead of a query language, embedding > an existing programming language into the database environment. I wrote a > bit of API drafting on it. > > ~Daniel Friesen (Dantman, Nadir-Seen-Fire) > > > Nitin Borwankar wrote: > >> Demetrius Nunes wrote: >> >>> Hi Nitin, >>> >>> Great answer. Thanks a lot. One more question... >>> >>> I am in the Javaland here, so another viable option for my application is >>> using JCR, such as the Apache Jackrabbit implementation. >>> >>> >> >> Hi Demetrius, >> >> I am a refugee from Javaland so am familiar with the power and limitations >> of Java. Yes, I have looked at JCR and JackRabbit in a previous project. >> These days I just recoil from the verbosity and conceptual layers you >> encounter when coding simple things in Java. And then there's XML..... >> So I would have held my nose and used JackRabbit if CouchDB didn't exist - >> but in my mind it's a distant second in practice even if it is conceptually >> similar and close in theory. >> >> Personally when I see layer upon layer of abstraction in Java architecture >> diagrams I wonder how much of my CPU cost is going in converting from >> strings, to TypeA to LayeredClassB to factoryC to ORM D to EJB4 to disk and >> back again all the way to strings. So I am moving away from Java except >> when the best of breed solution is in Java ( Lucene/Solr) - so I don't hate >> Java - I just need to justify the overhead that it brings both in coding and >> in the build/install/deploy process. >> >> CouchDB has minimal overhead in roundtrip datatype translations - it's >> what I call "WYSIWIS" - "what you see is what you store" i.e. JSON. >> There are people looking at an alternative to LAMP which they call JS3 - >> Javascript in all three layers - browser/helma/couchdb ( helma, >> helma.org, is a middle tier layer written in Java, runs on Jetty, uses JS >> as the language for doing UI templates and also ORM ) - I personally think >> CouchDB + CouchDBViews just makes it JS2 - browser-CouchDB. >> >> I would suggest you download Rhino ( JS interpreter in Java) from Mozilla >> and start playing with both CouchDB and JackRabbit and then see. >> >> Did I sound biased ? :-) >> >> >> Nitin Borwankar, >> Project Manager, Bibliographic Knowledge Network. >> bibkn.org >> >> Did you happen to take a look at that as well? I think JCR has even more >>> similarities with CouchDB than RDF. >>> >>> How would you compare JCR and CouchDB ? >>> >>> Thanks a lot, >>> Demetrius >>> >>> On Thu, May 7, 2009 at 5:04 PM, Nitin Borwankar <[email protected]> >>> wrote: >>> >>> >>> >>>> Demetrius Nunes wrote: >>>> >>>> >>>> >>>>> Hi there, >>>>> >>>>> We are evaluating new technologies for managing semi-structured data >>>>> and >>>>> documents in one of our applications. We've got tired of wrestling >>>>> relational databases for this. >>>>> >>>>> I would like to know why would I prefer to use CouchDB instead of a RDF >>>>> database, such as Sesame ou Mulgara. >>>>> >>>>> I know some of the RDF advantages, such as open standards, >>>>> interoperability, >>>>> rules engines, semantic queries, community and tool support, maturity, >>>>> etc. >>>>> >>>>> But I really like the simplicity of the CouchDB model. >>>>> >>>>> Can anyone enlighten me? >>>>> >>>>> Thanks a lot, >>>>> Demetrius >>>>> >>>>> >>>>> >>>>> >>>>> >>>> Hi Demetrius, >>>> >>>> We ( bibkn.org) have investigated and used SQL databases, RDF store >>>> (Virtuoso) and CouchDB for bibliographic metadata management. I am the >>>> project manager and data architect for this project. >>>> Relnl databases are a first choice often but have many limitations in >>>> management of loosely typed, messy, string based data sets. So we are >>>> in >>>> agreement on not using that technology. >>>> >>>> We, bibkn.org, need both the schemalessness of CouchDB at one end of >>>> our >>>> workflow and the strongly-typedness of RDF at the other end of the >>>> workflow >>>> when all our data has been cleaned up and "ontologized". So we don't see >>>> this as an either/or between CouchDB and RDF stores. >>>> However we can definitely say one thing - if you need just the >>>> flexible >>>> schema aspect and are using RDF to give you that, then that is massive >>>> overkill and the conceptual overhead of the RDF (ontology, schemas, >>>> namespaces, completely normalized everything ie URI's for subject, >>>> predictae, object) , is simply not worth it. If however you want to >>>> do >>>> logical inference and reasoning over your data then clearly the RDF and >>>> semantic machinery gives you a whole lot of goodness that is worth >>>> the >>>> overhead. >>>> >>>> So CouchDB is not a substitute for an RDF-store, but you may be using an >>>> RDF-store for the lesser things it gives you (flexible schema) and in >>>> that >>>> case CouchDB can do a lot more for you at a much lower overhead and much >>>> greater ease of use and integration into existing tools. >>>> >>>> Additionally SPARQL (like SQL) is not really meant for text search >>>> which >>>> is critical for loosely typed data. So even at our RDF end we have a >>>> Solr >>>> instance for rapid text search over the RDF store. >>>> Additionally we have couchdb-lucene as an extension on our CouchDB >>>> instance >>>> and this has given us everything we need at the loosely typed data end >>>> of >>>> our workflow. >>>> >>>> So if semi-structured data and document management is your primary use >>>> case >>>> and there is no semantic/ontology/inference component then forget >>>> RDF-stores >>>> and just go with CouchDB. >>>> >>>> In our project we are developing a format on top of JSON to export >>>> bibliographic metadata for integration into JSON friendly date >>>> consumers, it >>>> also happens to have easy mapping to RDF. >>>> So even if you go to Couch now you may be able to integrate into an >>>> RDF-store at some later stage if the need arises. >>>> >>>> Hope this helps, >>>> >>>> Nitin Borwankar, >>>> Project Manager, Bibliographic Knowledge Network >>>> bibkn.org >>>> >>> > -- ____________________________ http://www.demetriusnunes.com
