Hi, 2008/3/31 Andreas Hartmann <[EMAIL PROTECTED]>: > Martin schrieb: > > I've talked to my supervisor and comparison to other technologies with > > some benchmarks could be interesting. > > I wonder if it is really appropriate, or even possible, to compare > "technologies" regarding performance. IMO the performance is rather an > aspect of the implementation.
I for one would be very interested in seeing cross-technology comparisons for various common use cases. There are many cases for which relational databases are better suited than content repositories, but the opposite is also true. See http://www.mit.edu/~dna/vldb07hstore.pdf for a very interesting paper on the limits of traditional RDBM systems. One of the key points related to JCR is the observation (see section 3.1) that many typical applications use "tree schemas", i.e. a hierarchy of 1-n relationships that map very well to the hierarchical model in Jackrabbit. Most notably a hierarchical database can in many cases avoid expensive JOINs for such schemas. It would be really cool to see a thesis that evaluates JCR content repositories in light of the above paper. > - SQL query speed comparison with MySQL/PostgreSQL > - read/write comparisons with filesystems I'm sure that Jackrabbit will lose on both of those comparisons. The main benefit in using a JCR content repository comes not from duplicating content structures found in existing storage models, but in going beyond their current limitations. For example any non-trivial RDBMS application requires a number of joins that can easily become quite expensive. Standard JCR doesn't event support joins as a query concept, but the tree hierarchy gives 1-n relationships and thus many 1-n joins essentially for free. Thus I'd not compare the raw query performance between a relational database and a content repository, but rather the higher level performance for selected used cases based on a content model that's designed to best leverage the capabilities of the underlying system. The same goes for JCR versus the file system. Most non-trivial applications that use the file system for storage end up using XML files, or other parsed resources for handling fine-grained content like individual dates, strings, numbers, etc. A content repository natively supports such fine-grained content, so many read and update operations that target such "small data" are much more convenient and ofter faster than in a file-based solution that requires explicit parsing and serializing (not to mention locking) of larger chunks of data. BR, Jukka Zitting
