Hi, On Tue, Mar 6, 2012 at 5:01 PM, Jukka Zitting <[email protected]> wrote: > Rather than discuss this issue in the abstract, I suggest that we > define a set of relevant performance benchmarks, and use them for > evaluating potential alternatives.
In addition to this specific case, I think it's important that we define and implement a good set of performance and scalability benchmarks as early as possible. That allows us to get a good picture of where we are and what areas and potential bottlenecks need more focus. Such a set of benchmarks should also make it easy to evaluate alternative designs and produce hard evidence to help resolve potential disagreements. So what should we benchmark then? Here's one idea to get us started: * Large, flat hierarchy (selected pages-articles dump from Wikipedia) * Time it takes to load all articles (ideally a single transaction) * Amount of disk space used * Time it takes to iterate over all articles * Number of reads by X clients in Y seconds (power-law distribution) * Number of writes by X clients in Y seconds (power-law distribution) Ideally we'd design the benchmarks so that they can be run against not just different configurations of Oak, but also Jackrabbit 2.x and other databases (SQL and NoSQL) like Oracle, PostgreSQL, CouchDB and MongoDB. To start with, I'd target the following basic deployment configurations: * 1 node, MB-range test sets (small embedded or development/testing deployment) * 4 nodes, GB-range test sets (mid-size non-cloud deployment) * 16 nodes, TB-range test sets (low-end cloud deployment) WDYT? BR, Jukka Zitting
