Re: Oak benchmarks (Was: [jr3] Index on randomly distributed data)

Michael Dürig Fri, 09 Mar 2012 03:11:58 -0800


On 8.3.12 14:17, Jukka Zitting wrote:

So what should we benchmark then? Here's one idea to get us started:

* Large, flat hierarchy (selected pages-articles dump from Wikipedia)
   * Time it takes to load all articles (ideally a single transaction)
   * Amount of disk space used
   * Time it takes to iterate over all articles
   * Number of reads by X clients in Y seconds (power-law distribution)
   * Number of writes by X clients in Y seconds (power-law distribution)

Ack. In addition we should add tests which check that large numbers ofdirect child nodes (Millions) work. That is, adding a child node takesconstant time irrespective of how many child nodes there are already.These use case seems to be quite important to us.


Michael

Ideally we'd design the benchmarks so that they can be run against not
just different configurations of Oak, but also Jackrabbit 2.x and
other databases (SQL and NoSQL) like Oracle, PostgreSQL, CouchDB and
MongoDB.

To start with, I'd target the following basic deployment configurations:

* 1 node, MB-range test sets (small embedded or development/testing deployment)
* 4 nodes, GB-range test sets (mid-size non-cloud deployment)
* 16 nodes, TB-range test sets (low-end cloud deployment)

Sounds like a good idea to me. Having such deployment configuration andtesting infrastructure ready from the beginning should help a lot duringfurther development.


Michael


WDYT?

BR,

Jukka Zitting

Re: Oak benchmarks (Was: [jr3] Index on randomly distributed data)

Reply via email to