Hi all,
we currently work on an application using jackrabbit (CRX) with a lot of content (more than 15000 documents). To fit the requirements we had to create some relations between our documents (like symlinks in unix systems). For example we have a document /docs/generated/document1 that should be referenced from several other locations, say: /some/path/refToDocument1 /some/other/path/refToDocument1 /and/another/path/refToDocument1 We implemented this using a property, where we store the path to the original document. These reference is resolved in a higher layer of our application. Now we have obtain a list of all referenced documents below a given path, filtered by properties currently stored at the document itself and conditionally sorted by a set of properties. Therefor we tried two approaches yet: 1. Setting a mixin type to the references, we can query for and get an unsorted, unfiltered (very huge) result set, we afterwards filter and sort 2. Iterating (using multiple threads) over the hole tree below the given path, only collecting nodes matching the given filter. Sorting is done afterwards Both of the solutions didn't perform very well. In (1) the search took about 900 ms (this is ok for about 10000 entries in the result set, I think) and the filtering took about 3000ms. In (2) the traversing took 4500ms including filtering only. So both solutions are not suitable for our project and we are looking for a better way to model the given requirement. So what is the best way to work with relational content in jackrabbit? The last idea I had to solve the performance issue is to reduce the size of the result set by querying the documents directly, applying filtering and sorting using Lucene but this failed due to the complex sorting we have to implement. For example: order by property a when b doesn't exists otherwise use b. So is it possible to implement conditional sorting using the properties available in the index? Any other hints according to performance improvements are very welcome. (bundleCacheSize is already increased to about 10% of available heap size ;-)). Thanks so far, Dirk Rudolph T-Systems Multimedia Solutions GmbH Organisationseinheit CCS Dirk Rudolph Software-Entwicklung, OCJP Hausanschrift: Riesaer Straße 5, 01129 Dresden Postanschrift: Postfach 10 02 24, 01072 Dresden +49 351 2820-5363 (Tel) E-Mail: [email protected] <mailto:[email protected]> Internet: http://www.t-systems-mms.com <http://www.t-systems-mms.de/> T-Systems Multimedia Solutions GmbH Aufsichtsrat: Klaus Werner (Vorsitzender) Geschäftsführung: Peter Klingenburg, Susanne Heger, Dr. Rolf Werner Handelsregister: Amtsgericht Dresden HRB 11433 Sitz der Gesellschaft: Dresden Ust-IdNr.: DE 811 807 949
