Hi Bertrand, Some numbers:
Jackrabbit: Ingest = 43189ms. Retrieve = 580ms JPA/Database: Ingest = 86ms. Retrieve = 33ms. The data structure looks like this: Instrument: String name; String model; ... Dataset: Instrument instrument; String type; String units; List<Value> values; Value: Date time; Double value; In Jackrabbit the path looks like /<instrument>/<dataset>/YYYY/MM/DD/<value> I can probably improve the ingest time by an order of magnitude by more intelligent session handling, but the retrieval also needs to be improved and I don't know how. In the production system, using PostgreSQL as the back end, with 100,000 points across 50 instruments, it takes about 3 seconds to execute the query to retrieve the dataset. This needs to be < 1s at worst, as it feeds other systems. Advice would be gratefully received. Cheers Nigel 2009/8/15 Bertrand Delacretaz <[email protected]> > Hi Nigel, > > On Sat, Aug 15, 2009 at 6:32 AM, Nigel Sim<[email protected]> wrote: > > ...Thanks for your suggestion. Unfortunately, even in the simplest case > of 100 > > nodes in the root node, the time taken to retrieve is too long. If I > could > > resolve this fundamental speed issue then I could apply your solution to > > help me scale my system.... > > How much is too long, and how do you retrieve the nodes? > I'm curious, as retrieving 100 nodes by navigating the JCR > parent/child relationships should not be that slow. > > > ...I think I just need to bite the bullet and admit my use case doesn't > really > > map on Jackrabbit :)... > > If you tell us a bit more about your data structure, someone might be > able to help. > Did you have a look at http://wiki.apache.org/jackrabbit/DavidsModel ? > That can help structure things in a JCR-friendly way. > > -Bertrand > -- JCU eResearch Centre School Of Business (IT) James Cook University
