Hello Ian, On Mon, Oct 11, 2010 at 6:55 PM, Ian Boston <[email protected]> wrote: > Hi, > We have a Query that we are having performance problems with. The people > doing load testing are saying we will need a cluster of 5, 4 core servers to > support 600 concurrent users. For us this is not good, since that not many > users and a lot of hardware. > > The query is: > //*...@sling:resourceType='sakai/user-home']/*[fn:name() = 'public' or > fn:name() = 'pages']//*[jcr:contains(.,'keyword')]] > > The content structure is of the form: > /_user/i/ie/ieb/ieb2/ieb23389 > - sling:resourceType = sakai/user-home > public/*/*/*/*.....*/ somecontent > pages/*/*/*/* .... */someothercontent > > where there may be many nodes (many = thousands) that match the query per > "sakai/user-home" node, but we only want to get a list "sakai/user-home" > nodes. > > At the moment we are having to iterate through all the nodes (10,000) and > filter them to get a distinct list (10) > > We have been looking at using aggregates in the index configuration, but with > no success yet.
don't think this will help you. > > Is there a better way of formulating the XPath to just return the nodes we > want ? the problem is that you have in Lucene terms a quite challenging query. As a matter of fact, it is a very hard query you need to do, let alone in a hierarchical structure like Jackrabbit. I'll give it another thought but I am not sure whether your use case is possible in a performing way Ard > > Ian > > > > -- Hippo Europe • Amsterdam Oosteinde 11 • 1017 WT Amsterdam • +31 (0)20 522 4466 USA • San Francisco 185 H Street Suite B • Petaluma CA 94952-5100 • +1 (707) 773 4646 Canada • Montréal 5369 Boulevard St-Laurent • Montréal QC H2T 1S5 • +1 (514) 316 8966 www.onehippo.com • www.onehippo.org • [email protected]
