Hello Dennis, it's because your using 2 times a child axis query (jcr:root and one within the where clause) that makes it slow and. Explaining why is out of scope for the user list, but I wrote quite some time ago a few guidelines (most of them still valid):
http://n4.nabble.com/Explanation-and-solutions-of-some-Jackrabbit-queries-regarding-performance-td516614.html#a516614 I am not sure what nodetype jcr:content is, but suppose: my:contenttype now, if your query would be: //element(*,my:contenttype)[fn:lower-case(@cms:virtualPath)= '" + vpath + "']"; the query will be instant. Just take the parent node of the result and you should be fine. Just wondering, are you building a brand new cms on jcr? I am not sure what the @cms:virtualPath holds, but if you also need virtual environments showing the same jcr nodes in different tree structures you might wanna take a look here [1]. Regards Ard [1] http://www.onehippo.org/cms7 On Thu, Dec 3, 2009 at 9:47 AM, Dennis van der Laan <[email protected]> wrote: > Hi, > > It seems querying on a property is very slow on our system (running > Jackrabbit 1.6.0): almost 1 second per query which would normally return > 0 or 1 result. > > We use jcr:content nodes of type nt:unstructured to store the contents > of a file (in the jcr:data property) and we store an array of Strings in > a property cms:virtualPath on the same node. So basically, every file in > our repository has the JCR path and zero or more virtual paths in the > cms:virtualPath property. If we want to add a virtual path to a file, we > have to check if the virtual path does not exist already. For this, we > use an XPath query in the following code snippet: > > String vpath = QueryUtil.escapeForAttributeSearch(path.toLowerCase()); > String query = > "/jcr:root//element(*,nt:hierarchyNode)[fn:lower-case(jcr:content/@cms:virtualPath) > = '" + vpath + "']"; > Query q = queryManager.createQuery(query, Query.XPATH); > NodeIterator ni = q.execute().getNodes(); > if (ni.getSize() == 0) { > throw new ItemNotFoundException("Unable to find item by virtual > path: " + path); > } > else if (ni.getSize() > 1) { > throw new IllegalStateException("More than 1 item on virtual path: " > + path); > } > else { > return ni.nextNode(); > } > > Our repository now contains around 500,000 virtual paths, more or less > divided over 150,000 files which are evenly distributed over more than > 1000 (nested) folders. > > The repository runs on an Intel Nehalem Xeon (2 x 2.5GHz) running > Solaris 10 and the repository database (for datastore, filesystem, etc) > runs on the same specs, on a different server, running Oracle 10g. > > When we try to add virtual paths in a batch (about 2000 virtual path > properties for 1000 files) and all virtual paths already exist (so the > above query returns 1 virtual path), we see a 100% load of the our > Tomcat application (which means 1 core fully utilized). > > I would expect a JCR repository to be able to handle this kind of > queries. How are these properties indexed? Is it possible to optimize > the repository for this kind of queries? Or should I use a different > query? The alternative would be to keep a different database which keeps > track of the virtual paths, but keeping that in sync with the JCR > repository would be a pain, at the least. > > Thanks for your ideas about this issue, > Kind regards, > Dennis van der Laan >
