[
https://issues.apache.org/jira/browse/JCR-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Serge Huber updated JCR-2835:
-----------------------------
Fix Version/s: 2.2.1
> Poor performance of ISDESCENDANTNODE on SQL 2 queries
> -----------------------------------------------------
>
> Key: JCR-2835
> URL: https://issues.apache.org/jira/browse/JCR-2835
> Project: Jackrabbit Content Repository
> Issue Type: Improvement
> Components: jackrabbit-core, query
> Affects Versions: 2.2.0, 2.2.1, 2.3.0
> Reporter: Serge Huber
> Fix For: 2.2.1, 2.3.0
>
> Attachments: DescendantSearchTest.png,
> JCR-2835-use-DescendantSelfAxisQuery.patch, JCR-2835_PerformanceTests.patch,
> JCR-2835_Poor_performance_on_ISDESCENDANTNODE_constraint_v1.patch,
> SQL2DescendantSearchTest.png
>
>
> Using the latest source code, I have noticed very bad performance on SQL-2
> queries that use the ISDESCENDANTNODE constraint on a large sub-tree. For
> example, the query :
> select * from [jnt:news] as news where ISDESCENDANTNODE(news,'/root/site')
> order by news.[date] desc
> executes in 600ms
> select * from [jnt:news] as news order by news.[date] desc
> executes in 4ms
> From looking at the problem in the Yourkit profiler, it seems that the
> culprit is the constraint building, that uses recursive Lucene searches to
> build the list of descendant node IDs :
> private Query getDescendantNodeQuery(
> DescendantNode dn, JackrabbitIndexSearcher searcher)
> throws RepositoryException, IOException {
> BooleanQuery query = new BooleanQuery();
> try {
> LinkedList<NodeId> ids = new LinkedList<NodeId>();
> NodeImpl ancestor = (NodeImpl)
> session.getNode(dn.getAncestorPath());
> ids.add(ancestor.getNodeId());
> while (!ids.isEmpty()) {
> String id = ids.removeFirst().toString();
> Query q = new JackrabbitTermQuery(new Term(FieldNames.PARENT,
> id));
> QueryHits hits = searcher.evaluate(q);
> ScoreNode sn = hits.nextScoreNode();
> if (sn != null) {
> query.add(q, SHOULD);
> do {
> ids.add(sn.getNodeId());
> sn = hits.nextScoreNode();
> } while (sn != null);
> }
> }
> } catch (PathNotFoundException e) {
> query.add(new JackrabbitTermQuery(new Term(
> FieldNames.UUID, "invalid-node-id")), // never matches
> SHOULD);
> }
> return query;
> }
> In the above example this generates over 2800 Lucene queries, which is the
> culprit. I wonder if it wouldn't be faster to retrieve the IDs by using the
> JCR to retrieve the list of child IDs ?
> This was probably also missed because I didn't seem to find any performance
> tests on this constraint.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.