[jira] Updated: (JCR-2835) Poor performance of ISDESCENDANTNODE on SQL 2 queries

Serge Huber (JIRA) Fri, 17 Dec 2010 02:59:29 -0800

     [ 
https://issues.apache.org/jira/browse/JCR-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Serge Huber updated JCR-2835:
-----------------------------

    Fix Version/s: 2.2.1

> Poor performance of ISDESCENDANTNODE on SQL 2 queries
> -----------------------------------------------------
>
>                 Key: JCR-2835
>                 URL: https://issues.apache.org/jira/browse/JCR-2835
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core, query
>    Affects Versions: 2.2.0, 2.2.1, 2.3.0
>            Reporter: Serge Huber
>             Fix For: 2.2.1, 2.3.0
>
>         Attachments: DescendantSearchTest.png, 
> JCR-2835-use-DescendantSelfAxisQuery.patch, JCR-2835_PerformanceTests.patch, 
> JCR-2835_Poor_performance_on_ISDESCENDANTNODE_constraint_v1.patch, 
> SQL2DescendantSearchTest.png
>
>
> Using the latest source code, I have noticed very bad performance on SQL-2 
> queries that use the ISDESCENDANTNODE constraint on a large sub-tree. For 
> example, the query : 
> select * from [jnt:news] as news where ISDESCENDANTNODE(news,'/root/site') 
> order by news.[date] desc 
> executes in 600ms 
> select * from [jnt:news] as news order by news.[date] desc
> executes in 4ms
> From looking at the problem in the Yourkit profiler, it seems that the 
> culprit is the constraint building, that uses recursive Lucene searches to 
> build the list of descendant node IDs : 
>     private Query getDescendantNodeQuery(
>             DescendantNode dn, JackrabbitIndexSearcher searcher)
>             throws RepositoryException, IOException {
>         BooleanQuery query = new BooleanQuery();
>         try {
>             LinkedList<NodeId> ids = new LinkedList<NodeId>();
>             NodeImpl ancestor = (NodeImpl) 
> session.getNode(dn.getAncestorPath());
>             ids.add(ancestor.getNodeId());
>             while (!ids.isEmpty()) {
>                 String id = ids.removeFirst().toString();
>                 Query q = new JackrabbitTermQuery(new Term(FieldNames.PARENT, 
> id));
>                 QueryHits hits = searcher.evaluate(q);
>                 ScoreNode sn = hits.nextScoreNode();
>                 if (sn != null) {
>                     query.add(q, SHOULD);
>                     do {
>                         ids.add(sn.getNodeId());
>                         sn = hits.nextScoreNode();
>                     } while (sn != null);
>                 }
>             }
>         } catch (PathNotFoundException e) {
>             query.add(new JackrabbitTermQuery(new Term(
>                     FieldNames.UUID, "invalid-node-id")), // never matches
>                     SHOULD);
>         }
>         return query;
>     }
> In the above example this generates over 2800 Lucene queries, which is the 
> culprit. I wonder if it wouldn't be faster to retrieve the IDs by using the 
> JCR to retrieve the list of child IDs ?
> This was probably also missed because I didn't seem to find any performance 
> tests on this constraint.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (JCR-2835) Poor performance of ISDESCENDANTNODE on SQL 2 queries

Reply via email to