[jira] [Updated] (JCR-4057) Improve performance of skipping offset nodes for Lucene queries

Nils Breunese (JIRA) Fri, 11 Nov 2016 04:11:46 -0800

     [ 
https://issues.apache.org/jira/browse/JCR-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Nils Breunese updated JCR-4057:
-------------------------------
    Description: 
When doing Lucene-based queries with large offset values like 12000 I see 
pretty bad performance in our system. I have already enabled the 
{{sizeEstimate}} option to improve performance, but still see queries taking 6 
to 66 seconds.

I have identified the call to {{collectScoreNodes}} for offset nodes in 
{{org.apache.jackrabbit.core.query.lucene.QueryResultImpl#getResults}} to be 
the cause of this. The {{collectScoreNodes}} method builds an anonymous 
{{ArrayList<ScoreNode[]>}} for the offset nodes, which is not used after 
building it, so it uses memory for nothing, and it also does access checks for 
these nodes which are not returned.

I have attached a patch to Jackrabbit 2.10.4 which just calls {{skip}} on the 
{{MultiColumnQueryHits result}} and using this patch our query times seem to 
stay under 2 seconds.

  was:
When doing Lucene-based queries with large offset values like 12000 I see 
pretty bad performance in our system. I have already enabled the 
{{sizeEstimate}} option to improve performance, but still see queries taking 6 
to 66 seconds.

I have identified the call to {{collectScoreNodes}} in 
{{org.apache.jackrabbit.core.query.lucene.QueryResultImpl#getResults}} to be 
the cause of this. The {{collectScoreNodes}} method builds an anonymous 
{{ArrayList<ScoreNode[]>}} for the offset nodes, which is not used after 
building it, so it uses memory for nothing, and it also does access checks for 
these nodes which are not returned.

I have attached a patch to Jackrabbit 2.10.4 which just calls {{skip}} on the 
{{MultiColumnQueryHits result}} and using this patch our query times seem to 
stay under 2 seconds.


> Improve performance of skipping offset nodes for Lucene queries
> ---------------------------------------------------------------
>
>                 Key: JCR-4057
>                 URL: https://issues.apache.org/jira/browse/JCR-4057
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 2.10.4
>            Reporter: Nils Breunese
>              Labels: performance
>         Attachments: JCR-4057-test.patch, JCR-4057.patch
>
>
> When doing Lucene-based queries with large offset values like 12000 I see 
> pretty bad performance in our system. I have already enabled the 
> {{sizeEstimate}} option to improve performance, but still see queries taking 
> 6 to 66 seconds.
> I have identified the call to {{collectScoreNodes}} for offset nodes in 
> {{org.apache.jackrabbit.core.query.lucene.QueryResultImpl#getResults}} to be 
> the cause of this. The {{collectScoreNodes}} method builds an anonymous 
> {{ArrayList<ScoreNode[]>}} for the offset nodes, which is not used after 
> building it, so it uses memory for nothing, and it also does access checks 
> for these nodes which are not returned.
> I have attached a patch to Jackrabbit 2.10.4 which just calls {{skip}} on the 
> {{MultiColumnQueryHits result}} and using this patch our query times seem to 
> stay under 2 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (JCR-4057) Improve performance of skipping offset nodes for Lucene queries

Reply via email to