Nils Breunese created JCR-4057:
----------------------------------
Summary: Improve performance of skipping offset nodes for Lucene
queries
Key: JCR-4057
URL: https://issues.apache.org/jira/browse/JCR-4057
Project: Jackrabbit Content Repository
Issue Type: Improvement
Components: core
Affects Versions: 2.10.4
Reporter: Nils Breunese
Attachments: JCR-4057.patch
When doing Lucene-based queries with large offset values like 12000 we see
pretty bad performance in our system. We have already enabled the
{{sizeEstimate}} option to improve performance, but still see queries taking 6
to 66 seconds.
We identified the call to {{collectScoreNodes}} in
{{org.apache.jackrabbit.core.query.lucene.QueryResultImpl#getResults}} to be
the cause of this. The {{collectScoreNodes}} method builds an anonymous
{{ArrayList<ScoreNode[]>}} for the offset nodes, which is not used after
building it, so it uses memory for nothing, and it also does access checks for
these nodes which are not returned.
I have attached a patch to Jackrabbit 2.10.4 which just calls {{skip}} on the
{{MultiColumnQueryHits result}} and using this patch our query times seem to
stay under 2 seconds.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)