[
https://issues.apache.org/jira/browse/JCR-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13574307#comment-13574307
]
Tom Quellenberg commented on JCR-3513:
--------------------------------------
Hallo Alex,
my stack trace looks like this
Term.compareTo(Term) line: 114
TermInfosReader.get(Term, boolean) line: 212
TermInfosReader.get(Term) line: 179
SegmentTermDocs.seek(Term) line: 57
DirectoryReader$MultiTermDocs.termDocs(int) line: 1224
DirectoryReader$MultiTermDocs.read(int[], int[]) line: 1177
ReadOnlyIndexReader$FilteredTermDocs.read(int[], int[]) line: 257
DirectoryReader$MultiTermDocs.read(int[], int[]) line: 1182
MultiTermQueryWrapperFilter<Q>.getDocIdSet(IndexReader) line: 122
ConstantScoreQuery$ConstantScorer.<init>(ConstantScoreQuery, Similarity,
IndexReader, Weight) line: 122
ConstantScoreQuery$ConstantWeight.scorer(IndexReader, boolean, boolean) line:
86
BooleanQuery$BooleanWeight.scorer(IndexReader, boolean, boolean) line: 306
JackrabbitIndexSearcher(IndexSearcher).search(Weight, Filter, Collector) line:
210
JackrabbitIndexSearcher(Searcher).search(Query, Collector) line: 67
My code ends up on a TermInfosReader, too. The conclusion, that the Lucene code
does not use a cache, sounds reasonable to me.
For me there are two solutions:
# change the code, so that lucene uses a cached reader. (I have no idea how to
achieve this)
# avoid the usage of the MultiTermQueryWrapperFilter
We go with the second solution and removed the method
org.apache.jackrabbit.core.query.lucene.RangeQuery.rewrite(IndexReader). In the
super class, this method returns 'this' and thus the Jackrabbit RangeQuery is
used always. I'm not sure whether this will solve your problem.
> Slower range query execution
> ----------------------------
>
> Key: JCR-3513
> URL: https://issues.apache.org/jira/browse/JCR-3513
> Project: Jackrabbit Content Repository
> Issue Type: Improvement
> Affects Versions: 2.4.3
> Reporter: Tom Quellenberg
> Assignee: Alex Parvulescu
>
> After switching from JachRabbit 1.6.4 to 2.4.3 we experienced extreme slow
> query executions. All range query on date fields are often 10 times slow than
> before.
> In our repositories more than 1 million documents are stored which all
> contain for example a creation date. Typical queries look like this:
> //element(*, sophora-nt:story)[@sophora:creationDate > ...]
> JackRabbit has its own RangeQuery implementation which is used when Lucene
> throws a TooManyBooleanClauses-exception (and in some other situations, too).
> This worked well in Jackrabbit 1.6. In newer versions a different Lucene
> library is used which never throws TooManyBooleanClauses exceptions. Instead,
> is has its own fall-back in situations where a BooleanQuery does not work.
> This fall-back with a MultiTermQueryWrapperFilter seams to us much slower
> than the fall-back implementation in JackRabbit (Does anybody know the
> reason?). It is the same situation in Jackrabbit 2.6.0 (with Lucene 3.6.0)
> We patched org.apache.jackrabbit.core.query.lucene.RangeQuery to never use
> org.apache.lucene.search.TermRangeQuery but always use the JackRabbit
> implementation. This leads to query executions as fast as in older Jackrabbit
> versions.
> Do other people experience this problem? Are there any drawbacks using always
> the JackRabbit implementation for range queries?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira