[
https://issues.apache.org/jira/browse/JCR-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13573581#comment-13573581
]
Alex Parvulescu commented on JCR-3513:
--------------------------------------
Hi Tom,
Interesting how we are looking at the same problem :)
I think the slowness may come from the fact that, as you pointed out, lucene
will now use a MultiTermQueryWrapperFilter which seems to rely on a
TermInfosReader that cannot leverage the terms cache.
So this results in a *lot* of extra term reads. (See [0] for a TermInfosReader
link)
This is a stack-trace from a query tests which points to the same problem
you've noticed [1].
[0]
http://svn.us.apache.org/viewvc/lucene/dev/tags/lucene_solr_3_6_2/lucene/core/src/java/org/apache/lucene/index/TermInfosReader.java?view=markup#l206
[1]
{code}
at
org.apache.lucene.index.TermInfosReaderIndex.compareField(TermInfosReaderIndex.java:249)
at
org.apache.lucene.index.TermInfosReaderIndex.compareTo(TermInfosReaderIndex.java:225)
at
org.apache.lucene.index.TermInfosReaderIndex.compareTo(TermInfosReaderIndex.java:206)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:196)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:172)
at org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:66)
at
org.apache.lucene.index.FilterIndexReader$FilterTermDocs.seek(FilterIndexReader.java:49)
at
org.apache.lucene.index.DirectoryReader$MultiTermDocs.termDocs(DirectoryReader.java:1287)
at
org.apache.lucene.index.DirectoryReader$MultiTermDocs.read(DirectoryReader.java:1240)
at
org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:127)
at
org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:139)
at
org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:298)
at
org.apache.jackrabbit.core.query.lucene.DescendantSelfAxisQuery$DescendantSelfAxisWeight.scorer(DescendantSelfAxisQuery.java:396)
{code}
> Slower range query execution
> ----------------------------
>
> Key: JCR-3513
> URL: https://issues.apache.org/jira/browse/JCR-3513
> Project: Jackrabbit Content Repository
> Issue Type: Improvement
> Affects Versions: 2.4.3
> Reporter: Tom Quellenberg
>
> After switching from JachRabbit 1.6.4 to 2.4.3 we experienced extreme slow
> query executions. All range query on date fields are often 10 times slow than
> before.
> In our repositories more than 1 million documents are stored which all
> contain for example a creation date. Typical queries look like this:
> //element(*, sophora-nt:story)[@sophora:creationDate > ...]
> JackRabbit has its own RangeQuery implementation which is used when Lucene
> throws a TooManyBooleanClauses-exception (and in some other situations, too).
> This worked well in Jackrabbit 1.6. In newer versions a different Lucene
> library is used which never throws TooManyBooleanClauses exceptions. Instead,
> is has its own fall-back in situations where a BooleanQuery does not work.
> This fall-back with a MultiTermQueryWrapperFilter seams to us much slower
> than the fall-back implementation in JackRabbit (Does anybody know the
> reason?). It is the same situation in Jackrabbit 2.6.0 (with Lucene 3.6.0)
> We patched org.apache.jackrabbit.core.query.lucene.RangeQuery to never use
> org.apache.lucene.search.TermRangeQuery but always use the JackRabbit
> implementation. This leads to query executions as fast as in older Jackrabbit
> versions.
> Do other people experience this problem? Are there any drawbacks using always
> the JackRabbit implementation for range queries?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira