[ 
https://issues.apache.org/jira/browse/LUCENE-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14519358#comment-14519358
 ] 

Adrien Grand commented on LUCENE-6458:
--------------------------------------

I did some benchmarking and higher values (several hundreds) could make 
luceneutil run slower so I ended up with the same value that we are currently 
using for FuzzyQuery's max expansions.

bq. This reminds me of when I was working on the Solr "Terms" QParser that 
supports 3-4 different options, to include BooleanQuery & TermsQuery

Maybe we could change TermsQuery.rewrite to rewrite to a boolean query (wrapped 
in a CSQ) when there are few terms? This would avoid having to worry about this 
in every query parser.

bq. I have a feeling that the appropriate threshold is a function of the number 
of indexed terms, instead of just a constant.

Hmm, what makes you think so? In my opinion, the issue with rewriting to a 
BooleanQuery is that its scorer needs to rebalance the priority queue whenever 
it advances, which is O(log(#clauses)). So it gets slower as you add new 
optional clauses while the way TermsQuery works doesn't care much about the 
number of matching terms. I don't think the total number of index terms is 
relevant?

> MultiTermQuery's FILTER rewrite method should support skipping whenever 
> possible
> --------------------------------------------------------------------------------
>
>                 Key: LUCENE-6458
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6458
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-6458.patch
>
>
> Today MultiTermQuery's FILTER rewrite always builds a bit set fom all 
> matching terms. This means that we need to consume the entire postings lists 
> of all matching terms. Instead we should try to execute like regular 
> disjunctions when there are few terms.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to