[ 
https://issues.apache.org/jira/browse/LUCENE-6360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-6360:
---------------------------------
    Attachment: LUCENE-6360.patch

bq. I noticed what looks like a bug in TermsQuery.createWeight.scorer:

Good catch, it's a bad bug indeed! Here is an updated patch with a test that we 
only pull one iterator per unique field.

bq. Secondly... I think the needsScores param should arguably not pass through 
to the ConstantScoreQuery wrapped BooleanQuery, since this should be constant 
scoring; no? Or maybe it's moot since it's CSQ after all.

Actually I think it does need to pass through to the CSQ. The current contract 
is that if you pass needsScores=false then scores are going to be undefined, so 
if the user passed needsScores=true we need to make sure that we build a query 
that will return the same scores.

By the way if we did not, it would probably break the scores since 
ConstantScoreQuery.createWeight returns the inner weight directly when scores 
are not needed.

bq. Did you mean for the changes in CoalescedUpdates and FrozenBufferedUpdates 
to be in this patch?

Yes: I needed to have the term count to rewrite so I changed PrefixCodedTerms 
to store the number of wrapped terms and then noticed that these other classes 
were maintaining this number of terms on the side, so I refactored them to use 
PrefixCodedTerms.size() instead?

> TermsQuery should rewrite to a ConstantScoreQuery over a BooleanQuery when 
> there are few terms
> ----------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-6360
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6360
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-6360.patch, LUCENE-6360.patch
>
>
> TermsQuery helps when there are lot of terms from which you would like to 
> compute the union, but it is a bit harmful when you have few terms since it 
> cannot really skip: it always consumes all documents matching the underlying 
> terms.
> It would certainly help to rewrite this query to a ConstantScoreQuery over a 
> BooleanQuery when there are few terms in order to have actual skip support.
> As usual the hard part is probably to figure out the threshold. :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to