[ https://issues.apache.org/jira/browse/LUCENE-6360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Adrien Grand updated LUCENE-6360: --------------------------------- Attachment: LUCENE-6360.patch bq. I noticed what looks like a bug in TermsQuery.createWeight.scorer: Good catch, it's a bad bug indeed! Here is an updated patch with a test that we only pull one iterator per unique field. bq. Secondly... I think the needsScores param should arguably not pass through to the ConstantScoreQuery wrapped BooleanQuery, since this should be constant scoring; no? Or maybe it's moot since it's CSQ after all. Actually I think it does need to pass through to the CSQ. The current contract is that if you pass needsScores=false then scores are going to be undefined, so if the user passed needsScores=true we need to make sure that we build a query that will return the same scores. By the way if we did not, it would probably break the scores since ConstantScoreQuery.createWeight returns the inner weight directly when scores are not needed. bq. Did you mean for the changes in CoalescedUpdates and FrozenBufferedUpdates to be in this patch? Yes: I needed to have the term count to rewrite so I changed PrefixCodedTerms to store the number of wrapped terms and then noticed that these other classes were maintaining this number of terms on the side, so I refactored them to use PrefixCodedTerms.size() instead? > TermsQuery should rewrite to a ConstantScoreQuery over a BooleanQuery when > there are few terms > ---------------------------------------------------------------------------------------------- > > Key: LUCENE-6360 > URL: https://issues.apache.org/jira/browse/LUCENE-6360 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Adrien Grand > Assignee: Adrien Grand > Priority: Minor > Attachments: LUCENE-6360.patch, LUCENE-6360.patch > > > TermsQuery helps when there are lot of terms from which you would like to > compute the union, but it is a bit harmful when you have few terms since it > cannot really skip: it always consumes all documents matching the underlying > terms. > It would certainly help to rewrite this query to a ConstantScoreQuery over a > BooleanQuery when there are few terms in order to have actual skip support. > As usual the hard part is probably to figure out the threshold. :) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org