[ https://issues.apache.org/jira/browse/SOLR-14166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296964#comment-17296964 ]
David Smiley commented on SOLR-14166: ------------------------------------- CC [~yonik] [~jbernste] [~hossman] as possible reviewers for this attached PR which is rather technical into code which few people have touched but you all three in some shape/form. Please review the issue description, and take a look at the PR. In the PR, each commit is well isolated to the what the commit message says, so you may prefer to go commit-by-commit, or you could just look at the thing as a whole. In a comment above I pondered "Maybe we could make a wrapping query that wraps the underlying TPI.matchCost"; as you'll see in the PR, I did that. The test works in validating that match() isn't called more than it needs to be. It used to be called more which is verifiable by copying the test to the 8x line (if I recall, it was called two additional times). I suspect the test doesn't test that MatchCostQuery is having an effect... I may need to think a bit more on how to do that. I suspect someone will ask me if I did some performance tests. No I did not. My goal is removal of tech debt -- Filter, and in the process expect some performance improvements that Filter was blocking. So in this issue, anyone with non-cached filter queries may see a benefit, especially when those queries have TwoPhaseIterators (phrase queries, frange, spatial, more). The benefit may be further pronounced if the main query also has TPIs because Lucene cleverly sees through the boolean queries to group the TPIs of required clauses in the tree. > Use TwoPhaseIterator for non-cached filter queries > -------------------------------------------------- > > Key: SOLR-14166 > URL: https://issues.apache.org/jira/browse/SOLR-14166 > Project: Solr > Issue Type: Sub-task > Reporter: David Smiley > Assignee: David Smiley > Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > "fq" filter queries that have cache=false and which aren't processed as a > PostFilter (thus either aren't a PostFilter or have a cost < 100) are > processed in SolrIndexSearcher using a custom Filter thingy which uses a > cost-ordered series of DocIdSetIterators. This is not TwoPhaseIterator > aware, and thus the match() method may be called on docs that ideally would > have been filtered by lower-cost filter queries. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org