Michael McCandless created LUCENE-6890:
------------------------------------------

             Summary: Specialize 1D dimensional values intersection
                 Key: LUCENE-6890
                 URL: https://issues.apache.org/jira/browse/LUCENE-6890
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Michael McCandless


I tried implementing the same specialization we had before LUCENE-6881 for the 
1D case, but after testing it, I don't think it's worth it.

I'll upload the patch here for posterity (tests pass), but net/net it adds 
non-trivial code complexity in exchange for minor (5.39 sec -> 5.25 sec for 225 
queries) query gains.  Maybe in the future someone could improve this so it's 
more compelling... but I don't think the tradeoff is worth it today.

Furthermore, the optimization 1) requires an API change, and 2) is not even 
admissible in the current patch, since the query could be a union of multiple 
disjoint ranges when the optimization assumes it's just a single range.

The gist of the idea is to locate the start leaf block and end leaf block, make 
an informed estimate of the expected result set size, and then do a linear scan 
of the leaf blocks, vs the recursion and "grow per leaf block" we do today.  I 
think the conclusion is that this used to be more sizable win, but 
{{DocIdSetBuilder}} has improved so that it is plenty fast without "upfront" 
growing, which is nice :)

Or maybe my benchmark is bogus ;)

I'll commit the minor code comment / TODOs / test improvements from the patch 
...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to