Michael McCandless created LUCENE-6890:
------------------------------------------
Summary: Specialize 1D dimensional values intersection
Key: LUCENE-6890
URL: https://issues.apache.org/jira/browse/LUCENE-6890
Project: Lucene - Core
Issue Type: Improvement
Reporter: Michael McCandless
I tried implementing the same specialization we had before LUCENE-6881 for the
1D case, but after testing it, I don't think it's worth it.
I'll upload the patch here for posterity (tests pass), but net/net it adds
non-trivial code complexity in exchange for minor (5.39 sec -> 5.25 sec for 225
queries) query gains. Maybe in the future someone could improve this so it's
more compelling... but I don't think the tradeoff is worth it today.
Furthermore, the optimization 1) requires an API change, and 2) is not even
admissible in the current patch, since the query could be a union of multiple
disjoint ranges when the optimization assumes it's just a single range.
The gist of the idea is to locate the start leaf block and end leaf block, make
an informed estimate of the expected result set size, and then do a linear scan
of the leaf blocks, vs the recursion and "grow per leaf block" we do today. I
think the conclusion is that this used to be more sizable win, but
{{DocIdSetBuilder}} has improved so that it is plenty fast without "upfront"
growing, which is nice :)
Or maybe my benchmark is bogus ;)
I'll commit the minor code comment / TODOs / test improvements from the patch
...
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]