I opened https://issues.apache.org/jira/browse/LUCENE-10207 about these ideas.
On Tue, Oct 26, 2021 at 7:52 PM Robert Muir <rcm...@gmail.com> wrote: > On Tue, Oct 26, 2021 at 1:37 PM Adrien Grand <jpou...@gmail.com> wrote: > > > > > And then we could make an IndexOrDocValuesQuery with both the > TermInSetQuery and this SDV.newSlowInSetQuery? > > > > Unfortunately IndexOrDocValuesQuery relies on the fact that the "index" > query can evaluate its cost (ScorerSupplier#cost) without doing anything > costly, which isn't the case for TermInSetQuery. > > > > So we'd need to make some changes. Estimating the cost of a > TermInSetQuery in general without seeking the terms is a hard problem, but > maybe we could specialize the unique key case to return the number of terms > as the cost? > > Yes we know each term in terms dict only has a single document, when > terms.size() == terms.getSumDocFreq(): there's only one posting for > each term. > But we can probably generalize a cost estimation a bit more, just > based on these two stats? > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > -- Adrien