If the list of ASIN's is presorted you can quickly merge it with the
SortedDocValues and produce a FixedBitSet of the top level ordinals, which
can be used as the post filter. This is a nice approach for things like
passing in a long list of access control predicates.


Joel Bernstein
http://joelsolr.blogspot.com/


On Tue, Oct 26, 2021 at 3:52 PM Adrien Grand <jpou...@gmail.com> wrote:

> I opened https://issues.apache.org/jira/browse/LUCENE-10207 about these
> ideas.
>
> On Tue, Oct 26, 2021 at 7:52 PM Robert Muir <rcm...@gmail.com> wrote:
>
>> On Tue, Oct 26, 2021 at 1:37 PM Adrien Grand <jpou...@gmail.com> wrote:
>> >
>> > > And then we could make an IndexOrDocValuesQuery with both the
>> TermInSetQuery and this SDV.newSlowInSetQuery?
>> >
>> > Unfortunately IndexOrDocValuesQuery relies on the fact that the "index"
>> query can evaluate its cost (ScorerSupplier#cost) without doing anything
>> costly, which isn't the case for TermInSetQuery.
>> >
>> > So we'd need to make some changes. Estimating the cost of a
>> TermInSetQuery in general without seeking the terms is a hard problem, but
>> maybe we could specialize the unique key case to return the number of terms
>> as the cost?
>>
>> Yes we know each term in terms dict only has a single document, when
>> terms.size() == terms.getSumDocFreq(): there's only one posting for
>> each term.
>> But we can probably generalize a cost estimation a bit more, just
>> based on these two stats?
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
> --
> Adrien
>

Reply via email to