Generally you go over the least frequent to most frequent required terms as
the pivot.  As you get more and more complicated queries, the ordering of
the query results tends to dominate.  This also means there are two types
of measurement.  One for the running query and one for query + results in
order.

tim

On Thu, May 6, 2021 at 12:20 PM Michael Sokolov <[email protected]> wrote:

> Do we have a way to understand how BooleanQuery (and other composite
> queries) are advancing their child queries? For example, a simple
> conjunction of two queries advances the more restrictive (lower
> cost()) query first, enabling the more costly query to skip over more
> documents. But we may not be making the best choice in every case, and
> I would like to know, for some query, how we are doing. For example,
> we could execute in a debugging mode, interposing something that wraps
> or observes the Scorers in some way, gathering statistics about how
> many documents are visited by each Scorer, which can be aggregated for
> later analysis.
>
> This is motivated by a use case we have in which we currently
> post-filter our query results in a custom collector using some filters
> that we know to be expensive (they must be evaluated on every
> document), but we would rather express these post-filters as Queries
> and have them advanced during the main Query execution. However when
> we tried to do that, we saw some slowdowns (in spite of marking these
> Queries as high-cost) and I suspect it is due to the iteration order,
> but I'm not sure how to debug.
>
> Suggestions welcome!
>
> -Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to