Re: Computing multiple different aggregations over a match-set in one pass

Greg Miller Mon, 13 Feb 2023 14:46:23 -0800

Hi Stefan-

That helps, thanks. I'm a bit confused about where you're concerned with
iterating over the match set multiple times. Is this a situation where the
ordinals you want to facet over are stored in different index fields, so
you have to create multiple Facets instances (one per field) to compute the
aggregations? If that's the case, then yes—you have to iterate over the
match set multiple times (once per field). I'm not sure that's such a big
issue given that you're doing novel work during each iteration, so the only
repetitive cost is actually iterating the hits. If the ordinals are
"packed" into the same field though (which is the default in Lucene if
you're using taxonomy faceting), then you should only need to do a single
iteration over that field.


Cheers,
-Greg

On Sat, Feb 11, 2023 at 2:27 AM Stefan Vodita <stefan.vod...@gmail.com>
wrote:

> Hi Greg,
>
> I’m assuming we have one match-set which was not constrained by any
> of the categories we want to aggregate over, so it may have books by
> Mark Twain, books by American authors, and sci-fi books.
>
> Maybe we can imagine we obtained it by searching for a keyword, say
> “Washington”, which is present in Mark Twain’s writing, and those of other
> American authors, and in sci-fi novels too.
>
> Does that make the example clearer?
>
>
> Stefan
>
>
> On Sat, 11 Feb 2023 at 00:16, Greg Miller <gsmil...@gmail.com> wrote:
> >
> > Hi Stefan-
> >
> > Can you clarify your example a little bit? It sounds like you want to
> facet
> > over three different match sets (one constrained by "Mark Twain" as the
> > author, one constrained by "American authors" and one constrained by the
> > "sci-fi" genre). Is that correct?
> >
> > Cheers,
> > -Greg
> >
> > On Fri, Feb 10, 2023 at 11:33 AM Stefan Vodita <stefan.vod...@gmail.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > Let’s say I have an index of books, similar to the example in the facet
> > > demo [1]
> > > with a hierarchical facet field encapsulating `Genre / Author’s
> > > nationality /
> > > Author’s name`.
> > >
> > > I might like to find the latest publish date of a book written by Mark
> > > Twain, the
> > > sum of the prices of books written by American authors, and the number
> of
> > > sci-fi novels.
> > >
> > > As far as I understand, this would require faceting 3 times over the
> > > match-set,
> > > one iteration for each aggregation of a different type (max(date),
> > > sum(price),
> > > count). That seems inefficient if we could instead compute all
> > > aggregations in
> > > one pass.
> > >
> > > Is there a way to do that?
> > >
> > >
> > > Stefan
> > >
> > > [1]
> > >
> https://javadoc.io/doc/org.apache.lucene/lucene-demo/latest/org/apache/lucene/demo/facet/package-summary.html
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > > For additional commands, e-mail: java-user-h...@lucene.apache.org
> > >
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Re: Computing multiple different aggregations over a match-set in one pass

Reply via email to