A follow up question on this Hoss:
If I have a set of documents, let's say this email thread. Each email has a
unique author. All emails in the thread are indexed with "threadid=33" If I
want to count the number of unique authors in this email thread, I could go
along the lines you mention at the end:
rows=0&threadid=33&facet=true&facet.field=author&limit=-1
then count all returned facets. This works, but becomes unfeasable when the
number of unique author values in the index is large. Right?
So the limit=-1 solution is just not working for such fields. But would work
well for "category" if the number of unique categories is low...
It's almost faster to retrieve all entries from the thread and count
programatically the number of unique authors... But obviouslly, I don't want
to do that!

So, how would you go about to find the number of unique authors in this
scenario?

Cheers,
 Aleks

On Wed, Sep 2, 2009 at 12:57 AM, Chris Hostetter
<hossman_luc...@fucit.org>wrote:

>
> : lets say you filter your query on something and want to know how many
> : distinct "categories" that your results comprise.
> : then you can facet on the category field and count the number of facet
> : values that are returned, right?
>
> if you count the number of facet values returned you are getting a "count
> of disctinct values"
>
> if you just want the list of distinct values in a field (for your whole
> index) there TermsComponent is the fastest way.
>
> if you want the list of distinct values across a set of documents, then
> facet on that field when doing your query.
>
> "select distinct category from books where bookInStock='true'" is analgous
> to looking at the facet section of...
>
>   rows=0&q=bookInStock:true&facet=true&facet.field=category
>
>
> -Hoss
>
>

Reply via email to