I'm ok with the count returned being some estimate. Say in this simple example if it returned 1 for just Joe, or 3 for John, Joe, and Jack that would be ok too. I am also ok with restructuring my data in any way to more efficiently get this number.
You mentioned creating a reference count document. How would that look? 1 doc per unique author, with a count of the total number of books he wrote so then I can do a range aggregation on that number? What if I wanted to find "the number of authors who have written between 2-3 books that have a title containing E, F, H, or I" (still 2 in this case, John and Joe) ? On Thursday, June 19, 2014 6:43:41 PM UTC-4, Itamar Syn-Hershko wrote: > > This is a Map/Reduce operation, you'll be better off maintaining a > ref-count document IMO then trying to hack the aggregations framework to > support this > > Another reason for doing it that way is in a distributed environment some > aggregations can't be computed to an exact value - the Terms bucketing is > one example. So if you need exact values, I'd go for a model that does it. > > -- > > Itamar Syn-Hershko > http://code972.com | @synhershko <https://twitter.com/synhershko> > Freelance Developer & Consultant > Author of RavenDB in Action <http://manning.com/synhershko/> > > > On Fri, Jun 20, 2014 at 1:34 AM, Mike <[email protected] <javascript:>> > wrote: > >> Assume each document is a book: >> { title: "A", author: "Mike" } >> { title: "B", author: "Mike" } >> { title: "C", author: "Mike" } >> { title: "D", author: "Mike" } >> >> { title: "E", author: "John" } >> { title: "F", author: "John" } >> { title: "G", author: "John" } >> >> { title: "H", author: "Joe" } >> { title: "I", author: "Joe" } >> >> { title: "J", author: "Jack" } >> >> >> What is the best way to fin the number of authors who have written >> between 2-3 books? In this case it would be 2, John and Joe. >> >> I know I can do a terms aggregation on author, set size to be very very >> large, and then on the client side traverse through the thousands of >> authors and count how many had between 2-3. Is there a more efficient way >> to do this? The cardinality aggregation is almost what I want, if only I >> could specify a min and max term count. >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/22fc4e6d-bcac-426c-a343-ff1d36fc25de%40googlegroups.com >> >> <https://groups.google.com/d/msgid/elasticsearch/22fc4e6d-bcac-426c-a343-ff1d36fc25de%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2cab8d84-7c65-4f6e-ab39-3e2a0e859a87%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
