Thanks a lot for your fast response Adrien! * I noticed the cardinality aggregation but I was worried by the "an approximate count of distinct values." part of the documentation. I need an exact value, not an approximate one :) However I've read more the documentation and it may not be a real problem in practice, especially if I use a threshold of 40000 (the max apparently). I couldn't find the default precision value BTW in the documentation. * From your answer I gather that using aggregations is the only solution to my problem and there's no way to use the Query DSL to solve it.
Thanks, it helps a lot! -Vincent On Wednesday, April 2, 2014 11:17:17 AM UTC+2, Adrien Grand wrote: > > Hi Vincent, > > I left some replies inline: > > On Wed, Apr 2, 2014 at 10:02 AM, Vincent Massol <[email protected]<javascript:> > > wrote: > >> Hi guys, >> >> I'd like to count all entries in my ES instance, having a timestamp from >> the *last day* and *group together all entries having the same >> "instanceId"*. With the data below, the count result should be 1 (and >> not 2) since 2 entries are within the last day but they have the same >> instanceId of "def". >> >> I tried the following: >> >> curl -XPOST " >> http://localhost:9200/installs/install/_search?pretty=1&fields=_source,_timestamp" >> >> -d' >> { >> "aggs": { >> "lastday" : { >> "filter" : { >> "range" : { >> "_timestamp" : { >> "gt" : "now-1d" >> } >> } >> }, >> "aggs" : { >> "instanceids" : { >> "terms" : { "field" : "instanceId" } >> } >> } >> } >> } >> }' >> >> But I have 3 problems with this: >> * It's not a count but a search. "aggs" don't seem to work with _count >> * It returns all entries in the result before the aggs data >> > > For these two issues, you probably want to check out the count search > type[1] which works with aggregations. It's like a regular search, but > doesn't do perform the fetch phase in order to fetch the top hits. > > [1] > http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-search-type.html#count > > >> * In the aggs I don't get a direct count value and I have to count the >> number of buckets to get my answer >> > > We recently (Elasticsearch 1.1.0) added a cardinality[2] aggregation, that > allows for counting unique values. In previous versions of Elasticsearch, > counting was indeed only possible through the terms aggregation with a high > `size` parameter, but this was inefficient on high-cardinality fields. > > [2] > http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html#search-aggregations-metrics-cardinality-aggregation > > Here is a gist that gives an example of the count search_type and the > cardinality aggregation: > https://gist.github.com/jpountz/9930690 > > -- > Adrien Grand > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2260e806-b42b-4936-a9ec-5079e691108f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
