Hi Binh Ly,

Thanks for the response.

I'm aware that the numbers are not exact (hence the link to issue #1305 in 
my initial post), and have been advocating slightly incorrect numbers with 
my colleges and customers for some time already to prepare them for the 
moment we provide analytics with ES. But what bothers me is that they are 
*inconsistent*.

If you look at my gist you see that I ran the same aggs 3 times right after 
each other. If we just look at the top item we see the following results:

   1. { "key": "totaltrafficbos", "doc_count": 2880 }
   2. { "key": "totaltrafficbos", "doc_count": 2552 }
   3. { "key": "totaltrafficbos", "doc_count": 2179 }
   
These results are taken within seconds without any change to the number of 
documents in the index. If I run them even more you see that it rotates between 
a hand full of numbers. Is this also behavior one would expect from the aggs? 
And if so, why do the facets show the same number over and over again?

Anyway, I will try to work myself through the aggs code this weekend to get a 
better hang of what we could do with it, and what not.

-- Nils

On Friday, January 31, 2014 6:18:43 PM UTC+1, Binh Ly wrote:
>
> Nils,
>
> This is just the nature of splitting data around in shards. Actually the 
> terms facet has the same limitations (i.e. it will also give "approximate 
> counts"). Neither the terms facet nor the terms aggregation is better or 
> worse than the other - they are both approximations (using different 
> implementations). It is correct that if you put all your data in 1 shard, 
> then all the counts are exact. If you need to shard, you can increase the 
> "shard_size" parameter inside the terms aggregation to "improve accuracy". 
> Play with that number until it suits your purposes but the important thing 
> is they are just approximations the more documents you have in the index - 
> so just don't expect absolute numbers from them if you have more than 1 
> shard.
>
> {
>   "size": 0,
>   "aggs": {
>     "a": {
>       "terms": {
>         "field": "actor.displayName",
>         "shard_size": 10000
>       }
>     }
>   }
> }
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/13053d4e-a213-4f42-8f16-09e539ad694c%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to