Hello,
I'm trying to get produce the distribution of documents that matches vs
don't match a query, and get the cardinality of a field for both sets. The
idea is "Users who did" vs "Users who did not". In reality I'm actually
running another aggregation under "did not" (otherwise I'd just subtract
one count from the total), but the query here illustrates the issue I'm
having:
*Query*
"aggs": {
"total_distinct_count": { "cardinality": { "field": "UserId" } },
"has_thing": {
"filter": { "term": { "State": "thing" } },
"aggs": {
"distinct_count": { "cardinality": { "field": "UserId" } }
}
},
"does_not_have_thing": {
"filter": {
"not" : { "term": { "State": "thing" } }
},
"aggs": {
"distinct_count": { "cardinality": { "field": "UserId" } }
}
}
}
*Response*
"hits": {
"total": 3309709,
"max_score": 0,
"hits": []
},
"aggregations": {
"total_distinct_count": {
"value": 654556
},
"does_not_have_thing": {
"doc_count": 2575512,
"distinct_count": {
"value": 563371
}
},
"has_thing": {
"doc_count": 734197,
"distinct_count": {
"value": 223128
}
}
}
I would expect (aggregations.has_thing.dictinct_count.value +
aggregations.does_not_have_thing.distinct_count.value) to be close to
aggreations.total_distinct_count.value, but in reality it's pretty far off
(~+20%). Note: That the summation of doc_count adds up exactly to
hits.total. So I don't think this is an issue with the query, but I could
be wrong.
Any ideas whats up? Have I structured the query incorrectly, Is this a bug?
Or is this just expected behavior?
Some notes:
- UserId's data type is a *long, *but the values only fill up integer
space. (510,539 to 418,346,844)
- I'm running elasticsearch 1.1.0
- I've tried playing around with the precision threshold, but it doesn't
appear to make a difference.
Thanks in advance,
Cheers
Phil
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/cb558261-7865-491e-9bc5-e3f78b6390f3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.